- Company Name
- EPITEC
- Job Title
- Machine Learning Engineer
- Job Description
-
**Job title**
Senior MLOps Engineer
**Role Summary**
Design, build, and maintain a scalable, secure MLOps platform that enables data‑driven teams to develop, deploy, and operationalize machine‑learning models at enterprise scale. Responsible for creating self‑service tooling, ensuring model reliability, and driving platform adoption through user support and best‑practice standards.
**Expectations**
- Lead the architecture of end‑to‑end ML pipelines, from data preparation to production monitoring.
- Deliver high‑quality, production‑ready models with minimal manual intervention.
- Collaborate cross‑functionally with data science, engineering, and product teams to translate business needs into platform features.
- Uphold and evolve DevOps and MLOps best practices, ensuring continuous integration, delivery, and observability.
**Key Responsibilities**
- Define secure, scalable MLOps architectures and cloud‑native pipelines (AWS, Kubernetes).
- Build and maintain self‑service ML tooling (experiment tracking, model registry, deployment automation).
- Design and implement CI/CD workflows (Git, GitHub, Azure DevOps, jFrog Artifactory).
- Containerize and orchestrate models using Docker, Kubernetes, Helm/Helmfile.
- Develop testing, validation, and monitoring solutions for data drift, model drift, and performance degradation.
- Create user documentation, training materials, and troubleshoot platform issues.
- Prototype and deliver proof‑of‑concepts for automated MLOps at scale.
- Advocate for and enforce coding standards, refactoring, and code velocity improvements.
**Required Skills**
- 5+ years of OOP programming (Python, Go, Java, or C/C++).
- Proficiency in Python, R, SQL; strong scripting abilities.
- Experience with MLOps frameworks: MLflow, Kubeflow, Seldon, or equivalent.
- Design and deployment of cloud‑based MLOps pipelines (AWS preferred).
- Deep understanding of DevOps practices: CI/CD, Git workflows, artifact repositories.
- Containerization and orchestration: Docker, Kubernetes, Helm.
- Ability to automate model testing, validation, and end‑to‑end deployment.
- Excellent communication and collaboration skills; able to translate high‑level requirements into user stories and tasks.
- Proactive, self‑directed, and comfortable working with minimal supervision.
**Nice to Have**
- Experience with infrastructure as code (CloudFormation, Terraform).
- Knowledge of inference platforms (Seldon, Langfuse).
- Observability tools (Evidently AI) familiarity.
- Experience building inference systems integrated with MLflow.
**Required Education & Certifications**
- Bachelor’s degree in Computer Science, Data Science, Engineering, or related field with 5+ years of professional experience, *or*
- Master’s degree with 3+ years of professional experience.