- Company Name
- Avenue Code
- Job Title
- Machine Learning Engineer
- Job Description
-
Job Title: Machine Learning Engineer
Role Summary:
Design, build, and maintain end‑to‑end MLOps pipelines, develop FastAPI inference microservices, and lead deployment strategies on Azure Kubernetes Service (AKS) using GitOps. Own the full ML lifecycle from data preparation to model monitoring and automated retraining, ensuring high quality, reproducible, and secure model delivery.
Expectations:
- Deliver production-ready ML services with low latency and high scalability.
- Uphold best practices in code quality, governance, and observability.
- Drive continuous improvement of the MLOps platform and deployment workflows.
- Collaborate cross‑functionally with data scientists, software engineers, platform, and SRE teams.
Key Responsibilities:
- Build and maintain MLOps pipelines for data prep, training, validation, packaging, and deployment.
- Develop FastAPI microservices with clear API contracts, versioning, and documentation.
- Implement deployment strategies on AKS (blue/green, canary, shadow, champion/challenger) using GitOps with Argo CD.
- Architect a self‑serve MLOps platform (standards, templates, CLI/scaffolds).
- Operationalize scikit‑learn, PyTorch, XGBoost models for low‑latency, scalable serving.
- Build CI/CD for ML: automated testing, security scans, build, package, promote.
- Integrate telemetry, observability, and set SLOs for model services.
- Monitor model/data drift; automate retraining, evaluation, safe rollout/rollback.
- Partner with software engineers to embed ML services in client applications and shared platforms.
- Champion code quality, reproducibility, governance, model registry, artifact approval.
Required Skills:
- Advanced Python programming, FastAPI design, RESTful API best practices.
- Proven MLOps experience: packaging, serving, scaling models as APIs.
- CI/CD expertise (GitHub Enterprise or equivalent), automated testing, release pipelines.
- Containerization & orchestration (Docker, Kubernetes), AKS production deployments.
- GitOps with Argo CD; deployment strategies (blue/green, canary, rollback).
- Understanding of model registry, artifacts, approvals, reproducibility.
- Cross‑functional collaboration with data science, software, platform/SRE.
Required Education & Certifications:
- Bachelor’s degree in Computer Science, Software Engineering, Data Science or related field.
- 2+ years relevant experience in MLOps/ML engineering.
- Certifications such as Azure AI Engineer Associate or equivalent in cloud/MLOps are preferred but not mandatory.