- Company Name
- MillionLogics
- Job Title
- LLM Engineering Manager – Python & Machine Learning
- Job Description
-
Job Title: LLM Engineering Manager – Python & Machine Learning
Role Summary: Lead cross‑functional ML teams to research, train, and deploy large language models and ML systems, combining deep‑learning expertise with MLOps strategy and execution.
Expectations: Own end‑to‑end lifecycle of LLM projects, mentor senior engineers while remaining hands‑on in coding and pipeline optimization, align research breakthroughs with product and business goals, and ensure responsible AI and compliance standards.
Key Responsibilities
- Direct and mentor a team of ML engineers, data scientists, and MLOps specialists.
- Own the full LLM/ML project lifecycle: data acquisition, model training, evaluation, deployment, and post‑production monitoring.
- Collaborate with Research, Product, and Infrastructure teams to define project scope, milestones, and success metrics.
- Architect and scale distributed training pipelines, GPU/TPU utilization, and cloud‑based inference deployments.
- Implement MLOps best practices—experiment tracking, model governance, CI/CD, and automated monitoring.
- Manage compute budgets and resources, ensuring adherence to security and responsible AI policies.
- Communicate progress, risks, and outcomes to stakeholders and executive leadership.
Required Skills
- 9+ years experience in Machine Learning, NLP, and deep‑learning architectures (Transformers/LLMs).
- Proficient in PyTorch, TensorFlow, Hugging Face, DeepSpeed, or comparable frameworks.
- 2+ years proven team management in production ML/LLM delivery.
- Expertise in distributed training, GPU/TPU optimization, and AWS/GCP/Azure cloud environments.
- Hands‑on experience with MLOps tools such as MLflow, Kubeflow, Vertex AI, or similar.
- Strong leadership, communication, and cross‑functional collaboration.
Required Education & Certifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.