- Company Name
- Vortexa
- Job Title
- ML Engineer / Data Scientist
- Job Description
-
Job Title: ML Engineer / Data Scientist
Role Summary:
Design, build, and operate end‑to‑end machine learning pipelines that process millions of energy sensor events per second. Deliver real‑time classification, anomaly detection, and predictive insights to energy operators while ensuring 100 % uptime, fault‑tolerance, and rigorous model governance.
Expectations:
- Proven experience deploying scalable ML workloads on Kubernetes with MLflow.
- Deep knowledge of Python, PyTorch, XGBoost, and statistical modeling for classification and anomaly detection.
- Full‑cycle ML engineer: experiment design, model development, validation, deployment, monitoring, and maintenance.
- Ability to design fault‑tolerant, observable production systems that meet energy industry reliability standards.
- Strong collaboration with data scientists, software engineers, and energy analysts.
Key Responsibilities:
- Architect and maintain distributed ML pipelines that ingest, process, and analyze high‑velocity energy data streams.
- Implement and tune classification & anomaly detection models for production (e.g., equipment failure, demand forecasting).
- Manage model lifecycle through MLflow, ensuring reproducibility, versioning, and rollback capability.
- Establish comprehensive data lineage and model governance, including validation with domain experts.
- Design observability stack (logging, monitoring, tracing) for ML inference endpoints.
- Automate infrastructure and deployment via IaC tools (Terraform, CloudFormation) and container orchestration.
- Mentor junior team members on ML engineering best practices and career growth.
Required Skills:
- Python, PyTorch, XGBoost, and Spark/Databricks for large‑scale data processing.
- Kubernetes, Docker, and MLflow for orchestration and MLOps.
- AWS services: SageMaker, S3, EC2, Lambda, and IAM.
- Infrastructure‑as‑code: Terraform, CloudFormation.
- Streaming platforms: Apache Kafka, Flink/Kinesis.
- Observability: Prometheus, Grafana, Jaeger, ELK stack.
- Model governance: DataDog, Seldon, or equivalent platform.
- Strong understanding of data privacy, regulatory compliance (GDPR, CO2 reporting).
- Experience with time‑series forecasting and transformer or generative AI (optional).
Required Education & Certifications:
- Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Data Science, or related field.
- Optional certifications: AWS Certified Machine Learning – Specialty, TensorFlow Developer, or similar ML/DevOps credentials.