Job Specifications
Job Description: Senior AI/ML Engineer – Platform & Infrastructure
Location: Columbus, OH - Hybrid.
Day-1 office, 3 days per week in Client office
Rate: $47/Hr on W2
Role Overview
We are seeking a highly skilled Senior AI/ML Engineer with deep expertise in large-scale machine learning systems, distributed data processing, and cloud-native AI infrastructure. The ideal candidate will design, build, and operate end-to-end ML and GenAI platforms supporting real-time and batch workloads in production.
You will work closely with data scientists, platform engineers, and product teams to enable scalable model training, deployment, monitoring, and annotation workflows across cloud environments.
Key Responsibilities
Design and implement scalable ML pipelines using Python, Java, PySpark, and Apache Spark (batch & streaming)
Build and operate real-time data processing systems using Spark Streaming / Structured Streaming
Develop and manage GenAI and ML infrastructure on AWS (EMR, EKS, EC2, S3)
Deploy and manage Kubernetes-based ML platforms (EKS), including model serving and annotation systems
Implement annotation solutions and manage annotation platform deployments on EKS
Optimize distributed storage and low-latency access using Cassandra
Build production-grade ML workflows using Databricks
Ensure reliability, scalability, security, and cost optimization of ML platforms
Collaborate on MLOps practices including CI/CD, monitoring, logging, and model lifecycle management
Support experimentation, training, inference, and evaluation of ML and GenAI models
Required Skills & Qualifications
Strong proficiency in Python and Java
Hands-on experience with PySpark, Apache Spark, Spark Streaming
Advanced experience with AWS services: EMR, EKS, EC2, S3
Production experience deploying systems on Kubernetes (EKS)
Experience with Cassandra or other distributed NoSQL databases
Strong experience with Databricks for ML and data engineering workflows
Solid understanding of ML infrastructure, MLOps, and GenAI systems
Experience deploying and managing annotation pipelines and tooling
Strong understanding of distributed systems, performance tuning, and fault tolerance
Nice to Have
Experience with LLMs, GenAI pipelines, or model serving frameworks
Infrastructure as Code (Terraform, CloudFormation)
Monitoring tools (Prometheus, Grafana, CloudWatch)
Experience in large-scale, multi-tenant ML platforms
About the Company
At NAVA Software Solutions, we help enterprises, product companies, and Global Capability Centers (GCCs) scale with - . Headquartered in Connecticut with operations across the USA, Mexico, and India, we specialize in building AI-enabled platforms, modernizing legacy systems, and delivering automation-driven services across cloud, data, and software quality. Whether you're launching - , , or building offshore engineering capabilities, we bring the strategy, execution, and global delivery to make it happen. * AI-Po...
Know more