- Company Name
- TVH
- Job Title
- Data Scientist
- Job Description
-
Job Title: Data Scientist
Role Summary
Design, prototype, and productionize machine learning models that power e‑commerce search, ranking, and product discovery. Collaborate with data science, engineering, and IT to build scalable pipelines, deploy models, and translate business needs into technical solutions.
Expectations
- Deliver end‑to‑end ML workflows that improve customer part discovery.
- Bridge the gap between data science and IT, ensuring implementations meet both business objectives and technical constraints.
- Operate in a fast‑moving, ambiguous environment with a “make it work” mentality.
Key Responsibilities
- Develop and validate classification, forecasting, and clustering models for search and recommendation.
- Engineer features, experiment with vector‑based retrieval and embeddings, and evaluate model performance.
- Build and maintain scalable ML pipelines using Docker, Kubernetes, and CI/CD for automated testing, versioning, and deployment.
- Integrate Elasticsearch, Solr, OpenSearch, or Vespa for high‑performance search and retrieval.
- Fine‑tune transformer‑based LLMs, create prompts, and generate synthetic data for GenAI initiatives.
- Collaborate with cross‑functional teams to iterate on model outputs and incorporate feedback.
- Monitor model drift and retrain packages as needed to sustain accuracy.
Required Skills
- Strong foundation in core ML techniques (classification, forecasting, clustering, evaluation metrics, feature engineering).
- Proficiency in search engineering & information retrieval: vector search, embedding architectures, NLP, transformer models, sequence modeling.
- Experience with Elasticsearch, Solr, OpenSearch, or Vespa; multilingual retrieval is a plus.
- GenAI/LLM expertise: prompt engineering, synthetic data generation, agentic AI, training, fine‑tuning, deployment of transformer models.
- MLOps skills: Docker, Kubernetes, CI/CD pipelines (GitLab CI, Jenkins, Argo).
- Solid communication and partnership skills to translate technical concepts for business stakeholders.
- Ability to work independently in a dynamic environment and navigate ambiguity.
Required Education & Certifications
- Bachelor’s degree (or higher) in Computer Science, Data Science, Statistics, or related field.
- Certifications in ML, DevOps, or cloud platforms (AWS, GCP, Azure) are advantageous but not mandatory.