- Company Name
- SIMARN Solutions
- Job Title
- Senior Data Scientist
- Job Description
-
**Job title**
Senior Data Scientist
**Role Summary**
Lead advanced AI/ML initiatives, building, training, and deploying models across batch and streaming pipelines. Translate complex business problems into scalable, production‑ready data‑driven solutions. Mentor junior engineers, enforce best practices in model governance, and streamline end‑to‑end data workflows.
**Expactations**
- 7–9 years of data science experience, hands‑on with Python, PySpark, and SQL.
- Proven track record in developing regression, classification, clustering, anomaly detection, and deep‑learning models (CNN, RNN, LSTM, autoencoders).
- Experience designing and operating big‑data / streaming ML pipelines and ETL processes on AWS (SageMaker, EMR, etc.).
- Strong ability to evaluate, validate, and tune models, ensuring robust performance and governance.
- Leadership skills: code review, mentorship, and effective cross‑functional communication.
**Key Responsibilities**
- Design, implement, and optimize end‑to‑end ML pipelines for batch and real‑time use cases.
- Develop, validate, and fine‑tune models using scikit‑learn, TensorFlow, and PyTorch.
- Perform data transformation, exploratory analysis, and feature engineering (Pandas, NumPy).
- Support ETL workflows and maintain data quality across source and target systems.
- Apply model evaluation, cross‑validation, and hyper‑parameter tuning to deliver high‑impact results.
- Define and enforce model governance, reproducibility, and deployment standards.
- Mentor junior data scientists/engineers and conduct peer code reviews.
- Collaborate with stakeholders to translate business requirements into technical specifications.
**Required Skills**
- Programming: Python, PySpark, SQL.
- Libraries: Pandas, NumPy, scikit‑learn, TensorFlow, PyTorch.
- Deep learning: CNN, RNN, LSTM, autoencoders.
- Analytics: regression, classification, clustering, anomaly detection.
- Big‑data & streaming: Spark, Kafka, AWS (SageMaker, EMR), data pipelines.
- Model lifecycle: evaluation, cross‑validation, hyper‑parameter tuning, governance.
- Collaboration: code review, mentorship, stakeholder communication.
**Required Education & Certifications**
- Bachelor’s or Master’s degree in Computer Science, Statistics, Data Science, Data Analytics, or Machine Learning.
- Relevant certifications (e.g., AWS Certified Machine Learning, TensorFlow Developer) are a plus.