- Company Name
- BREAKERS CONSULTING
- Job Title
- Data Scientist F/H
- Job Description
-
Job Title: Data Scientist F/H
Role Summary
Provide end‑to‑end AI solutions, focusing on generative language models, Retrieval‑Augmented Generation (RAG) architectures, and multimodal data (image, audio, voice). Collaborate with product, engineering, and domain teams to transform business needs into deployable, high‑impact machine‑learning products.
Expactations
- Minimum 2 years of professional experience in Data Science or Machine Learning.
- Proven track record fine‑tuning and deploying large language models (OpenAI, Meta, Mistral, Llama, Gemini).
- Strong communication skills; fluency in English; ability to convey technical concepts to non‑technical stakeholders.
- Agile mindset with experience working in iterative, cross‑functional squads.
Key Responsibilities
• Design, fine‑tune, and optimize large language models for specific client projects.
• Build and maintain high‑performance RAG solutions: create vector stores, semantic search engines, and integrate them with LLMs.
• Develop computer‑vision pipelines (OCR, classification, detection, generative models).
• Process audio and voice data using tools such as Whisper, TTS, and other deep‑audio models.
• Implement full ML/AI pipelines: data ingestion, training, validation, deployment, and monitoring, leveraging MLOps practices (Docker, CI/CD).
• Conduct technological research, propose innovations, and keep the team informed of emerging methods.
• Present POCs, demos, and results to internal teams and clients.
Required Skills
- Deep expertise in LLM fine‑tuning and inference (PyTorch, TensorFlow).
- Experience with RAG components: vector stores, embeddings, retrieval mechanisms.
- Competency in computer‑vision frameworks (OpenCV, Detectron2, torchvision) and audio processing.
- Proficiency in MLOps tools: Docker, CI/CD pipelines, model serving.
- Strong Python programming; familiarity with data‑engineering libraries (pandas, NumPy, scikit‑learn).
- Knowledge of Agile/Scrum collaboration practices.
Required Education & Certifications
- Engineer or Master’s degree in Data Science, Computer Science, Machine Learning, or a related field.
- Relevant certifications (e.g., Agile Practitioner, Data Engineering, AI‑specific courses) preferred but not mandatory.