- Company Name
- Cohere
- Job Title
- Software Engineer - Applied ML (US/CAN)
- Job Description
-
**Job Title**
Software Engineer – Applied ML
**Role Summary**
Lead the design, development, and deployment of LLM‑based applications such as RAG systems and AI agents. Build scalable backend services, data pipelines, and evaluation frameworks to enable customer‑focused LLM solutions. Collaborate with ML engineers and platform teams to deliver production‑grade ML workflows on cloud and containerized environments.
**Expectations**
- Deliver high‑quality, production‑ready code for LLM and agent applications.
- Participate in end‑to‑end lifecycle: design, implementation, testing, CI/CD, monitoring, and maintenance.
- Maintain clear, maintainable APIs and services that support ML/LLM pipelines at scale.
- Communicate progress and technical decisions effectively to cross‑functional teams.
**Key Responsibilities**
1. Design and build agentic AI applications, including RAG pipelines, for business use‑cases.
2. Develop, maintain, and optimize data generation and evaluation pipelines that support custom LLM training and fine‑tuning.
3. Build backend services and APIs to support ML/LLM workflows, scaling to production volumes.
4. Collaborate with ML engineers on model deployment, evaluation, experiment tracking, and versioning.
5. Apply software engineering best practices: code quality, unit/integration testing, CI/CD, observability, and source control.
6. Deploy, monitor, and troubleshoot applications on Docker/Kubernetes cloud infrastructure.
**Required Skills**
- Proficiency in Python (application, data processing, ML/LLM integration).
- Strong API design, scalable service architecture, database experience, testing, CI/CD, observability, and Git workflow.
- Experience with LLM evaluation fundamentals, data pipelines, RAG systems, and agent development (LangChain, LlamaIndex, MCP).
- Familiarity with ML/LLM tooling: HuggingFace, Weights & Biases, Docker, Kubernetes.
- Knowledge of LLM finetuning and runtime concepts (SFT, DPO, LoRA, vLLM, Transformers).
- Ability to develop and monitor vector embeddings (Pinecone, Weaviate, FAISS, Milvus).
- Good written and verbal communication; willing to travel up to 25%.
**Nice-to-Have**
- Experience in cloud platforms (AWS, GCP, Azure).
- MLOps/experiment tracking with MLflow or equivalent.
- Evaluation frameworks such as LM Evaluation Harness, RAGAs, TruLens.
- Backend skills in Go, JavaScript/TypeScript.
- Applied NLP tooling (SpaCy, NLTK, HuggingFace datasets/tokenizers).
**Required Education & Certifications**
- Bachelor’s degree or higher in Computer Science, Engineering, or related technical field.
- Relevant certifications (e.g., AWS Certified Solutions Architect, Docker Certified Associate) are a plus.