cover image
Weights & Biases

Weights & Biases

wandb.ai

1 Job

312 Employees

About the Company

Weights & Biases: the AI developer platform. Build better models faster, fine-tune LLMs, develop GenAI applications with confidence, all in one system of record developers are excited to use. W&B Models is the MLOps solution used by foundation model builders and enterprises who are training, fine-tuning, and deploying models into production. W&B Weave is the LLMOps solution for software developers who want a lightweight but powerful toolset to help them track and evaluate LLM applications. Weights & Biases is trusted by over a 1,000 companies to productionize AI at scale including teams at OpenAI, Meta, NVIDIA, Cohere, Toyota, Square, Salesforce, and Microsoft. Sign up for a 30-day free trial today at http://wandb.me/trial.

Listed Jobs

Company background Company brand
Company Name
Weights & Biases
Job Title
AI Engineer- Gen AI/SWE- Weights & Biases
Job Description
Job Title: AI Engineer – Gen AI / SWE (Weights & Biases) Role Summary: Design, ship, and maintain end‑to‑end generative AI workflows, including prompting, retrieval‑augmented generation, agentic tool use, evaluation, and production deployment, while ensuring reproducibility, safety, and performance. Expectations: - 6+ years of building production software systems. - Proven track record delivering LLM‑powered features with measurable impact. - Strong expertise in responsible AI deployment and reproducible research practices. Key Responsibilities: - Develop end‑to‑end GenAI pipelines (prompting → RAG → agents → evaluation → serving). - Build agentic systems with secure tool integration, function‐calling, and multi‑step planners. - Design evaluation harnesses for RAG/agent performance, including golden sets and regression tests. - Publish reusable code, documentation, tutorials, and public presentations. - Partner with product and solutions teams to deliver LLM features with clear latency, cost, and safety targets. - Conduct growth experiments and analyze usage metrics of deployed artifacts. Required Skills: - Python or TypeScript (lead language) with strong system design, testing, CI/CD, and observability skills. - Experience shipping LLM applications (tools, agents, function‑calling) at production scale. - Proficiency in agentic patterns—planners, executors, sandboxing, and failure taxonomy. - Expertise in RAG: chunking, embeddings, vector database design, and retrieval policy. - Evaluation design: offline golden sets, counterfactuals, user studies, guardrail tests; statistical literacy (variance, CI, power). - Serving & productization: queueing, caching, streaming, cost control, latency troubleshooting. - Public signal: ≥2 major OSS projects, blog posts, talks, or videos with significant adoption (stars, forks, views). - Familiarity with AI SDKs/frameworks, agent frameworks, and developer‑facing examples. Required Education & Certifications: - Bachelor’s degree (or higher) in Computer Science, Software Engineering, or related field (or equivalent professional experience). - Relevant certifications in AI/ML, cloud platforms, or data engineering are a plus.
Us, France
On site
21-01-2026