cover image
Datadog

Datadog

datadoghq.com

5 Jobs

8,418 Employees

About the Company

Datadog is the essential monitoring platform for cloud applications. We bring together data from servers, containers, databases, and third-party services to make your stack entirely observable. These capabilities help DevOps teams avoid downtime, resolve performance issues, and ensure customers are getting the best user experience.


Listed Jobs

Company background Company brand
Company Name
Datadog
Job Title
Manager II, Engineering - Agent Customer Experience
Job Description
Job Title: Manager II, Engineering – Agent Customer Experience Role Summary: Lead the Agent Customer Experience group, shaping vision and execution for agent onboarding, fleet automation, remote management, and remote configuration. Align engineering and product strategies to deliver reliable, user‑centric observability solutions across diverse platforms. Expactations: • Mentor and grow multiple engineering teams, including senior engineers and managers. • Partner with Product and cross‑functional teams to clarify ambiguous challenges into actionable plans. • Design and deliver high‑reliability systems that meet critical business needs. • Communicate effectively across technical and non‑technical stakeholders. Key Responsibilities: - Own product vision and roadmap for Agent Customer Experience. - Collaborate with multiple organizations to embed a product‑first mindset and resolve complex platform‑level features. - Coach, mentor, and develop engineering leaders within the group. - Engage customers to understand pain points and inform product direction. - Translate strategic objectives into clear, achievable engineering plans. Required Skills: - Proven leadership of engineering teams (teams, senior engineers, managers). - Strong product partnership and strategic planning abilities. - Experience designing and delivering scalable, highly reliable systems. - Excellent communication, collaboration, and stakeholder management skills. Required Education & Certifications: - Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience). - Demonstrated track record in engineering leadership and product delivery.
Lyon, France
On site
01-02-2026
Company background Company brand
Company Name
Datadog
Job Title
Manager I, Engineering - APM SDK Capabilities
Job Description
Job Title: Manager I, Engineering - APM SDK Capabilities Role Summary: Lead a senior engineering team building core infrastructure for consistent, scalable APM tracing libraries across languages and frameworks, prioritizing developer experience and industry-standard alignment. Expactations: Manage 5-7 engineers, partner with product/engineering teams, drive adoption of runtime metrics/startup diagnostics, shape configuration management strategy, and influence OpenTelemetry standards. Key Responsibilities: Deliver cross-language instrumentation capabilities; simplify APM configuration for all customer scales; ensure SDK reliability/extensibility; align with OpenTelemetry ecosystem. Required Skills: Proven engineering leadership, cross-runtime system design, SDK/library development, observability tooling expertise, clean API design, and OSS collaboration. Must bridge technical/product narratives. Required Education & Certifications: Bachelor’s degree in Computer Science or related field (or equivalent experience).
Paris, France
On site
01-02-2026
Company background Company brand
Company Name
Datadog
Job Title
AI Research Engineer - Datadog AI Research (DAIR)
Job Description
**Job title:** AI Research Engineer – Datadog AI Research (DAIR) **Role Summary:** Design, build, and scale end‑to‑end AI systems that transform research concepts into production‑ready services within cloud observability and security domains. **Expectations:** - Deliver robust, scalable ML pipelines and inference engines. - Transition prototypes into reliable, customer‑facing features. - Collaborate closely with scientists, product, and engineering teams. - Maintain high code quality, reproducibility, and observability. **Key Responsibilities:** - Construct and manage datasets, training and evaluation pipelines, benchmarks, and tooling. - Implement, experiment, and tune large‑scale models (forecasting, anomaly detection, multimodal analysis, autonomous agents, code‑repair agents). - Orchestrate distributed training/ RL with Ray or equivalent frameworks; handle scaling, scheduling, and fault tolerance. - Profile models for reliability, performance, and cost; optimize GPU usage. - Create automated benchmarks and regression tests for all key research areas. - Integrate advanced AI capabilities into product pipelines and harden prototypes into production services. - Produce high‑quality code, documentation, and open‑source artifacts to support community and internal reproducibility. **Required Skills:** - Strong software engineering background, preferably in observability, SRE, or security. - Proficiency in Python; familiarity with Rust, C++, Go, or similar systems language. - Experience with distributed computing frameworks (Ray, Slurm, etc.). - Practical knowledge of ML frameworks (PyTorch, JAX), containerization, orchestration, and GPU acceleration. - Expertise in training, fine‑tuning, and inference of large foundation models. - Ability to communicate design trade‑offs to technical and non‑technical stakeholders. - Passion for open‑source and open‑science; experience establishing benchmarks and sharing artifacts. **Bonus Experience (not mandatory):** - Proven track record bridging research prototypes and production deployments. - GPU programming/optimization with CUDA. - Production data pipeline and application development. - Research publication contribution. **Required Education & Certifications:** - Bachelor’s degree or higher in Computer Science, Engineering, or related field (equivalent experience accepted). - Certifications in distributed systems, ML engineering, or cloud platforms are advantageous.
Paris, France
Hybrid
09-03-2026
Company background Company brand
Company Name
Datadog
Job Title
Staff AI Engineer - MCP Services
Job Description
**Job Title**: Staff AI Engineer - MCP Services **Role Summary** Develop and scale agent-friendly tooling for AI workflows in metrics, logs, and incident management. Drive evaluation frameworks and next-generation agent-tool interaction models for internal and external AI systems. **Expectations** - Lead public-facing tool development for AI agent integration. - Collaborate cross-functionally to standardize tool interfaces. - Operate in ambiguous environments with autonomy and technical ownership. **Key Responsibilities** - Architect and implement evaluation pipelines for agent performance (e.g., investigations, incident triage). - Design tool surfaces for production and evaluation use across AI agents. - Analyze failures in tool outputs, refine query parsing, and optimize agent feedback. - Advise on agent orchestration frameworks (e.g., LangChain) and agentic workflows. **Required Skills** - Expertise in applied AI/ML, LLM orchestration frameworks (LangChain, LangGraph), and agent systems. - Experience building evaluation frameworks for AI systems with metrics design and instrumentation. - Proficiency in systems thinking across agent-tool-user interactions. - AI coding tool usage and validation of AI-generated output. **Required Education & Certifications** - Bachelor’s or advanced degree in Computer Science, AI, or related field. - Proven track record of impactful technical contributions in AI/product development. --- *Note: No company-specific details, location, or benefits included per instructions.*
France
Remote
12-03-2026