Underpaidby HiringX

Senior Software Engineer (AI)

Datadog

Madrid, SpainRemoteDev Eng5+ yrs

About the role

At Datadog, we leverage AI across our observability platform to improve monitoring, speed up incident resolution, and ensure data reliability for cloud applications.

Datadog’s Deployment Gates team builds customer-facing systems that decide whether software should ship to production. Deployment Gates sit directly in customers’ CI/CD pipelines and use observability data to answer one of the hardest questions in software delivery:

Given everything we know right now, is this deployment safe to proceed?

In this role, you will:

Work on an analysis service that prevents incidents by detecting faulty software changes in production before they reach clients

Lead the design of progressive deployment automation, starting with zero-setup, conservative AI rules and evolving toward adaptive gates that learn from incidents and organizational patterns

Design the foundation for autonomous remediation, connecting what changed in code to what broke in production, from blocking and rolling back unsafe deployments to proposing fixes and enforcing policies

This is a highly product‑minded engineering role: you’ll work from problem discovery and UX all the way to reliable, scalable production systems.

At Datadog, we place value in our office culture - the relationships that it builds, the creativity it brings to the table, and the collaboration of being together. We operate as a hybrid workplace to ensure our employees can create a work-life harmony that best fits them.

What you’ll do:

Build AI-driven deployment gates: Design and ship decision systems that evaluate customer deployments using CI/CD context and Datadog telemetry, producing safe, explainable allow/block outcomes

Own evals and rollout: Define precision, recall, and trust metrics; build offline and online evals; validate changes in shadow mode; and safely promote improvements to enforcement

Design for robustness and safety: Implement conservative defaults, guardrails, fallbacks, and human-in-the-loop paths so gates behave predictably under noisy or incomplete data

Partner closely with Product: Work hand-in-hand with the Product Manager to translate customer problems, adoption signals, and roadmap goals into concrete technical decisions and iterations

Integrate across the Datadog platform: Partner with internal AI teams building the Faulty Deployment Detection pipeline, as well as teams working on LLMs and AI agents

Own production systems: Build and operate reliable backend services that run in the critical path of customer deployments, and be on-call for those services

Who you are:

A Product‑minded engineer who ships AI to production

You have 5+ years experience with backend systems and microservices performance: tracing, latency breakdowns, concurrency, and resiliency patterns

You are proficient in a modern programming language; strong API/service design; production ops (monitoring, alerting, on‑call rotation)

You have proven experience delivering software based on LLM/agent featur

Underpaid estimate

~₹19 LPA for Software Engineers (industry-wide) · based on 526 submissions

Check yours