Staff Software Engineer - Alerting Platform @Datadog

[Hiring] Staff Software Engineer - Alerting Platform @Datadog

Mar 27, 2025 - Datadog is hiring a remote Staff Software Engineer - Alerting Platform. đź’¸ Salary: unspecified. đź“ŤLocation: Europe.

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

We are looking for a Staff Engineer to help us scale Datadog's Alerting Platform, which is responsible for the core systems that define and schedule monitors, create alerts, and ensure the accuracy and timeliness of the end to end alerting process across the platform. This is a unique opportunity to contribute to one of the most critical platforms at Datadog. Customers can configure monitors and generate alerts for virtually every product in our unified platform. It’s imperative that we maintain our customers’ trust by delivering these notifications reliably.

In practice, this means the alerting platform has to be the most reliable platform. As we grow we have to design systems that can degrade furthermore while still ensuring the best customer experience without breaking. This staff engineer will focus on two critical components:

  • The alerting scheduler, responsible for scheduling the timely evaluation of millions of monitors each minute
  • The state processor that makes the critical decision about when a transition in monitor state has occurred

These distributed systems are tied together, one being the consumer (state machine) of the other (scheduler). The reliability and fault tolerance of these systems together, and across the entire alerting platform, is critical to Datadog's customer trust and business success. Upcoming initiatives to achieve an order of magnitude increase in reliability will require deep changes to these complex systems.

At Datadog, we place value in our office culture - the relationships and collaboration it builds and the creativity it brings to the table. We operate as a hybrid workplace to ensure our Datadogs can create a work-life harmony that best fits them.

Qualifications

  • You have led cross-team initiatives in a platform or infrastructure-focused environment for 2+ years.
  • You have led impactful technical initiatives in an environment where performance, reliability, and accuracy are first-order concerns.
  • You have a reliability-oriented mindset and care deeply about designing and building resilient architectures.
  • You have significant back end programming experience and have architected, built, and operated distributed systems to solve problems at high scale.

Requirements

  • Design and drive high priority, high visibility projects that increase the platform's resilience and scalability across multiple teams.
  • Lead and guide others through architectural decisions for new and existing distributed, high-throughput, real-time systems.
  • Identify potential system risks and trends in reliability, and design solutions to address them.
  • Provide input on prioritization of engineering-led initiatives in short- and long-term planning and roadmaps.
  • Collaborate closely with partner platforms that integrate and depend on the alerting platform to provide critical capabilities to their customers.

Benefits

  • New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
  • Continuous professional development, product training, and career pathing
  • Intradepartmental mentor and buddy program for in-house networking
  • An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)
  • Access to Inclusion Talks, our internal panel discussions
  • Free, global mental health benefits for employees and dependents age 6+
  • Competitive global benefits

Similar Remote Jobs

More jobs at Datadog

More Software Development jobs

More jobs in Europe

Before You Apply
️
đź“Ť Be aware of the location restriction for this remote position: Europe
‼ Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Staff Software Engineer - Alerting Platform @Datadog
Software Development
Salary đź’¸ unspecified
Remote Location
Europe
Job Type unspecified
Posted Mar 27, 2025
Apply for this position Unlock 54,754 Remote Jobs
️
đź“Ť Be aware of the location restriction for this remote position: Europe
‼ Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Staff Software Engineer - Alerting Platform Apply for this position Unlock 54,754 Remote Jobs
Ă—
  • Unlock 54,754 hidden remote jobs.
  • Your shortcut to remote work. Apply before everyone else.
  • Click and apply. No middlemen, no hassle.

We’re not like the other sites. Come see why!

50% off in March 2025
  • Single payment
  • Lifetime access
  • Filter by location/skills/salary…
  • Create custom email alerts
  • Private Slack Community