Senior Site Reliability Engineer II @Braze

[Hiring] Senior Site Reliability Engineer II @Braze

Mar 01, 2025 - Braze is hiring a remote Senior Site Reliability Engineer II. đź’¸ Salary: unspecified. đź“ŤLocation: Northern America, Americas, USA, Canada.

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

Site Reliability Engineers (SREs) are responsible for keeping all internal-facing services and platforms running smoothly. In a nutshell, SREs ensure site uptime. SREs blend sensible system administrators and software engineers who apply sound engineering principles, operational discipline, and mature automation to the environments and infrastructure services we provide.

  • Specialize in systems, whether it be networking, the Linux kernel, or scaling algorithms or distributed systems.
  • Help improve automation and infrastructure reliability.
  • Empower Braze’s other engineering teams to leverage the infrastructure products and platforms we create easily.
  • Operate at a massive scale with over 3.3 billion monthly active users across our customers.
  • Collect hundreds of billions of data points each month and send billions of messages to end-users daily.
  • Use a diverse technology stack rooted in Ruby on Rails, MongoDB, Redis, Kafka, Kubernetes, and more.

As a Site Reliability Engineer at Braze, you will collaborate with your team and consumer engineering teams to continuously improve the infrastructure, automation, and tooling that build internal products from these technologies.

Responsibilities

  • Partner with Braze’s engineering teams on:
    • Architecting products to effectively utilize infrastructure platforms in a scalable, reliable manner.
    • Debugging reliability and scalability issues across all stack layers, including the products built using our infrastructure platforms.
    • Make monitoring and alerting alerts on symptoms and not on outages.
    • Ensure that Braze meets our strict enterprise-grade SLAs with customers.
  • Develop Braze’s internal platform infrastructure:
    • Create Infrastructure as code using Chef, Terraform, and Kubernetes.
    • Develop deployment pipelines for applications in multiple languages using Docker, Kubernetes, etc.
    • Provide centralized/common tooling, services, and automation frameworks that are critical for scaling operations, capacity management, reducing operational pain, and improving the day-to-day workflow of Braze’s engineering teams.
  • Manage incidents:
    • Be on a PagerDuty rotation to respond to availability incidents and provide support for other engineers.
    • Use your on-call shift to prevent incidents from ever happening.
    • Retrospect everything that happens to turn lessons into system improvements/changes, automation, etc.

Qualifications

  • 3+ years of experience as a Software, DevOps, or Site Reliability Engineer.
  • Think about systems - interfaces, boundaries, edge cases, failure modes, behaviors, specific implementations.
  • Have an urge to collaborate, document, and deliver quickly.
  • Collaborate across global remote teams, often working asynchronously.
  • Document everything to avoid learning the same thing (or planning the same work) twice.
  • Deliver fast to delight our customers – even internal ones.
  • Have an enthusiastic, go-for-it attitude; when you see something broken, you can't help but fix it.
  • Desire to solve everyday challenges facing software engineers and automate their toil away.
  • Excellent ability to manage multiple tasks and expectations at once.
  • Know your way around Linux and Unix Shell.
  • Strong programming skills - Ruby and/or Go preferred.
  • Experience with Docker, Kubernetes, Terraform, or similar IaC technologies.
  • Experience with MongoDB, Redis, Kafka, Postgres, or similar data technologies.

Benefits

  • Competitive compensation that may include equity.
  • Retirement and Employee Stock Purchase Plans.
  • Flexible paid time off.
  • Comprehensive benefit plans covering medical, dental, vision, life, and disability.
  • Family services that include fertility benefits and equal paid parental leave.
  • Professional development supported by formal career pathing, learning platforms, and tuition reimbursement.
  • Community engagement opportunities throughout the year, including an annual company-wide Volunteer Week.
  • Employee Resource Groups that provide supportive communities within Braze.
  • Collaborative, transparent, and fun culture recognized as a Great Place to Work®.

Similar Remote Jobs

More jobs at Braze

More Devops / Sysadmin jobs

More jobs in Northern America

Before You Apply
️
đź“Ť Be aware of the location restriction for this remote position: Northern America, Americas, USA, Canada
‼ Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Senior Site Reliability Engineer II @Braze
Devops / Sysadmin
Salary đź’¸ unspecified
Remote Location
Northern America, Americas, USA, Canada
Job Type unspecified
Posted Mar 01, 2025
Apply for this position Unlock 54,817 Remote Jobs
️
đź“Ť Be aware of the location restriction for this remote position: Northern America, Americas, USA, Canada
‼ Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Senior Site Reliability Engineer II Apply for this position Unlock 54,817 Remote Jobs
Ă—
  • Unlock 54,817 hidden remote jobs.
  • Your shortcut to remote work. Apply before everyone else.
  • Click and apply. No middlemen, no hassle.

We’re not like the other sites. Come see why!

50% off in April 2025
  • Single payment
  • Lifetime access
  • Filter by location/skills/salary…
  • Create custom email alerts
  • Private Slack Community