[Hiring] Senior Site Reliability Engineer I @Instacart
Senior Site Reliability Engineer I @Instacart
Devops
Salary usd 155,000 - 1..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Employment Type full-time
Posted 1mth ago

[Hiring] Senior Site Reliability Engineer I @Instacart

1mth ago - Instacart is hiring a remote Senior Site Reliability Engineer I. πŸ’Έ Salary: usd 155,000 - 195,500 per year πŸ“Location: USA

Role Description

Join our team as a Senior Site Reliability Engineer I, where your expertise will play a crucial role in maintaining the backbone of our platform's operations. You'll take on challenges directly, ensuring optimal performance and growth while fostering a culture that prioritizes diligent and effective reliability practices. We're seeking someone eager to take ownership, skilled at addressing complex issues, and ready to explore innovative solutions to support the well-being of our teams and services.

About the Team

The Site Reliability Engineering (SRE) team combines software and systems engineering to design and manage large-scale, distributed, and fault-tolerant systems. This team is tasked with ensuring high reliability, optimal system performance, and continuous improvement for both Instacart's critical internal services and externally facing systems.

  • SRE focuses on optimizing existing systems, building robust infrastructure, and automating processes to minimize manual effort.
  • The team thrives within a culture of intellectual curiosity, problem-solving, and collaboration.
  • Members from diverse backgrounds and experiences foster a supportive and risk-tolerant environment.
  • Encourages thinking big, taking on impactful projects, and growing with mentorship and guidance.

About the Job

  • Develop scalable infrastructure strategies to ensure high availability, aligning infrastructure planning with product roadmaps, and optimizing cost, risk, and performance with cloud providers.
  • Establish and lead incident management protocols and response plans to coordinate rapid responses, investigate root causes, prevent recurrence, and collaborate with security teams to test response readiness and address security risks.
  • Continuously monitor performance metrics and trends to proactively identify reliability risks.
  • Regularly refine SLOs, SLIs, and Error Budgets to align with evolving standards and leverage data insights to propose improvement plans and suggest architectural updates to enhance system reliability.
  • Oversee regular system evaluations to pinpoint and refine process shortcomings and lead cross-functional projects that promote system optimization and minimize technical debt.
  • Collaborate with product and engineering teams to ensure system enhancements align with user requirements.
  • Design and deploy automation tools to streamline deployment and operations, ensuring seamless processes while overseeing the continuous enhancement of automation scripts and frameworks.
  • Rigorously monitor automated systems for performance and reliability, addressing and tackling issues in automated environments promptly to reduce disruptions.
  • Provide technical guidance to junior colleagues, fostering a collaborative culture for problem-solving and innovation.
  • Organize and lead knowledge-sharing sessions and coordinate training in site reliability best practices to enhance team proficiency.

Qualifications

  • Proven experience in programming.
  • Robust knowledge of incident management processes and tools.
  • Exemplary troubleshooting and problem-solving skills.
  • Ability to work under pressure and prioritize tasks during high-stress situations.
  • Expertise in scaling application infrastructure for high availability.

Preferred Qualifications

  • Proficient in Ruby or Go.
  • Experience with cloud platforms (e.g., AWS, GCP, Azure) and containerization (e.g., Docker, Kubernetes).
  • Skill in risk assessment for foundational infrastructure changes.
  • Experience in monitoring system performance and trend analysis.

Benefits

  • Highly market-competitive compensation and benefits.
  • Remote work flexibility.
  • New hire equity grant and annual refresh grants.

Salary Information

  • CA, NY, CT, NJ: $185,000 β€” $195,500 USD.
  • WA: $177,000 β€” $187,000 USD.
  • OR, DE, ME, MA, MD, NH, RI, VT, DC, PA, VA, CO, TX, IL, HI: $170,000 β€” $179,500 USD.
  • All other states: $155,000 β€” $163,500 USD.
Before You Apply
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Senior Site Reliability Engineer I @Instacart
Devops
Salary usd 155,000 - 1..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Employment Type full-time
Posted 1mth ago
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Application Denied βœ“
Unlock 165,000+ Remote Jobs
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Application Denied βœ“
Unlock 165,000+ Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 165,000+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later