Get daily remote job opportunities in your inbox

No middlemen, no spam, no infinite scrolling.

Get relevant job opportunities, one email at a time.

Unsubscribe at any time.

Site Reliability Engineer @Capchase

[Hiring] Site Reliability Engineer @Capchase

Apr 14, 2025 - Capchase is hiring a remote Site Reliability Engineer. 💸 Salary: unspecified. 📍Location: Latin America (LATAM), USA.

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

As a foundational member of our growing engineering team, you'll play a pivotal role in shaping the culture, processes, and infrastructure that will support our company as we scale 50x. This is an opportunity to work in a fast-paced environment, collaborate with talented engineers, and directly contribute to building a resilient, high-performance product.

You’ll be responsible for ensuring the availability, latency, performance, efficiency, scalability, and reliability of our systems—while helping define the long-term vision and roadmap for Site Reliability Engineering at our company.

  • Infrastructure & Scalability
    • Design and evolve our systems architecture to scale 50x.
    • Lead infrastructure and team scalability initiatives.
    • Partner with Tech leadership to drive strategy around production-critical systems.
  • Reliability & Performance
    • Own service level objectives (SLAs/SLOs/SLIs), helping teams define and uphold them.
    • Conduct capacity planning and cost optimization.
    • Standardize service levels and observability practices across the organization.
  • Monitoring, Observability & Alerting
    • Define requirements and best practices for monitoring, alerting, and logging.
    • Design and implement tools to gain insight into trends, detect anomalies, and compare system behavior.
    • Build visualizations to surface system health and performance.
  • CI/CD & Developer Velocity
    • Improve and maintain our CI/CD pipelines and development environments.
    • Eliminate toil and increase automation across the engineering lifecycle.
    • Accelerate development by enhancing testing, staging, and deployment processes.
  • Incident Management & Disaster Recovery
    • Lead the on-call rotation and incident response efforts.
    • Serve as the first responder during production incidents—owning detection, escalation, and resolution.
    • Drive postmortems and ensure actionable insights are implemented.
  • Security & Compliance
    • Collaborate on in-house practices or third-party partnerships to improve security posture.
    • Ensure that security and compliance support our scalability goals and customer trust.
  • Team & Culture Building
    • Help define the roadmap and future scope of the SRE function.
    • Participate in hiring and mentoring to grow a world-class reliability team.
    • Foster a culture of ownership, collaboration, and continuous improvement.

Qualifications

  • Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
  • Proficiency in one or more programming languages such as C++, Elixir, JavaScript, Python, or Go.
  • Solid understanding of algorithms and data structures.
  • Deep expertise in designing, analyzing, and troubleshooting distributed systems.
  • Hands-on experience with Kubernetes, Terraform, and Google Cloud Platform (GCP).
  • Strong debugging and code optimization skills, with a passion for automation.
  • Systematic problem-solving approach, effective communication skills, and a drive for operational excellence.

Requirements

  • This role is ideal for engineers passionate about building high-performing systems and scaling infrastructure—while collaborating cross-functionally to shape the future of engineering reliability.

Benefits

  • We are an equal opportunity employer and value diversity at our company.
  • We do not discriminate on the basis of race, religion, colour, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Similar Remote Jobs

More jobs at Capchase

More Devops / Sysadmin jobs

More jobs in Latin America (LATAM)

Before You Apply
️
📍 Be aware of the location restriction for this remote position: Latin America (LATAM), USA
‼ Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Site Reliability Engineer @Capchase
Devops / Sysadmin
Salary đź’¸ unspecified
Remote Location
Latin America (LATAM), USA
Job Type full-time
Posted Apr 14, 2025
Apply for this position Unlock 54,207 Remote Jobs
️
📍 Be aware of the location restriction for this remote position: Latin America (LATAM), USA
‼ Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Site Reliability Engineer Apply for this position Unlock 54,207 Remote Jobs
Ă—
  • Unlock 54,207 hidden remote jobs.
  • Your shortcut to remote work. Apply before everyone else.
  • Click and apply. No middlemen, no hassle.

We’re not like the other sites. Come see why!

50% off in April 2025
  • Single payment
  • Lifetime access
  • Filter by location/skills/salary…
  • Create custom email alerts
  • Private Slack Community