Principal Reliability Engineer @Enlyte
Software Development
Salary $133,000 - $190..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Job Type full-time
Posted 3d ago

[Hiring] Principal Reliability Engineer @Enlyte

3d ago - Enlyte is hiring a remote Principal Reliability Engineer. πŸ’Έ Salary: $133,000 - $190,000 annually πŸ“Location: USA

Role Description

This is a full-time remote position that can be located anywhere in the U.S.

The Principal Reliability Engineer is a senior technical leader responsible for the reliability, observability, and operational control of enterprise platforms and services. This role owns the design and evolution of the reliability control plane, including visibility, telemetry, alerting, and integrations across monitoring, incident management, and automation tooling.

At the Principal level, this role leads complex, high-impact reliability initiatives, defines standards for observability and operational readiness, and partners closely with Cloud, Platform, Security, and Application teams to ensure systems are resilient, measurable, and operable at scale.

Key Responsibilities

  • Own the reliability control plane, including standards and architecture for monitoring, logging, tracing, alerting, and incident management.
  • Define how services expose health, performance, and operational signals across the enterprise.
  • Establish and evolve reliability patterns and reference architectures adopted across teams.
  • Lead design decisions that improve system resilience, fault tolerance, and recoverability.
  • Own integrations between platforms and reliability tooling (monitoring, alerting, incident response, on-call, and automation systems).
  • Define consistent approaches to telemetry collection, normalization, and consumption.
  • Ensure observability tooling provides actionable visibility aligned to service-level objectives.
  • Evaluate and recommend tooling improvements that enhance visibility and operational insight.
  • Lead complex reliability initiatives impacting multiple systems or platforms.
  • Partner with engineering teams to design reliable, observable services from inception.
  • Drive adoption of best practices for operational readiness, graceful degradation, and failure handling.
  • Review system designs to ensure reliability and observability requirements are met.
  • Establish standards for alerting quality, escalation, and incident response.
  • Drive improvements in incident detection, diagnosis, and recovery.
  • Lead or support root cause analysis for significant incidents and ensure durable corrective actions.
  • Promote a culture of operational excellence and continuous reliability improvement.
  • Serve as a technical leader and trusted advisor on reliability and observability.
  • Mentor senior engineers and influence reliability practices across teams.
  • Collaborate with Cloud, Security, and Platform leaders to align reliability strategy with business needs.

Qualifications

  • 12+ years of related experience with a Bachelor’s degree; or equivalent professional experience.
  • Extensive experience designing and operating reliable, observable systems at scale.
  • Proven success owning or leading observability and incident management platforms.
  • Background in cloud, platform, or infrastructure engineering strongly preferred.
  • Deep expertise in reliability engineering, observability, and distributed systems.
  • Strong understanding of monitoring, logging, tracing, alerting, and incident management concepts.
  • Experience integrating and operating reliability tooling at enterprise scale.
  • Solid grasp of cloud and platform architectures and their operational characteristics.
  • Ability to translate operational risk and system behavior into actionable engineering improvements.

Benefits

  • Medical, Dental, Vision, Health Savings Accounts / Flexible Spending Accounts.
  • Life and AD&D Insurance.
  • 401(k).
  • Tuition Reimbursement.
  • An array of resources that encourage a lifetime of healthier living.
  • Compensation range: $133,000 - $190,000 annually, based on skills, experience, and education.
Before You Apply
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Principal Reliability Engineer @Enlyte
Software Development
Salary $133,000 - $190..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Job Type full-time
Posted 3d ago
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 152,720 Remote Jobs
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 152,720 Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 152,720+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later