Senior Site Reliability Engineer @Dealer Tire
DevOps / Sysadmin
Salary usd 110,000 - 1..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Job Type full-time
Posted 2d ago

[Hiring] Senior Site Reliability Engineer @Dealer Tire

2d ago - Dealer Tire is hiring a remote Senior Site Reliability Engineer. πŸ’Έ Salary: usd 110,000 - 125,000 per year πŸ“Location: USA

Role Description

As a Senior Site Reliability Engineer, you will be a hands-on technical individual contributor embedded within the Core Systems team, responsible for the daily health, stability, and performance of our production environment. You will serve as a primary responder for production incidents, owning triage through resolution β€” including root cause analysis, infrastructure remediation, and order automation recovery. You will work directly alongside the Manager, Consumer Technology Site Reliability, and Helpdesk to handle day-to-day triage and fix responsibilities, enabling leadership to focus on strategic decisions and team direction. You will also partner with development teams to evaluate production risk before deployment.

Your essential job responsibilities will include the following:

  • Production Triage: Includes all incidents surfaced via the #triage Slack channels, Datadog alerts, Rundeck failures, contact center reports, and proactive monitoring across all business units.
  • Incident Ownership: Serve as the primary on-call responder for production incidents. Acknowledge, investigate, and drive issues to resolution with clear communication throughout the incident lifecycle.
  • Root Cause Analysis: Lead RCA (Root Cause Analysis) for production failures, including order automation breakdowns, Gearman/worker queue degradation, API integration outages, batch job timeouts, and database performance events. Document findings with sufficient detail to support post-mortem review.
  • Hands-On Remediation: Execute infrastructure-level remediation, including EC2 instance restarts, Gearman worker pool resets, Rundeck job recovery, order status resets, and inventory and pricing queue restoration.
  • Regression Identification: Identify deployment-related regressions by correlating incident timelines to recent deployments. Initiate and coordinate revert requests with development teams when causal links are established.
  • Incident Coordination: Direct cross-functional teams during active incidents β€” assigning investigation tasks, managing parallel workstreams, tracking affected order or customer counts, and keeping all stakeholders informed via Slack threads and JIRA ticket updates.
  • Focus Areas: Monitor the entire Consumer Enterprise Group (CEG) Platform processing environment and proactively surface anomalies, enhancement opportunities, and risk areas to leadership.
  • Assist with data cleanup and order recovery operations following production incidents.
  • Support testing and validation of infrastructure changes prior to production deployment.
  • Ensure accurate and timely entry of incident details, findings, and resolutions into JIRA tracking systems.
  • Continue to develop expertise in the CEG codebase, third-party integrations, and operational tooling through working sessions and self-directed learning.
  • Attend improvement opportunities for personal growth and certifications that will enhance effectiveness in the role.
  • Other Duties as assigned.

Qualifications

  • 5+ years in a Site Reliability Engineering, DevOps, or Production Support role at a software or e-commerce company.
  • Demonstrated ability to independently diagnose and resolve production incidents, including infrastructure-level failures (servers, queues, batch jobs, APIs).
  • Hands-on experience with AWS (EC2, CloudWatch, or equivalent) for day-to-day operational tasks.
  • Experience with Datadog, New Relic, PagerDuty, or equivalent platforms for monitoring, alerting, and incident detection.
  • Working knowledge of MySQL/relational databases for investigative queries and data validation. Ability to read and analyze complex SQL queries to diagnose production data issues.
  • Familiarity with PHP, Python, Bash, or similar languages sufficient to read, debug, and modify production scripts and automation jobs.
  • Experience with Rundeck, cron, or equivalent batch job management and monitoring tools.

Requirements

  • Problem-Solving
  • Composure
  • Accountability
  • Detail-Oriented
  • Adaptability
  • Collaborative
  • Proactive
  • Communication
  • Results Orientation

Physical Job Requirements

  • Continuous viewing from and inputting data to a computer screen
  • Talking through the computer for many meetings and one-to-one conversations
  • Sitting for long periods of time
  • Travel required (<10%)

Benefits

  • Competitive & comprehensive benefit package including paid time off, medical, dental, vision, and 401k match (50% on the dollar up to 7% of employee contribution).
  • Compensation offered for this position will depend on qualifications, experience, and geographic location.
  • Total compensation package may also include commission, bonus or profit sharing.
Before You Apply
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Senior Site Reliability Engineer @Dealer Tire
DevOps / Sysadmin
Salary usd 110,000 - 1..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Job Type full-time
Posted 2d ago
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 152,720 Remote Jobs
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 152,720 Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 152,720+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later