Get daily remote job opportunities in your inbox

No middlemen, no spam, no infinite scrolling.

Get relevant job opportunities, one email at a time.

Unsubscribe at any time.

Collaboration Reliability Engineering Lead @EOS

[Hiring] Collaboration Reliability Engineering Lead @EOS

Mar 20, 2025 - EOS is hiring a remote Collaboration Reliability Engineering Lead. 💸 Salary: $135,000 - $150,000 usd. 📍Location: USA.

This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

We are seeking an experienced and technically proficient Collaboration Reliability Engineering Lead to join our team. In this role, you will support advanced collaboration technologies in a fast-paced and industry-leading environment. The ideal candidate is a highly motivated technical enthusiast with a strong foundation in IT, operations, networking, scripting, and collaboration technologies, and a passion for continuous learning.

  • Lead, mentor, and manage a global team of 8-12 reliability engineers.
  • Foster ownership, accountability, and collaboration within the team.
  • Develop team members' technical and professional skills through coaching and performance reviews.
  • Oversee maintenance of highly available and scalable architecture including but not limited to cisco server templates, endpoints, edge & proxy appliances.
  • Develop, present, and achieve service-level objectives (SLOs), service-level agreements (SLAs), and key performance indicators (KPIs).
  • Perform quality assurance on video conferencing infrastructure, calendar tooling, touch panel hardware, automation bots, cisco endpoints, and call center tooling.
  • Drive incident response, root cause analysis, and post-mortem processes to identify and address reliability issues impacting users.
  • Implement proactive monitoring, alerting, and automation to minimize downtime and improve recovery times in live production environments.
  • Serve as an escalation point for video conferencing infrastructure and network troubleshooting, maintaining up-to-date documentation and on-call runbooks.
  • Identify opportunities to improve system performance and reduce operational toil.
  • Develop and implement strategies for failure testing, and future-capacity planning.
  • Work closely with engineering, security, networking, and third-party vendors (e.g., Cisco, Brightsign, Arista, Zoom, Webex) to resolve support cases and critical escalations.
  • Provide highly-visible communications to hundreds of users regarding large scale changes and updates.
  • Advocate for reliability-focused initiatives and communicate their value to stakeholders.
  • Leverage internal-tooling to monitor, analyze, and improve system reliability.
  • Lead efforts to automate repetitive tasks, ensuring efficient system operations.

Qualifications

  • 3+ years of experience in Reliability Engineering or similar roles.
  • Health Monitoring: Experience implementing and coordinating telemetry using monitoring tools like Splunk, Grafana, and Prometheus, or similar technologies.
  • VMware expertise: Hands-on experience with VMware from a VM deployment, lifecycle and API/CLI perspective.
  • ITIL Knowledge: Understanding of ITIL processes, service management principles, and IT service delivery best practices.
  • Automation: Experience as an automation advocate with a history of removing operational toil via software.
  • Experience supporting internet-facing production services and distributed systems, including: Deployments, On-Call rotations, and Incident management.

Requirements

  • Familiarity with Bash, Python, Terraform, and REST APIs.
  • Fundamental understanding of networking protocols (e.g., HTTP, TCP/IP, WebRTC, SIP).
  • Infrastructure components (e.g., load balancers, firewalls, DNS).
  • Expertise in disaster recovery and future-capacity planning.
  • Excellent communication and interpersonal skills, with the ability to work effectively in a team-oriented environment.
  • Self-motivated and eager to learn new technologies, tools, and methodologies.
  • Experience with collaboration hardware, platforms (e.g., Zoom, Microsoft Teams, WebEx), or media delivery networks.

Benefits

  • The EOS pay range for this job is a general guideline only and not a guarantee of compensation or salary.
  • Additional factors considered in extending an offer include (but are not limited to) location, responsibilities of the job, experience, education, knowledge, skills, and abilities, as well as internal equity, market data, or other laws.

Similar Remote Jobs

More jobs at EOS

More Software Development jobs

More jobs in USA

Before You Apply
📍 Be aware of the location restriction for this remote position: USA
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Collaboration Reliability Engineering Lead @EOS
Software Development
Salary 💸 $135,000 - $150,000 usd
Remote Location
USA
Job Type unspecified
Posted Mar 20, 2025
Apply for this position Unlock 54,755 Remote Jobs
📍 Be aware of the location restriction for this remote position: USA
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Collaboration Reliability Engineering Lead Apply for this position Unlock 54,755 Remote Jobs
×
  • Unlock 54,755 hidden remote jobs.
  • Your shortcut to remote work. Apply before everyone else.
  • Click and apply. No middlemen, no hassle.

We’re not like the other sites. Come see why!

50% off in March 2025
  • Single payment
  • Lifetime access
  • Filter by location/skills/salary…
  • Create custom email alerts
  • Private Slack Community