[Hiring] Senior Site Reliability Engineer @Juul Labs
Senior Site Reliability Engineer @Juul Labs
All Others
Salary usd 141,000 - 2..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Employment Type full-time
Posted 1wk ago

[Hiring] Senior Site Reliability Engineer @Juul Labs

1wk ago - Juul Labs is hiring a remote Senior Site Reliability Engineer. πŸ’Έ Salary: usd 141,000 - 227,000 per year πŸ“Location: USA

Role Description

A Senior Site Reliability Engineer (SRE) is expected to own the operational stability and performance of Juul’s hybrid cloud infrastructure (Nutanix, AWS/GCP). This involves leading automation efforts, architecting for reliability, and acting as the final escalation point for critical incidents to ensure the platform is scalable and efficient.

Nutanix Platform Management

  • Design, deploy, and maintain enterprise-scale Nutanix AHV clusters and Prism Central for multi-cluster management.
  • Expert-level proficiency with Nutanix CLI (nCLI and acli) for advanced operations, troubleshooting, and automation.
  • Develop automation scripts using Nutanix REST APIs, Python SDK, PowerShell, and Terraform for infrastructure-as-code.
  • Create and manage VM templates, golden images, and standardized deployment catalogs for consistent provisioning.
  • Design disaster recovery solutions using Leap, Protection Domains, cross-cluster replication, and metro clustering.
  • Implement network micro-segmentation using Nutanix Flow and configure RBAC, encryption, and security hardening.
  • Lead L3 troubleshooting using advanced diagnostics, log analysis (CVM, Genesis), NCC health checks, and cluster service resolution.
  • Configure high availability, VM affinity rules, QoS policies, and optimize performance for mission-critical workloads.
  • Manage AHV networking with OVS bridges, VLANs, bonds, LACP and implement resource reservations and workload balance.
  • Design, deploy, and maintain hybrid cloud infrastructure across Nutanix HCI, AWS, and GCP platforms.
  • Architect and implement multi-cloud solutions ensuring high availability, scalability, and disaster recovery.

Cloud Platform Engineering

  • Architect and deploy enterprise-scale, highly available multi-cloud solutions across AWS and GCP with multi-region/multi-account strategies.
  • Expert-level proficiency with AWS CLI, GCP CLI, SDK, boto3, and Python for advanced automation and infrastructure orchestration.
  • Design AWS Organizations and GCP Organization hierarchies with consolidated billing, IAM policies, and centralized governance.
  • Configure and manage AWS Systems Manager (SSM) including Session Manager, Run Command, State Manager, and Automation for centralized fleet operations.
  • Implement centralized logging using CloudWatch/CloudTrail and GCP Cloud Logging with S3/Cloud Storage aggregation.
  • Integrate AWS and GCP with Splunk using HEC, CloudWatch subscriptions, Pub/Sub, Dataflow, and cloud-specific add-ons for SIEM correlation.
  • Design and deploy advanced load balancing solutions with AWS ALB/NLB/ELB and GCP Cloud Load Balancing including SSL termination and auto-scaling.
  • Develop infrastructure-as-code using Terraform, CloudFormation, CDK for repeatable multi-cloud deployments and CI/CD pipelines.
  • Configure AWS SSO, cross-account IAM roles, GCP Workload Identity, and federated access for centralized identity management.
  • Design VPC architectures with AWS Transit Gateway/PrivateLink and GCP Shared VPC/VPC peering for hybrid connectivity.
  • Manage containerized workloads using EKS, GKE, ECS, Cloud Run with service mesh, observability, and security best practices.
  • Implement disaster recovery using AWS Backup, Cross-Region Replication, GCP snapshots, and multi-region failover strategies.
  • Lead L3 troubleshooting using CloudWatch Insights, GCP Cloud Trace, VPC Flow Logs, X-Ray, and vendor support escalation.
  • Perform cost optimization through Reserved Instances, Committed Use Discounts, rightsizing, and automated resource lifecycle management.

System Administration

  • Administer and support Windows Server and Unix/Linux environments in production and non-production settings.
  • Perform OS-level hardening, patch management, and security compliance across heterogeneous systems.
  • Automate routine administrative tasks using PowerShell, Bash, Python, or similar scripting languages.
  • Manage GitHub organization settings, user permissions, repository access controls, and monitor GitHub Actions workflows and repository health across multiple teams.
  • Configure Splunk forwarders, heavy forwarders and other integrations for data ingestion from cloud and on-premises sources.

Qualifications

  • 8-12+ years infrastructure experience with 8+ years in Nutanix HCI and enterprise cloud (AWS/GCP).
  • Expert-level skills in Python, PowerShell, Bash scripting, infrastructure-as-code (Terraform/CloudFormation), and container orchestration (Kubernetes, EKS/GKE).
  • Proven experience managing enterprise-scale environments, hybrid cloud migrations, disaster recovery, and L3 critical incident management.
  • Strong networking knowledge (TCP/IP, VLANs, routing, VPN), security hardening, and compliance frameworks (ITIL).
  • Strategic thinker with exceptional analytical and troubleshooting abilities for complex multi-layer infrastructure issues.
  • Excellent communication skills to translate technical concepts to executives and non-technical stakeholders.
  • Calm under pressure during critical outages with meticulous attention to security, compliance, and configuration management.
  • Self-motivated continuous learner committed to staying current with evolving cloud technologies and automation opportunities.
  • Available for on-call rotations with strong documentation skills and customer service orientation.

Requirements

  • Certifications (plus): Nutanix NCP/NCAP, AWS Solutions Architect Professional, AWS DevOps Professional, GCP Professional Cloud Architect, Terraform.

Benefits

  • People. Work with talented, committed and supportive teammates.
  • Equity and performance bonuses. Every employee is a stakeholder in our success.
  • Cell phone subsidy, commuter benefits and discounts on JUUL products.
  • Excellent medical, dental and vision, disability, and life insurance, plus family support, wellness, legal, and employee assistance program benefits.
  • 401(k) plan with company matching.
  • Plus biannual discretionary performance bonuses.
Before You Apply
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Senior Site Reliability Engineer @Juul Labs
All Others
Salary usd 141,000 - 2..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Employment Type full-time
Posted 1wk ago
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 155,000+ Remote Jobs
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 155,000+ Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 155,000+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later