Back to Remote jobs > All Others > reliability engineer

Senior Site Reliability Engineer @Juul Labs

All Others

Salary usd 141,000 - 2..	Remote Location 🇺🇸 USA Only
Employment Type full-time	Posted 1wk ago

[Hiring] Senior Site Reliability Engineer @Juul Labs

1wk ago - Juul Labs is hiring a remote Senior Site Reliability Engineer. 💸 Salary: usd 141,000 - 227,000 per year 📍Location: USA

Role Description

A Senior Site Reliability Engineer (SRE) is expected to own the operational stability and performance of Juul’s hybrid cloud infrastructure (Nutanix, AWS/GCP). This involves leading automation efforts, architecting for reliability, and acting as the final escalation point for critical incidents to ensure the platform is scalable and efficient.

Nutanix Platform Management

Design, deploy, and maintain enterprise-scale Nutanix AHV clusters and Prism Central for multi-cluster management.
Expert-level proficiency with Nutanix CLI (nCLI and acli) for advanced operations, troubleshooting, and automation.
Develop automation scripts using Nutanix REST APIs, Python SDK, PowerShell, and Terraform for infrastructure-as-code.
Create and manage VM templates, golden images, and standardized deployment catalogs for consistent provisioning.
Design disaster recovery solutions using Leap, Protection Domains, cross-cluster replication, and metro clustering.
Implement network micro-segmentation using Nutanix Flow and configure RBAC, encryption, and security hardening.
Lead L3 troubleshooting using advanced diagnostics, log analysis (CVM, Genesis), NCC health checks, and cluster service resolution.
Configure high availability, VM affinity rules, QoS policies, and optimize performance for mission-critical workloads.
Manage AHV networking with OVS bridges, VLANs, bonds, LACP and implement resource reservations and workload balance.
Design, deploy, and maintain hybrid cloud infrastructure across Nutanix HCI, AWS, and GCP platforms.
Architect and implement multi-cloud solutions ensuring high availability, scalability, and disaster recovery.

Cloud Platform Engineering

Architect and deploy enterprise-scale, highly available multi-cloud solutions across AWS and GCP with multi-region/multi-account strategies.
Expert-level proficiency with AWS CLI, GCP CLI, SDK, boto3, and Python for advanced automation and infrastructure orchestration.
Design AWS Organizations and GCP Organization hierarchies with consolidated billing, IAM policies, and centralized governance.
Configure and manage AWS Systems Manager (SSM) including Session Manager, Run Command, State Manager, and Automation for centralized fleet operations.
Implement centralized logging using CloudWatch/CloudTrail and GCP Cloud Logging with S3/Cloud Storage aggregation.
Integrate AWS and GCP with Splunk using HEC, CloudWatch subscriptions, Pub/Sub, Dataflow, and cloud-specific add-ons for SIEM correlation.
Design and deploy advanced load balancing solutions with AWS ALB/NLB/ELB and GCP Cloud Load Balancing including SSL termination and auto-scaling.
Develop infrastructure-as-code using Terraform, CloudFormation, CDK for repeatable multi-cloud deployments and CI/CD pipelines.
Configure AWS SSO, cross-account IAM roles, GCP Workload Identity, and federated access for centralized identity management.
Design VPC architectures with AWS Transit Gateway/PrivateLink and GCP Shared VPC/VPC peering for hybrid connectivity.
Manage containerized workloads using EKS, GKE, ECS, Cloud Run with service mesh, observability, and security best practices.
Implement disaster recovery using AWS Backup, Cross-Region Replication, GCP snapshots, and multi-region failover strategies.
Lead L3 troubleshooting using CloudWatch Insights, GCP Cloud Trace, VPC Flow Logs, X-Ray, and vendor support escalation.
Perform cost optimization through Reserved Instances, Committed Use Discounts, rightsizing, and automated resource lifecycle management.

System Administration

Administer and support Windows Server and Unix/Linux environments in production and non-production settings.
Perform OS-level hardening, patch management, and security compliance across heterogeneous systems.
Automate routine administrative tasks using PowerShell, Bash, Python, or similar scripting languages.
Manage GitHub organization settings, user permissions, repository access controls, and monitor GitHub Actions workflows and repository health across multiple teams.
Configure Splunk forwarders, heavy forwarders and other integrations for data ingestion from cloud and on-premises sources.

Qualifications

8-12+ years infrastructure experience with 8+ years in Nutanix HCI and enterprise cloud (AWS/GCP).
Expert-level skills in Python, PowerShell, Bash scripting, infrastructure-as-code (Terraform/CloudFormation), and container orchestration (Kubernetes, EKS/GKE).
Proven experience managing enterprise-scale environments, hybrid cloud migrations, disaster recovery, and L3 critical incident management.
Strong networking knowledge (TCP/IP, VLANs, routing, VPN), security hardening, and compliance frameworks (ITIL).
Strategic thinker with exceptional analytical and troubleshooting abilities for complex multi-layer infrastructure issues.
Excellent communication skills to translate technical concepts to executives and non-technical stakeholders.
Calm under pressure during critical outages with meticulous attention to security, compliance, and configuration management.
Self-motivated continuous learner committed to staying current with evolving cloud technologies and automation opportunities.
Available for on-call rotations with strong documentation skills and customer service orientation.

Requirements

Certifications (plus): Nutanix NCP/NCAP, AWS Solutions Architect Professional, AWS DevOps Professional, GCP Professional Cloud Architect, Terraform.

Benefits

People. Work with talented, committed and supportive teammates.
Equity and performance bonuses. Every employee is a stakeholder in our success.
Cell phone subsidy, commuter benefits and discounts on JUUL products.
Excellent medical, dental and vision, disability, and life insurance, plus family support, wellness, legal, and employee assistance program benefits.
401(k) plan with company matching.
Plus biannual discretionary performance bonuses.

Similar Remote Jobs

Content Reviewer - US • TELUS Digital TELUS Digital

All others $14/hour USA Only

2wks ago
Apply See more >
Online Data Analyst United States • TELUS Digital TELUS Digital

All others USA Only

3wks ago
Apply See more >
Online Data Analyst (United States/Spanish speakers) • TELUS Digital TELUS Digital

All others USA Only

1mth ago
Apply See more >

Kickstart Your Job Search

⚡ 13,295 remote jobs added this week

You're seeing 0.4% of available roles

Unlock 155,000+ jobs →

Meet JobCopilot: Your Personal Al Job Hunter

Automatically Apply to Remote Jobs

Try it now →

Before You Apply

️

🇺🇸	Be aware of the location restriction for this remote position: USA Only
‼	Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.

Back to Remote jobs > All Others > reliability engineer

Senior Site Reliability Engineer @Juul Labs

All Others

Salary usd 141,000 - 2..	Remote Location 🇺🇸 USA Only
Employment Type full-time	Posted 1wk ago

Apply for this position

Unlock 155,000+ Remote Jobs

️

🇺🇸	Be aware of the location restriction for this remote position: USA Only
‼	Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.