Role Description
YPO is seeking a Senior / Lead DevOps Engineer to design, build, and operate the cloud infrastructure and developer platform that will power its next generation of products. This is a hands-on technical leadership role spanning the full DevOps surface β cloud infrastructure, CI/CD pipelines, release engineering, observability, platform reliability, and developer experience β all in service of a rapidly scaling AI-first mobile platform.
You will be a close partner to the Director of Product, Lead Security Engineer, and mobile engineering leadership β connecting platform reliability to product velocity and security posture in equal measure. You will bring strong technical depth, a product-minded approach to internal tooling, and the communication skills to champion engineering excellence across the organisation.
Key Responsibilities
-
Cloud Infrastructure Design and Operations
-
Own the architecture and day-to-day operation of YPO's cloud infrastructure across its full lifecycle.
-
Architect, implement, and continuously evolve YPO's cloud infrastructure across AWS, Azure, and/or GCP.
-
Design and manage multi-region, highly available environments for a 35,000+ member global community.
-
Own cloud cost management and FinOps practices.
-
Lead the evaluation and adoption of new cloud services, platforms, and tooling.
-
Manage DNS, CDN, load balancing, and networking configurations across cloud environments.
-
Infrastructure as Code and Automation
-
Lead YPO's Infrastructure as Code practice using Terraform.
-
Define and enforce IaC standards, module structures, and governance practices.
-
Automate environment provisioning, teardown, and configuration management.
-
Build and maintain automation pipelines for routine operational tasks.
-
Write clean, well-tested automation scripts in Python, Bash, or equivalent.
-
CI/CD Pipeline Design and Release Engineering
-
Design, build, and maintain end-to-end CI/CD pipelines for YPO's mobile, backend API, AI platform, and data engineering workloads.
-
Implement branch strategies, environment promotion workflows, and feature flagging patterns.
-
Integrate automated quality gates as non-negotiable steps in every pipeline.
-
Lead the adoption of progressive delivery techniques.
-
Own release documentation, change management workflows, and deployment runbooks.
-
Container Orchestration and Platform Engineering
-
Design, operate, and continuously improve YPO's container orchestration infrastructure using Kubernetes.
-
Manage container image governance.
-
Implement and maintain service mesh, ingress controllers, and network policies.
-
Evaluate and adopt platform engineering tools that improve developer self-service.
-
Lead the migration, decomposition, or consolidation of existing services.
-
Observability, Monitoring, and Site Reliability
-
Design and implement a comprehensive observability stack.
-
Define and enforce SLOs, SLIs, and error budgets across YPO's platform services.
-
Build and maintain dashboards, alerting rules, and on-call runbooks.
-
Lead blameless post-mortem processes following significant incidents.
-
Own capacity planning and performance benchmarking for the AI-first mobile platform.
-
DevSecOps and Compliance Automation
-
Partner with the Lead Security Engineer to embed security controls throughout the CI/CD pipeline.
-
Implement and maintain secrets management solutions.
-
Enforce cloud security baselines using policy-as-code frameworks.
-
Support SOC 2, ISO 27001, and other compliance programmes.
-
Manage network security controls across cloud environments.
-
Developer Experience and Technical Leadership
-
Own the internal developer experience.
-
Define and document engineering standards for environment configuration.
-
Mentor and up-level junior engineers and platform contributors.
-
Act as a cross-functional bridge between product, mobile engineering, AI/data engineering, and security.
-
Contribute to technology investment decisions.
Qualifications
-
5+ years of hands-on experience in DevOps, platform engineering, or site reliability engineering.
-
Deep expertise with at least one major cloud provider (AWS strongly preferred).
-
Infrastructure as Code proficiency: Terraform is required.
-
CI/CD experience: hands-on design and operation of pipelines.
-
Strong Kubernetes experience.
-
Proficiency in Python for automation and tooling.
-
Solid understanding of networking fundamentals.
-
Experience implementing observability solutions.
-
Practical knowledge of container security and cloud IAM patterns.
-
Strong communication skills.
-
Demonstrated ability to operate with autonomy in a fast-moving environment.
Preferred
-
Experience supporting native iOS and/or Android mobile release pipelines.
-
Familiarity with AI/ML infrastructure.
-
Experience with platform engineering tools.
-
Exposure to FinOps tooling and cloud cost optimisation.
-
Experience with multi-region, active-active deployment architectures.
-
Prior experience in a global SaaS or membership platform.
Relevant Certifications (Valued)
-
AWS Certified DevOps Engineer β Professional.
-
AWS Certified Solutions Architect β Professional.
-
Microsoft Certified: DevOps Engineer Expert (AZ-400).
-
Google Professional Cloud DevOps Engineer.
-
Certified Kubernetes Administrator (CKA).
-
Certified Kubernetes Application Developer (CKAD).
-
HashiCorp Certified: Terraform Associate.
Additional Information
-
Travel: 10β15% (domestic & international).
-
Flexible hours to support global teams.
-
EOE: YPO is an Equal Opportunity Employer.