Director of DevOps @Kipu Systems
DevOps / Sysadmin
Salary unspecified
Remote Location
🇺🇸 USA Only
Job Type full-time
Posted YDay

[Hiring] Director of DevOps @Kipu Systems

YDay - Kipu Systems is hiring a remote Director of DevOps. 💸 Salary: unspecified 📍Location: USA

Role Description

We’re looking for a Director of DevOps and CloudOps to own the reliability, security, and scalability of Kipu’s platform infrastructure across AWS and Azure. You’ll lead the team responsible for keeping our systems running, our deployments fast and safe, and our infrastructure ready to support the next phase of Kipu’s growth. This is a hands-on leadership role—you write scripts, build automation, maintain and upgrade internal tools, and architect solutions alongside your team every day.

Kipu is the leading technology platform for behavioral health, operating in a HIPAA-regulated environment where uptime and data security are non-negotiable. You’ll manage two established teams and individual contributors—approximately 13 engineers across DevOps Engineering, DevOps Production Support, and specialized infrastructure roles. A critical part of this role is enabling the broader engineering organization: multiple product teams are spinning up new services at a fast pace, and your team is responsible for standing up all CI/CD pipelines, container orchestration (Kubernetes, AWS ECS), cloud infrastructure, and observability for every new service that ships. You’ll partner closely with engineering, product, and security to ensure our cloud infrastructure is a competitive advantage, not a constraint.

What you’ll do

  • Infrastructure strategy and operations:
    • Own Kipu’s cloud infrastructure strategy across AWS and Azure, including architecture decisions, cost optimization, and capacity planning.
    • Drive reliability and availability targets, establishing and maintaining SLAs/SLOs that align with customer and business expectations.
    • Lead incident response, root cause analysis, and post-incident review processes to continuously improve system resilience.
    • Manage infrastructure budgets and optimize cloud spend without sacrificing performance or security.
    • Write Python, Bash, and other scripts daily to automate operations, solve problems, and improve workflows. Own and evolve infrastructure-as-code (Terraform, CDK, Ansible).
    • Maintain, upgrade, and develop internal DevOps applications and automation tools used across the organization.
  • CI/CD and release engineering:
    • Design and maintain CI/CD pipelines (Jenkins, GitHub Actions) that enable engineering teams to ship with speed and confidence.
    • Establish release engineering standards, including deployment strategies (blue-green, canary, feature flags) and rollback procedures.
    • Reduce build times, flaky tests, and deployment friction across the engineering organization.
    • Serve as the infrastructure partner for product engineering teams spinning up new services—own the process of onboarding each service into CI/CD, container platforms (Kubernetes, ECS), and cloud infrastructure.
    • Drive standardization of service deployment patterns, infrastructure templates, and operational runbooks across all teams.
  • Security, compliance, and governance:
    • Ensure infrastructure meets HIPAA, SOC 2, and other regulatory requirements, partnering with security and compliance teams on audits and remediation.
    • Implement and enforce infrastructure security best practices, including network segmentation, IAM policies, secrets management, and encryption at rest and in transit.
    • Maintain disaster recovery and business continuity plans, including regular testing and validation.
    • Own security risk identification, assessment, and remediation across the infrastructure—proactively identify vulnerabilities and drive fixes across cloud resources.
    • Manage security patching, hardening, and compliance remediation at scale across AWS and Azure environments.
  • Observability and platform reliability:
    • Build and evolve Kipu’s observability stack: monitoring, alerting, dashboards, logging, and distributed tracing (Datadog, CloudWatch, Azure Monitor).
    • Establish a data-driven approach to reliability, using SLIs and error budgets to balance velocity with stability.
    • Proactively identify and address infrastructure risks before they become customer-facing incidents.
    • Design and enforce observability standards for every new service—ensure teams ship with proper metrics, logging, and alerting from day one.
    • Provide production support and operational guidance to other engineering teams across the organization.
  • Team leadership:
    • Lead and mentor two managers and their teams, plus direct IC reports (~13 total headcount), fostering a culture of ownership, accountability, and continuous improvement.
    • Define team structure, hiring plans, and career development paths as the organization scales.
    • Collaborate cross-functionally with engineering, product, and security leadership to align infrastructure priorities with business goals.

Qualifications

  • 8+ years of experience in DevOps, CloudOps, SRE, or infrastructure engineering, with at least 3 years leading teams.
  • Deep expertise in AWS (EC2, ECS/EKS, RDS, S3, Lambda, VPC, IAM, CloudWatch, Secrets Manager), including networking, compute, storage, and cost optimization.
  • Strong background in CI/CD pipeline design, release engineering, and deployment automation.
  • Experience operating infrastructure in a HIPAA-compliant or similarly regulated environment.
  • Proven track record building and maintaining observability stacks (monitoring, alerting, logging, tracing).
  • Infrastructure-as-code fluency: Terraform (required), AWS CDK, with familiarity in CloudFormation or Pulumi. Experience with configuration management tools (Ansible preferred).
  • Experience managing containerized workloads at scale (Kubernetes, ECS, or similar).
  • Demonstrated ability to recruit, develop, and retain strong infrastructure engineering talent.
  • Working experience with Azure cloud services (Azure DevOps, AKS, Azure Monitor, or equivalent).
  • Strong scripting, coding, and automation skills in Python and Bash—you write code daily, not occasionally.
  • Experience building, maintaining, and upgrading internal tools and applications.
  • Experience with security risk management, vulnerability remediation, and compliance-driven patching across cloud infrastructure at scale.
  • Demonstrated ability to manage managers and lead through others while remaining technically engaged.
  • High personal integrity, strong work ethic, and a commitment to doing the right thing under pressure.

Nice to have

  • Experience with Azure, particularly in a multi-cloud or hybrid environment.
  • Healthcare SaaS or multi-tenant platform experience.
  • SOC 2 or HITRUST audit experience, including evidence collection and control implementation.
  • Background supporting data-intensive or AI/ML infrastructure workloads.
  • Experience leading platform migrations or major infrastructure modernization efforts.
  • Familiarity with FinOps practices and cloud cost governance at scale.
  • Experience with PostgreSQL administration and performance tuning.
  • Familiarity with Datadog, Grafana, PagerDuty, and building observability-as-code.
  • AWS certifications (Solutions Architect Professional, DevOps Engineer Professional) or Azure equivalents.
  • Experience with service mesh, API gateways, or zero-trust networking models.
  • Familiarity with Ruby on Rails, Node.js, or Spring Boot application ecosystems (the services your team will support).

Leadership qualities and culture fit

  • Leads by example—rolls up their sleeves and works alongside the team, not from a distance.
  • Takes ownership and accountability for outcomes, not just tasks.
  • Operates with high ethical standards and transparency in all decisions.
  • Demonstrates commitment and reliability—follows through on promises and is present when it matters.
  • Builds trust through technical credibility and genuine care for team growth.
  • Communicates directly and honestly, escalating risks and issues proactively.
Before You Apply
🇺🇸 Be aware of the location restriction for this remote position: USA Only
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Director of DevOps @Kipu Systems
DevOps / Sysadmin
Salary unspecified
Remote Location
🇺🇸 USA Only
Job Type full-time
Posted YDay
Apply for this position
Did not apply
Applied
Sent Follow-Up
Interview Scheduled
Interview Completed
Offer Accepted
Offer Declined
Unlock 152,720 Remote Jobs
🇺🇸 Be aware of the location restriction for this remote position: USA Only
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply
Applied
Sent Follow-Up
Interview Scheduled
Interview Completed
Offer Accepted
Offer Declined
Unlock 152,720 Remote Jobs
×

Apply to the best remote jobs
before everyone else

Access 152,720+ vetted remote jobs and get daily alerts.

4.9 ★★★★★ from 500+ reviews
Unlock All Jobs Now

Maybe later