Back to Remote jobs  >   AI / ML
Systems Architect AI/ML Infrastructure @Deepgram
AI / ML
Salary usd 160,000 - 2..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Job Type full-time
Posted 3d ago

[Hiring] Systems Architect AI/ML Infrastructure @Deepgram

3d ago - Deepgram is hiring a remote Systems Architect AI/ML Infrastructure. πŸ’Έ Salary: usd 160,000 - 220,000 per year πŸ“Location: USA

Role Description

Deepgram's infrastructure spans bare metal GPU clusters, multi-cloud deployments, and global edge presence -- all serving real-time voice AI at massive scale while simultaneously powering large-scale model training. As a Systems Architect, you will own the end-to-end infrastructure architecture that makes this possible. You will:

  • Define and drive the end-to-end infrastructure architecture for Deepgram's AI/ML workloads across production inference and research training.
  • Design multi-cloud and hybrid infrastructure strategies that balance performance, reliability, cost, and vendor flexibility.
  • Architect compute orchestration systems that efficiently schedule and manage GPU and CPU workloads across heterogeneous infrastructure.
  • Design storage architectures that handle the massive datasets required for speech and audio ML -- from high-throughput training data pipelines to low-latency model serving.
  • Lead capacity planning across all infrastructure dimensions, modeling growth and ensuring Deepgram can scale ahead of demand.
  • Drive cost optimization and FinOps practices, identifying opportunities to reduce infrastructure spend without compromising performance or reliability.
  • Design burstable, elastic training infrastructure that can scale up for large training runs and scale down to minimize idle cost.
  • Architect research compute infrastructure that gives ML teams the resources they need while maintaining operational efficiency.
  • Establish architectural standards, design review processes, and technical documentation practices for infrastructure decisions.
  • Collaborate with engineering leadership to align infrastructure strategy with product roadmap and business objectives.
  • Evaluate emerging hardware, cloud services, and infrastructure technologies for potential adoption.

Qualifications

  • 7+ years of experience in infrastructure engineering, systems architecture, or a senior technical role focused on large-scale infrastructure.
  • Proven experience designing multi-cloud architectures spanning AWS and at least one other major cloud provider or on-premises environment.
  • Deep expertise in storage system design -- block, object, and file storage, including performance tuning for large-scale data workloads.
  • Strong experience with compute orchestration using Kubernetes, and an understanding of how to schedule diverse workloads efficiently.
  • Hands-on experience with GPU infrastructure -- procurement considerations, cluster design, driver and runtime management.
  • Track record of capacity planning and infrastructure scaling for high-growth environments.
  • Ability to communicate complex architectural decisions clearly to both technical and non-technical stakeholders.
  • Strong understanding of networking fundamentals as they relate to infrastructure architecture.

Requirements

  • Direct experience architecting infrastructure for ML training workloads -- distributed training, large dataset management, experiment infrastructure.
  • Background in cost optimization and FinOps practices for large-scale cloud and bare metal infrastructure.
  • Experience operating and managing bare metal infrastructure in colocation facilities.
  • Expertise in network architecture design, including high-bandwidth GPU interconnects and global traffic routing.
  • Experience with infrastructure modeling and simulation for capacity planning.
  • Familiarity with Slurm, Ray, or other HPC/ML job scheduling systems.
  • Understanding of power, cooling, and physical infrastructure considerations for GPU-dense deployments.

Benefits

  • Holistic health: Medical, dental, vision benefits.
  • Annual wellness stipend.
  • Mental health support.
  • Life, STD, LTD Income Insurance Plans.
  • Unlimited PTO.
  • Generous paid parental leave.
  • Flexible schedule.
  • 12 Paid US company holidays.
  • Quarterly personal productivity stipend.
  • One-time stipend for home office upgrades.
  • 401(k) plan with company match.
  • Tax Savings Programs.
  • Learning / Education stipend.
  • Participation in talks and conferences.
  • Employee Resource Groups.
  • AI enablement workshops / sessions.
Before You Apply
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Back to Remote jobs  >   AI / ML
Systems Architect AI/ML Infrastructure @Deepgram
AI / ML
Salary usd 160,000 - 2..
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Job Type full-time
Posted 3d ago
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 152,720 Remote Jobs
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 152,720 Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 152,720+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later