Role Description
The GPU-focused Technical Account Manager (TAM) leads the post-sales technical success of customers deploying large-scale AI, training, inference, and high-performance GPU workloads on the companyβs platform. This includes customers using NVIDIA GPU clusters, AMD GPU clusters, GPU VMs, and rack-scale bare-metal environments.
You will act as a trusted advisor across:
-
LLM training
-
Fine-tuning
-
RAG workloads
-
Distributed training frameworks
-
Storage throughput requirements
-
Multi-GPU scaling
-
Performance tuning
This role requires deep technical fluency and exceptional customer management skills to help AI/ML teams achieve predictable, cost-efficient, high-performance outcomes.
Key Responsibilities
-
AI/GPU Onboarding & Workload Architecture
-
Lead onboarding for customers deploying GPU clusters (bare metal, VMs, or hybrid).
-
Advise on cluster design: multi-GPU topology, NVLink/NVSwitch considerations, RDMA, Infiniband and RoCE Ethernet, networking throughput, and storage IOPS requirements.
-
Guide customers in selecting GPU types and configurations based on workload (training, fine-tuning, inference, embeddings, RAG pipelines).
-
Support distributed frameworks: PyTorch, TensorFlow, DeepSpeed, Megatron, JAX, Ray, Mosaic, HuggingFace, etc.
-
Advanced hands-on Kubernetes skills.
-
Advanced hands-on SLURM skills.
-
Performance Optimization & Scaling
-
Identify bottlenecks (network, storage, memory bandwidth).
-
Provide tuning recommendations for batch size, mixed precision, parallelization strategies, and checkpointing.
-
Help customers evaluate cost vs. performance tradeoffs (GPU mix, CPU pairing, instance types, cluster sizing).
-
Technical Relationship Ownership
-
Own the long-term technical strategy across assigned GPU/AI accounts, including hyperscalers, labs, and high-growth AI startups.
-
Host recurring technical review meetings, roadmap reviews, and optimization sessions.
-
Define scaling plans, future GPU reservation needs, and capacity forecasting.
-
Incident & Escalation Management
-
Partner with Support, SRE, Networking, NOC, and Product Management & Engineering to resolve high-urgency incidents.
-
Manage outage communications, corrective action plans, and postmortem reviews with customers.
-
Advocate for GPU reliability improvements and influence roadmap priorities.
-
Account Growth & Expansion
-
Identify opportunities for expanded clusters, high-speed storage, or networking upgrades.
-
Support Sales with technical validation and architecture diagrams needed for expansion.
-
Customer Advocacy & Product Feedback
-
Provide structured feedback on existing and future GPU offerings, networking fabrics, storage platforms, and upcoming AI/ML platform features.
-
Partner with Product on early access programs (new GPUs, pipelines, orchestration, etc.).
Qualifications
-
2β5+ years as an AI/ML Engineer, AI/ML Ops, Technical Account Manager, HPC Engineer, Sales/Solutions Engineer or relevant technical role.
-
Strong knowledge of GPU hardware architectures (NVIDIA/AMD), CUDA/ROCm, distributed training, and ML frameworks.
-
Experience with Linux tuning, networking (Infiniband, RoCE fabrics).
-
Experience with high-performance storage systems (DDN, NetApp, Vast, Weka, etc.).
-
Ability to communicate complex concepts clearly to both executives and engineering teams.
-
Prior experience supporting hyperscale, AI labs, or large cluster deployments is a plus.
-
Cloud Native Computing Foundation Certified Kubernetes Administrator (CKA) certification is a plus.
Benefits
-
100% company-paid insurance premiums for employee medical, dental and vision plans.
-
401(k) plan that matches 100% up to 4%, with immediate vesting.
-
Professional Development Reimbursement of $2,500 each year.
-
11 Holidays + Paid Time Off Accrual + Rollover Plan.
-
Increased PTO at 3 year and 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year.
-
$500 stipend for remote office setup in first year + $400 each following year.
-
Internet reimbursement up to $75 per month.
-
Gym membership reimbursement up to $50 per month.
-
Company paid Wellable subscription.