Back to Remote jobs  >   AI / ML
Principal Machine Learning Engineer, ML Platform @Shippo
AI / ML
Salary usd 212,000 - 2..
Remote Location
๐Ÿ‡บ๐Ÿ‡ธ USA Only
Job Type full-time
Posted 2wks ago

[Hiring] Principal Machine Learning Engineer, ML Platform @Shippo

2wks ago - Shippo is hiring a remote Principal Machine Learning Engineer, ML Platform. ๐Ÿ’ธ Salary: usd 212,000 - 287,000 per year ๐Ÿ“Location: USA

Role Description

  • Set technical strategy and drive a multi-quarter roadmap for ML platform capabilities aligned to Shippoโ€™s business priorities.
  • Own cross-team architecture decisions, RFCs, and design reviews for ML lifecycle and inference.
  • Raise the engineering bar through mentorship, production readiness standards, and reusable platform primitives.
  • Be accountable for platform adoption, reliability, and cost-performance outcomes.
  • Build and operate core ML platform components:
    • ML lifecycle foundation (experiment tracking, reproducibility, artifact management, model registry, versioning, and controlled promotion workflows using MLflow or equivalent).
    • Training and experimentation enablement (standardized environments, reusable pipelines/templates, evaluation harnesses, and repeatable workflows that let data scientists move from exploration to production with confidence).
    • Kubernetes-native model serving for real-time inference (safe rollout and rollback, autoscaling, reliability practices, and cost controls).
    • Batch inference and scoring pipelines (repeatable backfills, retraining triggers, consistent packaging between training and inference).
    • Observability for ML systems (service health metrics, alerting, and model-quality signals such as drift and data quality).
    • Developer experience (templates, reference implementations, documentation, and self-service workflows).
  • Evaluate and recommend inference frameworks and deployment patterns, and document tradeoffs for Shippoโ€™s workloads.
  • Identify and resolve performance bottlenecks across the inference stack (model runtime, compute utilization, networking, serialization, and autoscaling behavior).
  • Establish ML engineering standards across training, evaluation, testing, model packaging, CI/CD, production readiness, and incident response.
  • Partner with Data Science teams to bridge research and production environments by creating repeatable frameworks, shared standards for code quality and reproducibility, and self-serve paths to deploy models safely.
  • Collaborate with Data and Engineering teams to ensure the platform supports real workflows, drives adoption, and meets reliability expectations.
  • Mentor engineers through design reviews, architecture guidance, and shared best practices across platform and ML development.

Qualifications

  • 15+ years of software engineering experience, including ownership of production systems (platform, infrastructure, or distributed systems).
  • 4+ years owning ML systems end-to-end in production, including on-call and incident response, and making architecture decisions based on operational constraints (latency, throughput, availability, and cost).
  • Strong experience building and running services on Kubernetes, including deployments, autoscaling, and observability.
  • Hands-on experience with ML lifecycle tooling such as MLflow or equivalent (tracking, registry, packaging, and promotion workflows).
  • Demonstrated ability to evaluate inference tradeoffs across batch and real-time serving, CPU versus GPU, latency and throughput, cost, and operational complexity.
  • Demonstrated Principal-level technical leadership, including setting technical direction, driving cross-team alignment via RFCs/design reviews, and delivering multi-quarter roadmaps.
  • Proven ownership of reliability and operational outcomes for production systems (SLOs, incident response, and measurable improvements in stability and performance).
  • Demonstrated ability to ship incrementally, prioritize production reliability over perfect solutions, and drive adoption through pragmatic platform design.
  • Experience working with or evaluating managed ML platforms (Databricks, SageMaker, Vertex AI, or similar), with clear judgement on strengths, limitations, and build-vs-buy decisions.

Bonus

  • Databricks experience (useful, not required), including Databricks workflows and ML tooling integration.
  • Experience with inference and serving frameworks.
  • Experience with feature store patterns, online and offline consistency, and model evaluation at scale.
  • Experience supporting optimization systems and decision engines in production.
  • LLM or agent workflow experience, especially evaluation harnesses, deployment patterns, guardrails, and monitoring.

Benefits

  • Healthcare coverage for medical, dental, and vision (90% covered by the company, incl. dependents). Pets coverage is also available!
  • Take-as-much-as-you-need vacation policy & flexible working hours.
  • One week-long company-wide winter slowdown.
  • 3 Volunteer Days Off (VTOs).
  • WFH stipend to set up your home office.
  • Charity donation match up to $100.
  • Dedicated programs, coaching, tools, and resources for your professional and career growth as well as an individual learning stipend for your personal and focused growth.
  • Fun team in-person time through our Shippos Everywhere program which includes regular team and company off-sites throughout the year as well as local Shippos gatherings.
Before You Apply
๏ธ
๐Ÿ‡บ๐Ÿ‡ธ Be aware of the location restriction for this remote position: USA Only
โ€ผ Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Back to Remote jobs  >   AI / ML
Principal Machine Learning Engineer, ML Platform @Shippo
AI / ML
Salary usd 212,000 - 2..
Remote Location
๐Ÿ‡บ๐Ÿ‡ธ USA Only
Job Type full-time
Posted 2wks ago
Apply for this position
Did not apply โœ“
Applied โœ“
Sent Follow-Up โœ“
Interview Scheduled โœ“
Interview Completed โœ“
Offer Accepted โœ“
Offer Declined โœ“
Unlock 152,720 Remote Jobs
๏ธ
๐Ÿ‡บ๐Ÿ‡ธ Be aware of the location restriction for this remote position: USA Only
โ€ผ Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply โœ“
Applied โœ“
Sent Follow-Up โœ“
Interview Scheduled โœ“
Interview Completed โœ“
Offer Accepted โœ“
Offer Declined โœ“
Unlock 152,720 Remote Jobs
ร—

Apply to the best remote jobs
before everyone else

Access 152,720+ vetted remote jobs and get daily alerts.

4.9 โ˜…โ˜…โ˜…โ˜…โ˜… from 500+ reviews
Unlock All Jobs Now

Maybe later