Role Description
In this role, you will focus on MLOps, supporting cross-functional teams in designing, deploying, and operating machine learning solutions while building scalable infrastructure, tools, and best practices across the Machine Learning Engineering (MLE) ecosystem.
What You’ll Do
-
Collaborate with Data Scientists and Engineers across the full ML lifecycle, including building and scaling ETL pipelines, deploying models into customer-facing applications, and enabling efficient model development through cloud infrastructure and tooling.
-
Design, build, and maintain scalable machine learning infrastructure, including model serving (real-time and batch), training environments, and orchestration systems, with a focus on performance, scalability, and cost efficiency.
-
Contribute to the roadmap for Machine Learning Engineering and Data Science tools, including developing reusable frameworks and standardized solutions to streamline model implementation.
-
Partner with and support Data Scientists by enabling effective use of cloud-based tools and infrastructure, and providing technical expertise across the ML lifecycle.
-
Collaborate with machine learning engineers to share knowledge, improve best practices, and foster a culture of continuous learning and development.
-
Support development and maintain monitoring, alerting, and automated testing frameworks to ensure the reliability, performance, and integrity of data pipelines, models, and infrastructure.
-
Develop, document, and communicate implementations and best practices across the data science lifecycle.
-
Manage and communicate cloud infrastructure costs and budgets to project stakeholders.
-
Stay current with GCP services and evolving best practices in Machine Learning Engineering and MLOps.
-
Additional tasks may be assigned.
Qualifications
-
Experience in MLOps or DevOps practices, including building and operating production ML systems using Docker, Kubernetes, CI/CD pipelines, Git-based version control, API development, model serving (batch and real-time), and automated testing frameworks.
-
Bachelor’s degree in Data Science, Computer Science, Statistics, Applied Mathematics or equivalent quantitative field.
-
Experience working with Data Scientists to deploy, scale, and operationalize machine learning models in production environments.
-
3+ years of experience as a Machine Learning Engineer with a proven track record of successful project delivery.
-
In-depth knowledge of cloud platform, preferably Google Cloud Platform services, particularly Vertex AI, BigQuery and Dataproc.
-
Extensive expertise with CI/CD and IaC best practices.
-
Extensive knowledge of distributed computing and big data technologies like Spark, Kubeflow, Airflow and SQL.
-
Extensive expertise in Python and machine learning libraries (e.g., TensorFlow, PyTorch, scikit-learn).
-
Experience working in Agile environments with an emphasis on iterative development and continuous delivery.
Requirements
-
Master’s Degree (Preferred).
-
Proficiency in Java or other languages (Preferred).
-
Retail experience (Preferred).
-
E-commerce experience (Preferred).
-
5+ years of experience in Machine Learning (Preferred).
-
Experience with optimization techniques and tools (e.g., Gurobi, linear programming, mixed-integer programming) (Preferred).
-
Experience working with agent-based or agentic AI systems, including orchestration of autonomous workflows or LLM-driven agents (Preferred).
Essential Functions
-
Ability to perform the accountabilities listed in the “What You’ll Do” Section.
-
Ability to comply with dress code requirements.
-
Basic math and reading skills, legible handwriting, and basic computer operation.
-
Ability to maintain prompt and regular attendance and meet scheduling requirements as set by the company.
-
Ability to learn and comply with all company policies, procedures, standards and guidelines.
-
Ability to receive, understand and proactively respond to direction from leadership and other company personnel.
-
Ability to work as part of a team and interact effectively and appropriately with others.
-
Ability to maintain composure and work in a fast-paced environment while accomplishing multiple tasks within established timeframes.
-
Ability to satisfactorily complete company training programs.
-
Ability to use a personal computer for tasks such as communicating, preparing reports, etc.
-
Ability to plan, prioritize and monitor activities across business units.
-
Ability to complete or oversee the completion of assigned projects in a timely manner.