[Hiring] Solutions Applied Data Scientist @Protege
Solutions Applied Data Scientist @Protege
Data and Analytics
Salary unspecified
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Employment Type full-time
Posted 3d ago

[Hiring] Solutions Applied Data Scientist @Protege

3d ago - Protege is hiring a remote Solutions Applied Data Scientist. πŸ’Έ Salary: unspecified πŸ“Location: USA

Role Description

We are hiring a Solutions Applied Data Scientist to help design, construct, and validate complex healthcare data cohorts used for AI model training. This role sits within the delivery organization, working closely with Solutions Leads and delivery engineers to solve complex data challenges that arise during customer projects.

Solutions Leads own the customer relationship and overall delivery of projects. The Solutions Applied Data Scientist serves as their technical partner for more complex data problems, including:

  • Cohort construction
  • Multi-source dataset assembly
  • Feasibility analysis
  • Data validation

You will help translate research generated by Protege’s Data Lab and customer requirements into practical dataset definitions, determine whether those requirements can be met with available data, and build the SQL and analysis needed to construct the resulting datasets. You will also collaborate with delivery engineers when solutions require changes to data pipelines, infrastructure, or large-scale data movement.

This is a highly applied role focused on solving real-world dataset challenges, not research or model development. The ideal candidate is someone who enjoys solving messy real-world data problems, working directly with large healthcare datasets, writing complex SQL, and collaborating closely with cross-functional teams.

Qualifications

  • Experience working with large structured healthcare datasets
  • Strong SQL and Python skills and experience writing complex queries
  • Experience using Claude Code / Codex
  • Experience joining and transforming large datasets
  • Experience performing data validation and exploratory analysis
  • Strong Python skills for data analysis and scripting
  • Experience working with structured file formats (CSV, Parquet, etc.)
  • Ability to translate ambiguous requirements into concrete data logic
  • Strong communication skills and ability to collaborate with technical and non-technical stakeholders

Requirements

  • Act as a technical partner during delivery projects to solve complex data challenges
  • Work collaboratively with Solutions Leads to unblock delivery challenges
  • Write complex SQL queries to construct cohorts
  • Implement inclusion and exclusion logic
  • Join datasets across multiple data sources
  • Validate linkage between datasets
  • Identify and resolve inconsistencies or missing fields
  • Perform data completeness analysis
  • Investigate missing or anomalous data
  • Review dataset requests from AI researchers and model development teams
  • Help clarify and refine requirements for model training or evaluation datasets
  • Assess feasibility of requested cohort definitions given real-world data constraints

Benefits

  • Opportunity to work in a fast-paced, high-impact environment
  • Collaborative culture with a focus on innovation
  • Access to world-class investors and partnerships

Company Description

We are building Protege to solve the biggest unmet need in AI β€” getting access to the right training data. The process today is time intensive, incredibly expensive, and often ends in failure. The Protege platform facilitates the secure, efficient, and privacy-centric exchange of AI training data.

Solving AI’s data problem is a generational opportunity. We’re backed by world-class investors and already powering partnerships with some of the most ambitious teams in AI. The company that succeeds will be one of the largest in AI β€” and in tech.

We’re a lean, fast-moving, high-trust team of builders who are obsessed with velocity and impact. Our culture is built for people who thrive on ambiguity, own outcomes, and want to shape the future of data and AI.

Before You Apply
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Solutions Applied Data Scientist @Protege
Data and Analytics
Salary unspecified
Remote Location
πŸ‡ΊπŸ‡Έ USA Only
Employment Type full-time
Posted 3d ago
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Application Denied βœ“
Unlock 165,000+ Remote Jobs
️
πŸ‡ΊπŸ‡Έ Be aware of the location restriction for this remote position: USA Only
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Application Denied βœ“
Unlock 165,000+ Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 165,000+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later