Senior Data Engineer @Capgemini
Software Development
Salary unspecified
Remote Location
Job Type full-time
Posted 2d ago

[Hiring] Senior Data Engineer @Capgemini

2d ago - Capgemini is hiring a remote Senior Data Engineer. 💸 Salary: unspecified 📍Location: Worldwide

Role Description

We are looking for a Data Engineer to design, build, and operate production data pipelines and platforms that support large scale AI and ML workloads. The role focuses on end to end data lifecycle management, AWS based infrastructure, and collaboration with ML and data teams.

Qualifications

  • Bachelor’s or master’s degree in computer science, data engineering, software engineering, or related field
  • 2-3+ years of experience building production data pipelines and data platforms for AI or ML systems
  • Strong proficiency in Python, C++ and distributed data processing frameworks
  • Hands on experience with AWS services including S3, EC2, SageMaker, and Glue
  • Experience designing data systems that support large scale ML training and experimentation
  • Knowledge of data governance, access control, and lifecycle management
  • Experience working with ML, data science, operations, and cloud engineering teams

Requirements

  • Experience building pipelines across edge devices and cloud systems
  • Background working with large scale sensor, image, or IoT data
  • Familiarity with data labeling tools and annotation workflows
  • Experience with dataset versioning, lineage, and reproducibility systems
  • Understanding of privacy, compliance, or regulated data environments
  • Experience supporting global multi region data platforms

Key Responsibilities

  • End to End Data Pipeline Ownership
    • Design, build, and maintain research and production data pipelines spanning edge devices, cloud services, and centralized platforms
    • Own the full data lifecycle including collection, ingestion, processing, obfuscation, versioning, access, retention, and retirement
  • Edge to Cloud Data Flow
    • Develop resilient ingestion pipelines that handle device variability and connectivity challenges
    • Support secure data transfer from field environments to cloud storage
    • Collaborate with operations teams to improve data coverage, observability, and reliability
  • Data Quality, Governance, and Compliance
    • Implement privacy preserving transformations and obfuscation pipelines
    • Build automated data cleaning and validation processes
    • Establish data lineage, retention policies, and access controls to ensure compliance and traceability
  • Data Services for AI and ML
    • Provide scalable data services for training, evaluation, and research experimentation
    • Support continuous data refresh and retraining workflows
    • Integrate with labeling and annotation systems
    • Enable efficient access patterns for large scale ML workloads
  • AWS Based Cloud Infrastructure
    • Build and optimize pipelines using AWS services such as S3, EC2, SageMaker, Lambda, Glue, and Step Functions
    • Design for cost efficiency, performance, and reliability at scale
  • Collaboration and Feedback Loops
    • Work with AI and ML engineers, scientists, and data teams to gather data requirements
    • Translate feedback into automated improvements in data collection and labeling
    • Support teams with exploratory analysis and data issue debugging
  • Scaling the Data Factory
    • Design and maintain data schemas, dataset versioning, and data factory updates
    • Architect global scale data systems across large device fleets
    • Ensure the platform is flexible for research and reliable for production

Benefits

  • Health insurance from the first days, regardless of the probationary period
  • Christmas holidays from 25 December to 31 December
  • Cooperation with Superhumans center and Veteran HUB
  • Support for psychological counseling provided by the Veteran Hub
  • Internal policy making the company friendly to military and veterans

Company Description

Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.

Before You Apply
worldwide Be aware of the location restriction for this remote position: Worldwide
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Senior Data Engineer @Capgemini
Software Development
Salary unspecified
Remote Location
Job Type full-time
Posted 2d ago
Apply for this position
Did not apply
Applied
Sent Follow-Up
Interview Scheduled
Interview Completed
Offer Accepted
Offer Declined
Unlock 152,720 Remote Jobs
worldwide Be aware of the location restriction for this remote position: Worldwide
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply
Applied
Sent Follow-Up
Interview Scheduled
Interview Completed
Offer Accepted
Offer Declined
Unlock 152,720 Remote Jobs
×

Apply to the best remote jobs
before everyone else

Access 152,720+ vetted remote jobs and get daily alerts.

4.9 ★★★★★ from 500+ reviews
Unlock All Jobs Now

Maybe later