Back to Remote jobs  >   AI / ML
Backend AI & Data Pipeline Engineer @Seeka Technology
AI / ML
Salary unspecified
Remote Location
Job Type internship
Posted 6d ago

[Hiring] Backend AI & Data Pipeline Engineer @Seeka Technology

6d ago - Seeka Technology is hiring a remote Backend AI & Data Pipeline Engineer. πŸ’Έ Salary: unspecified πŸ“Location: Worldwide

Role Description

We are looking for a Backend AI & Data Pipeline Engineer to own the end-to-end data processing infrastructure that powers Yuzee's intelligent course and job matching platform. You will design and maintain scalable, event-driven pipelines that process tens of thousands of daily records, generate semantic embeddings, and feed a growing knowledge graph used for personalised career pathway recommendations.

What you'll do

  • Design and maintain three distinct processing pipelines β€” scheduled job ingestion, event-driven course processing, and a periodic knowledge graph builder β€” each with independent trigger logic and cost controls.
  • Generate and manage semantic embeddings via Amazon Bedrock (Titan v2), index them in MongoDB Atlas Vector Search, and calibrate similarity thresholds to ensure match accuracy.
  • Build and maintain a knowledge graph linking jobs, courses, skills, and industries using FP-Growth association rules and archetype-to-SOC code mapping.
  • Build and improve a two-stage discovery and matching API on AWS Lambda β€” vector retrieval first, then deep eligibility scoring with LLM re-ranking.
  • Right-size Fargate Spot instances and design resumable processing loops that tolerate interruption, keeping infrastructure costs under control as data volume scales.
  • Maintain and improve daily job scrapers across multiple sources and build institution data scrapers with robust HTML cleaning pipelines.

Qualifications

  • 1+ years of backend engineering experience focused on data pipelines, ML infrastructure, or search systems.
  • Hands-on experience with AWS serverless and container services β€” Lambda, ECS Fargate, EventBridge, and Step Functions.
  • Strong Python skills β€” Pandas, async processing, bulk database operations, and text cleaning.
  • Familiarity with vector databases and semantic similarity search; MongoDB Atlas Vector Search experience is a strong plus.
  • Cost-conscious infrastructure mindset β€” you think in per-record compute costs, free tiers, Spot resilience, and right-sizing.
  • Ability to document and communicate complex architecture clearly to both technical and non-technical stakeholders.

Requirements

  • Degree or existing proven experience.

Benefits

  • You can work from home for the whole internship period.
  • A reference letter can be requested upon completion of internship.
  • A bit of flexibility with working time aside from the usual 9am to 6pm (Ex. 8am to 5pm / 7:30am to 4:30pm).
  • The possibility of retainment for part-time or full-time work post-internship based on your performance, even if you are not based in Malaysia.
Before You Apply
️
worldwide Be aware of the location restriction for this remote position: Worldwide
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Back to Remote jobs  >   AI / ML
Backend AI & Data Pipeline Engineer @Seeka Technology
AI / ML
Salary unspecified
Remote Location
Job Type internship
Posted 6d ago
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 152,720 Remote Jobs
️
worldwide Be aware of the location restriction for this remote position: Worldwide
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 152,720 Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 152,720+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later