[Hiring] AI Evaluation Engineer @Gramian Consulting Group
AI Evaluation Engineer @Gramian Consulting Group

[Hiring] AI Evaluation Engineer @Gramian Consulting Group

6d ago - Gramian Consulting Group is hiring a remote AI Evaluation Engineer. πŸ’Έ Salary: unspecified πŸ“Location: India, Brazil, Colombia, Egypt, Turkey, Indonesia, Bangladesh, Ghana, Kenya, Nigeria

Role Description

We are looking for an AI Evaluation Engineer specialized in planning and operations to design and build benchmark tasks that simulate real-world scenarios such as scheduling, logistics, and resource allocation. This role focuses on planning, scheduling, and operational optimization problems, where multiple agents must collaborate to solve constraint-rich scenarios involving resources, timelines, and dependencies.

Commitments Required: 8 hours per day with an overlap of 4 hours with PST.

Employment type: Contractor assignment (no medical/paid leave)

Duration of contract: 4 weeks+

Location: Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Indonesia, Kenya, Nigeria, Turkey, Vietnam

Interview: take home assessment (60min) + short interview

Responsibilities

  • Design and build multi-agent benchmark tasks involving:
    • Planning, scheduling, and resource allocation
    • Operational decision-making (logistics, project planning, incident response, capacity planning)
  • Create constraint-rich problem statements with multiple interacting variables
  • Develop verification scripts to evaluate:
    • Feasibility (all constraints satisfied)
    • Completeness (all requirements met)
    • Optimality (efficiency of solutions)
  • Define task decomposition strategies across specialized sub-agents (e.g., resource allocation, constraint resolution, optimization)
  • Model realistic operational systems with dependencies, timelines, and constraints
  • Implement validation logic and evaluation pipelines using Python
  • Work with Docker environments for reproducibility and execution
  • Collaborate with internal teams to improve task quality, coverage, and evaluation rigor

Qualifications

  • 5+ years of experience in operations, project management, logistics, or supply chain
  • Strong ability to formalize constraints, dependencies, and scheduling logic
  • Proficiency in Python for building validation and verification scripts
  • Experience with optimization techniques (linear programming, constraint satisfaction, scheduling algorithms)
  • Strong structured problem-solving and decomposition skills
  • Experience with AI benchmarks or evaluation frameworks (e.g., SWE-bench or similar)
  • Hands-on experience with Docker (Dockerfiles, image builds, debugging)

Nice to Have

  • Background in operations research or optimization-heavy domains
  • Experience with simulation or modeling tools
  • Familiarity with AI planning systems or automated reasoning
  • Project management experience or certifications (PMP, Agile, etc.)
Before You Apply
️
remote Be aware of the location restriction for this remote position: India, Brazil, Colombia, Egypt, Turkey, Indonesia, Bangladesh, Ghana, Kenya, Nigeria
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
AI Evaluation Engineer @Gramian Consulting Group Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 150,000+ Remote Jobs
️
remote Be aware of the location restriction for this remote position: India, Brazil, Colombia, Egypt, Turkey, Indonesia, Bangladesh, Ghana, Kenya, Nigeria
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 150,000+ Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 150,000+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later