Back to Remote jobs > Artificial Intelligence > language specialist

AI Benchmark Engineer | Native Language Specialist @LILT (Production)

Artificial Intelligence

Salary unspecified	Remote Location Worldwide
Employment Type contract	Posted 1mth ago

[Hiring] AI Benchmark Engineer | Native Language Specialist @LILT (Production)

1mth ago - LILT (Production) is hiring a remote AI Benchmark Engineer | Native Language Specialist. 💸 Salary: unspecified 📍Location: Worldwide

Role Description

We are building a rigorous, verifiable evaluation suite of Terminal-Bench tasks designed to test the limits of large language models on multilingual software challenges. Our goal is to measure multilingual robustness across prompt language effects, non-English data processing, and complex locale/encoding edge cases in terminal workflows.

We are seeking experienced native-speaking software engineers to design, build, and validate these benchmarks. You will create high-signal, high-quality tasks that genuinely test a model's ability to handle multilingual environments without relying on English translation crutches.

Note this is a remote, freelance opportunity.

What You’ll Deliver

Task Engineering: Evaluating Coding Agents.
Asset Creation: Build realistic task environments using datasets and files in your native language.
Prompting & Translation: Finding failure points where AI does not work, in your native language.
Implementation & Verification: Support the development of robust solutions (reference implementations) and write highly reliable, deterministic verifier scripts (using rubric-based judging only when strictly necessary).
Calibration & Execution: Analyze execution logs and calibrate task difficulty (Easy to Very Hard) using standard Terminal-Bench run configurations against various model tiers (Haiku, Sonnet, Opus).
Quality Assurance: Participate in a rigorous, 4-layer human quality control process (creation, human review, calibration review, and audit) alongside automated LLM-based checks to ensure fairness, grammatical accuracy, and benchmark integrity.

Qualifications

5+ years of industry experience in software engineering.
Proven track record at leading technology companies and/or graduation from top-tier engineering universities.
Native or near-native fluency, with a deep understanding of its grammar, register, and phrasing rules. High English proficiency.
Strong proficiency in Python, standard shell scripting, and data processing.
Extensive experience with Terminal/CLI-based development workflows and a working familiarity with coding agents.
Deep technical understanding of multilingual text processing pitfalls, including:
- Encoding/decoding robustness and Unicode normalization.
- Locale-dependent conventions (collation, casing, non-Gregorian dates).
- Text I/O, toolchain interoperability, and safe string operations.
- Bidirectional/RTL handling, font fallbacks, and rendering/typography in UI or artifacts.

Benefits

Your schedule, your rules: Work when you want, as much or as little as you want. No fixed hours, no check-ins, no micromanaging.
Get paid quickly and fairly: Competitive rates, prompt payments, no chasing invoices.
Work on projects that actually matter: Contribute to cutting-edge AI and language technology that is shaping how humans and machines communicate.
Be part of something bigger: Join a global community of linguists, subject matter experts, and language professionals who are advancing human knowledge together.
Grow without limits: Access to diverse, innovative projects that expand your portfolio and sharpen your skills across industries and domains.
Have fun doing what you love: Bring your language skills to life on projects that are as interesting as they are impactful.

How to join our expert community

Submit your application including an updated copy of your CV in English.
Complete a GenAI assessment to evaluate your skills.
Finalize onboarding and profile set-up in our system, and become eligible for Applied AI projects.

Similar Remote Jobs

Mid/Senior AI Cinematic Video Editor • EverAI EverAI

Artificial Intelligence Worldwide

1wk ago
Apply See more >
Business Transformation Lead • Expion Health Expion Health

Artificial Intelligence $175k - $225k USA Only

2wks ago
Apply See more >
Director of Revenue Systems and AI Automation (Offshore) • Caul Group Caul Group

Artificial Intelligence $60k–$72k LATAM

2wks ago
Apply See more >

Kickstart Your Job Search

⚡ 13,511 remote jobs added this week

You're seeing 0.4% of available roles

Unlock 160,000+ jobs →

Meet JobCopilot: Your Personal Al Job Hunter

Automatically Apply to Remote Jobs

Try it now →

Before You Apply

️

	Be aware of the location restriction for this remote position: Worldwide
‼	Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.

Back to Remote jobs > Artificial Intelligence > language specialist

AI Benchmark Engineer | Native Language Specialist @LILT (Production)

Artificial Intelligence

Salary unspecified	Remote Location Worldwide
Employment Type contract	Posted 1mth ago

Apply for this position

Unlock 160,000+ Remote Jobs

️

	Be aware of the location restriction for this remote position: Worldwide
‼	Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.

Apply for this position

Unlock 160,000+ Remote Jobs

[Hiring] AI Benchmark Engineer | Native Language Specialist @LILT (Production)

Apply to the best remote jobsbefore everyone else

Apply to the best remote jobs
before everyone else