LILT AI Jobs

Software Engineering & DevOps AI Rater/Evaluator

LILT AI

Software Engineering & DevOps AI Rater/Evaluator

Posted 4 Days Ago

Hiring Remotely in San Francisco, California, USA

In-Office or Remote

Mid level

Artificial Intelligence • Information Technology • Software

Make anything multilingual. Translation, AI data set creation, and human expert evals. For businesses and governments.

The Role

Evaluate AI outputs in software engineering and DevOps, assess technical correctness, and improve multilingual AI model quality. Senior roles involve higher-level oversight, error analysis, and collaboration with research and product teams.

Summary Generated by Built In

Overview

LILT is building a global network of domain experts to support high-quality AI evaluation across training, benchmarking, red-teaming, and ongoing model monitoring. We are seeking software engineering and DevOps professionals to contribute expert judgment to human-in-the-loop AI evaluation workflows used by leading enterprises and hyperscalers.

This role is designed for professionals who understand how software systems, infrastructure, and development practices work in real production environments and who can apply that expertise to evaluate, assess, and improve multilingual AI systems.

Your contribution of expertise will directly influence multilingual AI model quality, safety, and deployment readiness.

This role includes two distinct expert tracks, based on experience level and scope of responsibility.

Track A: Software Engineering & DevOps AI Rater

Raters execute structured evaluation tasks using clearly defined rubrics and instructions.

Responsibilities

Evaluate AI outputs related to software engineering, DevOps, and infrastructure topics
Perform structured scoring, comparison, classification, and judgment tasks
Assess technical correctness, completeness, security implications, and best-practice alignment
Identify hallucinations, incorrect code, unsafe recommendations, or misleading system guidance
Apply domain-specific engineering and DevOps guidelines consistently across tasks

Ideal Background

Software engineers, site reliability engineers, DevOps engineers, or platform engineers
Experience with production systems, CI/CD pipelines, cloud infrastructure, or distributed systems
Strong attention to detail and comfort working with structured evaluation criteria

Track B: Software Engineering & DevOps AI Evaluator (Senior Track)

Evaluators provide higher-level technical oversight and help shape how evaluation is performed.

Responsibilities

Validate and refine evaluation rubrics and edge-case handling
Perform adjudication where raters disagree
Conduct error analysis and qualitative reviews of model behavior
Partner with LILT research, product, and customer teams on evaluation design
Support red-teaming, security review, and model readiness assessments

Ideal Background

Senior software engineers, DevOps leads, SREs, or technical architects
Experience defining technical standards, reviewing complex edge cases, or advising on system design and reliability
Ability to clearly explain nuanced technical reasoning and tradeoffs

Evaluation Focus & Requirements

Types of AI Evaluation Work

Depending on project demands, work may include:

Software engineering and infrastructure content evaluation
Code correctness and reasoning assessment
DevOps, CI/CD, and cloud architecture evaluation
Security and reliability-focused red-teaming
Ongoing model monitoring and regression testing

What We Look For

Deep domain expertise in software engineering, DevOps, or infrastructure
Strong technical judgment and ability to apply criteria consistently
Comfort working with structured evaluation workflows
Ability to explain reasoning clearly, especially in complex or high-risk technical scenarios
Reliability, professionalism, and respect for quality standards

Engagement Model

Contract-based, flexible participation
Project-based work with clear expectations and timelines
Opportunities for recurring work based on performance and demand
Compensation communicated upfront per project or task type

Why This Work Matters

Your expertise helps ensure that AI systems:

Provide accurate and safe technical guidance
Align with real-world engineering and DevOps best practices
Are reliable, secure, and trustworthy across languages

Language Requirements

Native or professional fluency in one or more supported languages is required
Supported languages span 30+ global languages
Language-specific nuance is assessed through screening and task-based evaluation, not separate job descriptions
English fluency is required for guidelines, feedback, and collaboration

AI is changing how the world communicates — and LILT is leading that transformation.

LILT's mission is to make the world's information available to everyone, no matter the language they speak. Join our global community who thrive on innovation and excellence. Our collective knowledge, uniqueness, and skills deliver multilingual AI and human-verified services to Enterprises, Governments, and AI Developers around the world.

Earn money. Have fun. Advance human knowledge. Work on diverse projects from anywhere, any time you want. Get paid quickly and fairly, and build your professional network in a supportive community—all through a streamlined application process tailored to your expertise.

Information collected and processed as part of your application process, including any job applications you choose to submit, is subject to LILT's Privacy Policy at https://lilt.com/legal/privacy.

At LILT, we are committed to a fair, inclusive, and transparent hiring process. As part of our recruitment efforts, we may use artificial intelligence (AI) and automated tools to assist in the evaluation of applications, including résumé screening, assessment scoring, and interview analysis. These tools are designed to support human decision-making and help us identify qualified candidates efficiently and objectively. All final hiring decisions are made by people. If you have any concerns, require accommodations, or would like to opt-out of the use of AI in our hiring process, please let us know at [email protected].

LILT is an equal opportunity employer. We extend equal opportunity to all individuals without regard to an individual’s race, religion, color, national origin, ancestry, sex, sexual orientation, gender identity, age, physical or mental disability, medical condition, genetic characteristics, veteran or marital status, pregnancy, or any other classification protected by applicable local, state or federal laws. We are committed to the principles of fair employment and the elimination of all discriminatory practices.

Skills Required

Experience with production systems
Experience with CI/CD pipelines
Experience with cloud infrastructure
Strong attention to detail
Native or professional fluency in one or more supported languages
Experience defining technical standards
Ability to explain nuanced technical reasoning

LILT AI Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about LILT AI and has not been reviewed or approved by LILT AI.

Fair & Transparent Compensation — Pay for full‑time roles is characterized as market‑aligned in engineering and go‑to‑market functions; overall compensation is seen as acceptable to good for these teams.
Healthcare Strength — Core benefits include full healthcare coverage (medical, dental, vision) for full‑time employees; this provides a solid baseline for the package.
Retirement Support — U.S. offerings include a 401(k) match; retirement support is clearly part of the total rewards mix.

Learn more about LILT AI's Compensation & Benefits →

LILT AI Insights

What's It Like to Work at LILT AI? LILT AI Culture & Values LILT AI Career Growth & Development What's the Work-Life Balance Like at LILT AI? LILT AI Leadership & Management LILT AI Company Growth, Stability & Outlook

View all jobs at LILT AI

View LILT AI Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: San Francisco, California

690 Employees

Year Founded: 2015

What We Do

Make anything multilingual. A complete solution for translation and data set creation for businesses and governments. Founded by research scientists who met working on Google Translate, LILT is a global team of engineers, scientists, GTM experts, and operators transforming global business communications.