AI Researcher – Multilingual Data

Reposted 25 Days Ago
Be an Early Applicant
Hiring Remotely in World Golf Village, FL, USA
In-Office or Remote
Expert/Leader
Artificial Intelligence • Information Technology • Software
The Role
The AI Researcher will design and execute research on multilingual datasets, improve cross-lingual transfer in language models, and publish findings at top conferences while translating research into production improvements.
Summary Generated by Built In
About the Role

We’re looking for an AI Researcher focused on multilingual data to help us build and scale next-generation language models across diverse languages and domains. You’ll own research and execution around data sourcing, curation, evaluation, and training strategies for multilingual and low-resource languages, with a strong emphasis on publishing high-quality research and translating it into production systems.

This role is ideal for someone who enjoys working close to the frontier: balancing papers, prototypes, and real-world impact in a fast-moving startup environment.

What You’ll Do
  • Design and execute research on multilingual datasets, including data collection, filtering, deduplication, and quality measurement

  • Develop strategies for low-resource and long-tail languages (sampling, augmentation, curriculum design)

  • Research and improve cross-lingual transfer, alignment, and robustness in large language models

  • Build and maintain evaluation benchmarks for multilingual performance

  • Collaborate with engineers and researchers on training pipelines and model architecture decisions

  • Publish research at top venues (e.g., ACL, EMNLP, NeurIPS, ICML, ICLR) and contribute to open-source when appropriate

  • Translate research insights into practical improvements in production models

What We’re Looking For
  • Strong background in NLP / ML research, with a focus on multilingual or cross-lingual modeling

  • Publication record at respected conferences or journals (ACL, EMNLP, NeurIPS, ICML, ICLR, etc.)

  • Experience working with large-scale text datasets across multiple languages

  • Solid understanding of:

    • Tokenization and vocabulary design for multilingual models

    • Data quality metrics, filtering, and dataset bias

    • Transfer learning and multilingual representation learning

  • Comfortable prototyping in Python with modern ML frameworks (PyTorch, JAX, etc.)

  • Ability to operate independently and ship research in a startup pace environment

Nice to Have
  • Experience with low-resource languages or non-Latin scripts

  • Open-source contributions in NLP or data tooling

  • Experience training or evaluating large language models

  • Familiarity with multilingual benchmarks (e.g., XTREME, FLORES, TyDi QA)

Why Join Us
  • Real ownership over research direction and impact

  • A team that values papers and production

  • Access to meaningful scale: large datasets, modern infrastructure, and fast iteration

  • Competitive compensation and meaningful equity at an early stage

Top Skills

Jax
Ml
Nlp
Python
PyTorch
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, California
20 Employees
Year Founded: 2023

What We Do

We enable serverless inference via our GPU orchestration and model load-balancing system. We unlock fine-tuning by enabling organizations to size their server fleet to throughput needs, not number of models in the catalogue. See it in action on our public cloud, which offers inference for 10k+ open weight models.

Similar Jobs

Cox Enterprises Logo Cox Enterprises

Senior Data Scientist

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
United States
50000 Employees
102K-169K Annually

Cox Enterprises Logo Cox Enterprises

Environmental Advocacy Content Director

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
United States
50000 Employees
149K-248K Annually

Cox Enterprises Logo Cox Enterprises

Sales Representative

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
Jacksonville, FL, USA
50000 Employees
32K-88K Annually

Cox Enterprises Logo Cox Enterprises

Sales Representative

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
Florida, USA
50000 Employees
32K-88K Annually

Similar Companies Hiring

Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account