Scientific Lead, Generative AI Engineer, Applied Intelligence for Discovery

Reposted 6 Days Ago
Be an Early Applicant
San Francisco, CA, USA
In-Office
182K-284K Annually
Senior level
Healthtech • Biotech • Pharmaceutical
The Role
Lead design and delivery of production LLM systems for drug discovery: build RAG pipelines, hybrid retrieval, text-to-SQL, agentic workflows, evaluation frameworks, and orchestration for scientific analyses.
Summary Generated by Built In

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.

The Opportunity

We are building something unprecedented, an AI foundation that will fundamentally change how drug discovery research is conducted.

The Applied Intelligence for Discovery (AI4D) team is a newly formed group within Lilly Research Laboratories that operates at the intersection of scientific delivery and core platform development. AI4D’s mission is to connecting scientists to petabyte-scale data through natural language interfaces, automated analysis workflows, and intelligent search — and to convert early deployments into repeatable system standards and evaluation practices that scale across therapeutic areas.

As a Generative AI Engineer, you will design, build, and operate the core AI systems that power this transformation: retrieval-augmented generation over internal scientific documents, text-to-SQL over complex omics databases, agentic workflows that automate multi-step analyses, and the evaluation infrastructure that able the next-generation of medicines for patients.

Key Responsibilities

  • Design, build, and optimize RAG pipelines over internal publications, study reports, electronic lab notebooks, and other scientific documents

  • Build hybrid retrieval systems combining vector search with structured metadata, knowledge graphs, and ontology-aware filtering

  • Build and optimize text-to-SQL systems over Lilly’s databases, enabling scientists to query gene expression, proteomics, pathway, and variant data through natural language

  • Develop schema documentation, semantic annotations, and gold-standard question/SQL pairs that bridge how scientists think about data and how it is stored

  • Implement multi-step reasoning approaches (chain-of-thought, self-correction, Reflexion loops) to improve accuracy on complex scientific queries

  • Design agentic AI workflows that chain database queries, bioinformatics tools, literature search, and visualization into automated multi-step scientific analyses

  • Evaluate and integrate emerging orchestration frameworks (LangGraph, CrewAI, custom architectures) for scientific use cases

  • Build evaluation frameworks measuring accuracy, reliability, and scientific validity of AI outputs

Basic Qualifications

  • PhD in Computer Science, Data Science, or a related technical field with 0-3+ years of experience; or equivalent experience building production LLM systems; MS in Computer Science, Data Science, or a related technical field with 5+ years of experience; or equivalent experience building production LLM systems

Additional Skills/Preferences

  • Experience building LLM-powered applications, including at least two of: RAG systems, text-to-SQL, agentic workflows, or fine-tuning pipelines

  • Strong software engineering skills in Python with experience building production-grade systems

  • Deep familiarity with the modern LLM ecosystem: embedding models, vector databases, and orchestration frameworks

  • Experience designing evaluation frameworks for LLM systems — systematic approaches to measuring accuracy, detecting hallucinations, and tracking regressions

  • Comfort working with complex, heterogeneous data — databases with hundreds of tables, specialized schemas, or domain-specific vocabularies

  • Familiarity with cloud computing environments (AWS preferred), containerization (Docker), and CI/CD practices

  • Experience in pharmaceutical, biotech, or life sciences environments

  • Familiarity with biomedical data types (omics, clinical, molecular) or scientific databases

  • Experience with MLOps/LLMOps tooling: experiment tracking, model registries, prompt versioning, A/B testing for AI systems

  • Knowledge of biomedical ontologies (Gene Ontology, MeSH, ChEBI) or experience integrating domain-specific knowledge into LLM systems

  • Experience building for regulated environments where auditability, reproducibility, and explainability are requirements

Lilly is dedicated to helping individuals with disabilities to actively engage in the workforce, ensuring equal opportunities when vying for positions. If you require accommodation to submit a resume for a position at Lilly, please complete the accommodation request form (https://careers.lilly.com/us/en/workplace-accommodation) for further assistance. Please note this is for individuals to request an accommodation as part of the application process and any other correspondence will not receive a response.

Lilly is proud to be an EEO Employer and does not discriminate on the basis of age, race, color, religion, gender identity, sex, gender expression, sexual orientation, genetic information, ancestry, national origin, protected veteran status, disability, or any other legally protected status.


Our employee resource groups (ERGs) offer strong support networks for their members and are open to all employees. Our current groups include: Africa, Middle East, Central Asia Network, Black Employees at Lilly, Chinese Culture Network, Japanese International Leadership Network (JILN), Lilly India Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ+ Allies), Veterans Leadership Network (VLN), Women’s Initiative for Leading at Lilly (WILL), enAble (for people with disabilities). Learn more about all of our groups.

Actual compensation will depend on a candidate’s education, experience, skills, and geographic location.  The anticipated wage for this position is

$181,500 - $283,800

Full-time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance). In addition, Lilly offers a comprehensive benefit program to eligible employees, including eligibility to participate in a company-sponsored 401(k); pension; vacation benefits; eligibility for medical, dental, vision and prescription drug benefits; flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts); life insurance and death benefits; certain time off and leave of absence benefits; and well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities).Lilly reserves the right to amend, modify, or terminate its compensation and benefit programs in its sole discretion and Lilly’s compensation practices and guidelines will apply regarding the details of any promotion or transfer of Lilly employees.

#WeAreLilly

Skills Required

  • PhD in Computer Science, Data Science, or related (0-3+ years) OR MS in Computer Science, Data Science, or related (5+ years) or equivalent experience building production LLM systems
  • Experience building LLM-powered applications, including at least two of: RAG systems, text-to-SQL, agentic workflows, fine-tuning pipelines
  • Strong software engineering skills in Python with experience building production-grade systems
  • Deep familiarity with embedding models, vector databases, and orchestration frameworks
  • Experience designing evaluation frameworks for LLM systems (accuracy, hallucination detection, regression tracking)
  • Experience working with complex, heterogeneous scientific data, large databases, and specialized schemas
  • Familiarity with cloud environments (AWS preferred), containerization (Docker), and CI/CD practices
  • Experience in pharmaceutical, biotech, or life sciences environments
  • Familiarity with biomedical data types (omics, clinical, molecular) or scientific databases
  • Experience with MLOps/LLMOps tooling: experiment tracking, model registries, prompt versioning, A/B testing
  • Knowledge of biomedical ontologies (Gene Ontology, MeSH, ChEBI) or integrating domain knowledge into LLM systems
  • Experience building systems for regulated environments where auditability, reproducibility, and explainability are required

Eli Lilly and Company Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Eli Lilly and Company and has not been reviewed or approved by Eli Lilly and Company.

  • Strong & Reliable Incentives Pay is considered competitive with annual increases, bonuses, and equity programs that link rewards to contributions and business performance. Incentive structures and stock opportunities strengthen total compensation.
  • Retirement Support Retirement programs combine a matched savings plan, a pension, and company equity options. Financial advising and retiree health coverage reinforce long-term security.
  • Parental & Family Support Parental leave is generous for all parents, with additional paid time for birth mothers and financial support for adoption or surrogacy. Backup care services, childcare options, and caregiver concierge support further aid families.

Eli Lilly and Company Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Indianapolis, IN
39,451 Employees
Year Founded: 1876

What We Do

Eli Lilly and Company engages in the discovery, development, manufacture, and sale of products in pharmaceutical products business segment. For more than a century, we have stayed true to a core set of values – excellence, integrity, and respect for people – that guide us in all we do: discovering medicines that meet real needs, improving the understanding and management of disease, and giving back to communities through philanthropy and volunteerism.

Similar Jobs

Cox Enterprises Logo Cox Enterprises

Customer Care Specialist (Cox Automotive Fleet Client Solutions Delivery)

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
United States
50000 Employees
18-27 Hourly

Cox Enterprises Logo Cox Enterprises

Inside Solutions Representative (Cox Business)

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Hybrid
San Diego, CA, USA
50000 Employees
21-31 Hourly

Cox Enterprises Logo Cox Enterprises

Sr Customer Care Specialist (Manheim)

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
United States
50000 Employees
22-33 Hourly

Hex Logo Hex

Head of Revenue Operations

Artificial Intelligence • Big Data • Software • Analytics • Business Intelligence • Big Data Analytics
Remote or Hybrid
3 Locations
160 Employees
300K-340K Annually

Similar Companies Hiring

Camber Thumbnail
Fintech • Healthtech • Social Impact
New York, New York
90 Employees
Sailor Health Thumbnail
Healthtech • Social Impact • Telehealth
New York City, NY
20 Employees
Granted Thumbnail
Mobile • Insurance • Healthtech • Financial Services • Artificial Intelligence
New York, New York
23 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account