Deepmind

Research Scientist/Research Engineer, Multimodal Agents

Sorry, this job was removed at 08:39 p.m. (CST) on Thursday, Jan 22, 2026

Easy Apply

Be an Early Applicant

Mountain View, CA, USA

In-Office

Artificial Intelligence

The Role

Snapshot

Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.

About Us

Our team is part of Google DeepMind (GDM) in the Frontier-AI unit. We specialize in multimodal foundational models, with a focus on image and video domains.
We are looking for a research scientist to develop agentic solutions to improve the capabilities of multimodal models in GDM.
Candidates must have strong machine learning skills, including experience in LLMs and computer vision. We also require competency in software engineering, which is required to implement robust solutions at the scale that we operate.
Our team values both internal and external impact and there should be opportunities for both in this role.

The Role

As a Research Scientist specializing in Multimodal Agents, you will be at the forefront of developing innovative agentic solutions to enhance the capabilities of Google DeepMind's foundational models, particularly within the image and video domains. This is an exciting opportunity to contribute directly to advancing the state of the art in artificial intelligence, working with cutting-edge technologies and a team of world-class experts. You will be instrumental in designing, implementing, and deploying robust machine learning solutions at scale, with a clear path to both internal and external impact through product integration and publications. This role offers a unique chance to shape the future of AI agents by pushing the boundaries of multimodal understanding and interaction.

Key responsibilities:

Design and implement novel agentic solutions to enhance the capabilities of multimodal foundational models, specifically in image and video domains.
Conduct cutting-edge research in machine learning, with a focus on large language models (LLMs) and computer vision, to drive advancements in multimodal understanding and interaction.
Develop and deploy robust, scalable machine learning systems and prototypes that integrate effectively with Google DeepMind's existing infrastructure.
Collaborate with cross-functional teams of scientists and engineers to translate research insights into impactful product features and publications.
Analyze and evaluate the performance of agentic models, iterating on designs and approaches to continuously improve their effectiveness and efficiency.
Stay abreast of the latest research and developments in AI, particularly in multimodal learning and agent systems, and contribute to the scientific community through publications and presentations.

About You

In order to set you up for success as a Research Scientist/Research Engineer at Google DeepMind, we look for the following skills and experience:

PhD in machine learning, computer vision or related field
3+ publications in top ML or vision conferences/journals
Python experience
JAX/pytorch experience

In addition, the following would be an advantage:

Distributed data pipeline experience (e.g., beam)
C++ experience
Experience developing LLM-based agents.

The US base salary range for this full-time position is between $141,000 - $244,000 + bonus + equity + benefits. Your recruiter can share more about the specific salary range for your targeted location during the hiring process.

At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.

View all jobs at Deepmind

View Deepmind Profile

Report Job

Similar Jobs

Collectly

Customer Success Manager

Artificial Intelligence • Healthtech • Information Technology • Software • Conversational AI • Generative AI • Automation

Easy Apply

In-Office

San Francisco, CA, USA

100 Employees

130K-170K Annually

Drata

Account Executive

Security • Software • Cybersecurity • Automation

Hybrid

San Francisco, CA, USA

600 Employees

135K-150K Annually

Vercel

Forward Deployed Engineer, v0

Artificial Intelligence • Cloud • Software

Easy Apply

Remote or Hybrid

United States

196K-294K Annually

Benchling

Systems Engineer

Cloud • Healthtech • Social Impact • Software • Biotech

Hybrid

San Francisco, CA, USA

605 Employees

160K-217K Annually

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

1,218 Employees

Year Founded: 2010

What We Do

We’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority. Our long term aim is to solve intelligence, developing more general and capable problem-solving systems, known as artificial general intelligence (AGI). Guided by safety and ethics, this invention could help society find answers to some of the world’s most pressing and fundamental scientific challenges. We have a track record of breakthroughs in fundamental AI research, published in journals like Nature, Science, and more.Our programs have learned to diagnose eye diseases as effectively as the world’s top doctors, to save 30% of the energy used to keep data centres cool, and to predict the complex 3D shapes of proteins - which could one day transform how drugs are invented.