Data Engineer - Multimodal Systems

Reposted 21 Days Ago
San Francisco, CA, USA
In-Office
Mid level
Information Technology • Software
The Role
As a Data Engineer, you will create and optimize datasets and data pipelines, focusing on large-scale data collection across various modalities and collaborating with multiple teams.
Summary Generated by Built In
Zyphra is an artificial intelligence company based in San Francisco, California.

The Role:

As a Data Engineer - Multimodal Systems, you will be a core contributor to creating, collecting, and improving Zyphra’s datasets and data pipelines across a variety of modalities. Your work will intersect with almost every team at Zyphra. You will be involved in collecting large-scale datasets and implementing and optimizing highly parallel data pipelines.

You’ll Work Across:
  • Large-scale data collection across a variety of modalities (text, audio, image)

  • Designing and working with highly efficient, parallelized data processing pipelines across modalities

  • Designing and running rigorous experimental ablations to demonstrate the impact of new data improvements

What We're Looking For / Requirements:
  • Strong implementation and prototyping ability

  • Can take an idea from conception to experimentation quickly

  • The ability to work well with others in a high-paced research setting

  • Can rapidly learn new fields and are excited to implement new ideas

  • Excellent communication and collaboration skills, and can work effectively on both research and engineering implementation at scale.

Qualifications / Additional Skills:
  • Experience collecting, handling, and processing large datasets

  • Experience with parallel Python programming frameworks such as Dask

  • Understanding of the state-of-the-art in dataset curation across modalities

  • A generally meticulous nature and a strong interest in actually looking at data and sanity checking things

  • Strong grasp of proper experimental methodology for running rigorous ablations and other hypothesis testing

  • Understanding of and interest in large-scale, highly parallel data processing pipelines.

  • Proficiency with PyTorch and Python.

  • Experience contributing to large pre-existing codebases and rapidly getting up to speed.

  • Previously published machine learning research in well-respected venues.

  • Postgraduate degree in a scientific subject (Computer Science, EE/EECS, Mathematics, Physics, Machine Learning)

Why Work at Zyphra:
  • Our research methodology is grounded in methodical, step-by-step approaches to ambitious goals. Both deep research and engineering excellence are equally valued

  • We strongly value new and crazy ideas and are very willing to bet big on new ideas

  • We move as quickly as we can; we aim to minimize the bar to impact as low as possible

  • We all enjoy what we do and love discussing AI

Benefits and Perks:
  • Comprehensive medical, dental, vision, and FSA plans

  • Competitive compensation and 401(k) plan

  • Relocation and immigration support on a case-by-case basis

  • In-office snacks and meals provided

  • Unlimited PTO and company holidays

  • In-person team in San Francisco with a collaborative, high-energy environment

Skills Required

  • Strong implementation and prototyping ability
  • Ability to work well with others in a high-paced research setting
  • Excellent communication and collaboration skills
  • Experience collecting, handling, and processing large datasets
  • Proficiency with PyTorch and Python
  • Postgraduate degree in a scientific subject
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Palo Alto, California
35 Employees

What We Do

Zyphra is a full stack AGI company based in Palo Alto, California.

Similar Jobs

Optum Logo Optum

Medical Assistant

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Huntington Beach, CA, USA
160000 Employees
16-29 Hourly

Optum Logo Optum

Account Executive

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Sacramento, CA, USA
160000 Employees
69K-103K Annually

Optum Logo Optum

Medical Assistant

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
San Bernardino, CA, USA
160000 Employees
16-29 Hourly

Optum Logo Optum

Pharmacy Care Coordinator

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Irvine, CA, USA
160000 Employees
18-32 Hourly

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account