Bioinformatics Data Engineer

Posted 6 Days Ago
Emeryville, CA
125K-190K Annually
Mid level
Artificial Intelligence • Healthtech • Biotech • Pharmaceutical
The Role
The Bioinformatics Scientist/Engineer will perform genomic data mining, manage and expand protein sequence databases, deploy cloud-based pipelines for genomic data processing, and document code while collaborating with a multidisciplinary team.
Summary Generated by Built In

Profluent is an AI-first protein design company. Founded in 2022, we develop deep generative models to design and validate novel, functional proteins to revolutionize biomedicine. Based in Emeryville, CA, we are backed by leading investors including Spark Capital, Insight Partners, Air Street Capital, AIX Ventures, and Convergent Ventures.

Here at Profluent, data is our lifeline. Our generative models learn the blueprint of life by modeling large-scale evolutionary data, enabling us to engineer and write biology in unprecedented ways. As we continue to push the boundaries of what is possible, the volume of available genomic data is growing exponentially. Managing and extracting insights from this ever-expanding data is at the core of this position. 

We're seeking a Bioinformatics Data Engineer to design and build cutting-edge data and cloud infrastructure capable of handling the immense scale and complexity of our genomic datasets. This role is vital to ensuring that we can efficiently process, store, and analyze petabytes of data, unlocking the full potential of our models and driving discoveries across the life sciences.

As an early team member, you’ll play a pivotal role in building the foundations of our data and cloud architecture. You will have the autonomy to make critical decisions and directly influence the success of our mission to harness the power of data for machine learning.

This is an excellent opportunity to shape the future of AI-driven protein design and to work cross-functionally with a diverse team of experts across machine learning, protein engineering, cell biology, and gene editing.

Responsibilities

  • Maintain and expand the world’s largest database of protein sequences
  • Deploy cloud-based pipelines to process and search large-scale genomic datasets
  • Build cloud databases for scalable storage and fast retrieval of terabases of genomic data, including genomes, genes, proteins, and structures
  • Clearly document code and communicate outcomes to colleagues

Qualifications

  • BS, MS, or PhD in Bioinformatics, Genomics, Computer Science, or a related quantitative bioscience field
  • 3+ years of industry or postdoc experience
  • Experience working with Google Cloud Platform (GCP) or other cloud-based compute services (e.g. AWS)
  • Experience building cloud pipelines, pipelining tools (snakemake, NextFlow), and containerized applications (docker)
  • Experience with highly parallelized cloud-based computing platforms (Batch or Kubernetes)
  • Experience with scalable databases (BigQuery, BigTable) and proficient in database programming (SQL)
  • Fluent in Python data analysis tools (numpy, pandas, Jupyter notebook, biopython)
  • Experience with Linux environments and version control (git)

Preferences (but not required)

  • Experience with bioinformatics tools for sequence and structure analysis
  • Experience working with next-generation sequencing data
  • Familiarity with public repositories like UniProt, EBI, JGI, NCBI, and SRA
  • Familiar with concepts in molecular biology, biochemistry, and structural biology
  • Biological knowledge about prokaryotic gene and genome structure
  • Publications in major scientific journals or conferences

Actual salary will be determined based on relevant skills, qualifications, experience, training, and market data. Benefits package may vary depending on company policies and eligibility criteria.

Hiring Salary Range

$150,000$200,000 USD

What we offer at Profluent

  • A high-growth opportunity with meaningful impact
  • Competitive compensation package
  • Health insurance (health/dental/vision)
  • Generous paid time off (PTO) policy
  • Commitment to physical and mental well-being
  • More benefits and perks to be added!

Profluent Bio, Inc is an equal opportunity employer promoting diversity and inclusion in the workspace. We do not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical conditions, veteran status, sexual orientation, gender (including gender identity and gender expression), sex (which includes pregnancy, childbirth, and breastfeeding), genetic information, taking or requesting statutorily protected leave, or any other basis protected by law.

Legal authorization to work in the United States is required. In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire.

Top Skills

Python
The Company
HQ: Berkeley, California
23 Employees
On-site Workplace
Year Founded: 2022

What We Do

Profluent is an AI-first protein design company. Founded in 2022, we develop deep generative models to design and validate novel, functional proteins to revolutionize biomedicine. Based in Berkeley, CA, we are backed by leading investors including Spark Capital, Insight Partners, Air Street Capital, AIX Ventures, and Convergent Ventures. To learn more about our mission to decode the language of life with AI, visit profluent.bio

Similar Jobs

Brisbane, CA, USA
112 Employees

Afterpay Logo Afterpay

Senior Data Platform Engineer

Fintech • Payments • Software • Financial Services
Hybrid
8 Locations
900 Employees
126K-223K Annually

Voltage Park Logo Voltage Park

Platform Engineer

Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
San Francisco, CA, USA
51 Employees
120K-180K Annually

Crusoe Energy Systems Logo Crusoe Energy Systems

Electrical Engineer - Facilities

Cloud • Greentech • Other • Energy
Hybrid
San Francisco, CA, USA
450 Employees
170K-200K Annually

Similar Companies Hiring

Zealthy Thumbnail
Telehealth • Social Impact • Pharmaceutical • Healthtech
New York City, NY
13 Employees
Cencora Thumbnail
Pharmaceutical • Logistics • Healthtech
Conshohocken, PA
46000 Employees
RunPod Thumbnail
Software • Infrastructure as a Service (IaaS) • Cloud • Artificial Intelligence
Charlotte, North Carolina
53 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account