Passionate about precision medicine and advancing the healthcare industry?
Recent advancements in genomics and computer technology have finally made it possible for AI to impact clinical care in a meaningful way. Tempus' proprietary platform connects an entire ecosystem of real-world evidence to deliver real-time, actionable insights to physicians, providing critical information about the right treatments for the right patients, at the right time.
We are seeking a genomics data scientist with experience in supporting innovative research by managing and modelling large volumes of clinical, genetic and/or genomic data. You will work on data ingestion, organization, and implementation of analysis workflows for large-scale human cohorts with genetic and multi-dimensional, multi-modality phenotypic data. These will be obtained from public sources, as well as private datasets generated in-house and obtained through our collaborations. This position is based on the San Francisco Bay Area.
- Bring in genomics/genetics datasets from external and internal sources to help develop internal resources for various analytical approaches
- Develop scalable and high quality analysis pipelines for clinical trials and clinical diagnostics products.
- Leverage the opportunities and efficiencies afforded by access to hybrid cloud-based, distributed ecosystem of database technologies
- Collaborate with other data scientists and statistical geneticists to leverage multimodal data in training polygenic risk scores, machine learning, and other predictive models.
- Work with scientists and clinicians to design and perform analyses on clinical sequencing data that generate clinically actionable insights in order to improve quality of care.
- Produce high quality and detailed documentation for all projects.
- PhD/Masters or equivalent experience in genetics, biomedical informatics, or related life sciences areas.
- At least years of experience in complex data analysis and familiarity with applications of FAIR principles
- Hands-on development of database systems and data manipulation using SQL, working within a POSIX CLI environment
- Computational skills using Python (strongly preferred), Java, C/C++ or other programming languages.
- Good understanding of life sciences domain and omics technologies.
- Experience with longitudinal human clinical/phenotype data, e.g. from electronic health records, epidemiological cohorts, or clinical trials.
- Experience with genetic and genomic data types, including public genetic databases and results data from high-throughput genetic assays (e.g. UK Biobank, Gnomad, etc.) is a plus.
Ideal Candidates Will Possess
- Experience with Python/Jupyter notebooks, NumPy/Pandas, and/or R/Bioconductor in analyzing large data sets.
- Experience mining modern, large-scale genetic databases (e.g. ExAC/gnomAD, UK Biobank, UK10K, EBI GWAS Catalog, 1KG, etc.).
- Experience with distributed database technologies and related big-data analysis tools (e.g. Spark, BigQuery; the Apache Hadoop/Hive ecosystem).
- Ability to communicate insights and presenting concepts
- Self-driven and works well in interdisciplinary teams.
- Track record of publications in related domains is a plus.
Candidates based on the SF Area with current authorization to work in the US are strongly preferred.#LI-BL1