Flatiron Health is a healthtech company using data for good to power smarter care for every person with cancer, around the world. Flatiron partners with cancer centers in the US, Europe and Asia to transform patients' real-life experiences into real-world evidence and create a more modern, connected oncology ecosystem. Our multidisciplinary teams include oncologists, data scientists, software engineers, epidemiologists, product experts and more. Flatiron Health is an independent affiliate of the Roche Group.
What You'll Do
At Flatiron, we're advancing the use of machine learning and generative AI to extract clinically relevant information from unstructured medical notes for use in oncology research. The Data Curation team is helping to build these next generation research data products, developing and applying ML and LLMs to capture a complete picture of the patient journey.
As part of our team, you will apply existing internal and off-the-shelf external AI systems and validate AI generated datasets that are used by clinicians and researchers to evolve cancer research, generate clinical insights, and learn from the experience of thousands of people living with cancer. Engaging with a cross-functional group of stakeholders across our teams in Europe, Japan, and the US you will contribute to developing ML-derived datasets from scoping to validation, productionization and delivery.
In addition, you'll also:
- Work with our clinical stakeholders to apply existing AI systems to turn raw clinical data into high quality research data.
- Become a subject matter expert on our data and its capabilities and collaborate closely across the team to understand data needs and provide analytical support that enhances model development and deployment.
- Work with research scientists and oncologists to validate that our team's models can be used to generate sound scientific insights, including full dataset performance analyses
- Work closely with subject matter experts & ML researchers to define requirements for training and evaluation datasets, and maintain software pipelines for the generation of these sets.
- Provide analytic support and create custom data outputs for cross-functional teams such as our team of clinical experts.
- Interface with internal scientific & clinical stakeholders to understand what data they need to conduct high quality research.
- Work cross-functionally with software engineers to productionize, scale, and monitor our team's models.
Who You Are
You are a product-focused data scientist, with creative analytical problem-solving skills ready to tackle the problems of measuring the performance of complex datasets & the systems that build them. You're excited to learn about oncology from our clinical stakeholders and work with them to apply AI to extract nuanced clinical concepts from the medical record and validate the fitness-for-use of that data for oncology research. You're a kind, passionate and collaborative problem-solver who seeks and gives candid feedback, and values the chance to make an important impact.
- You have 3+ years of relevant working experience as an applied data scientist or similar technical data-oriented role, including relevant applied work in an academic setting. You have some experience working with ML or LLMs.
- You understand how machine learning and AI systems are measured and can analyze an existing system to understand the quality of its output, assess where improvements are needed, and communicate the impact of those improvements to stakeholders
- You are a clear and confident communicator who can break down complex data analyses to tell a compelling story.
- You are excited to work in a startup environment, think creatively and be scrappy to get the job done. You have a nose for value and empathy for your customers.
- You have collaborated with other technical team members in a production development environment using formal version control, Python (including data manipulation in pandas, polars or a similar framework), and SQL.
- You are proficient in English and German.
Extra Credit (Optional to add extra credit)
- You have experience working with data in a healthcare setting.
- You have experience with the risks of bias in machine learning, health equity research/analysis or have worked with underrepresented groups in a clinical research setting.
If this sounds like you, you'll fit right in at Flatiron.
- You have experience working in dbt or other ETL frameworks
- You have experience with deep learning and traditional NLP methods.
Preferred Primary Location: Berlin office
The annual pay range reflected above for this position is based on the preferred primary location of the role which is listed in the job description. Salary ranges for other locations vary from the range reflected above. Base pay offered may vary depending on job-related knowledge, skills, and experience. An annual bonus and equity may be provided as part of the compensation package, in addition to a full range of medical, financial, and/or other benefits, dependent on the position offered.
Top Skills
What We Do
Flatiron Health is a healthtech company dedicated to helping cancer centers thrive and deliver better care for patients today and tomorrow. Through clinical and data science, we translate patient experiences into real-world evidence to improve treatment, inform policy, and advance research. Cancer is smart. Together, we can be smarter. Flatiron Health is an independent affiliate of the Roche Group.
Why Work With Us
Reimagine the infrastructure of cancer care within a technology and science community that values integrity, inspires growth, and is uniquely positioned to create a more modern, connected oncology ecosystem.
Gallery
Flatiron Health Teams
Flatiron Health Offices
Hybrid Workspace
Employees engage in a combination of remote and on-site work.
At Flatiron, attracting and inspiring a diverse team is essential to our success. Our hybrid work approach, built on flexibility and clarity, allows you to choose your office days while optimizing productivity and well-being.

















