AstraZeneca PLC

Analyst - Data Scientist

Posted 3 Days Ago

Be an Early Applicant

Bangalore, Bengaluru Urban, Karnataka, IND

In-Office

Mid level

Biotech • Pharmaceutical • Manufacturing

The Role

Build and maintain ETL/ELT pipelines and orchestration (Airflow) across Snowflake/Redshift/S3; integrate and quality-check EHR/claims/lab data; engineer feature-ready datasets and performant data models; support ML model training, scoring and deployment; ensure governance, security, and observability; provide operational support and CI/CD for data solutions.

Summary Generated by Built In

Job Title: Analyst - Data Scientist

Grade : C3

Shift: 2 pm to 11 pm ISTRole: Individual contributor role.Location: Manyata Tech Park, Bangalore.Introduction to role:

Are you ready to dive into the world of commercial analytics and make a real impact? As a Data Scientist, you'll partner with cross-functional teams to tackle business challenges head-on using data-driven solutions. Your mission will be to build, develop, and deploy advanced AI and machine learning models. You will use diverse healthcare datasets to uncover actionable insights and optimize business processes.

Accountabilities:

Platform & Pipeline Ownership: Design, build, and maintain robust ETL/ELT pipelines across Snowflake, Redshift, S3, and AWS services; implement workflow orchestration with Apache Airflow and similar tools; manage environments for reliability, observability, and cost efficiency.

Data Integration & Quality: Ingest and harmonize internal and external data, including EHR, administrative claims, and laboratory data (e.g., IQVIA, Komodo, Symphony, Prognos); implement data profiling, quality checks, reconciliation, and lineage to ensure accuracy and timeliness.

Data Modeling & Performance: Engineer curated data models and feature-ready datasets and marts; optimize SQL and storage strategies; enable performant data services that power analytics dashboards and ML scoring.

Automation & DevOps: Apply scheduling, monitoring, alerting, and process improvements. Use Git-based CI/CD methods. Build reusable components and frameworks that coordinate and prioritize data descriptions to streamline transformations and business rules at scale.

Architecture, Security & Governance: Contribute to cloud architecture decisions, data cataloging, access controls, and governance standards; promote best practices for scalability, security, and compliance in healthcare/biopharma contexts.

BAU & Support: Provide operational support for data solutions, manage tickets and incident response, and drive continuous improvement across pipelines and platform processes.

Essential Skills/Experience:

Education & Experience: BS/MS in a quantitative field (Computer Science, Data Science, Engineering, Information Systems, Economics) and at least 4+ years of relevant data engineering/analytics engineering experience.
AI-Native Data Science Orientation: Strong inclination toward AI-first problem solving, with an understanding of modern AI paradigms including Generative AI and Agentic AI systems. Ability to design data systems that support intelligent, autonomous decision-making workflows.
Core Technologies: Proficiency with Python and SQL; hands-on experience with Snowflake, AWS (e.g., S3, Lambda), Amazon Redshift, Apache Spark, and Apache Airflow. Familiarity with MongoDB, Oracle, Teradata, and/or SQL Server.
Data Engineering Skills: Strong track record building ETL/ELT pipelines, integrating diverse sources, orchestrating workflows, and implementing data quality, lineage, and governance. Proven performance tuning and scalability practices.
Statistical & Analytical Expertise: Solid grounding in statistics, experimental design, and advanced analytics, with the ability to translate business problems into data science solutions and validate outputs rigorously.
Healthcare Data Expertise: Hands-on experience working with EMR/EHR, claims, laboratory, and epidemiological datasets, with a strong understanding of healthcare data models, standards, and challenges.
Analytics & DS Fundamentals: Ability to translate business logic into technical requirements and analytical datasets; experience supporting feature engineering and model operations; proficiency in data manipulation, cleansing, and interpretation; ability to analyze results and present findings.
Business & Domain Context: Ability to bridge data science, epidemiology, and business context, ensuring solutions are impactful and aligned with commercial and healthcare objectives.
Communication & Collaboration: Clear written and verbal communication to convey complex methods and outcomes to technical and non-technical stakeholders; comfortable working cross-functionally and in cross-cultural environments.
Operational Excellence: Experience with support/maintenance projects, ticket handling, and change management; experience with Git and CI/CD workflows.

Desirable Skills/Experience:

Agentic AI & Advanced AI Systems: Experience designing or working with Agentic AI frameworks, multi-agent systems, or autonomous decision pipelines, including orchestration of AI agents for complex problem-solving.
Model Enablement: Partner with Commercial Data Science and Advanced Analytics teams to translate business logic into technical requirements; prepare analytical datasets and features; optimize queries and data flows for model training, scoring, and monitoring.
Applied Analytics: Conduct analyses using HCP and de-identified patient-level healthcare data to generate disease insights, lead generation, targeting, segmentation, and field alerts; maintain scalable analytical assets for recurring use cases.
Methods & Tools: Contribute to predictive modeling workflows (e.g., regression, classification, clustering, time series); support development and optimization of ML tools that improve patient finding and HCP identification; build dashboards and reports in Tableau, Power BI, Qlik, or Databricks/Dataiku notebooks.
Collaboration: Work with Commercial partners and the internal AI community to identify value-driving opportunities and share best practices; support ad hoc analyses and production operations as needed.
Big Data & Distributed Computing: Knowledge of Hive, Spark, Scala, HDFS; designing solutions for large-scale distributed environments.
APIs & Integration: Experience with HTTP requests/responses and RESTful services for data ingestion and application integration.
Visualization & BI: Experience with Tableau, Qlik, Power BI, and Excel-based reporting.
Domain & Tools: Working knowledge of Salesforce/Veeva CRM, data governance practices, and data mining algorithms; consulting, healthcare, or biopharma experience; hands-on work with EHR, claims, and lab data ecosystems.
ML Enablement: Experience operationalizing ML pipelines (training, scoring, monitoring) and optimizing data flows for model performance in Databricks or Dataiku.

When we put unexpected teams in the same room, we unleash bold thinking with the power to inspire life-changing medicines. In-person working gives us the platform we need to connect, work at pace and challenge perceptions. That's why we work, on average, a minimum of three days per week from the office. But that doesn't mean we're not flexible. We balance the expectation of being in the office while respecting individual flexibility. Join us in our unique and ambitious world.

At AstraZeneca's Alexion division for Rare Diseases, you'll find an environment where innovation thrives! Our commitment to patients drives us forward every day. Here you'll be part of a team that values diversity of thought as much as diversity of people. We believe in empowering our employees through tailored development programs that align personal growth with our mission. With a rapidly expanding portfolio in a dynamic biotech atmosphere combined with the resources of a global biopharma leader—your career here is more than just a path—it's a journey towards making a difference where it truly counts.

Ready to make an impact? Apply now!

Date Posted

22-Jun-2026

Closing Date

04-Jul-2026

Alexion is proud to be an Equal Employment Opportunity and Affirmative Action employer. We are committed to fostering a culture of belonging where every single person can belong because of their uniqueness. The Company will not make decisions about employment, training, compensation, promotion, and other terms and conditions of employment based on race, color, religion, creed or lack thereof, sex, sexual orientation, age, ancestry, national origin, ethnicity, citizenship status, marital status, pregnancy, (including childbirth, breastfeeding, or related medical conditions), parental status (including adoption or surrogacy), military status, protected veteran status, disability, medical condition, gender identity or expression, genetic information, mental illness or other characteristics protected by law. Alexion provides reasonable accommodations to meet the needs of candidates and employees. To begin an interactive dialogue with Alexion regarding an accommodation, please contact [email protected]. Alexion participates in E-Verify.

Skills Required

BS/MS in a quantitative field (Computer Science, Data Science, Engineering, Information Systems, Economics)
At least 4+ years of relevant data engineering/analytics engineering experience
Proficiency with Python
Proficiency with SQL
Hands-on experience with Snowflake
Hands-on experience with AWS (S3, Lambda)
Hands-on experience with Amazon Redshift
Hands-on experience with Apache Spark
Hands-on experience with Apache Airflow (workflow orchestration)
Experience building ETL/ELT pipelines, data integration, data quality, lineage, and governance
Solid grounding in statistics, experimental design, and advanced analytics
Hands-on experience with EMR/EHR, claims, laboratory, and epidemiological healthcare datasets
Experience with Git and CI/CD workflows
Familiarity with MongoDB, Oracle, Teradata, and/or SQL Server
Experience with Generative AI, Agentic AI or multi-agent/ autonomous AI systems
Experience operationalizing ML pipelines and working with Databricks or Dataiku
Experience with Tableau, Power BI, Qlik or Databricks/Dataiku notebooks for visualization and reporting
Knowledge of big-data/distributed computing technologies (Hive, Scala, HDFS)
Experience integrating data via RESTful APIs/HTTP
Working knowledge of Salesforce/Veeva CRM and healthcare commercial tools

View all jobs at AstraZeneca PLC

View AstraZeneca PLC Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

90,000 Employees

Year Founded: 1999

What We Do

AstraZeneca is a global, science-led, patient-focused biopharmaceutical company committed to excellence in the research, development, and commercialization of prescription medicines. With approximately 90,000 employees across 85 countries, the company aims to unlock the power of science to deliver innovative medicines that transform patient outcomes and improve healthcare for people, society, and the planet worldwide.