Job Description
Works directly with the Data Science and Predictive Analytics team to advance data engineering and infrastructure within Northwell, as well as migrate and support machine learning projects in production. Defines and builds the next iteration of features for the Data Science team and is responsible for modifying, expanding, and optimizing our data warehousing to include feature stores, big data, and cloud technologies. Works collaboratively with members from both the Information Technology and clinical functions to support data scientists, systems and initiatives at both the department and the enterprise level.
Job Responsibility
- Manages and coordinates implementation of systems and processes to meet user requirements; communicates with user departments and project teams regarding implementation activities and system changes to ensure feasibility; develops system specifications that appropriately meet department requirements.
- Builds and maintains infrastructure for data management and assembles datasets from disparate sources to meet team requirements.
- Maintains data acquisition process and data pipelines; verifies data quality, and/or ensures it via data cleaning and processing. Develops and optimizes ETL processes, implements transformations and quality checks results.
- Designs, develops, and maintains data pipelines between serves, databases, and other sources to support Data Science research and development and production pipelines.
- Identifies, designs, and implements process improvements for optimization, efficiency, greater scalability, and automation.
- Develops and shares best practices for code quality, versioning, repository, documentation, and data flows among the Data Science team.
- Assists data scientists, engineers, cloud architects, and subject matter advisors in testing, deploying, and maintaining artificial intelligence and machine learning algorithms.
- Facilitates deployment of machine learning and other models to production; monitors performance and health; and updates/retrains models.
- Works collaboratively to develop, construct, test, and maintain large scale data processing systems and databases.
- Maintains development and production environments, both on-premises and cloud-based.
- Participates in projects to architect (research, recommend, design, develop and deploy) advanced systems for the collection, aggregation and analysis of those data in alignment with business objectives.
- Provides big data technology assessments, strategies, and roadmaps in several technical domains and act as a subject matter advisor on big data.
- Works with cross functional research leadership, technical and analytical teams to understand current and future enterprise-wide big data analytics goals spanning disparate platforms and datatypes.
- Assists in ensuring that systems are implemented to support Health System initiatives and goals to improve the quality of patient care, to maximize patient safety, and to provide operational efficiencies.
- Demonstrates familiarity with current health system information systems.
- Operates under limited guidance and work assignments involve moderately complex to complex issues where the analysis of situations or data requires in-depth evaluation of variable factors.
- Makes decisions on moderately complex to complex issues regarding technical approach and completion of own tasks/responsibilities of substantial complexity.
- Performs related duties as required. All responsibilities noted here are considered essential functions of the job under the Americans with Disabilities Act. Duties not mentioned here, but considered related are not essential functions.
Job Qualification
- Bachelor's Degree in Computer Science, Informatics, Statistics, Engineering, Data Science, or relative quantitative field required, or equivalent combination of education and related experience. Master's Degree, preferred.
- 3-5 years of experience with enterprise level design and implementation of relational databases, big data pipelines, cloud computing, and other advanced data science and big data technologies, required.
- Advanced working knowledge and experience with Google BigQuery, required. Strong knowledge and experience using the following software and tools, required: SQL/NoSQL, Python, Git, data pipeline tools (e.g., Airflow), cloud infrastructure (Google Cloud Platform preferred)
- Experience in building and optimizing data pipelines; deploying and maintaining production machine learning algorithms; and building, testing, and deploying code on cloud infrastructure, required.
- Experience with managing healthcare data, preferred.
- Experience with managing unstructured data and streaming data, preferred.
- Experience in architecting data warehouses and/or data lakes with traditional database enterprise-class RDBMS technologies, preferred.
- Strong knowledge of Business Intelligence & Analytics concepts and platforms, inclusive of data virtualization, data preparation, data visualization and advanced analytics technologies, preferred.
*Additional Salary Detail
The salary range and/or hourly rate listed is a good faith determination of potential base compensation that may be offered to a successful applicant for this position at the time of this job advertisement and may be modified in the future.When determining a team member's base salary and/or rate, several factors may be considered as applicable (e.g., location, specialty, service line, years of relevant experience, education, credentials, negotiated contracts, budget and internal equity).
Skills Required
- Bachelor's Degree in Computer Science, Informatics, Statistics, Engineering, Data Science, or related quantitative field (or equivalent experience)
- Master's Degree
- 3-5 years enterprise-level experience designing and implementing relational databases, big data pipelines, cloud computing and related technologies
- Advanced working knowledge and experience with Google BigQuery
- Experience with SQL and NoSQL
- Experience programming in Python
- Experience with Git (version control)
- Experience with data pipeline tools (e.g., Apache Airflow)
- Experience with cloud infrastructure (Google Cloud Platform preferred)
- Experience building and optimizing data pipelines, ETL processes, and data quality checks
- Experience deploying and maintaining production machine learning algorithms and model monitoring
- Experience building, testing, and deploying code on cloud infrastructure
- Experience managing healthcare data
- Experience with unstructured data and streaming data
- Experience architecting data warehouses and/or data lakes
- Knowledge of Business Intelligence & Analytics concepts and platforms (data virtualization, preparation, visualization)
What We Do
Northwell Health is New York’s largest private employer and health care provider, with 23 hospitals and nearly 800 outpatient facilities. We care for over two million people annually in the New York metro area and beyond, thanks to philanthropic support from our communities. Our 74,000+ employees – 18,500+ nurses and 14,200+ credentialed physicians, including about 4,500 employed doctors and nearly 3,300 members of Northwell Health Physician Partners – are working to change health care for the better. We’re making breakthroughs in medicine at the Feinstein Institutes for Medical Research. We're training the next generation of medical professionals at the visionary Donald and Barbara Zucker School of Medicine at Hofstra/Northwell and the Hofstra Northwell School of Graduate Nursing and Physician Assistant Studies. For information on our more than 100 medical specialties, visit Northwell.edu. Interested in a career at Northwell Health?





