Data Engineer Specialist
Data Engineer
Description:
Headquartered in the heart of downtown Chicago, CNA is a leading commercial and specialty insurer, offering a diverse range of insurance products including Workers Compensation, Property, General Liability, Professional Liability, Cyber Insurance, Surety, and Warranty. We are one of the world leaders in underwriting non-medical professionals, from lawyers and accountants to architects and management consultants.
What CNA offers:
- A collaborative and growing analytics team with diverse skills and experiences, combined with deep expertise in insurance applications of data and analytics
- Modern cloud computing environment that enables you to explore data, build and deploy sophisticated processes that impact key areas such as underwriting, pricing, claims management and risk control
- Sponsorship of continued professional growth through support for attending technical conferences, meetings and symposia
What we are looking for:
The successful candidate will:
- Work cross functionally at CNA to build next generation data capabilities to enable superior decision support and insight generation
- Support data and processes for the pricing, underwriting, claims, operations and marketing for an exciting mix of business insurance products
Essential Duties & Responsibilities:
- Assemble large and complex data sets from disparate data sources into consumable formats that meet business requirements
- Create efficient and reproducible ETL Data Pipelines using SQL, Python or big data tools such as Spark
- Work closely with Data Science, DevOps and data management teams to assist with data-related technical issues and support their data infrastructure needs
- Build and maintain capabilities for data quality control, identify data quality issues and pipeline failures
- Build exploratory Dashboards/tools for data scientists and business partners that can be deployed relatively quickly and require low maintenance
- Create streamlined process for geocoding internal data for matching to external sources
- Collaborate with application owners to help define data collection requirements
- Work with Data Scientists to understand requirements and help design systems and processes to deliver business value. Research new uses for existing data
- Build infrastructure required for flexible and scalable extraction, transformation and loading of data from a wide variety of data sources
- Design and implement functionality, participate in team code reviews, and provide feedback on performance, logic, standard methodologies and maintenance issues to ensure code-level consistency
- Create production quality code to support deployment of predictive models
- Produce coherent documentation, metadata, and reports
- Own data processing pipelines from conception to production deployment.
Required Skills, Knowledge & Abilities:
- Advanced SQL knowledge and proven experience working with relational databases.
- Demonstrated experience in manipulating, merging, cleaning, profiling, and preparing large datasets for analytics, from disparate sources
- Working knowledge of Python, including pandas
- Experience working with XML and JSON formats
- Practical experience with version control, preferably git
- Experience implementing and maintaining ETL and CI/CD data pipelines
- Ability to write efficient, well documented data wrangling code
- Intellectual curiosity to find new and innovative ways to solve data management issues
- Employ an array of technologies and tools to connect systems together
- Strong analytical, problem solving and critical thinking skills
- Attention to detail and accuracy of work, ability to spot and correct issues
- Strong interpersonal and communication skills
- Drive to continuously improve and learn new tools and methods
- Ability to work collaboratively with colleagues with diverse perspectives and backgrounds
- Strong time management skills
- Capable of operating with little supervision and thinking independently and innovatively
Preferred Skills, Knowledge & Abilities:
- Experience with data pipeline and workflow management tools such as Airflow
- Experience with GCP cloud services such as Big Query, Google Storage, Google Cloud Functions
- Experience with distributed data processing technologies such as Spark
- Knowledge of R, including the dplyr and data.table packages
- Experience working with unstructured data
- Experience working with insurance data
- Familiarity with data dash-boarding tools such as Python dash, R Shiny, or Tableau
- Experience in extracting meaningful information from data using visualization
Reporting Relationship:
Director or above
Education & Experience:
- Bachelor's degree, with two or more years of relevant work experience.
*LI-KC1