A Day in the Life of a Data Engineer

For data engineers to get the best insights possible, cutting-edge tools and technical prowess are a must. But there’s another important element needed, too: great communication skills.

Written by Adrienne Teeley
Published on Feb. 12, 2021
Brand Studio Logo

Good luck making a business decision these days without solid data on hand to back it up. 

Instead of gambling on gut feelings, successful businesses build roadmaps around key information derived from raw data, which has been packaged neatly by talented data engineers. For data engineers to get the best insights possible, cutting-edge tools and technical prowess are a must. But according to David Shepard, a senior data engineer at Evidation Health, there’s another important element needed: great communication skills. 

“A data engineer mediates between different groups who understand data in different ways: data scientists, software engineers, DevOps, product owners and management,” Shepard said.

The ability to work cross-functionally isn’t a nice-to-have trait, Kelsea Tomaino, a data engineer for Grand Rounds, said. It’s a vital part of the gig. 

“Being a data engineer requires you to have a wide breadth of knowledge and a thorough understanding of the underlying data that powers a company’s products and influences decision-making,” Tomaino said. “This knowledge is unlocked when you are able to collaborate with other teams and subject-matter experts to gain valuable insight on how best to use the data we have.”

It might sound like a lot to juggle, but for data engineers, it’s just another day. To find out how they pull it all off, we connected with Tomaino and Shepard to glean some insights of our own.

 

Grand Rounds team
grand rounds
Kelsea Tomaino
Data Engineer • Included Health

What they do: Grand Rounds is an employee benefit that was built to help users navigate the complicated world of healthcare and insurance. The company’s platform strives to be a “personal healthcare assistant” that painlessly connects patients with top-rated doctors, and walks employees through deciphering their insurance. 

 

What’s a typical day like for you? 

My typical day consists of a variety of tasks. I spend a portion of my time working with other teams to identify how our data can support the needs of the business, whether it be for our internal tools, applications or reporting needs. This involves developing an understanding of the raw shape of the data, and how it needs to be processed and transformed into a usable form for our data scientists, data analysts or application engineers. 

I am responsible for figuring out how to get a piece of data from one point to another and finding the best way to model it so that the important fields are readily available to our different systems. Part of my role as a data engineer requires me to work closely with team members on a daily basis to diagnose any issues that arise, so that we are consistently delivering the most up-to-date information. 

At Grand Rounds, we use a Spark-based pipeline to process a variety of datasets every single day. As the data requirements of the organization are constantly changing, our pipeline needs to accommodate the change in influx of data. A typical day will require some analysis on whether or not we are processing these large datasets as efficiently as possible.
 

One of the most important characteristics of a successful data engineer is having strong interpersonal skills.”


Tell us about a project you’re working on right now that you’re really excited about. 

Recently, I worked on a cross-functional engineering team to build a new kind of digital healthcare assistant. Many of our large customers encourage their employees to get an annual physical and speak to a doctor. Grand Rounds helped route people to the right level of care. Most people have relatively simple healthcare needs and may have already seen a PCP, while others might benefit from some of our advanced services. 

To implement this new feature, we worked with a third party to ingest initial health-screening data. This was used to power data science algorithms to identify high-risk members. The data integration was particularly challenging because the data had to be transformed to conform to a certain standard that would be easily integrated into our existing pipeline. 

At Grand Rounds, we use the FHIR (Fast Healthcare Interoperability Resources) specification to model electronic health records and member information, including third party data. In addition to the modeling, a lot of my time was spent validating the match rates between incoming data and pre-existing data. Associating healthcare information with the right person is challenging and requires expert knowledge of our internal identity resolution framework. This association is important so that our employees engaging with our members have all the information they need to understand someone’s complete healthcare journey.

 

What’s the most important skill a data engineer needs to be successful in their role?

I think that one of the most important characteristics of a successful data engineer is having strong interpersonal skills. Being a data engineer requires you to have a wide breadth of knowledge and a thorough understanding of the underlying data that powers a company’s products and influences decision-making. This knowledge is unlocked when you are able to collaborate with other teams and subject-matter experts to gain valuable insight on how best to use the data we have. 

At Grand Rounds, we are lucky to work with folks across departments who have a deep understanding of healthcare data. They share their knowledge to ensure we are interpreting data correctly and building the most impactful products. Being good communicators allows us to leverage their expertise and work with other teams to make smart design decisions. 

An example of this was when I was building a tool to assist the marketing team with their email campaigns. To make sure I was building the right thing, I had to really understand their needs, their jobs and how I could make their lives easier. Being a good data engineer means being an active listener during conversations involving people across different teams and being able to effectively communicate the impact of data engineering work across the organization.

 

Evidation Health
Evidation Health
David Shepard
Senior Data Engineer • Evidation

What they do: Evidation Health is on a mission to help healthcare companies learn more about disease and health by branching out of the research clinic and allowing everyday people to participate in studies. The platform collects data securely from its users, then turns that data into insights that can lead to a better understanding of overall health. 

 

What’s a typical day like for you? 

My job is to make sure data scientists and clients have access to the right data at the right time. It’s a very cross-functional job: I spend part of my time writing code for our data platform and part of the time maintaining our system, but beyond that, I also interact with a large number of teams. 

On a typical day, I’ll start by checking on my projects to make sure the systems ran correctly overnight. Then I check in with my team and our product owners at our standup. I’ll also read through the Slack messages from the other teams whose data my project stores. Then, I’ll spend time writing code, fixing bugs or building new features. Of course, I spend a lot of time working with DevOps — my team can’t do their job without them.

The technology we work with is Python, Spark and other AWS services, but we’re moving toward third-party software like Snowflake and ETL tools. I imagine a lot of organizations are going in this direction, too.

 

Tell us about a project you’re working on right now that you’re really excited about. 

Evidation recently developed an app that is designed to encourage healthy behaviors like exercising, eating right and resting, and was launched across a large population. The most challenging and rewarding part of being part of this initiative has been scaling up. The first day, we got two terabytes of data. Bringing that data in took a lot of work: at one point, we had 50 worker nodes just ingesting the data and a large Spark cluster de-duplicating it. 

Working on this app has taught me a lot about testing. Testing code that runs at that scale is vital because the system operates on a tight, well-defined schedule that we have to be careful about changing. If a task fails, it takes a lot of work to re-run it, and we may not have time to do that and meet our other deadlines. The reward comes from knowing that this app is helping to educate people about healthy behaviors and helping people to take better care of themselves. Plus, I’ve learned a lot about scaling up. 
 

A data engineer mediates between different groups who understand data in different ways.”


What’s the most important skill a data engineer needs to be successful in their role?

Knowing data tools and software engineering is important, but aside from that, the most important skill a data engineer needs is communication. A data engineer mediates between different groups who understand data in different ways: data scientists, software engineers, DevOps, product owners and management. 

Communication takes many forms: talking to people, of course, but it also includes making expectations clear, writing documentation and keeping that documentation up to date. I’ve benefitted from my team’s commitment to writing good documentation and I’ve learned how important that is in my own projects. The clearer everyone’s understanding of the data requirements of a project are, the easier a data engineer’s job is — and the easier everyone else’s job is, too.

Responses edited for length and clarity. Photography provided by companies listed.