What Skills Make a Desirable Data Scientist?

Four data leaders outline the most important and underrated skills in the field.
Quinten Dol
October 16, 2020
Updated: October 20, 2020
Quinten Dol
October 16, 2020
Updated: October 20, 2020

What does it take to be a successful data scientist?

When we put this question to data science leaders at four tech companies representing a cross section of the U.S. tech industry, we were surprised to learn that coding chops or traditional “full-stack” abilities didn’t necessarily top the list. Instead, less tangible qualities — like effective communication and listening skills, a troubleshooter’s approach to problems and specialized math skills — all came at the top of their hiring wish list. 

 

1. Communication, Practicality and Curiosity

lacey plache Vice President of Data & Analytics age of learning
Age of Learning

Says who?: Lacey Plache, Vice President of Data & Analytics at Age of Learning

The company: Age of Learning combines gaming elements with educational technology to create an online curriculum for pre-K, kindergarten and early elementary school students. The company uses data science to develop a deep understanding of the children who use its programs, influencing product development decisions as engineers iterate on existing platforms. The company says it plans to grow its data science team heading into 2021.

What are the most important skills for a data science professional?

“One of the most important skills for data scientists is communication. This includes mapping business questions into data science solutions, selling these solutions and translating the results into insights the business can act on. 

A second valuable skill is practicality. The most successful data scientists choose the right solutions for the problem at hand. For example, not building a neural network when a logistic regression will suffice. Additionally, understanding the optimal point between what is necessary and sufficient to be confident in results, such as knowing when hyperparameter tuning is and isn’t needed.

Finally, data curiosity is an important skill that relies on instinct. It involves constantly thinking creatively about how to use data, asking questions about why and how patterns occur and trying different approaches, interpretations and viewpoints.”

“We can predict and influence positive learning outcomes by building personalized and adaptive learning paths for each child based on their ability level.”

How does your team leverage data science techniques and technology?

“We are currently using clustering algorithms, such as k-medoids and hierarchical clustering to identify high- and low-value user segments, which informs product strategy on how to increase user engagement. Additionally, item response theory has helped us measure ability levels in English learners, including pronunciation, listening comprehension and the difficulty of words and phrases for these learners. Through these findings, we can predict and influence positive learning outcomes by building personalized and adaptive learning paths for each child based on their ability level.”

 

2. Math

niels joaquin Senior Director of Data Science maven clinic
Maven Clinic

Says who?: Niels Joaquin, Senior Director of Data Science at Maven Clinic

The company: Maven is a virtual healthcare clinic for women, offering consultations with a network of more than 1,700 women’s and family health providers via text and video conferencing. So-called “care advocates” are on hand to guide users through family planning programs, and the company also offers a benefits option for employers, health plans and individual customers. Maven uses data science to study how usage of its app can affect clinical outcomes. For example, how does engagement with care advocates, clinicians and content impact C-section rates or admissions to neonatal intensive care units?

What are the most important and/or underrated skills for a data science professional?

“I like data scientists with a broad background in applied math — not just machine learning, but also some subset of optimization, stochastic processes, Bayesian methods, discrete math. You won't always be able to use supervised learning, so when you’re faced with a novel problem, it’s important to be able to pull from different frameworks in your toolbelt.”

“Our models help us handle a growing user base and smartly prioritize the at-risk members who are most in need of guidance during their pregnancy.”

How will data science move the needle for your business heading into 2021?

“As we continue to onboard new clients and members, we need to scale our operations sustainably, but we also want to continue personalizing our experience at the same time. Machine learning enables both of these goals to happen: Our models help us handle a growing user base and smartly prioritize the at-risk members who are most in need of guidance during their pregnancy. But whether a member’s risk for adverse outcomes is high or low, predicting these outcomes allows us to customize her journey as she moves through Maven’s different “tracks” like pregnancy or postpartum.”

 

From Built In’s Expert Contributor NetworkHow to Do Data Science From Home Without Going Mad

 

3. Troubleshooting and Communication 

bark lauren talbot Vice President of Data and Analytics
BARK

Says who?: Lauren Talbot, Vice President of Data and Analytics at BARK

The company: BARK oversees BarkBox, a leading subscription box service for pet owners and their companions, and BarkShop, an e-commerce destination for pet owners. As it heads into 2021, the company plans to use data science technology to drive decision-making around customer-facing features and functionality. The team is also applying its expertise to customer feedback comments, which it will use to recommend pet products that align with each individual.

What are the most important and underrated skills for a data science professional?

“Communication is the most important skill for a data science professional. Analysis and decision making almost always precede machine learning, which means the way another human receives the output of the data scientist often means the difference between moving the needle or not. But listening (the other side of communication) is just as crucial, since data science is in constant conversation with the business on the rules of engagement. 

When you add in the need to create models, data pipelines and software, context switching becomes the most underrated skill. Soft skills aside, the most important hard skill is troubleshooting ability. Data science professionals are swimming in new frameworks and open source libraries, not to mention new datasets. They need to be able to go into almost any situation somewhat cold and work up to competency quickly by asking the right questions and knowing when to double down or try a new approach.”

“There’s a dose of data science almost everywhere you look.”

How does your team leverage data science techniques and technology?

“There’s a dose of data science almost everywhere you look. It’s in our reporting, business analysis, how we partner with the product teams, our data pipeline jobs — even in the data monitoring. The most basic applications tend to be in Periscope/Sisense, which is primarily a business intelligence tool, but we also use it for time series forecasting, statistical significance testing and even some model scoring. The ability to create filters and visualizations that interact with the data science components is really a nice perk, because it allows you to bring data science to a business audience in a place they feel comfortable. 

When we need to get more exploratory or really dig into a particular question with a fuller tool set, anything that can be displayed in a Jupyter Notebook without an insane amount of scrolling is fair game. We have our pip-installable “barkutils” and “barkstats” libraries where we put reusable data code along with whatever open source frameworks we find useful — Scikit, TensorFlow, Keras, H2O, PyTorch, SciPy, statsmodels, Polara, gensim and so on. We host a series of notebooks on Knowledge Repo with summaries and deep dives — sometimes for a business audience, sometimes just for posterity and reproducibility. When we work with the product teams we might find reasons to use machine learning in production. If it’s something that can be batched or creates high latency, putting it into Airflow alongside the other data pipelines — sometimes with AWS Lambda — is the easiest way to deploy. Recently we’ve moved some of our data science workflows to Google Cloud AI and Kubeflow. And for monitoring, we recently started using an anomaly detection platform called Monte Carlo. They send slack alerts when there are unexpected data changes. 

And then sometimes, when I’m between meetings, I’ll play with the Evan Miller sample size calculator.”

We’re On A Deep DiveRead Our Industry Coverage Here

 

4. Resource Allocation

sanjay castelina snow software Chief Product Officer
Snow Software

Says who?: Sanjay Castelino, Chief Product Officer at Snow Software

The company: Snow Software helps enterprises optimize how they use software, apps, hardware and cloud technology, with a range of software products designed to offer visibility across an entire information technology ecosystem. Using its products, users can track and optimize spending and secure operations against cyber threats. The company leverages its data science team to refine the identification of software and improve internal efficiency and impact by helping decide where to focus a given team’s energy and attention.

What are the most underrated skills for a data science professional?

“I believe the most underrated skills for data science have little to do with data or science — it has to do with understanding the scope and impact of solving a problem so they understand where to focus efforts on getting the next one percent improvement in outcomes. Absent a good understanding of where to apply data science, we won’t achieve our outcomes because the methods and techniques are not a magic wand. They take disciplined experimentation and that takes time, and you have to decide where to invest time, when to stop experimenting and when the potential return is worth additional time.”

“When it comes to usage and spend, we want to be able to offer customers a better understanding of where they are headed so that they can make better decisions.”

How will data science move the needle for your business heading into 2021?

“In 2021 we hope to leverage data science to drive more predictive forecasting for customers in the value we deliver to them. In particular, when it comes to usage and spend, we want to be able to offer customers a better understanding of where they are headed so that they can make better decisions about how technology is consumed in their organizations.”

Great Companies Need Great People. That's Where We Come In.

Recruit With Us