Data scientists tackle questions about the future. They start with big data, characterized by the three V’s: volume, variety and velocity. Then, they use it as fodder for algorithms and models. The most cutting-edge data scientists, working in machine learning and AI, make models that automatically self-improve, noting and learning from their mistakes.
Data scientists have changed almost every industry. In medicine, their algorithms help predict patient side effects. In sports, their models and metrics have redefined “athletic potential.” Data science applications have even tackled traffic, with route-optimization models that capture typical rush hours and weekend lulls.
Data Science Applications and Examples
- Healthcare: Data science can identify and predict disease, and personalize healthcare recommendations.
- Transportation: Data science can optimize shipping routes in real-time.
- Sports: Data science can accurately evaluate athletes’ performance.
- Government: Data science can prevent tax evasion and predict incarceration rates.
- E-commerce: Data science can automate digital ad placement.
- Gaming: Data science can improve online gaming experiences.
- Social media: Data science can create algorithms to pinpoint compatible partners.
What Is Data Science?
Data science shouldn’t be confused with data analytics. Both fields are ways of understanding big data, and both often involve analyzing massive databases using R and Python. These points of overlap mean the fields are often treated as one field, but they differ in important ways.
For one, they have different relationships with time. Data analysts synthesize big data to answer concrete questions grounded in the past, e.g., “How has our subscriber base grown from 2018 to 2022?” In other words, they mine big data for insights on what’s already happened. Meanwhile, data scientists build on big data, creating models that can predict or analyze whatever comes next.
Of course, it’s impossible to perfectly model all the complexities of real life. As statistician George E.P. Box famously put it, “All models are wrong, but some are useful.” Still, data science at its best can make informed recommendations about key areas of uncertainty.
Below we’ve rounded up 22 examples of data science applications at work, in areas from e-commerce to healthcare.
Data Science in Healthcare Examples
Back in 2008, data science made its first major mark on the healthcare industry. Google staffers discovered they could map flu outbreaks in real time by tracking location data on flu-related searches. The CDC's existing maps of documented flu cases, FluView, was updated only once a week. Google quickly rolled out a competing tool with more frequent updates: Google Flu Trends.
But it didn’t work. In 2013, Google estimated about twice the flu cases that were actually observed. The tool’s secret methodology seemed to involve finding correlations between search term volume and flu cases. That meant the Flu Trends algorithm sometimes put too much stock in seasonal search terms like “high school basketball.”
Even so, it demonstrated the serious potential of data science in healthcare. Here are some examples of more powerful and precise healthcare tools developed in the years after Google’s initial attempt. All of them are powered by data science.
Location: Mountain View, California
Google hasn’t abandoned applying data science to healthcare. In fact, the company developed a tool, LYNA, for identifying breast cancer tumors that metastasize to nearby lymph nodes. That can be difficult for the human eye to see, especially when the new cancer growth is small. In one trial, LYNA — short for Lymph Node Assistant —accurately identified metastatic cancer 99 percent of the time using its machine-learning algorithm. More testing is required, however, before doctors can use it in hospitals.
Location: Berlin, Germany
The popular Clue app employs data science to forecast users’ menstrual cycles and reproductive health by tracking cycle start dates, moods, stool type, hair condition and many other metrics. Behind the scenes, data scientists mine this wealth of anonymized data with tools like Python and Jupyter’s Notebook. Users are then algorithmically notified when they’re fertile, on the cusp of a period or at an elevated risk for conditions like an ectopic pregnancy.
Location: Philadelphia, Pennsylvania
Oncora’s software uses machine learning to create personalized recommendations for current cancer patients based on data from past ones. Healthcare facilities using the company’s platform include UT Health San Antonio and Scripps Health. Their radiology team collaborated with Oncora data scientists to mine 15 years’ worth of data on diagnoses, treatment plans, outcomes and side effects from more than 50,000 cancer records. Based on this data, Oncora’s algorithm learned to suggest personalized chemotherapy and radiation regimens.
Location: New York, New York and Boston, Massachuesetts
Veeva is a cloud software company that provides data and software solutions for the healthcare industry. The company’s reach extends through clinical, regulatory and commercial medical fields. Veeva’s Vault EDC uses data science to clean clinical trial findings and help medical professionals make adjustments mid-study.
Data Science in Transportation and Logistics Examples
Driving plays a central role in American life. The Supreme Court has called it “a virtual necessity,” and the vast majority of Americans — 86 percent — own or lease cars. In 2021, American automobiles burned about 134 billion gallons of gasoline. Unfortunately, this habit contributes to climate change. That’s where data science comes in.
While both biking and public transit can curb driving-related emissions, data science can do the same by optimizing road routes. And though data-driven route adjustments are often small, they can help save thousands of gallons of gas when spread across hundreds of trips and vehicles — even among companies that aren’t explicitly eco-focused. Here are some examples of data science hitting the road.
Location: San Francisco, California
StreetLight uses data science to model traffic patterns for cars, bikes and pedestrians on North American streets. Based on a monthly influx of trillions of data points from smartphones, in-vehicle navigation devices and more, Streetlight’s traffic maps stay up-to-date. They’re more granular than mainstream maps apps too: they can identify groups of commuters that use multiple transit modes to get to work, like a train followed by a scooter. The company’s maps inform various city planning enterprises, including commuter transit design.
Location: San Francisco, California
The data scientists at UberEats have a fairly simple goal: getting hot food delivered quickly. Making that happen across the country though, takes machine learning, advanced statistical modeling and staff meteorologists. In order to optimize the full delivery process, the team has to predict how every possible variable — from storms to holiday rushes — will impact traffic and cooking time.
Location: Atlanta, Georgia
UPS uses data science to optimize package transport from drop-off to delivery. The company’s integrated navigation system ORION helps drivers choose over 66,000 fuel-efficient routes. ORION has saved UPS approximately 100 million miles and 10 million gallons of fuel per year with the use of advanced algorithms, AI and machine learning. The company plans to continue to update its ORION system, with the last version having been rolled out in 2021. The latest update allowed drivers to reduce their routes by two to four miles.
Data Science in Sports Examples
In the early 2000s, the Oakland Athletics’ recruitment budget was so small the team couldn’t recruit quality players. At least, they couldn’t recruit players any other teams considered quality. So the general manager redefined quality, using in-game statistics other teams ignored to predict player potential and assemble a strong team despite their budget.
His strategy helped the A’s make the playoffs, and it snowballed from there. Author Michael Lewis wrote a book about the phenomenon, Moneyball, which spawned a film by the same name starring Brad Pitt. The global market for sports analytics is expected to reach 8.4 billion by 2026. Here are some examples of how data science is transforming sports beyond baseball.
Location: Tel Aviv, Israel
RSPCT’s shooting analysis system, adopted by NBA and college teams, relies on a sensor on a basketball hoop’s rim, whose tiny camera tracks exactly when and where the ball strikes on each basket attempt. It funnels that data to a device that displays shot details in real time and generates predictive insights.
“Based on our data… We can tell [a shooter], ‘If you are about to take the last shot to win the game, don’t take it from the top of the key, because your best location is actually the right corner,’” RSPCT COO Leo Moravtchik told SVG News.
Location: Boston, Massachusetts
WHOOP makes wearable devices that track athletes’ physical data like resting heart rate, sleep cycle and respiratory rate. The goal is to help athletes understand when to push their training and when to rest — and to make sure they’re taking the necessary steps to get the most out of their body. Professional athletes like Olympic sprinter Gabby Thomas, Olympic golfer Nelly Korda and PGA golfer Nick Watney are among the WHOOPS’ users, according to the company’s website.
Location: Austin, Texas
Trace provides soccer coaches with recording gear and an AI system that analyzes game film. Players wear a tracking device, called a Tracer, while its specially designed camera records the game. The AI bot then takes that footage and stitches together all of the most important moments in a game — from shots on goal to defensive lapses and more. This technology allows coaches and players to have more detailed insights from game film. Beyond stitching together clips, the software also provides performance metrics and a field heat map.
Data Science in Government Examples
Though few think of the U.S. government as “extremely online,” its agencies can access more data than Google and Meta combined. Not only do its agencies maintain their own databases of ID photos, fingerprints and phone activity, government agents can get warrants to obtain data from any American data warehouse. Investigators often reach out to Google’s warehouse, for instance, to get a list of the devices that were active at the scene of a crime.
Though many view such activity as an invasion of privacy, the United States. has minimal privacy regulations. Even California’s radical privacy law offers citizens no protections against government monitoring. In short, the government’s data well won’t run dry anytime soon. Here are some of the ways government agencies apply data science to vast stores of data.
Location: Canton, Ohio
Widely used by the American judicial system and law enforcement, Equivant’s Northpointe software suite attempts to gauge an incarcerated person’s risk of reoffending. Its algorithms predict that risk based on a questionnaire that covers the person's employment status, education level and more. No questionnaire items explicitly address race, but according to a ProPublica analysis that was disputed by Northpointe, the Equivant algorithm pegs Black people as higher recidivism risks than white people 77 percent of the time — even when they’re the same age and gender, with similar criminal records. ProPublica also found that Equivant's predictions were 60 percent accurate.
Location: Washington, D.C.
The U.S. Immigrations and Customs Enforcement, a.k.a. ICE, has used facial recognition technology to mine driver’s license photo databases in at least two states, with the goal of deporting undocumented immigrants. The practice — which has sparked criticism from both an ethical and technological standpoint (facial recognition technology remains shaky) — falls under the umbrella of data science. Facial recognition builds on photos of faces, a.k.a raw data, with AI and machine learning capabilities.
Location: Washington, D.C.
Tax evasion costs the U.S. government $458 billion a year, by one estimate, so it’s no wonder the IRS has modernized its fraud-detection protocols in the digital age. To the dismay of privacy advocates, the agency has improved efficiency by constructing multidimensional taxpayer profiles from public social media data, assorted metadata, emailing analysis, electronic payment patterns and more. Based on those profiles, the agency forecasts individual tax returns; anyone with wildly different real and forecasted returns gets flagged for auditing.
Data Science in Gaming Examples
The gaming industry is growing, and its using data science to help expand. The global video game market was valued at $195.65 billion in 2021, and is expected to grow by nearly 13 percent by 2030.
Data science and AI have been used in video games since as early as the 1950s with the creation of Nim — a mathematical strategy game in which two players take turns to remove objects from piles. The innovation continued with Pac-Man where AI and data science were used in the game’s mazes and to give the ghosts distinct personalities.
With the development of online, multi-player games like Call of Duty, World of Warcraft and Halo, the video game industry continues to find creative ways to implement data science and AI to improve game play and entertain millions of people across the globe. Here are just a few examples of how data science is used in video games.
Location: Santa Monica, California
Known for being the company behind games with cult followings like Call of Duty, World of Warcraft, Candy Crush and Overwatch, Activision Blizzard uses big data to improve their online gaming experiences. One example of this being the company’s game science division analyzing gaming data to prevent empowerment — the attempt to improve someone else’s sports scores through negative means — amongst COD players. The company also uses machine learning to detect power boosting and identify and track key indicators for increasing quality of game time.
Location: Novato, California
2k Games is a video game studio that has created popular titles like Bioshock and Borderlands, as well as both WWE and PGA games series. The company’s growing game science team focuses on extracting gaming data and building models in order to improve its sports games like NBA2K. Data scientists at 2K games analyze player gameplay and economy telemetry data to understand player behavior and suggest actions to improve the player experience.
Location: San Francisco, California
Unity is a platform for creating and operating interactive, real-time 3D content, including games. The platform is used by gaming companies like Riot Games, Atari and Respawn Entertainment, according to its website. Unity uses gaming data to make data-driven decision making within its product development team and to monitor business metrics.
Data Science in E-Commerce Examples
Once upon a time, everyone in a given town shopped at the same mall: a physical place with some indoor fountains, a jewelry kiosk and probably a Body Shop. Today, citizens of that same town can each shop in their own personalized digital mall — also known as the internet. Online retailers often automatically tailor their web storefronts based on viewers’ data profiles. That can mean tweaking page layouts and customizing spotlighted products, among other things. Some stores may also adjust prices based on what consumers seem able to pay, a practice called personalized pricing. Even websites that sell nothing feature personalized ads. Here are some examples of companies using data science to automatically personalize the online shopping experience.
Location: Boulder, Colorado
Sovrn brokers deals between advertisers and outlets like Bustle, ESPN and Encyclopedia Britannica. Since these deals happen millions of times a day, Sovrn has mined a lot of data for insights, which manifest in its intelligent advertising technology. Compatible with Google and Amazon’s server-to-server bidding platforms, its interface can monetize media with minimal human oversight — or, on the advertiser end, target campaigns to customers with specific intentions.
Location: San Francisco, California
Data science helped Airbnb totally revamp its search function. Once upon a time, it prioritized top-rated vacation rentals that were located a certain distance from a city’s center. That meant users could always find beautiful rentals, but not always in cool neighborhoods. Engineers solved that issue by prioritizing the search rankings of a rental if it’s in an area that has a high density of Airbnb bookings. There’s still breathing room for quirkiness in the algorithm, too, so cities don’t dominate towns and users can stumble on the occasional rental treehouse.
Location: Menlo Park, California
Instagram uses data science to target its sponsored posts, which hawk everything from trendy sneakers to influencers posting sponsored ads. The company’s data scientists pull data from Instagram as well as its owner, Meta, which has exhaustive web-tracking infrastructure and detailed information on many users, including age and education. From there, the team crafts algorithms that convert users’ likes and comments, their usage of other apps and their web history into predictions about the products they might buy.
Though Instagram’s advertising algorithms remain shrouded in mystery, they work impressively well, according to The Atlantic’s Amanda Mull: “I often feel like Instagram isn’t pushing products, but acting as a digital personal shopper I’m free to command.”
Location: New York, New York
Taboola uses deep learning, AI and large datasets to create engagement opportunities for advertisers and digital properties. Its discovery platform creates new monetization, audience and engagement by placing advertisements throughout a variety of online publishers and sites. Its discovery platform can expose readers to news, entertainment, topical information or advice as well as a new product or service. The company partners with outlets like USA Today, Bloomberg, Business Insider and MSN, according to its website.
Data Science in Social Platforms Examples
The rise of social networks has completely altered how people socialize. Romantic relationships unfold publicly on Venmo. Meta engineers can rifle through users’ birthday party invite lists. Friendship, acquaintanceship and coworker-ship all leave extensive online data trails.
Some argue that these trails — Meta friend lists or LinkedIn connections — don’t mean much. Anthropologist Robin Dunbar, for instance, has found that people can maintain only about 150 casual connections at a time; cognitively, humans can’t handle much more than that. In Dunbar’s view, racking up more than 150 digital connections says little about a person's day-to-day social life.
Catalogs of social network users’ most glancing acquaintances hold another kind of significance though. Now that many relationships begin online, data about your social world impacts who you get to know next. Here are some examples of data science fostering human connection.
Location: West Hollywood, California
When singles match on Tinder, they can thank the company’s data scientists. A carefully-crafted algorithm works behind the scenes, boosting the probability of matches. Once upon a time, this algorithm relied on users’ Elo scores, essentially an attractiveness ranking. Now, it prioritizes matches between active users, users near each other and users who seem like each other’s “types” based on their swiping history.
Location: Menlo Park, California
Meta, of course, uses data science in various ways, but one of its buzzier data-driven features is the “People You May Know” sidebar, which appears on the social network’s home screen. Often creepily prescient, it’s based on a user’s friend list, the people they’ve been tagged with in photos and where they’ve worked and gone to school. It’s also based on “really good math,” according to the Washington Post — specifically, a type of data science known as network science, which essentially forecasts the growth of a user’s social network based on the growth of similar users’ networks.