Data analysis is an aspect of data science and data analytics that is all about analyzing data for different kinds of purposes. The data analysis process involves inspecting, cleaning, transforming and modeling data to draw useful insights from it.

## Types of Data Analysis

- Descriptive analysis
- Diagnostic analysis
- Exploratory analysis
- Inferential analysis
- Predictive analysis
- Causal analysis
- Mechanistic analysis
- Prescriptive analysis

With its multiple facets, methodologies and techniques, data analysis is used in a variety of fields, including energy, healthcare and marketing, among others. As businesses thrive under the influence of technological advancements in data analytics, data analysis plays a huge role in decision-making, providing a better, faster and more effective system that minimizes risks and reduces human biases.

That said, there are different kinds of data analysis with different goals. We’ll examine each one below.

## Two Camps of Data Analysis

Data analysis can be divided into two camps, according to the book *R for Data Science*:

**Hypothesis Generation:**This involves looking deeply at the data and combining your domain knowledge to generate hypotheses about why the data behaves the way it does.**Hypothesis Confirmation:**This involves using a precise mathematical model to generate falsifiable predictions with statistical sophistication to confirm your prior hypotheses.

## Types of Data Analysis

Data analysis can be separated and organized into types, arranged in an increasing order of complexity.** **

### 1. Descriptive Analysis

The goal of descriptive analysis is* *to describe or summarize a set of data. Here’s what you need to know:

- Descriptive analysis is the very first analysis performed in the data analysis process.
- It generates simple summaries of samples and measurements.
- It involves common, descriptive statistics like measures of central tendency, variability, frequency and position.

#### Descriptive Analysis Example

Take the Covid-19 statistics page on Google, for example. The line graph is a pure summary of the cases/deaths, a presentation and description of the population of a particular country infected by the virus.

Descriptive analysis is the first step in analysis where you summarize and describe the data you have using descriptive statistics, and the result is a simple presentation of your data.

### 2. Diagnostic Analysis** **

Diagnostic analysis seeks to answer the question “Why did this happen?” by taking a more in-depth look at data to uncover subtle patterns. Here’s what you need to know:

- Diagnostic analysis typically comes after descriptive analysis, taking initial findings and investigating why certain patterns in data happen.
- Diagnostic analysis may involve analyzing other related data sources, including past data, to reveal more insights into current data trends.
- Diagnostic analysis is ideal for further exploring patterns in data to explain anomalies.

#### Diagnostic Analysis Example

A footwear store wants to review its website traffic levels over the previous 12 months. Upon compiling and assessing the data, the company’s marketing team finds that June experienced above-average levels of traffic while July and August witnessed slightly lower levels of traffic.

To find out why this difference occurred, the marketing team takes a deeper look. Team members break down the data to focus on specific categories of footwear. In the month of June, they discovered that pages featuring sandals and other beach-related footwear received a high number of views while these numbers dropped in July and August.

Marketers may also review other factors like seasonal changes and company sales events to see if other variables could have contributed to this trend.

### 3. Exploratory Analysis (EDA)

Exploratory analysis involves examining or exploring data and finding relationships between variables that were previously unknown. Here’s what you need to know:

- EDA helps you discover relationships between measures in your data, which are not evidence for the existence of the correlation, as denoted by the phrase, “Correlation doesn’t imply causation.”
- It’s useful for discovering new connections and forming hypotheses. It drives design planning and data collection.

#### Exploratory Analysis Example

Climate change is an increasingly important topic as the global temperature has gradually risen over the years. One example of an exploratory data analysis on climate change involves taking the rise in temperature over the years from 1950 to 2020 and the increase of human activities and industrialization to find relationships from the data. For example, you may increase the number of factories, cars on the road and airplane flights to see how that correlates with the rise in temperature.

Exploratory analysis explores data to find relationships between measures without identifying the cause. It’s most useful when formulating hypotheses.

### 4. Inferential Analysis

Inferential analysis involves using a small sample of data to infer information about a larger population of data.

The goal of statistical modeling itself is all about using a small amount of information to extrapolate and generalize information to a larger group. Here’s what you need to know:

- Inferential analysis involves using estimated data that is representative of a population and gives a measure of uncertainty or standard deviation to your estimation.
- The accuracy of inference depends heavily on your sampling scheme. If the sample isn’t representative of the population, the generalization will be inaccurate. This is known as the central limit theorem.

#### Inferential Analysis Example

A psychological study on the benefits of sleep might have a total of 500 people involved. When they followed up with the candidates, the candidates reported to have better overall attention spans and well-being with seven to nine hours of sleep, while those with less sleep and more sleep than the given range suffered from reduced attention spans and energy. This study drawn from 500 people was just a tiny portion of the 7 billion people in the world, and is thus an inference of the larger population.

Inferential analysis extrapolates and generalizes the information of the larger group with a smaller sample to generate analysis and predictions.

### 5. Predictive Analysis

Predictive analysis involves* *using historical or current data to find patterns and make predictions about the future. Here’s what you need to know:

- The accuracy of the predictions depends on the input variables.
- Accuracy also depends on the types of models. A linear model might work well in some cases, and in other cases it might not.
- Using a variable to predict another one doesn’t denote a causal relationship.

#### Predictive Analysis Example

The 2020 United States election is a popular topic and many prediction models are built to predict the winning candidate. FiveThirtyEight did this to forecast the 2016 and 2020 elections. Prediction analysis for an election would require input variables such as historical polling data, trends and current polling data in order to return a good prediction. Something as large as an election wouldn’t just be using a linear model, but a complex model with certain tunings to best serve its purpose.

### 6. Causal Analysis

Causal analysis* *looks at the cause and effect of relationships between variables and is focused on finding the cause of a correlation. This way, researchers can examine how a change in one variable affects another. Here’s what you need to know:

- To find the cause, you have to question whether the observed correlations driving your conclusion are valid. Just looking at the surface data won’t help you discover the hidden mechanisms underlying the correlations.
- Causal analysis is applied in randomized studies focused on identifying causation.
- Causal analysis is the gold standard in data analysis and scientific studies where the cause of a phenomenon is to be extracted and singled out, like separating wheat from chaff.
- Good data is hard to find and requires expensive research and studies. These studies are analyzed in aggregate (multiple groups), and the observed relationships are just average effects (mean) of the whole population. This means the results might not apply to everyone.

#### Causal Analysis Example** **

Say you want to test out whether a new drug improves human strength and focus. To do that, you perform randomized control trials for the drug to test its effect. You compare the sample of candidates for your new drug against the candidates receiving a mock control drug through a few tests focused on strength and overall focus and attention. This will allow you to observe how the drug affects the outcome.

### 7. Mechanistic Analysis

Mechanistic analysis is used to* *understand exact changes in variables that lead to other changes in other variables. In some ways, it is a predictive analysis, but it’s modified to tackle studies that require high precision and meticulous methodologies for physical or engineering science. Here’s what you need to know:

- It’s applied in physical or engineering sciences, situations that require high precision and little room for error, only noise in data is measurement error.
- It’s designed to understand a biological or behavioral process, the pathophysiology of a disease or the mechanism of action of an intervention.

#### Mechanistic Analysis* *Example

Say an experiment is done to simulate safe and effective nuclear fusion to power the world. A mechanistic analysis of the study would entail a precise balance of controlling and manipulating variables with highly accurate measures of both variables and the desired outcomes. It’s this intricate and meticulous modus operandi toward these big topics that allows for scientific breakthroughs and advancement of society.

### 8. Prescriptive Analysis** **

Prescriptive analysis compiles insights from other previous data analyses and determines actions that teams or companies can take to prepare for predicted trends. Here’s what you need to know:

- Prescriptive analysis may come right after predictive analysis, but it may involve combining many different data analyses.
- Companies need advanced technology and plenty of resources to conduct prescriptive analysis. Artificial intelligence systems that process data and adjust automated tasks are an example of the technology required to perform prescriptive analysis.

#### Prescriptive Analysis Example

Prescriptive analysis is pervasive in everyday life, driving the curated content users consume on social media. On platforms like TikTok and Instagram, algorithms can apply prescriptive analysis to review past content a user has engaged with and the kinds of behaviors they exhibited with specific posts. Based on these factors, an algorithm seeks out similar content that is likely to elicit the same response and recommends it on a user’s personal feed.

## When to Use the Different Types of Data Analysis** **

**Descriptive analysis**summarizes the data at hand and presents your data in a comprehensible way.**Diagnostic analysis**takes a more detailed look at data to reveal why certain patterns occur, making it a good method for explaining anomalies.**Exploratory data analysis**helps you discover correlations and relationships between variables in your data.**Inferential analysis**is for generalizing the larger population with a smaller sample size of data.**Predictive analysis**helps you make predictions about the future with data.**Causal analysis**emphasizes finding the cause of a correlation between variables.**Mechanistic analysis**is for measuring the exact changes in variables that lead to other changes in other variables.**Prescriptive analysis**combines insights from different data analyses to develop a course of action teams and companies can take to capitalize on predicted outcomes.

A few important tips to remember about data analysis include:

- Correlation doesn’t imply causation.
- EDA helps discover new connections and form hypotheses.
- Accuracy of inference depends on the sampling scheme.
- A good prediction depends on the right input variables.
- A simple linear model with enough data usually does the trick.
- Using a variable to predict another doesn’t denote causal relationships.
- Good data is hard to find, and to produce it requires expensive research.
- Results from studies are done in aggregate and are average effects and might not apply to everyone.

## Frequently Asked Questions

### What is an example of data analysis?

A marketing team reviews a company’s web traffic over the past 12 months. To understand why sales rise and fall during certain months, the team breaks down the data to look at shoe type, seasonal patterns and sales events. Based on this in-depth analysis, the team can determine variables that influenced web traffic and make adjustments as needed.

### How do you know which data analysis method to use?

Selecting a data analysis method depends on the goals of the analysis and the complexity of the task, among other factors. It’s best to assess the circumstances and consider the pros and cons of each type of data analysis before moving forward with a particular method.