Despite the significant overlap in the types of tasks users can perform with them, many Excel users are unfamiliar with Python. Fortunately, Microsoft recently integrated Python with Excel spreadsheets. This upgrade significantly lowers the barrier to entry for Excel users who want to incorporate Python. 

Thanks to the new Python/Excel integration, Excel users can explore and analyze data using Python operations within the familiar environment of an Excel spreadsheet. Further, the integrated version of Excel is appropriate for all levels of Python developers and Excel users, whether beginner or advanced. It also doesn’t require the hassles of development environment setup and package installations that often plague Python developers.

Excel Python Function Overview

In order to use Python in Excel, you simply need to use the PY function. This allows you to perform common Python tasks like creating data visualizations, generating descriptive statistics, and training machine learning models. 

More From Sadrach PierreA Guide to Pareto Analysis With Pareto Charts

 

What Is Python?

Python is an object-oriented programming language used for a wide variety of software tasks. Common applications include web development, scientific computing and research, and data science and machine learning. Python is known for its readability and having a vast community of developers. These qualities make Python a great language for beginners just getting started with software development

Python is particularly useful for performing data analysis and data manipulation tasks. Because of its utility in these areas, Python is often used in conjunction with other data analysis tools like Excel. 

 

What Is Excel?

Excel is a popular software package for spreadsheet editing. It is commonly used for data-related tasks like data entry, applying formulas for column calculations, sorting and filtering numerical columns, and more. A wide variety of professions use it, including business analysts, sales and marketing teams, data scientists, and many others. 

Because of its widespread utility across industries, Excel is the most widely used spreadsheet editor program. Professionals working with data in any capacity most likely will have used Excel. So, given the overlap of their applications, Excel and Python are worth learning for anyone working with data. 

 

Excel Python Function    

For those familiar with the Python development community, the Excel/Python integration uses the well-known Anaconda distribution. This configuration is ideal since Anaconda is a secure and trusted enterprise-grade distribution for data analytics. 

In order to use Python in Excel, you simply need to use the PY function. This allows you to perform common Python tasks like creating data visualizations, generating descriptive statistics, and training machine learning models. 

Specifically, the PY function allows you to use the Python statistical analysis library Pandas. This will allow you to perform Pandas operations like reading data into a Pandas dataframe and performing descriptive statistics calculations like calculating mean, sum, standard deviation and more. You can also execute more complex tasks like performing groupby operations and imputing missing values. 

In this post, we will learn how to use Python in Excel. We will use the Excel PY function to perform data analysis using synthetic sales data. In our analysis, we will walk through the steps of importing Python packages, performing descriptive statistics, and generating data visualizations. 

More in Data ScienceWhat Is Kaggle? How to Compete in Kaggle Competitions.

 

How to Import Python Libraries    

Importing Python packages in Excel is a straightforward process. To do so, let’s open sample electronic product sales data in an Excel spreadsheet:

An Excel spreadsheet showing sales data
Image: Screenshot by the author.

Next, in order to work with Python, we need to navigate to Python mode. We can do this by clicking the Insert Python tab on the top of the spreadsheet:

The updated Excel ribbon
Image: Screenshot by the author.

Upon clicking on insert Python, you will see options for Custom Python Formula and Explore Python Samples:

MS Excel navigation options
Image: Screenshot by the author.

Upon clicking on Custom Python Formula, you should see a green PY pop up on the left side of the formula bar:

The new PY function in MS Excel
Image: Screenshot by the author.

Another way to navigate to Python mode is by typing =PY in the formula bar and pressing enter.

 By default, Python in Excel contains the following packages:

  1. Pandas
  2. Numpy
  3. Matplotlib
  4. Statsmodels
  5. Seaborn 

Importing Python packages in Excel is also straightforward. Once you’re in Python mode, in the formula bar, you simply have to type import followed by the name of the package you want to import. For example, to import Plotly, a visualization library that is not included by default, we enter the following:

A command importing Plotly in Excel
Image: Screenshot by the author.

Another library that isn’t included by default is the regular expression library, which we use for manipulating text data. To import the regular expression library, we do the following:

Importing the regular expressions library in Excel
Image: Screenshot by the author.

 

Python Descriptive Statistics in Excel    

We can easily generate insights from our data using Python logic. To work with our data within Python, while in Python mode, we select the cells we would like to work with:

An Excel spreadsheet
Image: Screenshot by the author.

 

Using Pandas

To run this logic, we press Control + Enter:

A Pandas dataframe object in Excel
Image: Screenshot by the author.

This results in a Pandas dataframe object so we can work with this object in Excel the same way we would in a Python environment. 

 

Describe

The describe method is a quick and easy way to get high-level insights into your data. Within the Python model, you simply call the describe method on your Pandas dataframe object:

A Python command in Excel
Image: Screenshot by the author.

From there, you press Control + Enter to get high-level statistics describing your data. This includes the row count, the mean, standard deviation, quartiles, minimum and maximum values:

Data output in MS Excel
Image: Screenshot by the author.

A common practice in Python is to define variables that allow you to reference objects for later use. Let’s store our dataframe in a variable called df 

A Python command in Excel
Image: Screenshot by the author.

Next, we’ll call the describe method on our dataframe object:

A Python command in Excel
Image: Screenshot by the author.

These steps will result in the same output as before. 

 

Sum

If you’re interested in taking the sum of values in a numerical column, you can use the sum method on the dataframe object:

A Python command in Excel
Image: Screenshot by the author.

 

Mean

We can do something similar for calculating the mean of a numerical column. Let’s call the mean method on our dataframe object:

A Python command in Excel
Image: Screenshot by the author.

 

Groupby

A Python command in Excel
Image: Screenshot by the author.

This results in a series object:

A series object in Excel/Python
Image: Screenshot by the author.

We can expand this into Excel values:

Excel values
Image: Screenshot by the author.

This gives the following output:

A list of Excel values
Image: Screenshot by the author.

 

Python Plots in Excel    

We can use the Pandas plot method to generate a time series line plot of our sales data. Let’s store our grouped data in a variable called plot_data:

A Python command in Excel
Image: Screenshot by the author.

Next, let’s call the plot method on our grouped data. Here, we will specify the x-axis, y-axis and the kind of plot:

Python commands in Excel
Image: Screenshot by the author.

And upon pressing Control + Enter, we generate the following line plot:

A data visualization generated through Python in Excel
Image: Screenshot by the author.

The data used in this post is available on GitHub

Thinking About DataWhy Your Company Needs a Data Ethicist

 

Use Python in Excel to Gain Data Insights

Now that Excel enables Python logic within its interface, a wide variety of professionals can draw on the power of Python tools to supplement their day-to-day analytics workflows. Python in Excel is a step towards bridging the gap between analysts who are only comfortable using Excel and Python developers who use these tools to generate analytic insights. 

Although we only covered some of the basics here, the capabilities are not limited to these tasks. Given that Excel enables you to import external packages that aren’t provided by default, analysts can use Python tools used for machine learning, natural language processing and more. 

Expert Contributors

Built In’s expert contributor network publishes thoughtful, solutions-oriented stories written by innovative tech professionals. It is the tech industry’s definitive destination for sharing compelling, first-person accounts of problem-solving on the road to innovation.

Learn More

Great Companies Need Great People. That's Where We Come In.

Recruit With Us