When data is imported into Pandas DataFrame, it sometimes contains incorrect or messy column names, requiring you to go through the tedious process of renaming all or some of them.

4 Ways to Rename Columns in Pandas

  1. Rename columns in Pandas using dictionary with the pandas.DataFrame.rename() function.
  2. Rename columns in Pandas by passing a function to the columns parameter.
  3. Rename Columns in Pandas with pandas.DataFrame.columns.
  4. Rename Columns in Pandas with pandas.DataFrame.set_axis.

Replacing messy column names with meaningful ones is an essential step in data cleaning. It makes the entire code more readable and saves a lot of time during the next steps of data processing

I’m going to demonstrate the four best methods to easily change the Pandas DataFrame column names.

I’ll be using a self created Dummy_Sales_Data which you can get on my Github repo.

Let’s import the data set first:

import pandas as pd
df = pd.read_csv("Dummy_Sales_Data_v1.csv")
df.head()
pandas dataframe table
Dummy sales data. | Image: Suraj Gurav

It is a simple 10000 x 12 data set, which I created.

Now, let’s get started. We’ll begin with the most simple and straightforward method before moving on to other approaches.

 

Rename Columns in Pandas Using Dictionary

pandas.DataFrame.rename() is a DataFrame function that alters the axis labels. Here, the word — axisrefers to both rows and columns depending on which value we set for the parameter axis in this function.

As we are more interested in understanding how to change the column name, let’s focus on that. So, the important parameter for us in .rename() function is columns as shown below.

pandas.dataframe.rename() function
pandas.dataframe.rename() function. | Image: Suraj Gurav

To replace some or all of the column names, all you need to do is pass a dictionary where keys will be old column names and values will be the new column names as mentioned below. 

Rename pandas dataframe columns using df.rename()
Rename Pandas DataFrame columns using df.rename(). | Image: Suraj Gurav

As you can see, I passed dictionary in the parameter columns in df.rename(), where keys are Status and Quantity, which are the old column names. And values are Order_Status and Order_Quantity, which are the new column names.

df.rename() consists of an inplace parameter that is False by default. In order to retain the changes in the column names, you need to make inplace = True.

Because I didn’t want to retain the changed column names, I used .head() method to see how it looks with the changed column name.

Before making inplace = True in any function, it’s always a good idea to use .head() to see how the change looks before you finalize it. The next method is a slight variation of .rename() function.

More on Pandas4 Ways to Add a Column in Pandas

 

Rename Columns in Pandas Using Functions

Just like the first method above, we will still use the parameter columns in the .rename() function. But instead of passing the old name-new name key-value pairs, we can also pass a function to the columns parameter.

For example, converting all column names to upper case is quite simple using this trick below.

df.rename(columns=str.upper).head()
rename columns using functions
Rename columns using functions. | Image: Suraj Gurav

I simply used a string function str.upper to make all column names in upper case, as you can see in the above picture.

In this way, all the column names will be altered in one go. However, this can be made flexible through user defined functions.

You can pass any user defined function to the parameter columns to change the column names based on a function.

For example, you can write a simple function to split the column names on underscore ( _ ), and select only the first part. And then you’ll pass this function to the columns as shown below. 

def function_1(x):
   x = x.split('_')[0]
   return x
df.rename(columns=function_1).head()
Change column names based on a user defined function
Change column names based on a user defined function. | Image: Suraj Gurav

The changed column names can be noticed in the above output. As per the applied function, the column names containing _ are split on _ and only the first part of it is assigned as a new column name. For example, Product_Category becomes Product.

And if it’s a simple function, like the one above, you can use the lambda function as well. 

More on Pandas8 Ways to Filter Pandas DataFrames

 

Rename Columns in Pandas With pandas.DataFrame.columns

This is the method that allows you to return the list of all the column names of the DataFrame, such as: df.columns

list column names using df.columns
List column names using df.columns. | Image: Suraj Gurav

However, in the reverse way, we can also pass the list of new column names to df.columns. Now, the new column names will be assigned to the DataFrame.

Here is how it works.

df.columns = ['OrderID', 'Order_Quantity',
             'UnitPrice(USD)', 'Order_Status',
             'OrderDate', 'Product_Category',
             'Sales_Manager', 'Shipping_Cost(USD)',
             'Delivery_Time(Days)', 'Shipping_Address',
             'Product_Code', 'OrderCode']
df.head()
Changing all column names at once using df.columns
Changing all column names at once using df.columns. | Image: Suraj Gurav

As you can see, I assigned a list of new column names to df.columns and the names of all columns are changed accordingly.

For this to work, you  need to pass the names of all the columns. The length of this names list must be exactly equal to the total number of columns in the DataFrame.

And without any other options like inplace, the column names are changed directly and permanently. As a result, this method is a bit risky.

So, I would suggest using it only when you are 100 percent sure that you want to change the column names.

You should also remember that the sequence of the column names in the list should match the columns in the DataFrame. Otherwise, the column names can be assigned incorrectly.

With all of the above points kept in mind, this is the best method to change all columns in one go.

A tutorial on how to rename columns in Pandas. | Video: GeeksforGeeks

More on PandasLoc and iLoc Functions in Pandas Tutorial

 

Rename Columns in Pandas With pandas.DataFrame.set_axis

This method is originally used to set labels to DataFrame’s axis, i.e. this method can be used to label columns as well as rows.

All you need to do is simply pass the list of column names to the .set_axis() function and specify axis = 1 to rename columns, like below:

df.set_axis(['A', 'B', 'C', 'D', 'E', 'F',
            'G', 'H', 'I', 'J', 'K', 'L'], axis=1).head()
Change column names using set_axis()
Change column names using set_axis(). | Image: Suraj Gurav

This is how you can change the column names for all or some of the columns. One also has to consider all of the points that I mentioned in the previous method.

However, .set_axis() is a safer version of the previous method df.columns because this contains the inplace parameter. So even before applying changes you can preview future changes.

And to retain the changed column names, simply make inplace = True.

That’s all you need to know about changing column names. I hope you’ve found this article interesting, useful and refreshing. It’s always important to have column names in a more readable and uniform style. Therefore, renaming columns is one of the essential steps that needs to be done at the beginning of your project.

Expert Contributors

Built In’s expert contributor network publishes thoughtful, solutions-oriented stories written by innovative tech professionals. It is the tech industry’s definitive destination for sharing compelling, first-person accounts of problem-solving on the road to innovation.

Learn More

Great Companies Need Great People. That's Where We Come In.

Recruit With Us