Pandas is a data analysis and manipulation library for Python. It provides numerous functions and methods to manage tabular data. The core data structure of pandas is DataFrame, which stores data in tabular form with labeled rows and columns.

4 Ways to Add a Column in Pandas

  1. Add columns at the end of the table.
  2. Add columns at a specific index.
  3. Add columns with the loc method.
  4. Add columns with the assign function.

From a data perspective, rows represent observations or data points. Columns represent features or attributes about the observations. Consider a DataFrame of house prices. Each row is a house and each column is a feature about the house such as age, number of rooms, price and so on.

Adding or dropping columns is a common operation in data analysis. We’ll go over four different ways of adding a new column to a DataFrame.

First, let’s create a simple DataFrame to use in the examples.

import numpy as np
import pandas as pd

df = pd.DataFrame({"A": [1, 2, 3, 4],
                   "B": [5, 6, 7, 8]})

df
Two columns of data in pandas
A two column set of data presented in pandas DataFrame. | Image: Soner Yildirim

 

4 Pandas Add Column Methods

Below are four methods for adding columns to a pandas DataFrame.

Become a Pandas ExpertA Beginner’s Guide to Using Pandas for Text Data Wrangling With Python

 

Method 1: Adding Columns on the End

This might be the most commonly used method for creating a new column.

df["C"] = [10, 20, 30, 40]

df
Three column data set in pandas DataFrame
Adding a column at the end of a DataFrame in pandas. | Image: Soner Yildirim

We specify the column name like we are selecting a column in the DataFrame. Then, the values are assigned to this column. A new column is added as the last column, i.e. the column with the highest index.

We can also add multiple columns at once. Column names are passed in a list and values need to be two-dimensional compatible with the number of rows and columns. For instance, the following code adds three columns filled with random integers between zero and 10.

df[["1of3", "2of3", "3of3"]] = np.random.randint(10, size=(4,3))

df
adding three columns at the end of a pandas DataFrame
Adding three columns filled with integers at the end of a pandas DataFrame. | Image: Soner Yildirim

Let’s drop these three columns before going to the next method.

df.drop(["1of3", "2of3", "3of3"], axis=1, inplace=True)

 

Method 2: Add Columns at a Specific Index

In the first method, the new column is added at the end. Pandas also allows for adding new columns at a specific index. The insert function can be used to customize the location of the new column. Let’s add a column next to column A.

df.insert(1, "D", 5)

df
Inserting column D between A and B in pandas DataFrame
Inserting column D in between columns A and B in pandas DataFrame. | Image: Soner Yildirim

The insert function takes three parameters that are the index, the name of the column and the values. The column indices start from zero, so we set the index parameter as one to add the new column next to column A. We can pass a constant value to be filled in all rows.

A walkthrough on how to add new columns to pandas. | Video: Data Independent

 

Method 3: Add Columns with Loc

The loc method allows you to select rows and columns using their labels. It’s also possible to create a new column with this method.

df.loc[:, "E"] = list("abcd")

df
using the loc method to select rows and add columns
Using the loc method to select rows and column labels to add a new column. | Image: Soner Yildirim

In order to select rows and columns, we pass the desired labels. The colon indicates that we want to select all the rows. In the column part, we specify the labels of the columns to be selected. Since the DataFrame does not have column E, pandas creates a new column.

More on PandasHow to Speed Up Your Pandas Code by 10x

 

Method 4: Add Columns With the Assign Function

The last method is the assign function.

df = df.assign(F = df.C * 10)

df
Using the assign function to add a column F at the end
Using the assign function to create column F. | Image: Sonder Yildirim

We specify both the column name and values inside the assign function. You may notice that we derive the values using another column in the DataFrame. The previous methods also allow for similar derivations.

There is an important difference between the insert and assign functions. The insert function works in place, which means adding a new column is saved in the DataFrame.

The situation is a little different with the assign function. It returns the modified DataFrame but does not change the original one. In order to use the modified version with the new column, we need to explicitly assign it.

We’ve now covered four different methods for adding new columns to a pandas DataFrame, a common operation in data analysis and manipulation. One of the things I like about pandas is that it usually provides multiple ways to perform a given task, making it a flexible and versatile tool for analyzing and manipulating data

Expert Contributors

Built In’s expert contributor network publishes thoughtful, solutions-oriented stories written by innovative tech professionals. It is the tech industry’s definitive destination for sharing compelling, first-person accounts of problem-solving on the road to innovation.

Learn More

Great Companies Need Great People. That's Where We Come In.

Recruit With Us