How to Add Columns in a Pandas DataFrame

Pandas DataFrame presents data in tabular rows and columns. Adding new columns is an important task in data analysis. Here's four ways to do it in pandas.

Written by Soner Yıldırım
Greek columns in a row representing pandas add column concept
Image: Shutterstock / Built In
Brand Studio Logo
UPDATED BY
Brennan Whitfield | Jul 30, 2024

Pandas is a data analysis and manipulation library for Python. It provides numerous functions and methods to manage tabular data. The core data structure of pandas is DataFrame, which stores data in tabular form with labeled rows and columns.

How to Add a Column to a pandas DataFrame

How to add a column onto the end of a pandas DataFrame:

df["new column"] = 1

or 

df["new column"] = [1, 2, 3]

In this code, the first set of brackets represents the name of the new column, while values after the = sign is the value(s) assigned under the column.

How to insert a column in a pandas DataFrame:

df.insert(1, "new column", 1)

or 

df.insert(1, "new column", [1, 2, 3])

This code uses insert(), which requires three parameters: the index of where the new column will be added, the name of the new column and the new value(s) under the column.

Both methods above use example code, so make sure to replace the values and parameters with the ones you need for your own DataFrame.

From a data perspective, rows represent observations or data points. Columns represent features or attributes about the observations. Consider a DataFrame of house prices. Each row is a house and each column is a feature about the house such as age, number of rooms, price and so on.

Adding or dropping columns is a common operation in data analysis. We’ll go over four different ways of adding a new column to a DataFrame.

First, let’s create a simple DataFrame to use in the examples.

import numpy as np
import pandas as pd

df = pd.DataFrame({"A": [1, 2, 3, 4],
                   "B": [5, 6, 7, 8]})

df
Two columns of data in pandas
A two column set of data presented in pandas DataFrame. | Image: Soner Yildirim

Become a Pandas ExpertA Beginner’s Guide to Using Pandas for Text Data Wrangling With Python

 

A walkthrough on how to add new columns to pandas. | Video: Data Independent

4 Pandas Add Column Methods

Below are four methods for adding columns to a pandas DataFrame.

Method 1: Adding Columns on the End

This might be the most commonly used method for creating a new column.

df["C"] = [10, 20, 30, 40]

df
Three column data set in pandas DataFrame
Adding a column at the end of a DataFrame in pandas. | Image: Soner Yildirim

We specify the column name like we are selecting a column in the DataFrame. Then, the values are assigned to this column. A new column is added as the last column, i.e. the column with the highest index.

We can also add multiple columns at once. Column names are passed in a list and values need to be two-dimensional compatible with the number of rows and columns. For instance, the following code adds three columns filled with random integers between zero and 10.

df[["1of3", "2of3", "3of3"]] = np.random.randint(10, size=(4,3))

df
adding three columns at the end of a pandas DataFrame
Adding three columns filled with integers at the end of a pandas DataFrame. | Image: Soner Yildirim

Let’s drop these three columns before going to the next method.

df.drop(["1of3", "2of3", "3of3"], axis=1, inplace=True)

Method 2: Add Columns at a Specific Index

In the first method, the new column is added at the end. Pandas also allows for adding new columns at a specific index. The insert function can be used to customize the location of the new column. Let’s add a column next to column A.

df.insert(1, "D", 5)

df
Inserting column D between A and B in pandas DataFrame
Inserting column D in between columns A and B in pandas DataFrame. | Image: Soner Yildirim

The insert function takes three parameters that are the index, the name of the column and the values. The column indices start from zero, so we set the index parameter as one to add the new column next to column A. We can pass a constant value to be filled in all rows.

Method 3: Add Columns with Loc

The loc method allows you to select rows and columns using their labels. It’s also possible to create a new column with this method.

df.loc[:, "E"] = list("abcd")

df
using the loc method to select rows and add columns
Using the loc method to select rows and column labels to add a new column. | Image: Soner Yildirim

In order to select rows and columns, we pass the desired labels. The colon indicates that we want to select all the rows. In the column part, we specify the labels of the columns to be selected. Since the DataFrame does not have column E, pandas creates a new column.

More on PandasHow to Speed Up Your Pandas Code by 10x

Method 4: Add Columns With the Assign Function

The last method is the assign function.

df = df.assign(F = df.C * 10)

df
Using the assign function to add a column F at the end
Using the assign function to create column F. | Image: Sonder Yildirim

We specify both the column name and values inside the assign function. You may notice that we derive the values using another column in the DataFrame. The previous methods also allow for similar derivations.

There is an important difference between the insert and assign functions. The insert function works in place, which means adding a new column is saved in the DataFrame.

The situation is a little different with the assign function. It returns the modified DataFrame but does not change the original one. In order to use the modified version with the new column, we need to explicitly assign it.

We’ve now covered four different methods for adding new columns to a pandas DataFrame, a common operation in data analysis and manipulation. One of the things I like about pandas is that it usually provides multiple ways to perform a given task, making it a flexible and versatile tool for analyzing and manipulating data

Explore Job Matches.