In Python, a datetime object is part of the built-in datetime module and is used to represent and work with dates and times. It combines both date (year, month, day) and time (hour, minute, second, microsecond) into a single object, making it ideal for handling timestamps, scheduling and time-based calculations.
Datetime Object in Python Definition
A datetime object in Python represents a specific point in time, combining both date and time components. It's part of the datetime module and allows easy manipulation, formatting and comparison of dates and times.
Working with datetime data sets can be one of the most frustrating aspects of computer programming. Not only do you need to keep track of the date but you also need to learn how to represent dates and times in each language, create indices out of those data points and ensure all your data sets handle daylight saving time in the same manner.
Fortunately, this primer on datetime data will help you get started working in Python.
What Is a Datetime Object in Python?
Most representations of date and time in Python are presented as datetime objects, created from the datetime standard module. A datetime object is a Python variable that contains information about a date and time. It can also include time zone information, and there are tools for changing time zones as needed.
How to Create a Datetime Object in Python
Let’s look at a few examples of datetime objects in Python, created using the datetime module. First, we can use the module’s command to create a variable storing the current time as follows:
import datetime
import pytz
now = datetime.datetime.now(pytz.timezone('US/Pacific'))
print(now)
print(now.tzinfo)
# Output:
2022-05-11 09:19:01.859385-07:00
US/Pacific
The first two lines import the important libraries for this task. The first is the datetime module, which enables us to create and manipulate datetime objects. The second is the pytz package, which provides time zone information.
The third line uses the datetime.datetime.now() function along with a pytz timezone object to create a timezone-aware datetime object representing the current time in the U.S. Pacific time zone.
The fourth and fifth lines both print outputs used to demonstrate the result of the code.
The first output (2022-05-11 09:19:01.859385-07:00) shows the full information of the variable now. It shows that the variable was created on May 11, 2022 at 9:19 and 1.86 seconds. Since I set the time zone as ‘US/Pacific’ the program attaches the appropriate -7 hours (compared to UTC, Coordinate Universal Time) to the variable. The second output (US/Pacific) confirms the time zone information by printing that the variable’s time zone is the U.S. Pacific time zone.
Pytz for Datetime Objects in Python
You’ll notice that in the above code I set the time zone to the U.S. Pacific time zone by calling pytz.timezone(‘US/Pacific’). If you want to use a different time zone you need to know the correct code (though they all follow the same format and reference known time zones, so they’re fairly predictable).
If you want to find your time zone you can print a list of all options with the following command:
print(pytz.all_timezones)
How to Create a Datetime Object With a Specific Date and Time in Python
To create a datetime object for a specified date and time, you can provide the year, month, day, hour, minute and second as arguments to the datetime.datetime() constructor.
In other words, if we want to create a datetime object representing May 11, 2022 at 12:11:03 in the U.S. Pacific time zone we can use the following code:
import datetime
import pytz
# Define the Pacific timezone
pacific = pytz.timezone('US/Pacific')
# Create a naive datetime object
naive_datetime = datetime.datetime(2022, 5, 11, 12, 11, 3)
# Localize the naive datetime to Pacific time
pacific_datetime = pacific.localize(naive_datetime)
print(pacific_datetime)
print(pacific_datetime.tzinfo)
# Output:
2022-05-11 12:11:03-07:00
US/Pacific
Notice how the inputs to datetime.datetime appear above. Then the .localize method is called to set the time zone to the U.S. Pacific time zone.
Turns out that the code created the variable we wanted. As desired, pacific_datetime now returns May 11, 2022 at 12:11:03 in the U.S. Pacific time zone.
How to Convert the Timezone of a Datetime Object in Python
Now what if we want to represent the same time in a different time zone? We could go through the effort of calculating that new time zone and creating a new datetime object accordingly but that would require us to know the time difference, do the math and create the new object accordingly. Another option is to convert the time zone to the desired time zone and save the output to a new variable.
So if we want to convert pacific_datetime to the U.S. Eastern time zone, we can use the following code:
import datetime
import pytz
pacific = pytz.timezone('US/Pacific')
naive_datetime = datetime.datetime(2022, 5, 11, 12, 11, 3)
pacific_datetime = pacific.localize(naive_datetime)
print(pacific_datetime)
print(pacific_datetime.tzinfo)
eastern_datetime = pacific_datetime.astimezone(pytz.timezone('US/Eastern'))
print(eastern_datetime)
print(eastern_datetime.tzinfo)
That code calls the .astimezone() method of pacific_datetime with a new time zone object representing the U.S. Eastern time zone. The printed outputs are:
2022-05-11 12:11:03-07:00
US/Pacific
2022-05-11 15:11:03-04:00
US/Eastern
Notice the changes between pacific_datetime and eastern_datetime. The hour has changed from 12 to 15, due to U.S. Eastern time being three hours ahead of U.S. Pacific time. The time zone information has changed from -7 to -4, because U.S. Eastern time is four hours different from UTC instead of seven hours different. Finally, notice that the printed time zone information is now US/Eastern instead of US/Pacific.
As a note, calling .astimezone() on a newly created datetime object can result in an error. To properly add timezone information, it’s best to use pytz’s localize() method first, before any conversions.
Pandas and Datetime Objects in Python
Pandas DataFrames often use datetime objects as the indices because this enables data sets to track the date and time at which a measurement was recorded. Therefore Pandas provides many tools you can use.
Consider the following example and the outputs provided:
import pandas as pd
data = pd.read_csv(r'C:\Users\Peter Grant\Desktop\Sample_Data.csv', index_col = 0) # Replace with your own path to the CSV file
print(data.index)
print(type(data.index))
print(type(data.index[0]))
The index in the sample data set uses datetime, so you’d expect the DataFrame’s index to be a DatetimeIndex, right? Well, unfortunately, you’d be wrong.
Let’s take a look at the outputs:
Index(['10/1/2020 0:00', '10/1/2020 0:00', '10/1/2020 0:00', '10/1/2020 0:01',
'10/1/2020 0:01', '10/1/2020 0:01', '10/1/2020 0:01', '10/1/2020 0:02',
'10/1/2020 0:02', '10/1/2020 0:02',
...
'4/1/2021 2:01', '4/1/2021 2:01', '4/1/2021 2:02', '4/1/2021 2:02',
'4/1/2021 2:02', '4/1/2021 2:02', '4/1/2021 2:03', '4/1/2021 2:03',
'4/1/2021 2:03', '4/1/2021 2:03'],
dtype='object', length=1048575)
<class 'pandas.core.indexes.base.Index'>
<class 'str'>
How to Convert a Pandas DataFrame to a Pandas DatetimeIndex
While the index in the sample data set above looks like it contains datetime values, Pandas often reads it in as a generic object index containing strings. This can happen if the date format isn’t standard or if the read_csv() function isn’t explicitly told to parse the dates.
Fortunately, the pd.to_datetime() function can convert it to a proper DatetimeIndex.
import pandas as pd
data.index = pd.to_datetime(data.index)
print(data.index)
print(type(data.index))
print(type(data.index[0]))
And the outputs:
DatetimeIndex(['2020-10-01 00:00:00', '2020-10-01 00:00:00',
'2020-10-01 00:00:00', '2020-10-01 00:01:00',
'2020-10-01 00:01:00', '2020-10-01 00:01:00',
'2020-10-01 00:01:00', '2020-10-01 00:02:00',
'2020-10-01 00:02:00', '2020-10-01 00:02:00',
...
'2021-04-01 02:01:00', '2021-04-01 02:01:00',
'2021-04-01 02:02:00', '2021-04-01 02:02:00',
'2021-04-01 02:02:00', '2021-04-01 02:02:00',
'2021-04-01 02:03:00', '2021-04-01 02:03:00',
'2021-04-01 02:03:00', '2021-04-01 02:03:00'],
dtype='datetime64[ns]', length=1048575, freq=None)
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
The index of the DataFrame is now a DatetimeIndex and the type of the first entry is now a Pandas time stamp (which is equivalent to a datetime object). This is what we wanted, and something we can work with.
How to Create a Pandas DatetimeIndex
But what if you want to create a DatetimeIndex of your own? If you know the date range and frequency for which you want to create a DatetimeIndex, you can use the Pandas date_range() function.
Here’s an example:
import pandas as pd
import datetime
index = pd.date_range(datetime.datetime(2022, 1, 1, 0, 0), datetime.datetime(2022, 12, 31, 23, 55), freq = '5min')
print(index)
This code returns the following output:
DatetimeIndex(['2022-01-01 00:00:00', '2022-01-01 00:05:00',
'2022-01-01 00:10:00', '2022-01-01 00:15:00',
'2022-01-01 00:20:00', '2022-01-01 00:25:00',
'2022-01-01 00:30:00', '2022-01-01 00:35:00',
'2022-01-01 00:40:00', '2022-01-01 00:45:00',
...
'2022-12-31 23:10:00', '2022-12-31 23:15:00',
'2022-12-31 23:20:00', '2022-12-31 23:25:00',
'2022-12-31 23:30:00', '2022-12-31 23:35:00',
'2022-12-31 23:40:00', '2022-12-31 23:45:00',
'2022-12-31 23:50:00', '2022-12-31 23:55:00'],
dtype='datetime64[ns]', length=105120, freq='5T')
Comparing the code calling date_range() to the documentation for the function, you can see that the first two entries set the start and end dates of the range. The start date was set to January 1, 2022 at midnight, and the end range was set to December 31, 2022 at 23:55:00. The third entry sets the frequency of the DatetimeIndex to be five minutes. Notice that the code for five minutes is ‘5min’. In order to get the frequency you want, you need to set the frequency using the correct code. Fortunately, a list of Pandas codes is available.
How to Access Data in a Pandas DataFrame With a DatetimeIndex
Indexing with Pandas datetime indices can also be a bit of a pain. At first glance it seems that you need to create complex datetime objects to reference the correct part of the DataFrame.
Consider the following example wherein I create a new DataFrame using the index from the last example, set a value in the DataFrame and print that value to ensure it updates correctly.
import pandas as pd
import datetime
index = pd.date_range(datetime.datetime(2022, 1, 1, 0, 0), datetime.datetime(2022, 12, 31, 23, 55), freq = '5min')
df = pd.DataFrame(index = index, columns = ['Example'])
df.loc[datetime.datetime(2022, 1, 1, 0, 0, 0), 'Example'] = 2
print(df.loc[datetime.datetime(2022, 1, 1, 0, 0, 0), 'Example'])
# Output: 2
The output from this code is exactly what I wanted. It prints 2, showing that the DataFrame’s value at [datetime.datetime(2022, 1, 1, 0, 0, 0), ‘Example’] is 2 as desired.
Accessing Data by .Iloc Instead of Datetime
Specifying the full datetime over and over again gets tedious. Fortunately, you don’t have to — you can use .iloc to reference rows by position. For example, if you want to edit the first row, you can do:
import pandas as pd
import datetime
index = pd.date_range(datetime.datetime(2022, 1, 1, 0, 0), datetime.datetime(2022, 12, 31, 23, 55), freq = '5min')
df = pd.DataFrame(index = index, columns = ['Example'])
df.iloc[0, 0] = 2
print(df.iloc[0, 0])
# Output: 2
Notice how that code does the exact same thing, except it provides the desired index value by calling the first value of the index.
You don’t even need to know what the value there is, you just need to know that you want to work with the first one — or second, or third or whatever value you want. You only need to update the call accordingly.
Daylight Saving Time and Datetime Objects in Python
Working with daylight saving time is one of the biggest pains in Python, and one that can cause very serious data analysis errors. Consider the example of comparing physical measurements to theoretical approximations. Now you have two data sets, and you want to make sure they say the same thing. What if one data set uses daylight saving time and the other doesn’t? Suddenly you’re comparing data sets that disagree by one hour.
One way to resolve this issue is to remove one hour from the DataFrame index with daylight saving time during the hours when daylight saving time occurs. To help with this, the .dst() method is available on timezone-aware datetime or Pandas timestamp objects and returns the daylight saving time offset as a timedelta. If the timestamp occurs during daylight saving time, it returns a datetime.timedelta value of one hour. If the timestamp does not occur during daylight saving time, it returns a datetime.timedelta value of zero hours. This enables us to identify timestamps that occur during daylight saving time and remove one hour from it accordingly.
Imagine that you have a DataFrame with a DatetimeIndex that includes a daylight saving time offset. To remove daylight saving time you can iterate through the index, make the index time zone naive and remove one hour from the index. Unfortunately, indexes are immutable so you can’t edit them directly. What you can do is create an external list of values, add your updated index values to that list and replace the index with that list at the end.
That code would look like this:
import datetime
import pandas as pd
import pytz
# Create naive datetime range
index = pd.date_range(
datetime.datetime(2022, 3, 1),
datetime.datetime(2022, 11, 1),
freq='D'
)
df = pd.DataFrame(index=index, columns=['Example'])
# Localize index to US/Pacific (make it timezone-aware)
df.index = df.index.tz_localize('US/Pacific')
print("Original timezone-aware index:")
print(df.index)
temp = []
for ts in df.index:
# Subtract 1 hour if DST is in effect
if ts.dst() == datetime.timedelta(hours=1):
# Convert to naive UTC-8 equivalent (standard time)
temp.append((ts - datetime.timedelta(hours=1)).replace(tzinfo=None))
else:
# Convert to naive time directly
temp.append(ts.replace(tzinfo=None))
df.index = pd.DatetimeIndex(temp)
print("\nAdjusted index (naive, normalized to standard time):")
print(df.index)
# Output:
Original timezone-aware index:
DatetimeIndex(['2022-03-01 00:00:00-08:00', '2022-03-02 00:00:00-08:00',
'2022-03-03 00:00:00-08:00', '2022-03-04 00:00:00-08:00',
'2022-03-05 00:00:00-08:00', '2022-03-06 00:00:00-08:00',
'2022-03-07 00:00:00-08:00', '2022-03-08 00:00:00-08:00',
'2022-03-09 00:00:00-08:00', '2022-03-10 00:00:00-08:00',
...
'2022-10-23 00:00:00-07:00', '2022-10-24 00:00:00-07:00',
'2022-10-25 00:00:00-07:00', '2022-10-26 00:00:00-07:00',
'2022-10-27 00:00:00-07:00', '2022-10-28 00:00:00-07:00',
'2022-10-29 00:00:00-07:00', '2022-10-30 00:00:00-07:00',
'2022-10-31 00:00:00-07:00', '2022-11-01 00:00:00-07:00'],
dtype='datetime64[ns, US/Pacific]', length=246, freq=None)
Adjusted index (naive, normalized to standard time):
DatetimeIndex(['2022-03-01 00:00:00', '2022-03-02 00:00:00',
'2022-03-03 00:00:00', '2022-03-04 00:00:00',
'2022-03-05 00:00:00', '2022-03-06 00:00:00',
'2022-03-07 00:00:00', '2022-03-08 00:00:00',
'2022-03-09 00:00:00', '2022-03-10 00:00:00',
...
'2022-10-22 23:00:00', '2022-10-23 23:00:00',
'2022-10-24 23:00:00', '2022-10-25 23:00:00',
'2022-10-26 23:00:00', '2022-10-27 23:00:00',
'2022-10-28 23:00:00', '2022-10-29 23:00:00',
'2022-10-30 23:00:00', '2022-10-31 23:00:00'],
dtype='datetime64[ns]', length=246, freq=None)
And there you have it. Now you know how to work with datetime objects, use them to form the indices of your Pandas DataFrames and remove daylight saving time from your data sets.
Frequently Asked Questions
What is a datetime object in Python?
A datetime object is a variable in Python that stores a date, a time, or both. It can also contain time zone information.
How do I create a datetime object for the current time?
In Python, you can create a datetime object for the current time by calling the datetime.now() function from the datetime module.
For example:
from datetime import datetime
current_time = datetime.now()
print(current_time)
How do I change a datetime object to a different time zone?
You can use the pytz package to manage time zones for datetime objects in Python. Use pytz.timezone() to get a timezone object, then apply it to your datetime object to set or convert its time zone.
For example:
import datetime
import pytz
now = datetime.datetime.now(pytz.timezone('US/Pacific'))
print(now)
print(now.tzinfo)
