++++

Data Science

May 2026×Notebook lesson

Notebook converted from Jupyter for blog publishing.

08-Time-Methods

Driptanil DattaSoftware Developer

Time Methods

Python Datetime Review

Basic Python outside of Pandas contains a datetime library:

from datetime import datetime

# To illustrate the order of arguments
my_year = 2017
my_month = 1
my_day = 2
my_hour = 13
my_minute = 30
my_second = 15

# January 2nd, 2017
my_date = datetime(my_year,my_month,my_day)

# Defaults to 0:00
my_date

RESULT

datetime.datetime(2017, 1, 2, 0, 0)

# January 2nd, 2017 at 13:30:15
my_date_time = datetime(my_year,my_month,my_day,my_hour,my_minute,my_second)

my_date_time

RESULT

datetime.datetime(2017, 1, 2, 13, 30, 15)

You can grab any part of the datetime object you want

my_date.day

RESULT

my_date_time.hour

RESULT

Pandas

Converting to datetime

Often when data sets are stored, the time component may be a string. Pandas easily converts strings to datetime objects.

import pandas as pd

myser = pd.Series(['Nov 3, 2000', '2000-01-01', None])

myser

RESULT

0    Nov 3, 2000
1     2000-01-01
2           None
dtype: object

myser[0]

RESULT

'Nov 3, 2000'

pd.to_datetime()

https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#converting-to-timestamps (opens in a new tab)

pd.to_datetime(myser)

RESULT

0   2000-11-03
1   2000-01-01
2          NaT
dtype: datetime64[ns]

pd.to_datetime(myser)[0]

RESULT

Timestamp('2000-11-03 00:00:00')

obvi_euro_date = '31-12-2000'

pd.to_datetime(obvi_euro_date)

RESULT

Timestamp('2000-12-31 00:00:00')

# 10th of Dec OR 12th of October?
# We may need to tell pandas
euro_date = '10-12-2000'

pd.to_datetime(euro_date)

RESULT

Timestamp('2000-10-12 00:00:00')

pd.to_datetime(euro_date,dayfirst=True)

RESULT

Timestamp('2000-12-10 00:00:00')

Custom Time String Formatting

Sometimes dates can have a non standard format, luckily you can always specify to pandas the format. You should also note this could speed up the conversion, so it may be worth doing even if pandas can parse on its own.

A full table of codes can be found here: https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes (opens in a new tab)

style_date = '12--Dec--2000'

pd.to_datetime(style_date, format='%d--%b--%Y')

RESULT

Timestamp('2000-12-12 00:00:00')

strange_date = '12th of Dec 2000'

pd.to_datetime(strange_date)

RESULT

Timestamp('2000-12-12 00:00:00')

Data

Retail Sales: Beer, Wine, and Liquor Stores

Units: Millions of Dollars, Not Seasonally Adjusted

Frequency: Monthly

U.S. Census Bureau, Retail Sales: Beer, Wine, and Liquor Stores [MRTSSM4453USN], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/MRTSSM4453USN (opens in a new tab), July 2, 2020.

sales = pd.read_csv('RetailSales_BeerWineLiquor.csv')

sales

HTML

DATE
MRTSSM4453USN
0
1992-01-01
1509

sales.iloc[0]['DATE']

RESULT

'1992-01-01'

type(sales.iloc[0]['DATE'])

RESULT

str

sales['DATE'] = pd.to_datetime(sales['DATE'])

sales

HTML

DATE
MRTSSM4453USN
0
1992-01-01
1509

sales.iloc[0]['DATE']

RESULT

Timestamp('1992-01-01 00:00:00')

type(sales.iloc[0]['DATE'])

RESULT

pandas._libs.tslibs.timestamps.Timestamp

Attempt to Parse Dates Automatically

parse_dates - bool or list of int or names or list of lists or dict, default False The behavior is as follows:

boolean. If True -> try parsing the index.

list of int or names. e.g. If [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column.

list of lists. e.g. If [[1, 3]] -> combine columns 1 and 3 and parse as a single date column.

dict, e.g. {‘foo’ : [1, 3]} -> parse columns 1, 3 as date and call result ‘foo’

If a column or index cannot be represented as an array of datetimes, say because of an unparseable value or a mixture of timezones, the column or index will be returned unaltered as an object data type. For non-standard datetime parsing, use pd.to_datetime after pd.read_csv. To parse an index or column with a mixture of timezones, specify date_parser to be a partially-applied pandas.to_datetime() with utc=True. See Parsing a CSV with mixed timezones for more.

# Parse Column at Index 0 as Datetime
sales = pd.read_csv('RetailSales_BeerWineLiquor.csv',parse_dates=[0])

sales

HTML

DATE
MRTSSM4453USN
0
1992-01-01
1509

type(sales.iloc[0]['DATE'])

RESULT

pandas._libs.tslibs.timestamps.Timestamp

Resample

A common operation with time series data is resampling based on the time series index. Let's see how to use the resample() method. [reference (opens in a new tab)]

# Our index
sales.index

RESULT

RangeIndex(start=0, stop=340, step=1)

# Reset DATE to index

sales = sales.set_index("DATE")

sales

HTML

MRTSSM4453USN
DATE
1992-01-01
1509
1992-02-01

When calling .resample(){:python} you first need to pass in a rule parameter, then you need to call some sort of aggregation function.

The rule parameter describes the frequency with which to apply the aggregation function (daily, monthly, yearly, etc.)
It is passed in using an "offset alias" - refer to the table below. [reference (opens in a new tab)]

The aggregation function is needed because, due to resampling, we need some sort of mathematical rule to join the rows (mean, sum, count, etc.)

<table style="display: inline-block"> <caption style="text-align: center"><strong>TIME SERIES OFFSET ALIASES</strong></caption> <tr><th>ALIAS</th><th>DESCRIPTION</th></tr> <tr><td>B</td><td>business day frequency</td></tr> <tr><td>C</td><td>custom business day frequency (experimental)</td></tr> <tr><td>D</td><td>calendar day frequency</td></tr> <tr><td>W</td><td>weekly frequency</td></tr> <tr><td>M</td><td>month end frequency</td></tr> <tr><td>SM</td><td>semi-month end frequency (15th and end of month)</td></tr> <tr><td>BM</td><td>business month end frequency</td></tr> <tr><td>CBM</td><td>custom business month end frequency</td></tr> <tr><td>MS</td><td>month start frequency</td></tr> <tr><td>SMS</td><td>semi-month start frequency (1st and 15th)</td></tr> <tr><td>BMS</td><td>business month start frequency</td></tr> <tr><td>CBMS</td><td>custom business month start frequency</td></tr> <tr><td>Q</td><td>quarter end frequency</td></tr> <tr><td></td><td><font color=white>intentionally left blank</font></td></tr></table>

<table style="display: inline-block; margin-left: 40px"> <caption style="text-align: center"></caption> <tr><th>ALIAS</th><th>DESCRIPTION</th></tr> <tr><td>BQ</td><td>business quarter endfrequency</td></tr> <tr><td>QS</td><td>quarter start frequency</td></tr> <tr><td>BQS</td><td>business quarter start frequency</td></tr> <tr><td>A</td><td>year end frequency</td></tr> <tr><td>BA</td><td>business year end frequency</td></tr> <tr><td>AS</td><td>year start frequency</td></tr> <tr><td>BAS</td><td>business year start frequency</td></tr> <tr><td>BH</td><td>business hour frequency</td></tr> <tr><td>H</td><td>hourly frequency</td></tr> <tr><td>T, min</td><td>minutely frequency</td></tr> <tr><td>S</td><td>secondly frequency</td></tr> <tr><td>L, ms</td><td>milliseconds</td></tr> <tr><td>U, us</td><td>microseconds</td></tr> <tr><td>N</td><td>nanoseconds</td></tr></table>

# Yearly Means
sales.resample(rule='A').mean()

HTML

MRTSSM4453USN
DATE
1992-12-31
1807.250000
1993-12-31

Resampling rule 'A' takes all of the data points in a given year, applies the aggregation function (in this case we calculate the mean), and reports the result as the last day of that year. Note 2020 in this data set was not complete.

.dt Method Calls

Once a column or index is ina datetime format, you can call a variety of methods off of the .dt library inside pandas:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.dt.html (opens in a new tab)

sales = sales.reset_index()

sales

HTML

DATE
MRTSSM4453USN
0
1992-01-01
1509

help(sales['DATE'].dt)

STDOUT

Help on DatetimeProperties in module pandas.core.indexes.accessors object:

class DatetimeProperties(Properties)
 |  Accessor object for datetimelike properties of the Series values.
 |

sales['DATE'].dt.month

RESULT

sales['DATE'].dt.is_leap_year

RESULT

0       True
1       True
2       True
3       True
4       True

07 Text Methods 09 Inputs and Outputs