🚀
Pandas
07 Text Methods
++++
Data Science
May 2026×Notebook lesson

Notebook converted from Jupyter for blog publishing.

07-Text-Methods

Driptanil Datta
Driptanil DattaSoftware Developer

Text Methods

A normal Python string has a variety of method calls available:

mystring = 'hello'
mystring.capitalize()
RESULT
'Hello'
mystring.isdigit()
RESULT
False
help(str)
STDOUT
MORE
Help on class str in module builtins:

class str(object)
 |  str(object='') -> str
 |  str(bytes_or_buffer[, encoding[, errors]]) -> str

Pandas and Text

Pandas can do a lot more than what we show here. Full online documentation on things like advanced string indexing and regular expressions with pandas can be found here: https://pandas.pydata.org/docs/user_guide/text.html (opens in a new tab)

Text Methods on Pandas String Column

import pandas as pd
names = pd.Series(['andrew','bobo','claire','david','4'])
names
RESULT
MORE
0    andrew
1      bobo
2    claire
3     david
4         4
names.str.capitalize()
RESULT
MORE
0    Andrew
1      Bobo
2    Claire
3     David
4         4
names.str.isdigit()
RESULT
MORE
0    False
1    False
2    False
3    False
4     True

Splitting , Grabbing, and Expanding

tech_finance = ['GOOG,APPL,AMZN','JPM,BAC,GS']
len(tech_finance)
RESULT
2
tickers = pd.Series(tech_finance)
tickers
RESULT
0    GOOG,APPL,AMZN
1        JPM,BAC,GS
dtype: object
tickers.str.split(',')
RESULT
0    [GOOG, APPL, AMZN]
1        [JPM, BAC, GS]
dtype: object
tickers.str.split(',').str[0]
RESULT
0    GOOG
1     JPM
dtype: object
tickers.str.split(',',expand=True)
HTML
MORE
0
1
2
0
GOOG

Cleaning or Editing Strings

messy_names = pd.Series(["andrew  ","bo;bo","  claire  "])
# Notice the "mis-alignment" on the right hand side due to spacing in "andrew  " and "  claire  "
messy_names
RESULT
0      andrew  
1         bo;bo
2      claire  
dtype: object
messy_names.str.replace(";","")
RESULT
0      andrew  
1          bobo
2      claire  
dtype: object
messy_names.str.strip()
RESULT
0    andrew
1     bo;bo
2    claire
dtype: object
messy_names.str.replace(";","").str.strip()
RESULT
0    andrew
1      bobo
2    claire
dtype: object
messy_names.str.replace(";","").str.strip().str.capitalize()
RESULT
0    Andrew
1      Bobo
2    Claire
dtype: object

Alternative with Custom apply() call

def cleanup(name):
    name = name.replace(";","")
    name = name.strip()
    name = name.capitalize()
    return name
messy_names
RESULT
0      andrew  
1         bo;bo
2      claire  
dtype: object
messy_names.apply(cleanup)
RESULT
0    Andrew
1      Bobo
2    Claire
dtype: object

Which one is more efficient?

import timeit 
  
# code snippet to be executed only once 
setup = '''
import pandas as pd
import numpy as np
messy_names = pd.Series(["andrew  ","bo;bo","  claire  "])
def cleanup(name):
    name = name.replace(";","")
    name = name.strip()
    name = name.capitalize()
    return name
'''
  
# code snippet whose execution time is to be measured 
stmt_pandas_str = ''' 
messy_names.str.replace(";","").str.strip().str.capitalize()
'''
 
stmt_pandas_apply = '''
messy_names.apply(cleanup)
'''
 
stmt_pandas_vectorize='''
np.vectorize(cleanup)(messy_names)
'''
timeit.timeit(setup = setup, 
                    stmt = stmt_pandas_str, 
                    number = 10000)
RESULT
3.931618999999955
timeit.timeit(setup = setup, 
                    stmt = stmt_pandas_apply, 
                    number = 10000)
RESULT
1.2268500999999787
timeit.timeit(setup = setup, 
                    stmt = stmt_pandas_vectorize, 
                    number = 10000)
RESULT
0.28283379999993485

Wow! While .str() methods can be extremely convienent, when it comes to performance, don't forget about np.vectorize()! Review the "Useful Methods" lecture for a deeper discussion on np.vectorize()

Drip

Driptanil Datta

Software Developer

Building full-stack systems, one commit at a time. This blog is a centralized learning archive for developers.

Legal Notes
Disclaimer

The content provided on this blog is for educational and informational purposes only. While I strive for accuracy, all information is provided "as is" without any warranties of completeness, reliability, or accuracy. Any action you take upon the information found on this website is strictly at your own risk.

Copyright & IP

Certain technical content, interview questions, and datasets are curated from external educational sources to provide a centralized learning resource. Respect for original authorship is maintained; no copyright infringement is intended. All trademarks, logos, and brand names are the property of their respective owners.

System Operational

© 2026 Driptanil Datta. All rights reserved.