🚀
Seaborn
02 Categorical Plots Stat Estimation
++++
Data Science
May 2026×Notebook lesson

Notebook converted from Jupyter for blog publishing.

02-Categorical-Plots-Stat-Estimation

Driptanil Datta
Driptanil DattaSoftware Developer

Categorical Plots - Statistical Estimation within Categories

Often we have categorical data, meaning the data is in distinct groupings, such as Countries or Companies. There is no country value "between" USA and France and there is no company value "between" Google and Apple, unlike continuous data where we know values can exist between data points, such as age or price.

To begin with categorical plots, we'll focus on statistical estimation within categories. Basically this means we will visually report back some statistic (such as mean or count) in a plot. We already know how to get this data with pandas, but often its easier to understand the data if we plot this.

Imports

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

The Data

df = pd.read_csv("dm_office_sales.csv")
df.head()
HTML
MORE
division
level of education
training level
work experience
salary

Countplot()

A simple plot, it merely shows the total count of rows per category.

plt.figure(figsize=(10,4),dpi=200)
sns.countplot(x='division',data=df)
RESULT
<AxesSubplot:xlabel='division', ylabel='count'>
PLOT
Output 1
plt.figure(figsize=(10,4),dpi=200)
sns.countplot(x='level of education',data=df)
RESULT
<AxesSubplot:xlabel='level of education', ylabel='count'>
PLOT
Output 2

Breakdown within another category with 'hue'

plt.figure(figsize=(10,4),dpi=200)
sns.countplot(x='level of education',data=df,hue='training level')
RESULT
<AxesSubplot:xlabel='level of education', ylabel='count'>
PLOT
Output 3

NOTE: You can always edit the palette to your liking to any matplotlib colormap (opens in a new tab)

plt.figure(figsize=(10,4),dpi=200)
sns.countplot(x='level of education',data=df,hue='training level',palette='Set1')
RESULT
<AxesSubplot:xlabel='level of education', ylabel='count'>
PLOT
Output 4
plt.figure(figsize=(10,4),dpi=200)
# Paired would be a good choice if there was a distinct jump from 0 and 1 to 2 and 3
sns.countplot(x='level of education',data=df,hue='training level',palette='Paired')
RESULT
<AxesSubplot:xlabel='level of education', ylabel='count'>
PLOT
Output 5

barplot()

So far we've seen the y axis default to a count (similar to a .groupby(x_axis).count() call in pandas). We can expand our visualizations by specifying a specific continuous feature for the y-axis. Keep in mind, you should be careful with these plots, as they may imply a relationship continuity along the y axis where there is none.

plt.figure(figsize=(10,6),dpi=200)
# By default barplot() will show the mean
# Information on the black bar: https://stackoverflow.com/questions/58362473/what-does-black-lines-on-a-seaborn-barplot-mean
sns.barplot(x='level of education',y='salary',data=df,estimator=np.mean,ci='sd')
RESULT
<AxesSubplot:xlabel='level of education', ylabel='salary'>
PLOT
Output 6
plt.figure(figsize=(12,6))
sns.barplot(x='level of education',y='salary',data=df,estimator=np.mean,ci='sd',hue='division')
RESULT
<AxesSubplot:xlabel='level of education', ylabel='salary'>
PLOT
Output 7
plt.figure(figsize=(12,6),dpi=100)
 
# https://stackoverflow.com/questions/30490740/move-legend-outside-figure-in-seaborn-tsplot
sns.barplot(x='level of education',y='salary',data=df,estimator=np.mean,ci='sd',hue='division')
 
plt.legend(bbox_to_anchor=(1.05, 1))
RESULT
<matplotlib.legend.Legend at 0x21fa0b503c8>
PLOT
Output 8


Drip

Driptanil Datta

Software Developer

Building full-stack systems, one commit at a time. This blog is a centralized learning archive for developers.

Legal Notes
Disclaimer

The content provided on this blog is for educational and informational purposes only. While I strive for accuracy, all information is provided "as is" without any warranties of completeness, reliability, or accuracy. Any action you take upon the information found on this website is strictly at your own risk.

Copyright & IP

Certain technical content, interview questions, and datasets are curated from external educational sources to provide a centralized learning resource. Respect for original authorship is maintained; no copyright infringement is intended. All trademarks, logos, and brand names are the property of their respective owners.

System Operational

© 2026 Driptanil Datta. All rights reserved.