🚀
Seaborn
03 Categorical Plots Distributions
++++
Data Science
May 2026×Notebook lesson

Notebook converted from Jupyter for blog publishing.

03-Categorical-Plots-Distributions

Driptanil Datta
Driptanil DattaSoftware Developer

Categorical Plots - Distribution within Categories

So far we've seen how to apply a statistical estimation (like mean or count) to categories and compare them to one another. Let's now explore how to visualize the distribution within categories. We already know about distplot() which allows to view the distribution of a single feature, now we will break down that same distribution per category.

Imports

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

The Data

df = pd.read_csv("StudentsPerformance.csv")
df.head()
HTML
MORE
gender
race/ethnicity
parental level of education
lunch
test preparation course

Boxplot

As described in the video, a boxplot display distribution through the use of quartiles and an IQR for outliers.

plt.figure(figsize=(12,6))
sns.boxplot(x='parental level of education',y='math score',data=df)
RESULT
<AxesSubplot:xlabel='parental level of education', ylabel='math score'>
PLOT
Output 1

Adding hue for further segmentation

plt.figure(figsize=(12,6))
sns.boxplot(x='parental level of education',y='math score',data=df,hue='gender')
 
# Optional move the legend outside
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
RESULT
<matplotlib.legend.Legend at 0x215583a7e08>
PLOT
Output 2

Boxplot Styling Parameters

Orientation

# NOTICE HOW WE HAVE TO SWITCH X AND Y FOR THE ORIENTATION TO MAKE SENSE!
sns.boxplot(x='math score',y='parental level of education',data=df,orient='h')
RESULT
<AxesSubplot:xlabel='math score', ylabel='parental level of education'>
PLOT
Output 3

Width

plt.figure(figsize=(12,6))
sns.boxplot(x='parental level of education',y='math score',data=df,hue='gender',width=0.3)
RESULT
<AxesSubplot:xlabel='parental level of education', ylabel='math score'>
PLOT
Output 4

Violinplot

A violin plot plays a similar role as a box and whisker plot. It shows the distribution of quantitative data across several levels of one (or more) categorical variables such that those distributions can be compared. Unlike a box plot, in which all of the plot components correspond to actual datapoints, the violin plot features a kernel density estimation of the underlying distribution.

plt.figure(figsize=(12,6))
sns.violinplot(x='parental level of education',y='math score',data=df)
RESULT
<AxesSubplot:xlabel='parental level of education', ylabel='math score'>
PLOT
Output 5
plt.figure(figsize=(12,6))
sns.violinplot(x='parental level of education',y='math score',data=df,hue='gender')
RESULT
<AxesSubplot:xlabel='parental level of education', ylabel='math score'>
PLOT
Output 6

Violinplot Parameters

split

When using hue nesting with a variable that takes two levels, setting split to True will draw half of a violin for each level. This can make it easier to directly compare the distributions.

plt.figure(figsize=(12,6))
sns.violinplot(x='parental level of education',y='math score',data=df,hue='gender',split=True)
RESULT
<AxesSubplot:xlabel='parental level of education', ylabel='math score'>
PLOT
Output 7

inner

Representation of the datapoints in the violin interior. If box, draw a miniature boxplot. If quartiles, draw the quartiles of the distribution. If point or stick, show each underlying datapoint. Using None will draw unadorned violins.

plt.figure(figsize=(12,6))
sns.violinplot(x='parental level of education',y='math score',data=df,inner=None)
RESULT
<AxesSubplot:xlabel='parental level of education', ylabel='math score'>
PLOT
Output 8
plt.figure(figsize=(12,6))
sns.violinplot(x='parental level of education',y='math score',data=df,inner='box')
RESULT
<AxesSubplot:xlabel='parental level of education', ylabel='math score'>
PLOT
Output 9
plt.figure(figsize=(12,6))
sns.violinplot(x='parental level of education',y='math score',data=df,inner='quartile')
RESULT
<AxesSubplot:xlabel='parental level of education', ylabel='math score'>
PLOT
Output 10
plt.figure(figsize=(12,6))
sns.violinplot(x='parental level of education',y='math score',data=df,inner='stick')
RESULT
<AxesSubplot:xlabel='parental level of education', ylabel='math score'>
PLOT
Output 11

orientation

# Simply switch the continuous variable to y and the categorical to x
sns.violinplot(x='math score',y='parental level of education',data=df,)
RESULT
<AxesSubplot:xlabel='math score', ylabel='parental level of education'>
PLOT
Output 12

bandwidth

Similar to bandwidth argument for kdeplot

plt.figure(figsize=(12,6))
sns.violinplot(x='parental level of education',y='math score',data=df,bw=0.1)
RESULT
<AxesSubplot:xlabel='parental level of education', ylabel='math score'>
PLOT
Output 13

Advanced Plots

We can use a boxenplot and swarmplot to achieve the same effect as the boxplot and violinplot, but with slightly more information included. Be careful when using these plots, as they often require you to educate the viewer with how the plot is actually constructed. Only use these if you are sure your audience will understand the visualization.

df.head()
HTML
MORE
gender
race/ethnicity
parental level of education
lunch
test preparation course

swarmplot

sns.swarmplot(x='math score',data=df)
STDERR
c:\users\marcial\anaconda3\envs\ml_master\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 15.8% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
RESULT
<AxesSubplot:xlabel='math score'>
PLOT
Output 14
sns.swarmplot(x='math score',data=df,size=2)
RESULT
<AxesSubplot:xlabel='math score'>
PLOT
Output 15
sns.swarmplot(x='math score',y='race/ethnicity',data=df,size=3)
RESULT
<AxesSubplot:xlabel='math score', ylabel='race/ethnicity'>
PLOT
Output 16
sns.swarmplot(x='race/ethnicity',y='math score',data=df,size=3)
RESULT
<AxesSubplot:xlabel='race/ethnicity', ylabel='math score'>
PLOT
Output 17
plt.figure(figsize=(12,6))
sns.swarmplot(x='race/ethnicity',y='math score',data=df,hue='gender')
RESULT
<AxesSubplot:xlabel='race/ethnicity', ylabel='math score'>
PLOT
Output 18
plt.figure(figsize=(12,6))
sns.swarmplot(x='race/ethnicity',y='math score',data=df,hue='gender',dodge=True)
STDERR
c:\users\marcial\anaconda3\envs\ml_master\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 6.7% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
RESULT
<AxesSubplot:xlabel='race/ethnicity', ylabel='math score'>
PLOT
Output 19

boxenplot (letter-value plot)

Official Paper on this plot: https://vita.had.co.nz/papers/letter-value-plot.html (opens in a new tab)

This style of plot was originally named a “letter value” plot because it shows a large number of quantiles that are defined as “letter values”. It is similar to a box plot in plotting a nonparametric representation of a distribution in which all features correspond to actual observations. By plotting more quantiles, it provides more information about the shape of the distribution, particularly in the tails.

sns.boxenplot(x='math score',y='race/ethnicity',data=df)
RESULT
<AxesSubplot:xlabel='math score', ylabel='race/ethnicity'>
PLOT
Output 20
sns.boxenplot(x='race/ethnicity',y='math score',data=df)
RESULT
<AxesSubplot:xlabel='race/ethnicity', ylabel='math score'>
PLOT
Output 21
plt.figure(figsize=(12,6))
sns.boxenplot(x='race/ethnicity',y='math score',data=df,hue='gender')
RESULT
<AxesSubplot:xlabel='race/ethnicity', ylabel='math score'>
PLOT
Output 22


Drip

Driptanil Datta

Software Developer

Building full-stack systems, one commit at a time. This blog is a centralized learning archive for developers.

Legal Notes
Disclaimer

The content provided on this blog is for educational and informational purposes only. While I strive for accuracy, all information is provided "as is" without any warranties of completeness, reliability, or accuracy. Any action you take upon the information found on this website is strictly at your own risk.

Copyright & IP

Certain technical content, interview questions, and datasets are curated from external educational sources to provide a centralized learning resource. Respect for original authorship is maintained; no copyright infringement is intended. All trademarks, logos, and brand names are the property of their respective owners.

System Operational

© 2026 Driptanil Datta. All rights reserved.