🚀
4. Cohort Analysis
++++
Languages
Mar 2025×10 min read

A cohort is a group of users who share a common characteristic over a specified period. The most common cohort is bas...

Page 4: Cohort Analysis — Understanding Groups Over Time 👥

Driptanil Datta
Driptanil DattaSoftware Developer

Page 4: Cohort Analysis — Understanding Groups Over Time 👥

A cohort is a group of users who share a common characteristic over a specified period. The most common cohort is based on Acquisition Date (e.g., "Users who joined in January").

🌍
References & Disclaimer

This content is adapted from Mastering System Design from Basics to Cracking Interviews (Udemy). It has been curated and organized for educational purposes on this portfolio. No copyright infringement is intended.


Why Cohorts Matter

Raw metrics like "Total Active Users" can be misleading. A massive influx of new users can hide the fact that your old users are leaving. Cohort analysis reveals the true health of your retention and product-market fit.

Building a Retention Curve

How to calculate retention in SQL
  1. Define the Cohort: Find the first activity date for every user.
  2. Calculate Activity Age: Subtract the cohort date from the activity date for every subsequent action.
  3. Count Unique Users: Group by Cohort Month and Month Number (Month 0, Month 1, etc.).
  4. Percentage: Divide the count of Month N by the count of Month 0.

Handling Sparse Cohorts

In many systems, users don't perform actions every single period. If you only look at "Month 1 Retention," you might miss users who returned in "Month 2."

  • Returnship: Looking for any activity after a certain gap.
  • Repeat Purchase Behavior: Tracking the frequency and recency of repeat conversions.

Survivorship and Churn

  • Survivorship: What % of the original cohort is still active after X months?
  • Churn Rate: The inverse of retention—how many users are we losing?

Cumulative Cohort Calculations

Tracking the LTV (Lifetime Value) of a cohort:

SUM(revenue) OVER(PARTITION BY cohort_id ORDER BY month_age)

This tells you how much value a specific group of users has generated desde joining.

Cross-Sectional Analysis

Comparing different cohorts against each other at the same "age" (e.g., comparing "Month 3 Retention" for the January cohort vs. the February cohort). This helps determine if product updates or marketing campaigns are improving long-term stickiness.

Drip

Driptanil Datta

Software Developer

Building full-stack systems, one commit at a time. This blog is a centralized learning archive for developers.

Legal Notes
Disclaimer

The content provided on this blog is for educational and informational purposes only. While I strive for accuracy, all information is provided "as is" without any warranties of completeness, reliability, or accuracy. Any action you take upon the information found on this website is strictly at your own risk.

Copyright & IP

Certain technical content, interview questions, and datasets are curated from external educational sources to provide a centralized learning resource. Respect for original authorship is maintained; no copyright infringement is intended. All trademarks, logos, and brand names are the property of their respective owners.

System Operational

© 2026 Driptanil Datta. All rights reserved.