๐Ÿš€
Prompting Techniques
Self-Consistency
Article Header Backdrop
Engineering

Self-Consistency โš–๏ธ

Improve LLM performance on complex arithmetic and commonsense reasoning by sampling multiple, diverse reasoning paths and selecting the most consistent answer.

Mar 20266 min read
๐ŸŒ
References & Disclaimer

This content is adapted from Prompting Guide: Self-Consistency. It has been curated and organized for educational purposes on this portfolio. No copyright infringement is intended.

Introduction

Perhaps one of the more advanced techniques for prompt engineering is Self-Consistency. Proposed by Wang et al. (2022) (opens in a new tab), self-consistency aims "to replace the naive greedy decoding used in chain-of-thought prompting."

The core idea is to sample multiple, diverse reasoning paths through few-shot CoT and use those generations to select the most consistent answer. This significantly boosts performance on tasks involving arithmetic and commonsense reasoning.


The "Sister's Age" Problem

Let's look at a task where standard reasoning often fails due to simple arithmetic lapses.

๐Ÿ›‘

Prompt:
"When I was 6 my sister was half my age. Now Iโ€™m 70 how old is my sister?"

Output: 35 (Incorrect)

Improving with Self-Consistency

To solve this using self-consistency, we first provide several few-shot CoT exemplars (like the ones from Table 17 in Wang et al. 2022) to prime the model for reasoning. We then generate multiple outputs for the same question.

โš–๏ธ

Sample Output 1:
"When I was 6 my sister was half my age, so she was 3. Now I am 70, so she is 70 - 3 = 67. The answer is 67."

Sample Output 2:
"When the narrator was 6, his sister was 3. Now that the narrator is 70, his sister would be 70 - 3 = 67 years old. The answer is 67."

Sample Output 3:
"When I was 6 my sister was half my age, so she was 3. Now I am 70, so she is 70/2 = 35. The answer is 35."

The Majority Vote

While some reasoning paths might still lead to incorrect results (like Output 3), by taking a majority vote of the final answers (67 appearing twice vs. 35 once), the model can arrive at the objectively correct answer.

This works because there are usually many ways to reason correctly, but only a few ways to reason incorrectly in a way that arrives at the same wrong answer.


Why it Works

  • Replacement of Greedy Decoding: Standard "greedy" decoding picks the most likely next token at every step, which can lock the model into an early reasoning error.
  • Diversity as a Filter: By sampling diverse paths, the model "filters" out random hallucinations or logic slips through statistical consensus.
  • Complementary to CoT: Self-consistency doesn't replace Chain-of-Thought; it enhances it by providing a robust decision-making layer on top of the reasoning tokens.
๐Ÿš€

Next Steps: For even more complex multi-step problems that require exploring multiple branches of a solution, we can look at Tree of Thoughts (ToT).

ยฉ 2026 Driptanil Datta. All rights reserved.

Software Developer & Engineer

Disclaimer:The content provided on this blog is for educational and informational purposes only. While I strive for accuracy, all information is provided "as is" without any warranties of completeness, reliability, or accuracy. Any action you take upon the information found on this website is strictly at your own risk.

Copyright & IP:Certain technical content, interview questions, and datasets are curated from external educational sources to provide a centralized learning resource. Respect for original authorship is maintained; no copyright infringement is intended. All trademarks, logos, and brand names are the property of their respective owners.

System Operational

Built with Love โค๏ธ | Last updated: Mar 16 2026