LLMs Can’t Learn Maths & Reasoning: What Recent Research Reveals

Avinash Dubey

CTO & Top Thought Leadership Voice | AI & ML Book Author | Web3 & Blockchain Enthusiast | Startup Transformer | Leading the Next Digital Revolution 🚀

Published Dec 30, 2024

The debate surrounding the ability of Large Language Models (LLMs) to truly learn mathematics and reasoning tasks has been a cornerstone of AI research. A recent paper has made significant strides in unpacking this mystery by employing causal analysis to examine how LLMs tackle arithmetic reasoning tasks. By identifying specific circuits responsible for arithmetic logic, researchers have provided a glimpse into how these models operate under the hood.

Let’s dive into this groundbreaking research and uncover its implications.

Defining Reasoning

In his seminal 2019 paper, “On the Measure of Intelligence,” François Chollet defines intelligence as "skill-acquisition efficiency," emphasizing adaptability and generalization over performance on specific tasks. Reasoning, within this context, involves deriving conclusions from principles or evidence, requiring both logical consistency and flexibility.

For LLMs, reasoning has always been a contentious subject. While these models can generate coherent text and solve structured problems, the mechanisms behind their reasoning remain elusive.

Types of Reasoning

Reasoning can be broadly classified into:

Deductive Reasoning – Drawing specific conclusions from general principles. Example: Solving a geometry theorem.
Inductive Reasoning – Formulating general rules from specific examples. Example: Predicting patterns in data.
Abductive Reasoning – Inferring the most likely explanation from incomplete information. Example: Diagnosing a disease based on symptoms.

In mathematical contexts, LLMs appear to excel at deductive tasks within their training scope but struggle to generalize beyond.

Understanding Heuristics

A key takeaway from the research is that LLMs often rely on heuristics rather than true reasoning. Heuristics are mental shortcuts or rules of thumb that guide problem-solving but may not always yield accurate results.

For instance:

In arithmetic problems, LLMs may rely on token patterns (e.g., "+" suggests addition) without understanding the underlying operations.
They may also replicate patterns from training data, leading to plausible but incorrect solutions.

Breaking Down Black-Box AI Internals

The biggest challenge with LLMs is their “black-box” nature. Researchers have now begun using causal analysis to break this box open. By isolating subsets of the model responsible for specific behaviors—called circuits—they’ve started mapping how these models “reason” through tasks.

Recommended by LinkedIn

Solving Math with GPT-4; Transformers and Recursive…

Danny Butvinik 1 year ago

From scratch to XAI - A personal 1-week experience…

Candi CARRERA 1 year ago

FunSearch: Leveraging AI Hallucinations to Make New…

Rudina Seseri 1 year ago

For arithmetic reasoning:

Specific circuits were identified that predominantly influence the model’s output for tasks like addition or subtraction.
By analyzing these circuits, researchers confirmed that the models mimic arithmetic logic without genuinely understanding it.

Mathematical Circuits

Circuits in LLMs are analogous to pathways in human brains. For arithmetic tasks:

These circuits act as deterministic pathways for handling numbers and operations.
While efficient for predefined tasks, they falter when introduced to novel scenarios requiring true generalization.

An interesting finding was how these circuits handle carry-over logic in addition. Instead of understanding it conceptually, the circuits depend on patterns in training data to approximate results.

Understanding Circuits in More Detail

Using causal intervention techniques, researchers:

Examined the activation patterns of circuits when solving arithmetic tasks.
Found that the circuits’ “reasoning” is predominantly influenced by surface-level patterns rather than a deeper mathematical understanding.

This revelation underscores the limitations of LLMs in acquiring true reasoning capabilities. They are exceptional imitators but lack the innate ability to reason abstractly like humans.

Conclusion

The recent research confirms that while LLMs excel at simulating reasoning and performing arithmetic tasks, their abilities are bounded by their reliance on heuristics and pre-defined circuits. This challenges the notion of “intelligence” in AI and calls for further exploration into building models that genuinely understand and generalize beyond their training.

Key Takeaways:

LLMs use specialized circuits for arithmetic and reasoning tasks but lack true comprehension.
Heuristics, not genuine reasoning, drive their problem-solving abilities.
Future advancements in AI should focus on enhancing generalization and adaptability.

LLMs Can’t Learn Maths & Reasoning: What Recent Research Reveals

Avinash Dubey

CTO & Top Thought Leadership Voice | AI & ML Book Author | Web3 & Blockchain Enthusiast | Startup Transformer | Leading the Next Digital Revolution 🚀

Defining Reasoning

Types of Reasoning

Understanding Heuristics

Breaking Down Black-Box AI Internals

Recommended by LinkedIn

Mathematical Circuits

Understanding Circuits in More Detail

Conclusion

Key Takeaways:

Your CTO Advisor

1,307 follower

More articles by Avinash Dubey

Insights from the community

Others also viewed

FunSearch: Leveraging AI Hallucinations to Make New Discoveries in Mathematics

Making the unapproachable approachable

Neuroplastic Transfer Learning: LNN / Transformer Hybrids

Machine Learning Guide for Petroleum Professionals: Part 3

PART 2: How Math and Physics Majors Can Dominate AI

Beyond Human: AI's Unexpected Triumph at the Mathematical Olympiad

DeepMind's Models Get Silver at Math Olympiads

Why Some People Are Good at Math (and Some are Not)

Five Machine Learning Paradoxes that will Change the Way You Think About Data

Oxford Machine Learning Summer School (OxML 2023)

Explore topics

Defining Reasoning

Types of Reasoning

Understanding Heuristics

Breaking Down Black-Box AI Internals

Recommended by LinkedIn

Mathematical Circuits

Understanding Circuits in More Detail

Conclusion

Key Takeaways:

Your CTO Advisor

1,307 follower

More articles by Avinash Dubey

Run DeepSeek R1 Locally with Cursor: A Guide to Unlocking the Power of Open-Source AI

Alibaba’s Qwen Team Unveils Game-Changing AI Models with Multi-Device Control

DeepSeek’s R1 Model: A Bold Move That’s Shaking the AI Industry

A New Benchmark for Testing Frontier AI Systems

Human Minds vs. Machine Learning Models: Understanding the Differences

The AI Companies Enterprise VCs Are Eyeing in 2025

DeepSeek’s R1: The Rise of Reasoning Models

How to Build a Knowledge Graph in Minutes (And Make It Enterprise-Ready) 🚀

Hugging Face and FriendliAI Reach Settlement in Patent Infringement Case

CES 2025: Highlights and Key Updates in Tech and AI

Insights from the community

Others also viewed

FunSearch: Leveraging AI Hallucinations to Make New Discoveries in Mathematics

Making the unapproachable approachable

Neuroplastic Transfer Learning: LNN / Transformer Hybrids

Machine Learning Guide for Petroleum Professionals: Part 3

PART 2: How Math and Physics Majors Can Dominate AI

Beyond Human: AI's Unexpected Triumph at the Mathematical Olympiad

DeepMind's Models Get Silver at Math Olympiads

Why Some People Are Good at Math (and Some are Not)

Five Machine Learning Paradoxes that will Change the Way You Think About Data

Oxford Machine Learning Summer School (OxML 2023)

Explore topics