Act 2: Can AI Really Think Like You? The Evolution of Reflective AI

Act 2: Can AI Really Think Like You? The Evolution of Reflective AI

In my last article, From Fast Talkers to Deep Thinkers: How Slow-Thinking AI will Transform Law, I talked about the rise of LLMs, their surprising capabilities, and their limits. I used Daniel Kahneman's analogy of fast (System 1) and slow (System 2) thinking to show why LLMs can be both brilliant and frustratingly foolish. This analogy also gives us insight into how GenAI needs to evolve to tackle deeper business challenges. In this article, let's pick up that thread and dive into the advancements driving System 2 AI.

How AI Is Becoming Smarter and More Reflective for You

AI has come a long way. It can now handle huge amounts of data and perform certain tasks better than humans. But to understand the next steps in AI, it’s essential to look at not just how well AI models perform but* how* they make decisions. Recent breakthroughs are pushing AI closer to "System 2 thinking"—deliberate, reflective, and slow—unlike the rapid, instinctive "System 1 thinking." In this article, we'll explore how AI is evolving toward System 2 thinking, using real examples, exciting techniques, and emerging cognitive architectures designed to deepen machine reasoning.

AI breakthroughs tend to come in waves. Recently, it's all about Generative AI, but in 2016 the noise was all about deep learning and reinforcement learning. Before that, symbolic reasoning ruled the scene for decades. To get to true Artificial General Intelligence (AGI), we may need all of these approaches to work together. If you want to learn more, The Master Algorithm by Pedro Domingos is a great read on this topic.

The Game-Changing Move 37

Move 37 was a pivotal moment not just for AI experts but for everyone interested in the potential of machine intelligence. Let me take you back to 2016, when DeepMind's AlphaGo took on Lee Sedol, one of the greatest Go players ever (Go is one of the most complex board games out there—much tougher for AI than chess). Sedol was expected to win the five-game series easily.

But in the second game, AlphaGo made a move that no one saw coming. "Move 37" as it became known, stunned experts. Sedol called it "a move that no human would ever think of." This wasn’t just pattern recognition. It was strategic reasoning that seemed to go beyond its training data.

Move 37 was a glimpse of AI getting into System 2 territory—showing structured reasoning and calculated risk-taking rather than simply reacting based on previous patterns.

In the end, AlphaGo won the series 4-1.

DeepMind’s successor to AlphaGo, AlphaZero, went even further. Unlike AlphaGo, which focused only on Go, AlphaZero was a general-purpose game-playing AI. It mastered Go, chess, and shogi—all by playing against itself, with no human input. AlphaZero’s flexibility and creativity were a massive leap forward. Peter Heine Nielsen, a grandmaster, called its chess games "the most beautiful ever played." It wasn't just about winning; it was about creative, unexpected play that changed how even the best players saw the game.

Exploring What's Next for AI Reasoning

Bringing that flash of strategic insight shown by AlphaGo and combining it with LLMs is the next frontier for AI. There's a huge amount happening in the space, but in the interests of space, I'm focus on just Chain-of-Thought Reasoning.

Chain of Thought Reasoning

One of the key ways to push LLMs toward System 2 thinking is through Chain-of-Thought (CoT) Reasoning. CoT is a prompting technique that encourages models to generate intermediate steps before arriving at a final answer. It’s like teaching AI to "think aloud," breaking down complex problems just like you or I would.

CoT has been around for a while and has helped LLMs tackle complex problems in maths, programming, scientific research, and medicine. In law, for instance, analyzing a merger for antitrust concerns is much easier if you break it down into smaller questions—like market definition, competitive impact, consumer effects, and regulatory considerations.

For instance, let’s take a hypothetical analysis of Company A's proposed acquisition of Company B. We might break our prompt flow down into a sequence addressing the following areas:

  1. Assessing the Market Definition for the companies involved.
  2. What would be the resulting market share and concentration, and how does this impact competitive dynamics and create concerns around market dominance.
  3. What would be the competitive effects: Would it likely lead to barriers to entry, reduce consumer choice, and increase pricing power?
  4. What justifications are offered? For example, would the merger lead to operational efficiencies?
  5. Which the relevant regulatory framework, such as the Competition and Markets Authority (CMA), under the Enterprise Act 2002.
  6. Are there other public interest considerations?
  7. What are the proposed remedies; for example, divestitures and behavioral commitments?

When AI is prompted to "think aloud," it’s better at handling tasks that need multi-step reasoning. This mirrors how humans tackle problems—breaking them down to make them manageable. But, we’re only focusing on the “logic” here. There are many other things that could go wrong - to be useful the LLM will need access to excellent data (for example, on market share) and must be grounded to manage potential for hallucination.

At Taylor Wessing, we’ve used this approach to create many useful applications. By pairing legal experts with prompt specialists, we’ve built fact extraction flows that make legal information easily accessible. Fact extraction is a foundational part of legal GenAI work. These flows help make information within texts accessible for downstream tasks, such as summarizing content, or supporting decision-making processes.

Creating these prompt flows, though, takes a lot of effort. So, how can we generalize this process to make LLMs better at System 2 thinking?

How We Can Scale Up AI's Step-by-Step Thinking

Right now, there are three main approaches to generalizing CoT:

  1. Training with CoT Examples: This means creating datasets with step-by-step annotated solutions provided by experts, or leveraging textbooks containing worked solutions.
  2. Few-Shot Learning and Interactive Exploration: For complex domains without existing training data, CoT can be applied using few-shot learning (i.e., providing a few examples), interactive exploration (like our antitrust example), or meta-reasoning (where models iteratively refine their outputs).
  3. General Frameworks for CoT: These include: Universal CoT Framework: Prompting models to identify problem components, solve sub-problems, and combine solutions. TREE-of-Thought Reasoning: Exploring multiple solutions like branches of a decision tree. Stepwise Templates: Using generic steps to define, break down, and solve problems across domains.

If you've used ChatGPT 4.0 (nicknamed “Strawberry” - geddit??), you may have noticed it showing its thought process as it works through a problem. This is CoT in action. These methods can also be tailored to specific domains, like M&A due diligence—breaking tasks into identifying key contractual clauses, cross-referencing regulatory requirements, and assessing risks.

Fine-tuning these models with domain-specific examples and connecting them to knowledge libraries also adds important checks, like ensuring consistency with precedents, and avoiding pitfalls by explicitly incorporating critical intermediate checks.

What's Next for AI? What Role Will Reflective AI Play in Your Life?

The emergence of System 2 thinking in AI is a glimpse into a future where machines can both respond and reason. Techniques like chain-of-thought prompting and cognitive architectures are helping AI simulate reflective, deliberate processes similar to human thinking. Each advancement—from Move 37 to general reasoning frameworks—takes us closer to thoughtful, reasoned AI, moving beyond simple automation.

This journey is just beginning. System 2 thinking is still in its early stages, and AI has a long way to go before it can fully emulate human reasoning. But the progress we’re seeing is promising. The future of AI isn’t just about scaling up—it’s about mastering the art of thoughtful, deliberate reasoning. It’s about finding the right balance between instant responses and deeper, reflective thinking.

#genAI #AIagents #artificialintelligence #legaltech

Great article. And links with with converastions we've had about how we can use workflow and multiple 'agents' working on the same problem to address some really complex challenges.

Lisa Burton

Bridging legal and corporate data/AI/IT, enabling directors' accountability through ethical, proactive data and AI management and compliant, risk-mitigated solutions.

1mo

Gerard Frith - another great article thanks! Personally, I am very pleased to see that System 2 is not just about scaling up and speed, but thoughtful, deliberate reasoning and balancing instant responses with deeper, reflective thinking. That plus the important Human in the Loop model will generate more opportunity for ethical AI - that is my hope always!

Gerard Frith

Entrepreneur In Residence, Innovation and Ventures Director, at Taylor Wessing | AI, Legal tech products, Customer Experience

1mo
Like
Reply

To view or add a comment, sign in

More articles by Gerard Frith

Insights from the community

Others also viewed

Explore topics