🎁 Meta Reveals New AI Architecture

Pascal Biese

Daily AI highlights for 70k+ experts 📲🤗 AI/ML Engineer

Published Dec 27, 2024

+ Follow

In this issue:

How Meta wants to take LLMs to the next level
A smaller, more transparent o1 alternative
Graph agents improving RAG

Subscribe now

1. Large Concept Models: Language Modeling in a Sentence Representation Space

Watching: LCMs (paper)

What problem does it solve? Current Large Language Models (LLMs) operate at the token level, processing input and generating output word by word. This contrasts with how humans process information, utilizing higher levels of abstraction beyond single words. By introducing a new architecture that operates on explicit higher-level semantic representations called "concepts," this research aims to bridge the gap between human-like understanding and the current token-based approach of LLMs.

How does it solve the problem? The proposed "Large Concept Model" uses a language- and modality-agnostic representation of ideas or actions called "concepts." In this study, a concept is assumed to correspond to a sentence, and the SONAR sentence embedding space, which supports up to 200 languages in both text and speech modalities, is used. The model is trained to perform autoregressive sentence prediction in the embedding space using various approaches, including MSE regression, diffusion-based generation, and models operating in a quantized SONAR space.

What's next? The Large Concept Model demonstrates impressive zero-shot generalization performance across many languages, outperforming existing LLMs of the same size. Future work could explore more sophisticated definitions of "concepts" beyond sentences and investigate the model's performance on a wider range of tasks. Additionally, scaling up the model size and training data could potentially lead to even more impressive results.

2. Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Watching: Mulberry (paper)

Recommended by LinkedIn

Why Llama 3.1's Release is an Important Step in the…

Data Science Dojo 5 months ago

GPT-4: A Potential Stepping Stone on the Path to…

Data Science Dojo 1 year ago

Top LLM Papers of the Week (October Week 4, 2024)

Kalyan KS 2 months ago

What problem does it solve? While Large Language Models (LLMs) have shown impressive performance on a wide range of tasks, their reasoning abilities are still limited. They often struggle to provide step-by-step explanations for their answers, which is crucial for building trust and understanding in AI systems. Mulberry aims to address this issue by developing an MLLM that can generate intermediate reasoning steps to arrive at the final answer.

How does it solve the problem? Mulberry introduces a novel learning-to-reason method called Collective Monte Carlo Tree Search (CoMCTS). CoMCTS leverages the collective knowledge of multiple models to collaboratively search for effective reasoning paths. It involves four iterative operations: Expansion, Simulation and Error Positioning, Backpropagation, and Selection. By using CoMCTS, the authors constructed Mulberry-260k, a multimodal dataset with explicit reasoning nodes for each question. This dataset is then used to train Mulberry, a series of MLLMs with step-by-step reasoning and reflection capabilities.

What's next? The development of (M)LLMs with verbose reasoning steps mighr enable AI systems to provide more transparent and interpretable explanations for their decisions. This is particularly important in domains such as healthcare, finance, and legal systems, where trust and accountability are crucial. We can expect to see more research focused on improving the reasoning capabilities of (M)LLMs and developing datasets that facilitate this process. Additionally, the integration of multimodal data, as demonstrated in Mulberry-260k, could lead to more comprehensive and robust reasoning systems.

3. GeAR: Graph-enhanced Agent for Retrieval-augmented Generation

Watching: GeAR (paper)

What problem does it solve? Retrieval-augmented generation (RAG) systems rely on effective document retrieval to provide relevant information for generating accurate responses. However, conventional sparse or dense retrievers face challenges in multi-hop retrieval scenarios, where the required information is spread across multiple documents. This limitation hinders the performance of RAG systems in complex question answering tasks that require reasoning over multiple pieces of information.

How does it solve the problem? GeAR addresses the limitations of conventional retrievers in multi-hop scenarios through two key innovations. First, it introduces graph expansion, which enhances any base retriever, such as BM25, by leveraging the LLM to synchronize information from passages with triples and expand the graph by exploring diverse beams of triples that link multi-hop contexts. This strategy allows GeAR to effectively retrieve relevant information spread across multiple documents. Second, GeAR incorporates an agent framework that utilizes the multi-hop contexts returned by the graph retriever to construct a gist memory, which summarizes the retrieved information across iterations. This gist memory enables the LLM to reason over the collected information and generate accurate responses.

What's next? Future research could explore the application of graph-based retrievers and agent frameworks to other complex natural language processing tasks that require reasoning over multiple pieces of information. Additionally, the synergy between the graph retriever and the LLM within the GeAR framework highlights the potential for further improvements by leveraging the capabilities of large language models to guide the retrieval process. We can expect to see more advanced techniques that enable effective reasoning over large amounts of information, leading to more accurate and informative responses.

Papers of the Week:

👍 If you enjoyed this article, give it a like and share it with your peers.

LLM Watch

51,651 followers

+ Subscribe

Shashi Bhushan

This is an exciting development in the evolution of AI language models! The shift from token-based to concept-based processing could indeed revolutionize how we approach natural language understanding and generation. One intriguing aspect to consider is how LCMs might enhance cross-disciplinary applications, such as integrating linguistic insights with cognitive science to better mimic human thought processes. Additionally, the potential for LCMs to improve multilingual communication by transcending language barriers is immense. This could lead to more inclusive and effective global collaboration. Looking forward to seeing how this progresses!

Matteo Castiello

Managing Director @ Insurgence - Delivering Enterprise Intelligence as a Service (iQaaS)

Interesting shift from token-based modelling.

1 Reaction

See more comments

To view or add a comment, sign in

🎁 Meta Reveals New AI Architecture

Pascal Biese

Daily AI highlights for 70k+ experts 📲🤗 AI/ML Engineer

In this issue:

1. Large Concept Models: Language Modeling in a Sentence Representation Space

2. Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Recommended by LinkedIn

3. GeAR: Graph-enhanced Agent for Retrieval-augmented Generation

Papers of the Week:

👍 If you enjoyed this article, give it a like and share it with your peers.

LLM Watch

51,651 followers

More articles by Pascal Biese

Insights from the community

Others also viewed

Researchers Discuss the Rapid Rise of LLMs

Discover Graph LLM leading the next wave of AI-driven data exploration

o3: The Strongest Model for a Complex Human Society or a Resource-Devouring Beast?

The main announcements from Microsoft Build 2024 conference

How to Build Powerful LLM Apps with Vector Databases + RAG - AI&YOU #55

LLM: Train vs. Tune – Understanding the Key Differences

OpenAI o3, DeepSeek V3, Qwen QVQ, and MediaTek's Phi-3.5 Optimisations

What are Retrieval Augmented Generation (RAG) Systems?

Explainable Language Models: Existing and Novel Approaches

Explore topics

In this issue:

1. Large Concept Models: Language Modeling in a Sentence Representation Space

2. Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Recommended by LinkedIn

3. GeAR: Graph-enhanced Agent for Retrieval-augmented Generation

Papers of the Week:

👍 If you enjoyed this article, give it a like and share it with your peers.

LLM Watch

51,651 followers

More articles by Pascal Biese

🧑🔬 AI Cutting Research Costs by 84%

🤗 AI Agents: Quick & Easy

🌱 Another ChatGPT Moment

🗣️ Microsoft's Best Small Language Model

🧪The First Fully AI-Designed Drug... Almost

🥇 GraphRAG's Biggest Problem Solved

🍓 Actually Open AI: A Free o1 Alternative

🤖 The Future of Designing AI Agents

💻 HTML > Plain Text for RAG

🤏 All You Need to Know About Small Language Models

Insights from the community

Others also viewed

Researchers Discuss the Rapid Rise of LLMs

Discover Graph LLM leading the next wave of AI-driven data exploration

o3: The Strongest Model for a Complex Human Society or a Resource-Devouring Beast?

The main announcements from Microsoft Build 2024 conference

How to Build Powerful LLM Apps with Vector Databases + RAG - AI&YOU #55

LLM: Train vs. Tune – Understanding the Key Differences

OpenAI o3, DeepSeek V3, Qwen QVQ, and MediaTek's Phi-3.5 Optimisations

What are Retrieval Augmented Generation (RAG) Systems?

Explainable Language Models: Existing and Novel Approaches

Explore topics