Building Our Own Knowledge System: Why We Took This Path

Building Our Own Knowledge System: Why We Took This Path

A few weeks ago, we unveiled SvenAI at an event held at Friends Arena—now known as Strawberry Arena. Amidst the social media chatter, some people repeated the outdated claim that large language models (LLMs) can't count the 'R' in "strawberry"—a myth that's been debunked for quite some time. It makes you wonder if they've even tried it before echoing such misconceptions. Humans are a funny bunch.

As always, the demo gods were in our favour. Both Patrik Mälarholm , Abrahamsson Christer , and I were thrilled with the launch, and the positive reception reinforced our belief in the project's potential.


Our Vision and Intentions

Before diving into the technical depths of the system we've built at Tyréns AB with the help of Predli, it's important to clarify our intentions. For more than obvious reasons, we're not aiming to compete with major players like OpenAI or Anthropic; attempting that with our small team wouldn't be feasible. Instead, our focus is on constructing a state-of-the-art knowledge system—a platform that allows us to store, utilize, and combine our unique information to generate powerful, data-driven insights.

By building our own tailored solution, we're able to address our specific needs while maintaining control over our proprietary data. This approach empowers us to leverage the benefits of advanced LLMs without compromising on data privacy or becoming overly dependent on third-party providers.

Mindful Integration with Large Providers

While models like GPT-4o are among the most advanced in the world and accessible to all ChatGPT users, this accessibility comes with caveats. These sophisticated models are available at low or no cost largely due to data collection—they've already amassed vast amounts of publicly available information. This raises concerns about how our proprietary data might be used if we rely solely on these external services without any modifications.


Consider the example of Svensk Byggtjänst. Although they haven't publicly uploaded their core assets like the AMA codes, it's plausible that someone, somewhere, has made that information available by interacting with ChatGPT. We wouldn't want to risk our sensitive data being inadvertently shared or used to further train external models, potentially benefiting even our competitors. Therefore, mindful integration with big providers is essential to safeguard our interests.


Understanding Model Enhancements

An insightful conversation between Dwarkesh Patel and John Schulman, co-founder of OpenAI, sheds light on how models like GPT-4 have improved over time. Schulman confirms that post-training—refinements made after the initial training phase—is a significant factor in the performance gains of these models:

Dwarkesh Patel: "The current GPT-4 has an Elo score that is like a hundred points higher than the original one that was released. Is that all because of what you're talking about, with these improvements that are brought on by post-training?"
John Schulman: "Yeah, most of that is post-training. There are a lot of different, separate axes for improvement."

This highlights the importance of ongoing development and customization to meet specific needs. It reinforces our decision to focus on refining and tailoring our system rather than relying entirely on external models.


Choosing the Right Tools for Our Goals

When it comes to large language models (LLMs), there are several key areas to consider:

  1. Pre-training: Building foundational models from scratch, which is resource-intensive and typically reserved for organizations with extensive GPU resources.
  2. Fine-tuning: Refining the style and format of a model's output to better suit specific applications.
  3. Retrieval-Augmented Generation (RAG): Using external data sources to generate fact-based responses, ensuring accuracy and relevance.


Our focus has been on RAG and fine-tuning. RAG, in particular, allows us to combine a retrieval system with a generative model. This means the model can pull relevant information from our proprietary data sources, providing accurate, real-time responses without relying on external models that might compromise data privacy.

By selecting tools that align with our goals, we're able to create a system that not only meets our specific needs but also mitigates the risks associated with data sharing and dependency on third-party providers.


SvenAI's Role at Tyréns

SvenAI is an AI-powered tool supporting various applications in infrastructure planning, construction management, and research. Its primary objectives are:

  • Efficient Data-Driven Decision-Making: Enabling quick access to relevant information to make informed decisions.
  • Streamlined Retrieval of Information from Extensive Datasets: Simplifying the process of finding necessary data within large repositories.
  • Contextual Response Generation for Complex Queries: Providing answers that consider the broader context, leading to more accurate and useful insights.

By combining sophisticated AI approaches, SvenAI becomes a highly versatile solution capable of responding to a wide range of queries while delivering accurate and insightful information.

Let's take a look under the hood!


HyDE and LLM ReRank: Enhancing Retrieval with Hypothetical Generation and Reranking

HyDE (Hypothetical Document Embeddings) and LLM ReRank enhance information retrieval by generating hypothetical documents that could answer a query and then reranking results based on contextual relevance. HyDE creates synthetic documents related to the query and converts them into vector embeddings, allowing the system to consider both existing and potential answers. LLM ReRank reevaluates and scores these documents using a large language model to prioritize the most relevant information. This combination ensures that even when direct answers aren't available, the AI can generate plausible responses and present them effectively.

1. Hypothetical Document Embeddings (HyDE)

HyDE improves document retrieval by generating synthetic documents based on a query. Here’s how it works:

  • Hypothetical Generation: When a query is submitted, the system generates hypothetical documents that could contain potential answers. These documents are based on the AI’s understanding and are used to supplement the dataset when direct information is unavailable.
  • Embedding: Both real and hypothetical documents are converted into vector embeddings to compare their semantic content.
  • Comparison: The query vector is compared against both real and hypothetical document vectors, expanding the available answer space beyond the original dataset.
  • Enhancement: The use of hypothetical documents allows the AI to fill in potential gaps, providing more robust responses in situations where existing data is sparse.

2. LLM ReRank

Once the initial documents are retrieved, LLM ReRank refines the results to ensure relevance:

  • Initial Retrieval: The system identifies documents that are semantically similar to the query using standard vector similarity techniques like cosine similarity or Euclidean distance.
  • Reranking: The LLM reevaluates the retrieved documents based on a deeper understanding of the query, ensuring that the most contextually appropriate responses are ranked higher.
  • Final Ranking: The candidates are ordered based on their new relevance scores, ensuring that users receive accurate and meaningful results.

Key Strengths of HyDE and LLM ReRank

This approach offers several key benefits:

  • Flexibility: HyDE can generate answers even when the dataset lacks direct matches.
  • Contextual Accuracy: LLM ReRank provides precision by reordering results based on deep contextual relevance.
  • Efficiency: The system can quickly retrieve and rank relevant documents, making it ideal for complex tasks.


Graph RAG: Structured, Contextual Retrieval for Complex Queries

While HyDE and LLM ReRank are focused on generating and refining individual document results, Graph RAG takes a different approach by structuring information into a knowledge graph. This graph allows the AI to explore complex relationships, themes, and entities across the dataset.

1. Knowledge Graph Generation

Graph RAG automatically generates a knowledge graph from documents using a Large Language Model (LLM). This graph captures the relationships between key entities, concepts, and topics.

  • Entity and Relationship Extraction: The system identifies important entities and relationships within the text, organizing them into a graph that can be queried directly.
  • Graph Structure: The graph, made up of nodes (entities) and edges (relationships), allows for queries that explore not just documents but the relationships between concepts within them.

2. Community Detection and Summarization

A standout feature of Graph RAG is its ability to detect communities of related entities and generate summaries for each:

  • Hierarchical Summaries: These summaries offer high-level insights, making it easier to understand the broader context of the data, such as how different infrastructure projects might relate to one another.

3. Global Question Handling

While traditional retrieval methods focus on finding semantically similar chunks of text, Graph RAG can handle global queries by exploring connections and relationships across the entire dataset. This provides more comprehensive answers, considering the full context of the data.

4. Map-Reduce Question Answering

Graph RAG utilizes a map-reduce approach to answer complex queries:

  • Map Phase: Breaks down the query into smaller sub-tasks, processing them across relevant parts of the graph.
  • Reduce Phase: Synthesizes these results into a complete, comprehensive answer, ensuring all relevant information is considered.

This method ensures the AI is able to handle large datasets efficiently, offering precise responses even for complex queries.

5. Improved Performance with Graph RAG

The Graph RAG approach brings several advantages:

  • Comprehensive Answers: By traversing the graph, the AI can generate responses that pull from different parts of the dataset, offering a more nuanced view.
  • Efficient Token Usage: The system uses fewer tokens while still providing detailed insights, making it ideal for large datasets.
  • Contextual Awareness: The graph structure ensures a deeper understanding of relationships between entities and topics.


Agentic Approaches: Dynamic Query Handling

One of the more advanced features being explored is the introduction of agentic approaches in Graph RAG. These approaches enable the AI to dynamically switch between local and global search strategies based on the specific nature of the query, making the system more flexible and intelligent in its query processing.

1. Local Search for Precision

In local search, the system identifies specific nodes and relationships within a defined subset of the graph, focusing on precision over breadth. This approach is ideal for highly detailed, targeted queries where the answer is likely contained within a limited part of the dataset.

2. Global Search for Comprehensive Context

For broader, more exploratory queries, the global search strategy enables the system to traverse the entire graph. This strategy provides a comprehensive understanding of the relationships and entities within the dataset, perfect for high-level queries requiring an understanding of the bigger picture.

3. Adaptive Search with Agents

By integrating agentic behavior, the AI can adapt its approach based on the query's complexity and the information available. It can:

  • Start with local search for direct, specific questions.
  • Expand to global search if the initial retrieval doesn’t yield sufficient context.
  • Combine both strategies dynamically to balance precision and comprehensiveness.

This agentic behavior introduces a more intelligent, dynamic response generation system, enabling the AI to refine its search methods on the fly, which can lead to more accurate and contextually appropriate results.


Connecting the Methodologies to Our Objectives

These methodologies are not just abstract concepts; they directly contribute to achieving SvenAI's primary objectives. By integrating HyDE and LLM ReRank, we improve the precision and relevance of information retrieval, which is crucial for efficient data-driven decision-making. Graph RAG and agentic approaches further enhance our ability to handle complex queries, providing contextual responses that consider the broader landscape of information.

To illustrate how these methodologies benefit our work at Tyréns, let's explore two use cases:

Use Case 1: Research Support for Large Infrastructure Projects

Context:

When planning large-scale infrastructure projects, decision-makers rely on vast amounts of technical reports, research papers, regulations, and environmental studies. Finding the most relevant and up-to-date information quickly is challenging.

Solution Using HyDE and LLM ReRank:

  • HyDE generates hypothetical document embeddings representing what an ideal, highly relevant document would contain. This improves retrieval breadth, ensuring relevant documents are identified even without exact keyword matches.
  • LLM ReRank reorganizes the retrieved documents based on their relevance to specific sub-questions, enhancing result precision.

Value:

This combination saves time and improves decision quality by providing comprehensive and precise information retrieval, which is crucial for industries like infrastructure, healthcare, or legal services. By efficiently navigating large datasets, project managers and engineers can make informed decisions without the tedious process of manual filtering.


Use Case 2: Optimizing Knowledge Management for Urban Planning

Context:

Urban planners working on complex city infrastructure projects need to access, integrate, and summarize dispersed information from various documents and data sources.

Solution Using Graph RAG:

  • Graph RAG captures document relationships and global context using graph structures, allowing the system to retrieve and synthesize information across multiple sources.
  • It provides planners with contextually rich summaries that balance local relevance with global understanding.

Value:

This approach supports more informed and strategic decision-making in urban development by offering comprehensive, context-rich summaries that reflect both specific details and broader insights. The ability to understand relationships between different data points enhances the quality of planning and execution in urban projects.


What’s Next for Tyréns’ AI Solution?

The journey to improving Sven’s capabilities is ongoing, with several exciting developments on the horizon:

1. Expanding Agentic Approaches

The integration of more agentic search strategies will continue, allowing the AI to better manage both local and global searches. By dynamically adjusting the search methodology based on the query’s needs, future iterations of Sven will be more adaptable, leading to even more precise and comprehensive responses.

2. Hyperparameter Tuning for Efficiency

Fine-tuning the system's hyperparameters—such as chunk size, token overlap, and rank thresholds—will enhance processing efficiency and performance. This optimization will lead to faster response times and more accurate results, improving the overall user experience.


And lot's and lots of more.. Stay tuned



Emmanuel Verkinderen

AECforward.ai, IA pour la construction, consultant BET structure et AMOT

1mo

Very interesting to read how Tyrens is implementing RAG in the AEC industry. Thanks for sharing Stefan Wendin. And in what's next; D-RAG for all those drawings obviously ;-)

Gustavo de Assis Costa

PhD - Full Professor at Federal Institute of Education, Science and Technology of Goiás - Campus Jataí

2mo

Dear Stefan Wendin, thank you very much for sharing this thorough history of your work. We're working on a similar tool in Brazil to support knowledge management, and almost all of your thoughts align with what we're working on.

Alexandra M. Davis

Tech-Driven TA/HR Professional | Scaling TA/HR Functions | Uncovering your Hidden Leaders with the Unoffical Leaders Datamodel

3mo

This is so cool, there is a lot of orgs that need this.. Wow. Not pretending to understand all of the tech, but the value, amazing! So inspiring!

To view or add a comment, sign in

Explore topics