Building Our Own Knowledge System: Why We Took This Path
A few weeks ago, we unveiled SvenAI at an event held at Friends Arena—now known as Strawberry Arena. Amidst the social media chatter, some people repeated the outdated claim that large language models (LLMs) can't count the 'R' in "strawberry"—a myth that's been debunked for quite some time. It makes you wonder if they've even tried it before echoing such misconceptions. Humans are a funny bunch.
As always, the demo gods were in our favour. Both Patrik Mälarholm , Abrahamsson Christer , and I were thrilled with the launch, and the positive reception reinforced our belief in the project's potential.
Our Vision and Intentions
Before diving into the technical depths of the system we've built at Tyréns AB with the help of Predli, it's important to clarify our intentions. For more than obvious reasons, we're not aiming to compete with major players like OpenAI or Anthropic; attempting that with our small team wouldn't be feasible. Instead, our focus is on constructing a state-of-the-art knowledge system—a platform that allows us to store, utilize, and combine our unique information to generate powerful, data-driven insights.
By building our own tailored solution, we're able to address our specific needs while maintaining control over our proprietary data. This approach empowers us to leverage the benefits of advanced LLMs without compromising on data privacy or becoming overly dependent on third-party providers.
Mindful Integration with Large Providers
While models like GPT-4o are among the most advanced in the world and accessible to all ChatGPT users, this accessibility comes with caveats. These sophisticated models are available at low or no cost largely due to data collection—they've already amassed vast amounts of publicly available information. This raises concerns about how our proprietary data might be used if we rely solely on these external services without any modifications.
Consider the example of Svensk Byggtjänst. Although they haven't publicly uploaded their core assets like the AMA codes, it's plausible that someone, somewhere, has made that information available by interacting with ChatGPT. We wouldn't want to risk our sensitive data being inadvertently shared or used to further train external models, potentially benefiting even our competitors. Therefore, mindful integration with big providers is essential to safeguard our interests.
Understanding Model Enhancements
An insightful conversation between Dwarkesh Patel and John Schulman, co-founder of OpenAI, sheds light on how models like GPT-4 have improved over time. Schulman confirms that post-training—refinements made after the initial training phase—is a significant factor in the performance gains of these models:
Dwarkesh Patel: "The current GPT-4 has an Elo score that is like a hundred points higher than the original one that was released. Is that all because of what you're talking about, with these improvements that are brought on by post-training?"
John Schulman: "Yeah, most of that is post-training. There are a lot of different, separate axes for improvement."
This highlights the importance of ongoing development and customization to meet specific needs. It reinforces our decision to focus on refining and tailoring our system rather than relying entirely on external models.
Choosing the Right Tools for Our Goals
When it comes to large language models (LLMs), there are several key areas to consider:
Our focus has been on RAG and fine-tuning. RAG, in particular, allows us to combine a retrieval system with a generative model. This means the model can pull relevant information from our proprietary data sources, providing accurate, real-time responses without relying on external models that might compromise data privacy.
By selecting tools that align with our goals, we're able to create a system that not only meets our specific needs but also mitigates the risks associated with data sharing and dependency on third-party providers.
SvenAI's Role at Tyréns
SvenAI is an AI-powered tool supporting various applications in infrastructure planning, construction management, and research. Its primary objectives are:
By combining sophisticated AI approaches, SvenAI becomes a highly versatile solution capable of responding to a wide range of queries while delivering accurate and insightful information.
Let's take a look under the hood!
HyDE and LLM ReRank: Enhancing Retrieval with Hypothetical Generation and Reranking
HyDE (Hypothetical Document Embeddings) and LLM ReRank enhance information retrieval by generating hypothetical documents that could answer a query and then reranking results based on contextual relevance. HyDE creates synthetic documents related to the query and converts them into vector embeddings, allowing the system to consider both existing and potential answers. LLM ReRank reevaluates and scores these documents using a large language model to prioritize the most relevant information. This combination ensures that even when direct answers aren't available, the AI can generate plausible responses and present them effectively.
1. Hypothetical Document Embeddings (HyDE)
HyDE improves document retrieval by generating synthetic documents based on a query. Here’s how it works:
2. LLM ReRank
Once the initial documents are retrieved, LLM ReRank refines the results to ensure relevance:
Key Strengths of HyDE and LLM ReRank
This approach offers several key benefits:
Graph RAG: Structured, Contextual Retrieval for Complex Queries
While HyDE and LLM ReRank are focused on generating and refining individual document results, Graph RAG takes a different approach by structuring information into a knowledge graph. This graph allows the AI to explore complex relationships, themes, and entities across the dataset.
1. Knowledge Graph Generation
Graph RAG automatically generates a knowledge graph from documents using a Large Language Model (LLM). This graph captures the relationships between key entities, concepts, and topics.
2. Community Detection and Summarization
A standout feature of Graph RAG is its ability to detect communities of related entities and generate summaries for each:
3. Global Question Handling
While traditional retrieval methods focus on finding semantically similar chunks of text, Graph RAG can handle global queries by exploring connections and relationships across the entire dataset. This provides more comprehensive answers, considering the full context of the data.
4. Map-Reduce Question Answering
Graph RAG utilizes a map-reduce approach to answer complex queries:
This method ensures the AI is able to handle large datasets efficiently, offering precise responses even for complex queries.
5. Improved Performance with Graph RAG
The Graph RAG approach brings several advantages:
Agentic Approaches: Dynamic Query Handling
One of the more advanced features being explored is the introduction of agentic approaches in Graph RAG. These approaches enable the AI to dynamically switch between local and global search strategies based on the specific nature of the query, making the system more flexible and intelligent in its query processing.
1. Local Search for Precision
In local search, the system identifies specific nodes and relationships within a defined subset of the graph, focusing on precision over breadth. This approach is ideal for highly detailed, targeted queries where the answer is likely contained within a limited part of the dataset.
2. Global Search for Comprehensive Context
For broader, more exploratory queries, the global search strategy enables the system to traverse the entire graph. This strategy provides a comprehensive understanding of the relationships and entities within the dataset, perfect for high-level queries requiring an understanding of the bigger picture.
3. Adaptive Search with Agents
By integrating agentic behavior, the AI can adapt its approach based on the query's complexity and the information available. It can:
This agentic behavior introduces a more intelligent, dynamic response generation system, enabling the AI to refine its search methods on the fly, which can lead to more accurate and contextually appropriate results.
Connecting the Methodologies to Our Objectives
These methodologies are not just abstract concepts; they directly contribute to achieving SvenAI's primary objectives. By integrating HyDE and LLM ReRank, we improve the precision and relevance of information retrieval, which is crucial for efficient data-driven decision-making. Graph RAG and agentic approaches further enhance our ability to handle complex queries, providing contextual responses that consider the broader landscape of information.
To illustrate how these methodologies benefit our work at Tyréns, let's explore two use cases:
Use Case 1: Research Support for Large Infrastructure Projects
Context:
When planning large-scale infrastructure projects, decision-makers rely on vast amounts of technical reports, research papers, regulations, and environmental studies. Finding the most relevant and up-to-date information quickly is challenging.
Solution Using HyDE and LLM ReRank:
Value:
This combination saves time and improves decision quality by providing comprehensive and precise information retrieval, which is crucial for industries like infrastructure, healthcare, or legal services. By efficiently navigating large datasets, project managers and engineers can make informed decisions without the tedious process of manual filtering.
Use Case 2: Optimizing Knowledge Management for Urban Planning
Context:
Urban planners working on complex city infrastructure projects need to access, integrate, and summarize dispersed information from various documents and data sources.
Solution Using Graph RAG:
Value:
This approach supports more informed and strategic decision-making in urban development by offering comprehensive, context-rich summaries that reflect both specific details and broader insights. The ability to understand relationships between different data points enhances the quality of planning and execution in urban projects.
What’s Next for Tyréns’ AI Solution?
The journey to improving Sven’s capabilities is ongoing, with several exciting developments on the horizon:
1. Expanding Agentic Approaches
The integration of more agentic search strategies will continue, allowing the AI to better manage both local and global searches. By dynamically adjusting the search methodology based on the query’s needs, future iterations of Sven will be more adaptable, leading to even more precise and comprehensive responses.
2. Hyperparameter Tuning for Efficiency
Fine-tuning the system's hyperparameters—such as chunk size, token overlap, and rank thresholds—will enhance processing efficiency and performance. This optimization will lead to faster response times and more accurate results, improving the overall user experience.
And lot's and lots of more.. Stay tuned
--
1moThank for sharing 👍
AECforward.ai, IA pour la construction, consultant BET structure et AMOT
1moVery interesting to read how Tyrens is implementing RAG in the AEC industry. Thanks for sharing Stefan Wendin. And in what's next; D-RAG for all those drawings obviously ;-)
PhD - Full Professor at Federal Institute of Education, Science and Technology of Goiás - Campus Jataí
2moDear Stefan Wendin, thank you very much for sharing this thorough history of your work. We're working on a similar tool in Brazil to support knowledge management, and almost all of your thoughts align with what we're working on.
Tech-Driven TA/HR Professional | Scaling TA/HR Functions | Uncovering your Hidden Leaders with the Unoffical Leaders Datamodel
3moThis is so cool, there is a lot of orgs that need this.. Wow. Not pretending to understand all of the tech, but the value, amazing! So inspiring!