# RAG is the concept, not a product I thank Sherman for showcasing the work of our ML team at Nyota AI: https://lnkd.in/epX965Ch His illustrative diagram captures well what RAG is all about. It is now up to me to build upon his introduction and delve a bit deeper under the hood. The "RAG" label hides a large variety of algorithmic approaches and technologies, primarily sharing in-context learning idea. The "A" in the RAG acronym stands for Augmentation -- the input of the Large Language Model (LLM) is augmented, in contrast with the "simple" LLM invocation where everything the model "knows" comes from pretraining. In other words, the input of LLM in the RAG case contains not only the user's question but also another useful information and we assume its work is easier. That's why RAG offers important promises: 1. Personalization/fine-tuning alternative: Unlike foundation models trained by providers on large volumes of publicly available and synthetic data (plus RLHF), RAG allows the model to directly use your local data, without the need for extensive data preparation and expensive and lengthy model fine-tuning. 2. Reduction of hallucinations - by finding facts and answers in your data, the model does not have to invent them uninformed. The above implies limitations of this approach: - Retriever must obtain relevant context and preferably not present irrelevant information (focus, noise reduction). - LLM must be able to process a larger amount of presented pieces of information using prompts and generate a quality response. Both represent significant challenges -- RAG is only as good as its critical components. It is not difficult to realize that the overall performance is largely influenced by the application's needs. The quantity, nature, and structure of stored documents, the way they are preprocessed, the models used, and prompt engineering -- all of this means that there is no one-size-fits-all solution. Instead, attention must be paid to details, as they will determine how well the system will work.
Pavel Suchmann’s Post
More Relevant Posts
-
The enterprise search capability and the methods we used to store enterprise data are outdated. How can we re-imagined to create data and store it so AI systems can start consuming without pre-processing/grounding? What you, Pulse AI (YC S24), have is a great achievement!
Pulse AI (YC S24) just launched an API for production-grade unstructured document extraction, turning complex information into LLM-ready inputs. No training required. Approximately 75% of enterprise data is unstructured, the majority of this is directly within PDF files. This makes it extremely difficult to build RAG applications with this data, and ingestion is often the bottleneck. The team tested every tool on the market and found they lacked accurate contextual understanding, multi-column PDFs, and multimodal documents. Most current technologies are simply wrappers on Textract or Gemini, which have their own inherent flaws. Pulse trained its own set of Vision Language Models (VLMs) and OCR techniques to bridge this gap. Pulse reached state-of-the-art (SOTA) performance on its vision model for documents and spreadsheets. The API processes all PDF types, including handwritten documents, foreign languages, and more. It seamlessly integrates into new or existing engineering workflows as well. They are also actively working on a novel reasoning tool on spreadsheets using this technology – stay tuned. Pulse's API saw initial success across supply chain teams in the three-way match process and is deployed in companies across hardware, healthcare, manufacturing industries, and more. Sign up at studio.trypulse.ai. Congrats on the launch Sid Manchkanti and Ritvik Pandey! 🚀 https://lnkd.in/gKW37rcH
To view or add a comment, sign in
-
🤖 When you use AI properly amazing things can happen. Pulse AI (YC S24)* is a great example of that. Why is this such a useful case study? We're glad you asked... Data** can be a miracle worker - particularly when you're in a large or complex organisation. Data - or, more accurately, the insights it holds - can make profound a difference to your operational efficiency, your decision-making, and your ability to uncover opportunities and innovations that will futureproof your business. 🏆 And here's the kicker: YOU ALREADY HAVE IT. But that data can only fulfil its potential if it's structured properly and then presented in ways that make it accessible and usable to everyone in your business - from domain experts to the C-Level. And there are huge challenges with collecting fragmented data, even using AI. AI is great at reading reports and such like, but hasn't been great at extracting and structuring data that's held in different places and in different formats. 👏 That's what these Pulse AI guys are looking to sort out. If you can extract your data and structure it, you can then use machine learning and generative AI models to create a knowledge base your whole business can access, understand, and take advantage of. 💥 This is properly transformative stuff. But the lesson here is: you've got to use the right models in the right way. Give our MD Andreas a shout if you want to chat about the way you can use AI to transform how you do business. * We are not connected to Pulse in any way, in case you're wondering. We just like what they're up to. ** Grammar pedants: we are aware of the single and plurals of data. It just sounds weird applying the correct grammatical rules.
Pulse AI (YC S24) just launched an API for production-grade unstructured document extraction, turning complex information into LLM-ready inputs. No training required. Approximately 75% of enterprise data is unstructured, the majority of this is directly within PDF files. This makes it extremely difficult to build RAG applications with this data, and ingestion is often the bottleneck. The team tested every tool on the market and found they lacked accurate contextual understanding, multi-column PDFs, and multimodal documents. Most current technologies are simply wrappers on Textract or Gemini, which have their own inherent flaws. Pulse trained its own set of Vision Language Models (VLMs) and OCR techniques to bridge this gap. Pulse reached state-of-the-art (SOTA) performance on its vision model for documents and spreadsheets. The API processes all PDF types, including handwritten documents, foreign languages, and more. It seamlessly integrates into new or existing engineering workflows as well. They are also actively working on a novel reasoning tool on spreadsheets using this technology – stay tuned. Pulse's API saw initial success across supply chain teams in the three-way match process and is deployed in companies across hardware, healthcare, manufacturing industries, and more. Sign up at studio.trypulse.ai. Congrats on the launch Sid Manchkanti and Ritvik Pandey! 🚀 https://lnkd.in/gKW37rcH
To view or add a comment, sign in
-
Data cleaning and preparation is not gone with gen AI. after all they're machine learning models Here are some of the preprocessing steps I have personally used to get better retrieval: 1- Document collection: Gather all relevant text documents. (Web scrapping, PDF, and more). 2- Text preprocessing: - Remove irrelevant content (e.g., headers, footers) - Clean the text (remove special characters, Unicode, lowercase everything, normalize whitespace) - Correct spelling and formatting issues 3- Text segmentation: - Split documents into smaller chunks (Langchain) - Ensure chunks are semantically meaningful and self-contained 4- Metadata extraction: - Extract relevant metadata (e.g., document title, date, author) - Associate metadata with corresponding text chunks 5- Embedding generation: Chroma DB for example can do it for you, or you can specify a model. - Use a suitable embedding model to convert text chunks into vector representations - Store embeddings alongside the original text and metadata 6- Indexing: - Create an efficient index for quick retrieval (e.g., using vector databases like Pinecone or Faiss or chroma DB) 7- Quality assurance: - Verify the accuracy and relevance of processed documents - Ensure proper linking between chunks, metadata, and embeddings #RAG, #LLMS, #Datapreparation #datacleaning
To view or add a comment, sign in
-
Great insights here! The potential of AI to unlock hidden value in a company’s data is truly transformative, especially when data from different sources can be organized and centralized for easy access. I completely agree that structuring and standardizing data is essential to make AI work effectively and drive impactful insights. Pulse AI’s approach sounds like a game-changer, especially for complex organizations with fragmented data. It's exciting to think how accessible and actionable data could be when it's organized properly. Thanks for sharing this perspective – it’s inspiring to see how the right use of AI can enable such powerful changes!
Pulse AI (YC S24) just launched an API for production-grade unstructured document extraction, turning complex information into LLM-ready inputs. No training required. Approximately 75% of enterprise data is unstructured, the majority of this is directly within PDF files. This makes it extremely difficult to build RAG applications with this data, and ingestion is often the bottleneck. The team tested every tool on the market and found they lacked accurate contextual understanding, multi-column PDFs, and multimodal documents. Most current technologies are simply wrappers on Textract or Gemini, which have their own inherent flaws. Pulse trained its own set of Vision Language Models (VLMs) and OCR techniques to bridge this gap. Pulse reached state-of-the-art (SOTA) performance on its vision model for documents and spreadsheets. The API processes all PDF types, including handwritten documents, foreign languages, and more. It seamlessly integrates into new or existing engineering workflows as well. They are also actively working on a novel reasoning tool on spreadsheets using this technology – stay tuned. Pulse's API saw initial success across supply chain teams in the three-way match process and is deployed in companies across hardware, healthcare, manufacturing industries, and more. Sign up at studio.trypulse.ai. Congrats on the launch Sid Manchkanti and Ritvik Pandey! 🚀 https://lnkd.in/gKW37rcH
To view or add a comment, sign in
-
𝐔𝐧𝐥𝐨𝐜𝐤 𝐭𝐡𝐞 𝐒𝐞𝐜𝐫𝐞𝐭𝐬 𝐨𝐟 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬 𝐰𝐢𝐭𝐡 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫 𝐄𝐱𝐩𝐥𝐚𝐢𝐧𝐞𝐫 Transformers have revolutionized machine learning, but their inner workings often remain a mystery to many. Introducing Transformer Explainer, an innovative, interactive visualization tool designed to demystify Transformers for non-experts through the GPT-2 model. 𝐊𝐞𝐲 𝐅𝐞𝐚𝐭𝐮𝐫𝐞𝐬: 𝐈𝐧𝐭𝐞𝐫𝐚𝐜𝐭𝐢𝐯𝐞 𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠: Understand complex Transformer concepts with a model overview and smooth transitions across different abstraction levels. 𝐋𝐢𝐯𝐞 𝐆𝐏𝐓-2 𝐈𝐧𝐬𝐭𝐚𝐧𝐜𝐞: Experiment with your own input and observe in real-time how the Transformer predicts the next tokens. 𝐍𝐨 𝐈𝐧𝐬𝐭𝐚𝐥𝐥𝐚𝐭𝐢𝐨𝐧 𝐑𝐞𝐪𝐮𝐢𝐫𝐞𝐝: Access modern generative AI techniques without needing special hardware or software installations. 𝐄𝐱𝐩𝐥𝐨𝐫𝐞 𝐍𝐨𝐰: Tool: https://lnkd.in/ebCzFMrb https://lnkd.in/eMZSRztC #TransformerExplainer #AIforAll #MachineLearning #GPT2 #InteractiveLearning
To view or add a comment, sign in
-
Pulse AI (YC S24) just launched an API for production-grade unstructured document extraction, turning complex information into LLM-ready inputs. No training required. Approximately 75% of enterprise data is unstructured, the majority of this is directly within PDF files. This makes it extremely difficult to build RAG applications with this data, and ingestion is often the bottleneck. The team tested every tool on the market and found they lacked accurate contextual understanding, multi-column PDFs, and multimodal documents. Most current technologies are simply wrappers on Textract or Gemini, which have their own inherent flaws. Pulse trained its own set of Vision Language Models (VLMs) and OCR techniques to bridge this gap. Pulse reached state-of-the-art (SOTA) performance on its vision model for documents and spreadsheets. The API processes all PDF types, including handwritten documents, foreign languages, and more. It seamlessly integrates into new or existing engineering workflows as well. They are also actively working on a novel reasoning tool on spreadsheets using this technology – stay tuned. Pulse's API saw initial success across supply chain teams in the three-way match process and is deployed in companies across hardware, healthcare, manufacturing industries, and more. Sign up at studio.trypulse.ai. Congrats on the launch Sid Manchkanti and Ritvik Pandey! 🚀 https://lnkd.in/gKW37rcH
To view or add a comment, sign in
-
You can build this solution with a simple pdf data extract and an LLM. I know because I built one. It automatically took email attachments containing an #RFQ and inserted that data into ServiceM8 and generated a quote based on its knowledge base.
Pulse AI (YC S24) just launched an API for production-grade unstructured document extraction, turning complex information into LLM-ready inputs. No training required. Approximately 75% of enterprise data is unstructured, the majority of this is directly within PDF files. This makes it extremely difficult to build RAG applications with this data, and ingestion is often the bottleneck. The team tested every tool on the market and found they lacked accurate contextual understanding, multi-column PDFs, and multimodal documents. Most current technologies are simply wrappers on Textract or Gemini, which have their own inherent flaws. Pulse trained its own set of Vision Language Models (VLMs) and OCR techniques to bridge this gap. Pulse reached state-of-the-art (SOTA) performance on its vision model for documents and spreadsheets. The API processes all PDF types, including handwritten documents, foreign languages, and more. It seamlessly integrates into new or existing engineering workflows as well. They are also actively working on a novel reasoning tool on spreadsheets using this technology – stay tuned. Pulse's API saw initial success across supply chain teams in the three-way match process and is deployed in companies across hardware, healthcare, manufacturing industries, and more. Sign up at studio.trypulse.ai. Congrats on the launch Sid Manchkanti and Ritvik Pandey! 🚀 https://lnkd.in/gKW37rcH
To view or add a comment, sign in
-
Excited to share insights on tackling one of the most persistent challenges in AI: extracting contextually accurate information from complex documents! 🔍 The Challenge: As humans, we understand entities, relationships, and context intuitively, but LLM models struggle with understanding these concepts, especially if they are not directly implied. 💡 The Solution: HybridRAG - a different approach combining Knowledge Graphs and Vector Retrieval Key benefits: • Navigates complex document structures with ease • Resolves ambiguities through enhanced contextual understanding • Handles inconsistent terminology and non-standard formats • Scales to process large volumes efficiently I tried applying this technique in creating a Legal Case RAG app, and the results are truly exciting. By leveraging both graph-based and vector-based retrieval, we're able to extract nuanced information from even the most complex legal documents, opening up new possibilities for automation and analysis in the legal tech space. What are your thoughts? Have you encountered similar challenges in your field? How do you see technologies like HybridRAG transforming industries that rely heavily on complex document analysis? #ArtificialIntelligence #RAG #DataExtraction Research paper: https://lnkd.in/gyGVupF5 Example Code: https://lnkd.in/gVHkNsfT Special thanks and appreciation to Jaelin Lee Hanieh Moshki William Green Subhrajit Makur and the team. Full story: https://lnkd.in/gCvuHzdQ
HybridRAG: Integrating Knowledge Graphs and Vector Retrieval Augmented Generation for Efficient Information Extraction
arxiv.org
To view or add a comment, sign in
-
🌟 Pushing Boundaries of AI: My Multi-modal RAG-Based LLM Project🌟 I’m excited to share my project: Multi-modal RAG (Retrieval-Augmented Generation) Based LLM for Information Retrieval! 🚀 🎯 Project Highlights: This system is designed to retrieve and generate accurate answers from unstructured data, seamlessly combining text and visual modalities to provide a highly interactive user experience. 💡 Key Features: ✅ Multi-modal Capabilities: Handles both text and visual inputs/outputs, making it adaptable to diverse datasets. ✅ Advanced Query Handling: Integrates AI agents for intelligent context-aware query processing. Implements a "Tree of Thought" agent for breaking queries into sub-questions. Employs a cross-encoder for reranking, ensuring only the most relevant results are presented. ✅ Conversational Interaction: Supports complex queries like multi-step reasoning and calculations (e.g., averaging discussed data points). ✅ Efficient Retrieval: Built on a robust RAG framework, it uses state-of-the-art vector databases like Weaviate, Pinecone, and LanceDB for rapid and accurate retrieval. ✅ Feedback Integration: Enhances performance by learning from user feedback using LangSmith. ✅ Streamlined Interface: Developed with Streamlit for a simple yet powerful user experience. 🛠️ Tech Stack: LLM: Leveraged Mistral-7B for high-quality natural language understanding. Vector Databases: Weaviate, Pinecone, LanceDB for scalable data handling. Frameworks: LangGraph, Streamlit, and HuggingFace tools for seamless integration. ✨ What Makes This Unique: This project bridges the gap between AI and real-world usability, offering a multi-modal, context-aware retrieval system capable of handling unstructured data with exceptional accuracy. It’s a step toward building smarter, more intuitive AI solutions for diverse applications. This journey has been incredibly rewarding, and I’m excited to continue exploring the endless possibilities of AI and RAG systems. I’d love to hear your thoughts, ideas, or experiences with similar technologies! #AI #RAG #LLM #MultiModal #InformationRetrieval #DataScience #Mistral #LangGraph #Streamlit #LearningAndGrowing
To view or add a comment, sign in
-
Day 27- Interesting approach for tackling Agentic workflows, which are labor-intensive and limit the broader application of LLMs. AFLOW reframes workflow optimization as a search problem over code-represented workflows 🖥️, offering a more flexible and comprehensive approach than previous methods that relied on manual setup 🤖 or limited search spaces. ✨ Key Innovations: - Code Representation: AFLOW uses code to represent workflows. This allows precise control over what the language models (LLMs) do, includes conditional logic, and helps integrate complex structures like graphs. 🖥️ - Operators: AFLOW introduces "operators." These are groups of reusable nodes for common tasks, like Ensemble or Review & Revise. Operators make workflow creation simpler and improve search efficiency. 🔄 - Monte Carlo Tree Search (MCTS): AFLOW uses a modified MCTS algorithm to explore different workflow setups. This process evaluates workflows, learns from the results, and improves them. The tree structure helps keep track of past experiences for better choices. 🌳 - Soft Mixed-Probability Selection: This method balances exploration and exploitation. It uses a mix of equal and score-based probability to ensure a wide range of options while focusing on successful workflows. ⚖️ - LLM-Driven Expansion: An LLM helps create new workflows by changing existing ones and their prompts. It uses insights from past changes and performance data to guide its improvements. 📈 Results and implications: AFLOW was tested on six benchmark datasets across question answering, code generation, and mathematics. The results show that AFLOW creates workflows that exceed those of traditional methods and other automated systems, allowing smaller language models (LLMs) to perform as well as or better than larger ones, leading to cost savings 💰. By reducing the need for human expertise, AFLOW simplifies workflow design ✨, making LLM technology more accessible and scalable. Its ability to find cost-effective workflows could promote wider use of LLMs in real-life applications 🌍. Notable caveats: AFLOW mainly focuses on reasoning tasks with clear evaluation methods. 🧠 Expanding its capabilities to handle more complex tasks, especially those that involve personal judgments or changing environments, could be valuable for future research. 🔍 Looking into how to include reinforcement learning could also improve workflow optimization and create more flexible AI systems. 💡 paper link- https://lnkd.in/gbbCXZbC #GenAI #0to100xengineer
AFlow: Automating Agentic Workflow Generation
arxiv.org
To view or add a comment, sign in