Pavel Suchmann’s Post

View profile for Pavel Suchmann, graphic

currently: Startup ML adventures ... formerly: Principal Software Engineer at Twilio Inc.

# RAG is the concept, not a product I thank Sherman for showcasing the work of our ML team at Nyota AI: https://lnkd.in/epX965Ch His illustrative diagram captures well what RAG is all about. It is now up to me to build upon his introduction and delve a bit deeper under the hood. The "RAG" label hides a large variety of algorithmic approaches and technologies, primarily sharing in-context learning idea. The "A" in the RAG acronym stands for Augmentation -- the input of the Large Language Model (LLM) is augmented, in contrast with the "simple" LLM invocation where everything the model "knows" comes from pretraining. In other words, the input of LLM in the RAG case contains not only the user's question but also another useful information and we assume its work is easier. That's why RAG offers important promises: 1. Personalization/fine-tuning alternative: Unlike foundation models trained by providers on large volumes of publicly available and synthetic data (plus RLHF), RAG allows the model to directly use your local data, without the need for extensive data preparation and expensive and lengthy model fine-tuning. 2. Reduction of hallucinations - by finding facts and answers in your data, the model does not have to invent them uninformed. The above implies limitations of this approach: - Retriever must obtain relevant context and preferably not present irrelevant information (focus, noise reduction). - LLM must be able to process a larger amount of presented pieces of information using prompts and generate a quality response. Both represent significant challenges -- RAG is only as good as its critical components. It is not difficult to realize that the overall performance is largely influenced by the application's needs. The quantity, nature, and structure of stored documents, the way they are preprocessed, the models used, and prompt engineering -- all of this means that there is no one-size-fits-all solution. Instead, attention must be paid to details, as they will determine how well the system will work.

To view or add a comment, sign in

Explore topics