What's the hype surrounding Gen AI, LLMs, RAG and so on? It's crucial not to overlook this [Beginner]

What's the hype surrounding Gen AI, LLMs, RAG and so on? It's crucial not to overlook this [Beginner]

The purpose of this article is to provide you with a foundational understanding of Gen AI and its associated terminology. By doing so, you'll be equipped to explore solutions independently and contribute to this groundbreaking transformation.

Before I go into specifics of Gen AI , LLMs, Embedding, Vector Search, RAG etc lets try to understand a bit about AI-ML first and a brief history so that you don't feel alienated.

Background:

AI:

Artificial intelligence is general field with very broad scope including Language Processing, Computer Vision, Summarisation etc

Machine Learning:(Check this course)

Machine learning is the branch of AI that covers the statistical part of artificial intelligence. It teaches computer to solve problems by looking at the thousands of examples, learning from them and then using that experience to solve the same problem in new situations.

Machine Learning


Deep Learning and Neural Network: (check this coursera course)

Deep learning is a very special field of ML where computer can actually learn and make intelligent decision of their own. It involves a deeper level of automation in comparison with most ML algorithms.

Neural Network: It is the network of neurons, as the name suggests, but here the neurons are some mathematical functions represented by nodes defined in below diagrams. All the neuron does is take input, computes the function and then output the result(prediction/estimation etc).

Lets try to understand in simple terms e.g if y={1,2,3,4 for every x in {1,2,3,4}} then the function is y = x linear equation,

y={1,3,8,15..} for every x in {1,2,3,4...} then the function is y= x^2 -1 (quadratic equation), Or the input and output can be related with any order of equation

A simple Model respesentation


One good thing about NN is, given enough data about input(x) and output(y), the NN are remarkable good and figuring out the functions that accurately maps x -> y. i.e if your train on enough data, the NN can find the best algo like is it linear, quadratic or polynomial equation of a particular order.

The real things are not as simple as that but it will give you layman understanding.

A simple diagram to show various dimensions of NN


Evolution of AI

Evolution of AI Landscape



Generative AI

Before Gen AI, we had ML models as shown in the earlier para that can help you do predictions or forecasts based on existing data ( The historical data which you have trained your model on) whereas Generative AI learns the patterns in the training data and generate new content which are closer to that but are not exact replicas.

Generative AI is used in applications such as image generation, video description, text generation, music composition, and content creation etc.

Now you have the AI power in your hand and instead of Googling things, try out these to get yourself familiar on the go.

  • Open Krutrim from Ola App menu options and start playing
  • Meta Al from Whatsapp App right at your finger tip
  • Microsoft Copilot.
  • then you have chat GPT.


Lets now talk about elephant in the room. LLMs. What is it? What can it do? etc etc. Check out the course Startup-School-AI. You can get yourself familiar with Gen AI and all the terminology which I am listing here. The hands on are based on Google's gemini model but again you can get the theoretical knowledge and see the demos to understand things in a bit more details.

  • LLMs
  • Prompt Engineering
  • Embedding & Vector Search
  • RAG - Retrieval Augmented Generation


LLMs:

LLMs are the models trained on huge amount of data enabling them to learn the statistical patterns and semantic relationships within language and once tested and fine tuned, it is released to use for the outside world. Given the sequence what LLMs do is predict the next token in the sequence(can be a sentence or a document etc). LLMs are typically based on deep learning architectures. A simple diagram below shows how we come to any LLM model after training. e.g open AI's GPT-3, GPT-4, Google's BERT, LaMDA, PaLM, Gemini , Anthropic's Claude etc.


These LLMs can be used now to create different solutions. Outside, they can be seen as separate apps, but under the hood they may use the same LLMs. e.g Github co-pilot, Microsoft 365 co-pilot etc

Multi-modality: Here modals can process and understand data from various sources, such as text, images, audio, video, and sensor data instead of data of a specific type. Traditional language models, like GPT models, primarily focus on processing text data. However, multi-modality LLMs extend this capability to incorporate and understand other forms of data, such as images, audio, video, or structured data etc.

Prompt Engineering:

Prompt: The text you feed to the model is called prompt.

It is the art and science of figuring out what text to feed your language model to nudge the model to behave in the desired way. You might have amazing algorithm but if your prompt are not good enough, you can't get potential out of that algorithm.

Prompt can be of following types:

  • Question input: What is the best Veg Restaurant in JP Nagar, Bengaluru?
  • Task Input: Give me a list of things that I should carry when planning a hiking to base camp of Mt Everest.
  • Entity Input: Classify the following as large, small: Ant, Tiger.
  • Completion Input: Some strategy to adapt to new things in life includes ....

Context: On top of that , we can also provide some context around that input.

You can add contextual info in your prompt when you need to give info to the model or restrict the boundaries of response to whats only within the prompt.

Input and Context can be used interchangeably.


Embedding & Vector Search:

Embedding is being used in internet everyday and you may not be aware of it but its part of your daily internet usage e.g Google, Spotify, Facebook, Instagram, Youtube recommendation etc. Almost all the big tech are using embedding at their core but we still lack its penetration in IT services where more than 90% still relies on tabular data. Thats why knowing embedding is super important even for business people.

Technically embeddings are some real numbers(aka vectors) and it is generated by AI model or deep learning models (LLMs). It is like the meaning of entire text represented as a single point vector of n dimentions.

E.g this text "What are the books related to Human Behaviour ?" --> can be represented by a vector [0.3,0.1,0.25.......] in vector space.

Vector/embedding space: You can break the texts from your file to create chunks and find the embedding for each chunk. Now when you map this , it creates an embedding space where embedding model puts chunks(text) with similar meaning close together. Basically it is map of meaning.

You can go to https://www.nomic.ai/ and play around with real visualisation of your data.

Use cases for embedding:

  • Semantic Search
  • Recomendation
  • Clustering, Anomaly detection, Sentiment Analysis etc


Vector Search:

Embeddings are vectors and you can calculate the similarity between two embeddings by using many metrics and some popular metrics are

  • L2/Euclidean Distance
  • Cosine Similarity
  • Dot Product

This is how when you query, your query is first converted into vector and then it is searched against the vector DB where you have stored all the source information.


RAG - Retrieval Augmented Generation:

Lets understand why do we need RAG first.

Grounding: Connecting abstract concepts or language representations to real-world knowledge or experiences i.e how to integrate LLMs or AI chatbot with existing IT systems, database & business data.

LLMs problem is Hallucinations (aka grounding problem):

Hallucinations means your are giving some results which are not relevant to my query at all but model is confident 100% that it is the correct answer.

LLMs are phenomenal for knowledge generation and reasoning. They are pre-trained on large amount of public data. But LLMs can only understand the information -

  • that they are trained on
  • that they are explicitly given in prompt


LLMs don't have capability to ask for more information and potentially need some outside input. Lets see some naive solution for this.

  • Fine Tuning: It is expensive to do and to train on every new data is hefty task.
  • Make Manual Check: We can't test every o/p of LLMs to verify before sending to user. Not realistic, humans can make error.
  • Prompt Engineering: Improve your prompting by providing additional info but again, there is a token limit for your query so you can't put everything. and there is trade off between performance, latency and cost.

So we need a different approach then. Here comes RAG.

RAG allows user to provide their own data as context so that the query searches are more specific to my use case rather working on entire world data. E.g company A(Aviation Company) and Company B(E-Commerce) chose same LLM model (say GPT-4) to build a Q&A app for its users. Now with RAG, they can bring their own data into vector DB, and then augment the query by its user with this additional info before sending the final query to new LLM model(Generator) to come up with the correct answers.

RAG


Given the prompt, LLM extracts the info from the prompt that will be sent to retriever (which is going to be the data source that you provide - your internal data stored into vector DB which will limit the search scope instead of searching the whole world data)

Here is quick and nice explanation of RAG, Embedding and all by John Savill .

I trust this overview provides you with a broad understanding of numerous new terminologies in the AI and Gen AI domains. Your feedback or comments are greatly appreciated.

Happy Reading - Gaurav

#GenAI #ArtificialIntelligence #MachineLearning #DeepLearning #Multimodality #Vectorsearch #NaturalLanguageProcessing #NLP #LanguageModel #LLM #RAG #Embedding #WordEmbedding #TextEmbedding #NeuralNetworks #TechTrends #Innovation #FutureTech #DataScience #Automation #EmergingTech #SmartTechnology #AIResearch #AIApplications #AIinBusiness #AIinHealthcare #AIinFinance #AIinEducation #EthicalAI #ResponsibleAI #TechEthics #DigitalTransformation

Phil Tinembart

I grow businesses with performance-driven content | Helped companies increase organic traffic 2-3x | I share content marketing frameworks that work

9mo

Wow, that's quite the mix of conversations Explaining AI to a suit seller must've been interesting. Gaurav Kumar Singh

Krishna Madhuri Madivada

Senior Software Engineer | Full Stack .Net Developer

9mo

Such an awesome work - The Gaurav's way of explanation 🎉

To view or add a comment, sign in

More articles by Gaurav Kumar Singh

Insights from the community

Others also viewed

Explore topics