Class 30 - CHATBOT FOR DOCUMENTS Notes from the AI Basic Course by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Hamza Nadeem

Founder & CEO H-Tech | AI Enthusiastic

Published Apr 8, 2024

+ Follow

Class 30 - CHATBOT FOR DOCUMENTS

Notes from the AI Basic Course by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Today, we are going towards LangChain.

Give yourself margin, give yourself breathing space, focus on one niche, leave rest upto ALLAH.

Direction is very important in Life.

If you have Believe on ALLAH & you are working in same directios, ALLAH will make paths for you.

Perfection come with time.

Successful are those, who have made their decisions & they stand on their decisions.

LangChain is like a bridge for different LLM'S.

We, will make bot with this technology.

Document GPT, we are going to create GPT with documents.

Functionality:

1- User can upload the document

2- He can query anything from the document

3- User can generate summary of the document.

Required Tools:

1- LLM (openai)

2- LangChain

3- Vector database

4- Streamlit

We will make you Practioner, you make to be scientist by yourself.

LangChain is Python Package.

LangChain arrived in Mar 2023.

We have to change our curriculum time to time. Don't define 4 year syllabus. Change it after 6 months for Practical courses.

Client will ask you a question.

What's unique in your product or service ???

LangChain is like a bridge that connects data source with LLM'S.

It is used for

1- Chatbots

2- Answering questions using sources

3- Data augmentation

Vector database is same like database.

Streamlit is Front-End Python Package.

Data Talks to you, if you develops the ability to Listen.

Recommended by LinkedIn

The Journey to LLM Expertise - Part 2: Leading Large…

Data Science Dojo 1 year ago

OpenAI Hype Cycle

AIM 2 years ago

Importance of Frameworks in AI

Analytics Insight® 7 months ago

OpenAI:

It is a LLM (Large Language Model)

To avoid the tokens limit exceeding issue, divide the data by yourself or by using LangChain function.

Loader will load the data depends on file type.

After loading we have realized that, data consists of 30k words.

Then, we use splitters to divide the data.

How, our document GPT will work.

Splitting the document into smaller chunks

Convert text chunks into embeddings

Perform a similarity search on the embeddings.

Generate answers to questions using an LLM.

Embeddings are made by vector.

In technology, don't ignore micros, it will create loop holes or dots in yourself, will hurt you in long run.

Vector Database:

Understand this concept.

While having meetings with client, your technical words matters alot.

Avoid overflow information.

Create Vectors from the splitted data, that's why they are called embedding vectors.

Chunks to Embeddings:

Embedding are numerical representations that capture the semantic essence of words, phrases or sentences.

Embeddings Models:

Take words and make vectors.

Embedding models {hugging face and openai}

Hugging face is open-source.

OpenAI is paid.

From Hugging Face, you do alot like {you can get help from HUgging Face for 80-90% of upwork project}. In addition, you have to make some customizations, you have to enough competent till now.

Now, you can start your Journey.

Vector Databases:

FAISS (locally managed)

Elastic Search (locally managed)

Chroma db (locally managed)

Quadrant (managed) {Free or Paid}

Pinecone (managed) {Paid}

#AI #artificialintelligence #datascience #irfanmalik #drsheraz #xevensolutions #openai #chatbot #streamlit #hamzanadeem

To view or add a comment, sign in

Class 30 - CHATBOT FOR DOCUMENTS Notes from the AI Basic Course by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Hamza Nadeem

Founder & CEO H-Tech | AI Enthusiastic

Recommended by LinkedIn

More articles by Hamza Nadeem

Insights from the community

Others also viewed

Importance of Frameworks in AI

Build RAG applications using only APIs with Postman! ⚡️

Top Data Analytics Skills and Platforms for 2023, PyTorch 2.0 Released, and 5 Huge Data Science Career Mistakes

Agent Protocol to Deploy AI Agents in Production

Recognize, Detect, Segment, and Moderate Your Images with a Single API! 🔥

Llama 3.2: On-device 1B/3B and Multimodal 11B/90B Models – Access via API 🔥

New flagship and advanced LLM from MistralAI with a 32K context window 🚀

The Rise of AI-Powered Code Generation Tools: How Developers are Accelerating Workflow

OpenAI Introduces Structured Outputs - A Breakthrough for Developers

Introducing Gemma: New Open Source Model from Google outperformed Llama 2 and Mistral Models!

Explore topics

Recommended by LinkedIn

More articles by Hamza Nadeem

ARTIFICIAL NEURAL NETWORK Notes from the AI Advance course-Class 25 by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Basics of NumPy

DEEP LEARNING Notes from the AI Advance course-Class 24 by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Class 35 - CLASSIFICATION MODEL USING PYTORCH Notes from the AI Basic Course by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Class 34 - REGRESSION USING PYTORCH Notes from the AI Basic Course by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Class 33 - INTRODUCTION TO LLAMA INDEX Notes from the AI Basic Course by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Class 32 - DOCUMENT GPT 2.0 Notes from the AI Basic Course by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Class 31 - DOCUMENT GPT HANDS-ON Notes from the AI Basic Course by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Class 29 - CHATBOT DEBUGGING IN VS CODE Notes from the AI Basic Course by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Class 28 - CHATBOT USING OPENAI STREAMLIT Notes from the AI Basic Course by Irfan Malik & Dr Sheraz Naseer (Xeven Solutions)

Insights from the community

Others also viewed

Importance of Frameworks in AI

Build RAG applications using only APIs with Postman! ⚡️

Top Data Analytics Skills and Platforms for 2023, PyTorch 2.0 Released, and 5 Huge Data Science Career Mistakes

Agent Protocol to Deploy AI Agents in Production

Recognize, Detect, Segment, and Moderate Your Images with a Single API! 🔥

Llama 3.2: On-device 1B/3B and Multimodal 11B/90B Models – Access via API 🔥

New flagship and advanced LLM from MistralAI with a 32K context window 🚀

The Rise of AI-Powered Code Generation Tools: How Developers are Accelerating Workflow

OpenAI Introduces Structured Outputs - A Breakthrough for Developers

Introducing Gemma: New Open Source Model from Google outperformed Llama 2 and Mistral Models!

Explore topics