Download Subtitles for Complete RAG Crash Course with Langchain

Complete RAG Crash Course With Langchain In 2 Hours

Krish Naik

3231 segments EN

SRT - Most compatible format for video players (VLC, media players, video editors)

VTT - Web Video Text Tracks for HTML5 video and browsers

TXT - Plain text with timestamps for easy reading and editing

Subtitle preview

Scroll to view all subtitles

[00:00]

Hello all, my name is Krishna and I am

[00:03]

super excited to announce this amazing

[00:05]

crash course on rag that is retrieval

[00:07]

augmented generation. Uh in this

[00:10]

specific crash course it'll be somewhere

[00:12]

around 2.5 to 3 hours but we are going

[00:14]

to discuss everything that is related to

[00:17]

rack completely from scratch. Uh we'll

[00:20]

be talking about the entire pipeline

[00:22]

from data injection to retrieval

[00:24]

pipeline to output generation. how to

[00:26]

use LLM models, how to use embedding

[00:29]

models in this uh along with this uh

[00:31]

what should be the right strategy of

[00:32]

using chunkings and many more things

[00:35]

right so we will be deep diving into

[00:38]

both the theoretical understanding along

[00:40]

with the practical implementation and we

[00:42]

will initially go ahead step by step

[00:44]

we'll start with the basic

[00:45]

implementation and then as we go ahead

[00:47]

in the advanced section we'll also

[00:48]

implement the modular coding right the

[00:51]

main aim of the modular coding is to

[00:53]

link the entire pipeline in a way so

[00:55]

that you should be able to understand

[00:56]

how rag actually works and also

[00:58]

implement it in your company use cases.

[01:01]

Let me tell you one very important

[01:03]

thing. 90%age of the use cases that are

[01:05]

currently been worked in all the

[01:07]

companies are specifically related to

[01:09]

rag. So this crash course will be an

[01:11]

amazing one for you all of you. We'll

[01:14]

keep a simple like target of thousand uh

[01:16]

try to complete it as soon as possible

[01:18]

and we'll also keep a like target to

[01:20]

some uh comments target of 500. So

[01:23]

please try to complete it and yes go

[01:25]

ahead and enjoy this particular crash

[01:26]

course. Thank you. So this is a simple

[01:29]

definition that uh I have put up over

[01:32]

here and uh in this definition first of

[01:35]

all we'll try to understand rag. Okay.

[01:38]

So first of all let's go through the

[01:39]

definition and then I will give you a

[01:41]

brief idea what exactly rag is all about

[01:44]

you know. So here you can clearly see

[01:46]

that

[01:48]

rag is the process of optimizing the

[01:51]

output of a large language model. Okay.

[01:55]

So it references an authorative

[01:58]

knowledge base outside of his training

[02:01]

data set source before get generating a

[02:04]

response. LLMs are trained on vast

[02:08]

volume of data as we all know and use

[02:10]

billions of parameters to generally

[02:12]

original output for task like question

[02:15]

answering, translating and completing

[02:17]

sentences. Rag extends the already

[02:20]

powerful capabilities of LLM to specific

[02:22]

domain or an organizational internal

[02:25]

knowledge base all without the need to

[02:28]

retrain the model. Okay, it is cost

[02:30]

effective approach to improve LLM

[02:32]

output. So it's relevant, accurate and

[02:35]

useful in various context. So this is

[02:36]

just a basic definition. You can refer

[02:38]

to this particular definition. So guys,

[02:40]

now let's go ahead and understand about

[02:42]

rag. So let's consider that I have a

[02:46]

generative AI application. And as you

[02:48]

all know in a generative AI application,

[02:50]

usually let's say that I have an LLM. So

[02:52]

this is my LLM. Now usually whenever we

[02:55]

have a LLM what happens is that let's

[02:58]

consider that I have a user

[03:01]

a user is asking a query. So this is a

[03:05]

my query from the user and before it is

[03:09]

sent to the LLM we do add a prompt right

[03:13]

we do add a prompt and this prompt is

[03:16]

just like an instruction to the LLM like

[03:18]

how the LLM should work okay and then

[03:22]

based on this we actually get an output

[03:25]

now this is a simple generative AI

[03:27]

application wherein the LLM is used to

[03:30]

generate the content

[03:34]

Okay, generate the content. So obviously

[03:38]

by using this specific technique we give

[03:40]

a query and this LLM you know that it

[03:43]

has been trained with billions of data

[03:47]

okay different kind of data that is

[03:49]

available in the internet and based on

[03:52]

this it will be able to generate the

[03:53]

output. One of the disadvantage of this,

[03:58]

let me talk about the disadvantage of

[03:59]

this particular approach. As you know

[04:02]

that every LLM that is trained, you

[04:04]

know, it will be trained for a specific

[04:07]

set of data. So let's say right now it

[04:09]

is 31st August. Okay, 31st August.

[04:15]

Let's say this is my LLM model and this

[04:17]

is basically GPT5

[04:19]

which is the recent model from OpenAI.

[04:22]

Now as you know that when this model was

[04:24]

launched this model may be trained

[04:27]

by may be trained with data till 1st

[04:31]

August. Okay. So this LLM will not have

[04:35]

any idea what has basically happened in

[04:38]

the current world between 1st to 31st

[04:40]

August. Right? And let's say if I go

[04:43]

ahead and ask a specific question to the

[04:45]

LLM which is between this specific dates

[04:49]

for any kind of events the LLM will

[04:52]

start hallucinating. So one of the major

[04:55]

disadvantages of only using the LLM is

[04:59]

that it will hallucinate. Okay. When we

[05:02]

say hallucinating what does this

[05:04]

basically mean? It means that even

[05:07]

though it does not have the knowledge

[05:08]

what has happened between 1st August to

[05:10]

31st August any events even though we

[05:13]

ask any question the LLM will try to

[05:16]

generate it own answer because it does

[05:19]

not want to look like a fool. Okay, that

[05:22]

is the best example. It does not want to

[05:24]

look like a fool. So it will try to

[05:26]

generate some answers and it will make

[05:28]

sure that it'll it'll show you answer

[05:31]

that you may also have to believe it.

[05:33]

that is how it will be written you know

[05:35]

in terms of the output that we get so

[05:38]

usually this condition is basically

[05:39]

called as hallucinating okay so this is

[05:42]

one of the major disadvantage the second

[05:45]

disadvantage that you have so let's say

[05:47]

that I'm using this LLM and you know

[05:50]

this LLM has been trained with huge

[05:51]

amount of data now what happens is that

[05:55]

I'm running a startup

[05:57]

let's say now in my startup I'm solving

[06:00]

a specific use case and I have some data

[06:05]

which again I need to use this

[06:07]

particular data along with my LLM. Okay.

[06:10]

So let's say that I have some other data

[06:12]

like you know um policies policies of my

[06:17]

company I have HR policies of my company

[06:20]

I have finance policies you know and

[06:24]

this policies all will not be available

[06:26]

in the it will not be available publicly

[06:29]

because it is my startup so these all

[06:32]

data has been protected now I also want

[06:34]

to use this specific data and probably

[06:36]

create a chatbot okay now how do I do

[06:39]

this Now one way is that many people

[06:41]

will say hey kish we can take this

[06:43]

particular data and we can fine-tune the

[06:46]

model

[06:48]

right we can simply fine-tune the model

[06:51]

yes this is a very good solution but

[06:54]

understand fine-tuning a model is a very

[06:57]

expensive process very tedious process

[07:00]

because this LLM whichever LLM we are

[07:02]

using it has billions of parameter and

[07:04]

tweaking this billions of parameter

[07:06]

usually takes a lot of time Right? So

[07:10]

obviously this is a solution but this is

[07:12]

a very expensive solution. Okay. Now do

[07:16]

we have any other way? Any other way and

[07:19]

remember these all policies and these

[07:21]

all data will also keep on getting

[07:23]

updated as we run the startup. Right? So

[07:28]

every time we cannot just go ahead and

[07:29]

finetune it like every day we not

[07:31]

fine-tune it. Right? So we should try to

[07:33]

find out a solution like how do we

[07:35]

prevent this? So this can again be

[07:38]

prevented with the help of rag

[07:43]

right now how it will be prevented with

[07:45]

the help of rag I will talk about it

[07:46]

okay so here instead of fine-tuning I'm

[07:50]

saying that hey I will go ahead and

[07:51]

implement the rag now you'll understand

[07:54]

only when we understand the pipeline of

[07:56]

the rag which I will discuss in this

[07:57]

specific video okay now these are the

[08:01]

major two disadvantages that you see

[08:04]

right over here and yes they are some

[08:07]

more disadvantages which we'll just deep

[08:09]

dive more as we go ahead. Okay. Now what

[08:12]

happens in

[08:14]

uh if we use rag and how we are

[08:16]

preventing it. See rag is nothing but it

[08:18]

is it is saying that is a process of

[08:20]

optimizing the output of a large

[08:22]

language model. So it references an

[08:24]

authorative knowledge base outside of

[08:27]

his training data. Now how do we solve

[08:30]

this hallucinating and this problem that

[08:33]

we have. Okay. So let me just go ahead

[08:35]

and draw the diagram again. Okay. So

[08:37]

here is my LLM. Okay. And here is my

[08:40]

query. So let's say that uh I am coming

[08:44]

up with an user query. So let's consider

[08:46]

it over here. Okay. And here I'm drawing

[08:50]

a user I'm user. Okay. And this user

[08:55]

will first of all give a query.

[09:00]

Okay. Now what happens is that there

[09:03]

will be two important pipelines that

[09:05]

will be created. As I said over here we

[09:09]

are trying to optimize the output of a

[09:12]

large language model. So it references

[09:15]

an authorative knowledge base outside of

[09:18]

it training data source. So as you all

[09:20]

know this is my LLM right? This LLM is

[09:23]

already trained with huge amount of

[09:24]

data. Now along with this I will be

[09:27]

having an external

[09:30]

database and this database we basically

[09:33]

say it as vector database okay external

[09:37]

vector database now you you know that

[09:39]

this LLM is already trained with some

[09:42]

amount of data and any additional data

[09:44]

let's say my startup data my policies HR

[09:47]

finance whatever data is there we will

[09:50]

try to create a data injection pipeline

[09:54]

over here

[09:56]

data injection pipeline over here. Now

[10:00]

what will be this data injection

[10:02]

pipeline? So let's say I have my data

[10:05]

from this data we will do some kind of

[10:09]

parsing

[10:11]

and from this parsing we will do

[10:14]

embeddings

[10:17]

embeddings and then we finally store it

[10:20]

into the vector store. Okay. Now

[10:22]

whenever we talk about the specific data

[10:24]

this data can be in any format. It can

[10:27]

be in PDF format. It can be in HTML

[10:30]

format. It can be in Excel format. It

[10:33]

can be even in SQL database format or

[10:36]

unstructured format. Any format. So what

[10:39]

we do initially we take this data and we

[10:42]

do data parsing. Now here data parsing

[10:44]

is a very important step. I think if you

[10:49]

crack this step then developing a rag

[10:52]

application becomes very easy. Data

[10:54]

parsing is all about how do you read the

[10:57]

unstructured data or the structured data

[10:59]

that is present inside this and how do

[11:03]

you chunk this data right? How do you

[11:07]

chunk? How do you divide the specific

[11:09]

data into chunks? Chunking is very

[11:11]

important because you need to save this

[11:13]

data inside some kind of vector store.

[11:16]

This is nothing but vector store or

[11:18]

vector DB. Okay. Now vector store and

[11:21]

vector DB is nothing but it will

[11:23]

actually help you to save vectors inside

[11:26]

this. Okay. So once you do the chunking

[11:29]

after doing the chunking you pass it to

[11:31]

the embedding models. Now here in the

[11:33]

embedding models you basically convert

[11:36]

text to vectors.

[11:39]

Okay, vectors is just like a numerical

[11:42]

representation for text so that you will

[11:46]

be able to apply algorithms like

[11:49]

similarity search, cosine similarity

[11:51]

techniques that are already available,

[11:54]

right? Wherein similar kind of results

[11:57]

based on a specific query can be

[11:58]

retrieved from this particular

[12:00]

databases. Okay, so here whenever I talk

[12:03]

about vector DB, this is my vector DB or

[12:06]

vector store. Here we are storing

[12:08]

embeddings. Okay. And this embeddings

[12:10]

will get applied to every chunks.

[12:13]

Embeddings is nothing but we basically

[12:15]

use we convert text into vectors. Here

[12:19]

we can use different different

[12:20]

embeddings like Google gemin models. We

[12:23]

can use openi embedding models. We can

[12:25]

use hugging phase embedding models and

[12:26]

each and every embedding models exist

[12:29]

with different different cost and there

[12:32]

are also open source embedding models

[12:33]

which will actually help you to convert

[12:34]

the text into vectors. Now this is one

[12:37]

specific pipeline which we call it as

[12:39]

data injection pipeline. At the end of

[12:41]

the data injection pipeline, you are

[12:43]

able to store the text into vectors

[12:46]

inside your vector DB. Now how rag is

[12:51]

different from the previous one, right?

[12:53]

So initially you had this data injection

[12:54]

pipeline where you are converting all

[12:56]

your data into vectors, right? And this

[13:00]

data is specifically for this particular

[13:02]

startup. And now I have created a

[13:05]

knowledge base. So this is my knowledge

[13:08]

base. External knowledge base or

[13:11]

internal knowledge base whatever

[13:12]

knowledge base I have. And this

[13:14]

knowledge base does not exist with this

[13:16]

LLM. Right? Yes, some amount of

[13:18]

information may be available but not the

[13:20]

entire part. Now

[13:24]

see the definition. It is a process of

[13:25]

optimizing the output of a large

[13:27]

language so that it references an

[13:29]

authorative knowledge base outside of

[13:32]

this training data. Now what will happen

[13:34]

when user gives a query? Now this query

[13:37]

instead of directly going to the LLM

[13:39]

will go to this vector database right

[13:43]

and before going here also we need to go

[13:45]

ahead and apply embedding right because

[13:48]

this query will be converted into

[13:52]

vectors right why we need to convert

[13:55]

into vectors so that when we are hitting

[13:58]

this query to the vector DB this

[14:00]

similarity search is basically applied

[14:03]

and based on this we get

[14:07]

some kind of

[14:09]

context

[14:11]

we get some information from the vector

[14:13]

DB and now whatever query I'm asking

[14:16]

okay if I ask hey what is the leave

[14:19]

policy of my company

[14:22]

right now what will happen first of all

[14:24]

it will go to the vector store it will

[14:27]

gather all the related information that

[14:29]

is available over here and that

[14:31]

information when it is sending it to the

[14:32]

lm it is called as context

[14:35]

Now we use this context along with we go

[14:38]

ahead and write a specific prompt.

[14:42]

Now this prompt is an instruction to the

[14:44]

LLM and it says that you can use this

[14:47]

context to answer the question and

[14:49]

finally you get a output.

[14:53]

This is the entire pipeline. This

[14:56]

pipeline is basically called as

[14:58]

retrieval pipeline.

[15:01]

Retrieval pipeline. And this is a very

[15:03]

good example of a traditional rag.

[15:08]

Now you may be thinking kish what about

[15:10]

other types of rag. Don't worry thumb

[15:12]

don't worry I will explain it completely

[15:14]

from basic to advanc with implementation

[15:16]

each and everything because later on

[15:18]

we'll be discussing about agentic rags.

[15:20]

We'll be discussing how agentic rags

[15:22]

actually work each and everything. But I

[15:24]

hope you got an idea with respect to

[15:26]

this. Now here you will even not be

[15:29]

seeing this particular problem like

[15:31]

you'll not completely remove

[15:32]

hallucination but some amount of

[15:34]

hallucination if any queries that is

[15:36]

asked related to the data that is

[15:38]

present in the vector DB I will

[15:40]

definitely get some kind of context and

[15:43]

my LLM will give me the output as let's

[15:47]

say that if that data is not present

[15:48]

over here then LLM can hallucinate right

[15:51]

but here we are doing this see one best

[15:54]

example that you can do is that you can

[15:56]

use perfectly Perplexity.

[15:58]

Perplexity is nothing but it is based on

[16:00]

rag. It is completely developed based on

[16:05]

rag applications. Okay. Rag it is it is

[16:09]

a kind of a rag application. In

[16:10]

perplexity you have connected to various

[16:13]

retrievers. You are connected to tools.

[16:17]

You are connected to web search

[16:20]

right and then it is summarizing the

[16:22]

output and giving by the LLM. Right? and

[16:25]

it also uses various LLMs itself. I'm

[16:27]

also planning to mostly start a startup

[16:30]

soon enough within couple of weeks I

[16:33]

guess and the kind of application that

[16:35]

I'm developing is a rag application only

[16:38]

and it solves a very good problem for a

[16:40]

developer. Okay. So that is the reason

[16:42]

I'm not even able to upload a lot of

[16:45]

videos because I'm pretty much involved

[16:47]

in those startups and working and

[16:49]

developing a product that India can

[16:51]

definitely remember. Okay. And this is

[16:54]

how

[16:56]

you know this is this is this is how

[16:58]

things are and you can basically see how

[17:00]

good uh you know the pipeline actually

[17:04]

works and this is basically a

[17:05]

traditional rack. Now you may be

[17:07]

thinking what all things we'll be

[17:08]

discussing. Okay fine we have discussed

[17:09]

about a traditional rack in the future

[17:11]

classes what coding we'll be doing. Okay

[17:13]

so let's go ahead and talk about it. As

[17:16]

I said two important pipelines we'll go

[17:18]

ahead and create one is a data injection

[17:20]

pipeline and one is a retrieval

[17:22]

pipeline. Okay. Now in the data

[17:25]

injection pipeline you'll be seeing that

[17:28]

we will be performing data injection.

[17:30]

Along with the data injection we will go

[17:32]

ahead and do data parsing. Then we'll

[17:34]

perform embeddings. Then uh we will

[17:38]

store everything into the vector store.

[17:40]

Then we will create a ve retriever for

[17:42]

this. And whenever a user ask any

[17:45]

queries, it will be able to give the

[17:47]

context to the LLM. And then finally we

[17:49]

will be generating the output. So here

[17:52]

this is retrieval. This is auggmentation

[17:56]

right? This is augumentation over here.

[17:58]

Augmentation basically means what?

[18:00]

You're giving a context to the LLM along

[18:02]

with the prompt to generate the output.

[18:04]

Right? So this is basically called as

[18:05]

augumentation and finally you're

[18:07]

generating the output right which is

[18:09]

nothing but generation. So here you are

[18:11]

basically generating. Now

[18:16]

in the next session how we are going to

[18:18]

implement it. First of all I will show

[18:20]

you how to perform this two steps in a

[18:24]

very efficient way. Okay sorry not these

[18:27]

two steps. I will show you how we can

[18:29]

perform these all steps right data in

[18:32]

data parsing and embedding. Here we are

[18:34]

going to consider different different

[18:35]

files like PDF, HTML.

[18:39]

Okay. Um PDF, HTML, you can consider

[18:42]

Excel, you can consider SQL database,

[18:44]

you can consider any kind of files. Then

[18:46]

we'll do document parsing and we will

[18:49]

try to convert this into document. So

[18:50]

document is an amazing data structure

[18:53]

which you can basically use it and you

[18:56]

can even parse this do the chunking and

[18:58]

store it in the vector embeddings sorry

[19:00]

vector store then we'll perform

[19:02]

embeddings here we will use both open

[19:04]

source

[19:06]

and we are going to use paid embeddings

[19:08]

for the same okay and then finally we go

[19:10]

to the vector store then based on a user

[19:13]

query how do we go ahead and apply the

[19:15]

same embeddings we are going to see that

[19:17]

okay and then finally we'll be

[19:19]

developing this So mostly I really want

[19:22]

I'm I'm focusing more on making bigger

[19:24]

videos so that you don't just follow a

[19:26]

playlist. Okay, I want to basically

[19:28]

cover a lot of stuff in one video so

[19:31]

that uh you should also be able to

[19:34]

efficiently cover it instead of covering

[19:35]

50 different videos. Right now when we

[19:38]

are doing data injection and data

[19:39]

parsing right there are various

[19:41]

techniques. See we are going to see

[19:43]

about optimization.

[19:45]

We are going to see about various

[19:46]

chunking strategies, context

[19:48]

engineering, these all kind of topics

[19:50]

will be coming up when we talk about

[19:52]

data parsing you know u what is semantic

[19:55]

chunker you know how do we go ahead and

[19:57]

do the chunking in those strategies and

[19:59]

all everything we'll try to discuss as

[20:01]

we go ahead but I hope you got a very

[20:03]

super cool idea about what exactly is

[20:04]

rag hello guys so we are going to

[20:07]

continue the discussion with respect to

[20:08]

rag already till now we have understood

[20:12]

what is rag then what are the main

[20:15]

drawbacks we are fixing with rag and

[20:17]

along with that we have also understood

[20:19]

how the rag pipeline is right it usually

[20:22]

consists of two important pipeline one

[20:23]

is the data injection pipeline and one

[20:26]

is the retrieval pipeline which includes

[20:27]

this two box okay now we are going to go

[20:30]

ahead with some kind of practical

[20:32]

implementation

[20:34]

now the major thing that usually comes

[20:37]

in my mind right whenever we go ahead

[20:39]

and start any new series that is how

[20:42]

should we cover a specific topic you

[20:45]

know so that we can understand the

[20:46]

coding from basics and we move towards

[20:49]

modular coding so that is how I'm going

[20:52]

to implement this entire pipeline

[20:54]

initially we will go ahead with some

[20:56]

basic code we'll try to understand the

[20:58]

fundamentals and then we will start

[21:01]

writing more complex code we'll be using

[21:03]

modular coding also so initially we will

[21:06]

write all the code in Jupyter notebook

[21:08]

then we'll increase the complexity we'll

[21:10]

write uh code in terms of class reus

[21:13]

reus usability and then we'll try to see

[21:15]

that how we can actually create the

[21:17]

pipeline. So that is how the agenda will

[21:20]

probably go ahead as we go ahead right.

[21:22]

So two important things that we'll think

[21:24]

about. The first important thing is to

[21:26]

understand about the document structure.

[21:29]

Now whenever we work with any external

[21:32]

knowledge database any data that needs

[21:35]

to be feeded into the vector DB you

[21:38]

definitely need to know about this

[21:39]

document structure. Why? Because inside

[21:43]

this data injection pipeline the first

[21:44]

step is data injection. Now whenever we

[21:47]

talk about data injection here we can

[21:48]

have any kind of files right we can have

[21:50]

PDF files, HTML file, DB file, Excel

[21:53]

file. Our main aim is to read all this

[21:56]

particular file content and probably

[21:59]

convert into a structure wherein we can

[22:02]

additionally do uh we can apply

[22:04]

strategies like chunking embedding and

[22:06]

store it into the vector DB. That is

[22:08]

what this entire pipeline is all about.

[22:10]

So for that you really need to

[22:12]

understand this document structure. So

[22:14]

if you see this diagram right so since

[22:17]

uh these two are the main topics that we

[22:19]

are going to cover in this particular

[22:20]

video initially we will go ahead with

[22:22]

document structure understanding this

[22:24]

and then we'll try to build our complete

[22:26]

rag pipeline in our complete rag

[22:28]

pipeline we have two important step one

[22:31]

is the data injection pipeline and the

[22:33]

other one is the query retrieval

[22:35]

pipeline now whenever we talk about the

[22:38]

data injection pipeline let's let's talk

[22:40]

about this in complete depth right so

[22:42]

initially you have this data injection

[22:43]

pipeline Right? In the data injection

[22:45]

pipeline, the first step is data

[22:47]

injection. That basically means let's

[22:49]

say that you have you may have different

[22:51]

kind of files like PDF, HTML, right?

[22:55]

Excel, you may have uh DB file, you may

[22:59]

have unstructured file, any kind of file

[23:01]

format. So in data injection what is our

[23:04]

main strategy is that how to proceed

[23:06]

with reading this particular file. How

[23:09]

to perform data parsing.

[23:12]

How to perform data parsing

[23:15]

and then finally how to convert this

[23:18]

into a document structure.

[23:23]

Document structure. So that is the

[23:24]

reason in this video right as I said

[23:28]

we're going to first of all understand

[23:29]

about document structure. how to build

[23:31]

this document structure, what is

[23:33]

metadata? Now, inside this document

[23:35]

structure, uh you will be learning about

[23:37]

important components like metadata.

[23:40]

You'll be learning about content. You'll

[23:42]

be learning about how the structure of

[23:45]

the metadata exist each and everything,

[23:47]

right? So, we will be covering

[23:50]

completely in depth like how these

[23:52]

things actually work. Okay? Once you

[23:55]

understand this that and this data

[23:58]

parsing is really really important step

[24:00]

because of this you know later in the

[24:03]

retrieval pipeline that is the query

[24:05]

retrieval pipeline based on this parsing

[24:07]

it can become much more efficient right

[24:10]

you'll be able to get the results much

[24:12]

more accuracy much more accurate so that

[24:14]

is the reason you need to really focus

[24:16]

on the data parsing now after doing the

[24:18]

data parsing the next step usually is

[24:21]

something called as chunking right so

[24:24]

Here in the chunking we we convert this

[24:28]

entire data into chunks multiple chunks.

[24:32]

So this chunks is like let's say this is

[24:35]

my chunk one this is my chunk two this

[24:39]

is my chunk three this is my chunk four

[24:44]

okay then as we go ahead after applying

[24:48]

chunking. So chunking basically means

[24:50]

and why do we apply chunking? Chunking

[24:52]

strategy is very simple. Whatever

[24:54]

documents we have, we are just dividing

[24:56]

this into smaller parts or smaller

[24:58]

chunks. The reason we do this because

[25:02]

whenever we consider with respect to any

[25:05]

LLM model or any L embedding models,

[25:08]

let's say here the next step is all

[25:11]

about embeddings. Okay. In embedding

[25:14]

with respect to every LMA model, there

[25:17]

is a fixed context size. Okay.

[25:21]

Let's say if I take the complete 100

[25:23]

pages PDF and I directly try to give it

[25:26]

to a L model for performing the

[25:28]

embeddings like uh if I give it directly

[25:30]

to an embedding model for performing the

[25:32]

embeddings and embedding basically means

[25:33]

you convert text to vectors. It will not

[25:38]

be possible. It will say that hey you

[25:39]

have you you are providing data more

[25:42]

than the context size and that will not

[25:44]

be possible in order to convert the text

[25:46]

into vectors. So within the limit of the

[25:49]

context size you really need to give the

[25:51]

data and this is for both embedding

[25:53]

models and even in the later stages

[25:55]

whenever we use any kind of LLM model

[25:57]

because for every LLM model there is a

[25:59]

fixed context size. Yeah different LLM

[26:02]

model may have different different

[26:03]

context size. So that is the reason and

[26:05]

it is always a good strategy that we try

[26:07]

to divide our data into chunks so that

[26:10]

we fit them in a way that we uh in the

[26:12]

later stages we'll be able to

[26:13]

efficiently put them into the vector

[26:15]

database which is this. So after

[26:17]

chunking for every chunk we go ahead and

[26:20]

apply embeddings. Okay. So we go ahead

[26:23]

and apply embeddings and from the

[26:25]

embeddings we finally store that into

[26:27]

our vector DB. Now inside this vector DB

[26:30]

all this will be stored in the form of

[26:32]

vectors. Like let's say this is my

[26:34]

record one record two record three

[26:37]

record four like that right so this is

[26:40]

one record two record this is my third

[26:42]

record then fourth record fifth record

[26:44]

this you have right now from this

[26:46]

particular vector DB you will definitely

[26:50]

be able to apply any kind of similarity

[26:52]

search similarity search now in this

[26:57]

specific video what we are going to do

[26:58]

is that I will be using any of this file

[27:02]

and I'll create this entire pipeline.

[27:05]

Okay, I will I'll just create this

[27:07]

entire pipeline and you also need to

[27:11]

probably work along with me later on.

[27:14]

For any other files, I will give you an

[27:16]

assignment. Okay, I will show you with

[27:18]

couple of files. Let's say I'll take PDF

[27:20]

file and I'll show you this entire data

[27:22]

injection. Then what you do is that as

[27:24]

an assignment you use any of the other

[27:27]

file format let's say Excel, CSV

[27:29]

whatever file format you want and you

[27:31]

try to complete the same pipeline. Okay.

[27:34]

So that is what is my strategy and

[27:36]

please make sure to complete the

[27:38]

assignment also and we will go step by

[27:40]

step completely from scratch so that

[27:41]

everybody will be able to follow. So

[27:44]

first of all I will go ahead and open my

[27:46]

empty folder and in this remember I will

[27:49]

be using lang chain uh and this is just

[27:51]

a traditional rag right now in the later

[27:54]

stages we will move towards aentic rag.

[27:56]

So from this particular command I will

[27:58]

just go ahead and open my command

[27:59]

prompt. I will open my VS code. So let

[28:03]

me quickly go ahead and open the VS

[28:05]

code. Now from the VS code the next step

[28:08]

will be that I will

[28:11]

quickly open my terminal

[28:15]

terminal and let me just go ahead and

[28:17]

write uv uh I'll just go ahead and

[28:20]

initialize this particular workspace as

[28:22]

my repository. So yt rag is my

[28:24]

workspace. Now I will just go ahead and

[28:28]

also go ahead and create my environment.

[28:31]

So if you're using uv package so you can

[28:33]

just write uv env. So my Python 3.13.2

[28:37]

will be the recent uh Python version

[28:39]

that I'm specifically using for this

[28:41]

particular project. And then I will go

[28:44]

ahead and create activate this

[28:45]

particular environment. Okay, perfect.

[28:48]

Till here we are good enough. Now I will

[28:50]

go ahead and create my requirement.txt.

[28:54]

Now from this requirement.txt txt. Let

[28:56]

me quickly go ahead and install some of

[28:58]

the packages like lang chain lang chain

[29:01]

core

[29:03]

uh core lang chain dash community

[29:08]

uh the all things are there. Let's me

[29:11]

quickly go ahead and install these

[29:13]

packages. So uv add minus r requirement

[29:18]

txt. Okay, txt.

[29:23]

So this is done and along with this I

[29:26]

will also go ahead and install some of

[29:27]

the libraries like pi pdf pi mu

[29:32]

m new pdf. Okay so these are all

[29:34]

libraries I'll be using. I'll talk about

[29:36]

why I'm using pi pdfd pi mu pdf right.

[29:39]

This is specifically to read my pdf

[29:42]

documents. So one example that I'm

[29:43]

actually going to show you is with

[29:45]

respect to PDF and then you should also

[29:48]

try to create the same pipeline with the

[29:50]

help of any other uh data types. Okay,

[29:53]

data formats types like let's say it

[29:55]

will be it can be JSON, it can be

[29:57]

anything as such. So uh my requirement

[30:00]

txt is filled. Now what I will do is

[30:02]

that I'll quickly go ahead and create my

[30:03]

data folder and here I will also go

[30:06]

ahead and create my notebook folder

[30:09]

quickly so that I can start working on

[30:11]

it and then along with this I will also

[30:14]

go ahead and add UV add ipi kernel. Okay

[30:17]

so that I will be able to work along

[30:19]

with my Jupyter notebook. So ipi kernel

[30:22]

has got executed. Now quickly I will

[30:25]

first of all start with my Jupyter

[30:27]

notebook and at the first thing that I

[30:29]

told you it's related to document data

[30:31]

structure right document what is

[30:32]

document and what is how document can be

[30:36]

very very helpful if you are using in

[30:38]

the document data uh in the data

[30:40]

injection pipeline okay so I'll quickly

[30:42]

select my kernel

[30:45]

and these all things you really need to

[30:47]

be a good at Python programming language

[30:49]

see there cannot be anything that you uh

[30:52]

you can skip Python programming

[30:53]

programming language. So my suggestion

[30:54]

would be never do that. Okay. So Python

[30:57]

is must and this time I'm just going to

[30:59]

use some more advanced coding and it

[31:01]

will not be possible for me to write

[31:03]

line by line. So definitely I'll go a

[31:05]

little bit fast to in order to explain

[31:07]

you. Okay.

[31:08]

Now as I told you if I go back over here

[31:12]

in the data injection our main aim is to

[31:14]

load some data apply some chunking then

[31:17]

convert into embeddings and finally

[31:19]

store it into the vector DB. That is

[31:21]

what my entire data injection pipeline

[31:23]

is all about. Right? For understanding

[31:25]

this, we need to understand a document

[31:27]

structure because all this chunking that

[31:29]

is done, you know, the final output will

[31:31]

be documents. Now, what exactly is a

[31:34]

document data structure? So here I will

[31:37]

go ahead and write what exactly is a

[31:39]

document data structure. So for this I

[31:42]

will go ahead and import from lang chain

[31:46]

or to probably show you this. I will be

[31:50]

showing you some kind of uh file so that

[31:54]

you'll be able to understand it. Okay,

[31:56]

let me put this file over here.

[32:00]

Okay, I have some file over here and

[32:02]

then we'll try to understand. Okay, what

[32:04]

exactly is a document structure? See

[32:06]

lang chin document structure. So

[32:08]

langchen uh document is a kind of a data

[32:12]

structure which will be able to save

[32:15]

some data in some format where we have

[32:18]

two important things. One is the page

[32:20]

content and one is the metadata.

[32:23]

The page content will basically have the

[32:27]

content that is present inside that

[32:29]

particular file. Okay. So if you are

[32:30]

reading the file inside my page content

[32:34]

all those detail all those content that

[32:37]

is present inside the file will be

[32:38]

available over here and metadata will be

[32:41]

some more additional information of the

[32:43]

file like it can be the file name it can

[32:46]

be how many number of pages are there

[32:47]

how what is the time stamp of the file

[32:49]

each and everything. So this way

[32:52]

whenever you read any kind of data and

[32:54]

you convert them right in a document

[32:55]

data structure this format will be very

[32:58]

very important because at the end of the

[33:00]

day we will be doing the embedding on

[33:03]

this particular data and pushing it into

[33:05]

the vector DB and when we do that

[33:07]

specific task pushing it to the vector

[33:10]

DB we will be able to apply different

[33:12]

different uh algorithms like similarity

[33:15]

search cosine similarity and we'll be

[33:17]

able to retrieve the results. So here

[33:20]

you can see that all the information

[33:21]

regarding this is given over here. So

[33:24]

usually langchen document structure it

[33:26]

has two important core components. One

[33:28]

is page underscore content and one is

[33:30]

metadata. And here page content will be

[33:32]

the actual text uh content where all it

[33:36]

will be very very handy in research

[33:37]

papers if you want to probably create a

[33:39]

rag application or research papers

[33:41]

product manual. So you can specifically

[33:43]

use this in lang chain you definitely

[33:46]

have different different loaders. Okay,

[33:49]

loaders like you have something like PDF

[33:51]

loader, you have CSV loader, you have

[33:53]

web- based loader, you have directory

[33:54]

loader. Now see all these loaders what

[33:56]

it does is that for PDF loader will be

[33:59]

used to load the PDF files and once it

[34:02]

loads the PDF file right it will be

[34:04]

giving you the output of the documents

[34:06]

in the form of a document structure.

[34:09]

Okay, I will show you practically also

[34:11]

why I'm specifically saying and

[34:12]

stressing on this. Okay, it will

[34:14]

definitely give you all the output in

[34:16]

the form of a document structure.

[34:18]

Similarly, in the case of CSV loader,

[34:20]

here we are giving the CSV file, but it

[34:22]

will try to convert the entire content

[34:24]

that is present inside that CSV into a

[34:26]

document data structure. Similarly, with

[34:28]

respect to web brace loader, clarity

[34:30]

loader. Similarly, there are so many

[34:32]

different different loaders over here,

[34:34]

right? You can use any of this

[34:36]

particular loader to load the data and

[34:38]

at the end of the day uh this loader

[34:41]

will finally give you the output in the

[34:43]

form of document structure. Okay. So I

[34:47]

hope you got an idea about what exactly

[34:48]

is document structure itself. Okay. So

[34:51]

now quickly what I will do I will go

[34:53]

ahead and uh start explaining you about

[34:56]

like how we can start with the document

[34:58]

structure. So for the document we need

[35:00]

to import from langin.

[35:03]

langchen

[35:05]

dot there's something called as text

[35:07]

splitter and uh sorry langchen core it

[35:11]

is present inside core dot documents

[35:15]

import document okay now this document

[35:19]

you will be able to see that if you just

[35:21]

hover over here you'll be able to the

[35:23]

class for storing a piece of text and

[35:25]

associated metadata okay now if you

[35:30]

really want to understand a document

[35:31]

structure so first of all I will go

[35:33]

ahead create one document let's say

[35:35]

manually I'll go ahead and create so I

[35:37]

will use this document and inside this

[35:39]

we will be using two parameters one is

[35:41]

the page content let's say this page

[35:43]

content I'm writing this is the main

[35:46]

text content

[35:48]

uh content

[35:50]

uh I'm using to create rag okay so I

[35:55]

I've just basically written some some

[35:58]

basic content over here let's consider

[36:00]

that this particular content is coming

[36:02]

from a txt file Okay, but along with

[36:05]

this content, if you really want to

[36:07]

improve the search query retrieval from

[36:09]

the vector DB, you need to also go ahead

[36:11]

and write metadata. So the second

[36:13]

parameter that you'll be able to see is

[36:15]

something called as metadata. Now inside

[36:18]

this metadata, you can write different

[36:20]

different information because at the end

[36:21]

of the day this is text. You can write

[36:23]

like okay fine this is my source. The

[36:25]

source is basically coming from

[36:27]

example.txt file. Okay. Then let's say

[36:30]

the number of pages are uh equal to one.

[36:34]

Okay. Total number of pages are like

[36:36]

one. Uh I can also go ahead and write

[36:38]

some more information like okay who is

[36:40]

the author for this? Author is nothing

[36:42]

but crush nayak. So this is the

[36:44]

additional details that you'll be able

[36:46]

to see it. Okay fine. Let's go ahead and

[36:48]

write date created. So date created.

[36:52]

Right. Date created. And here I can go

[36:54]

ahead and write 24 -01 - 0 like it's

[36:58]

like first 2024 or first 2025. Now why

[37:02]

these all metadata will be really really

[37:04]

important because once we consider this

[37:06]

document right once we do the chunking

[37:09]

once we do the embedding and once we

[37:11]

store into the vector DB when you're

[37:12]

doing the similarity search you can also

[37:15]

apply filters that is the most important

[37:18]

thing of this and when you apply filters

[37:20]

let's say that I am applying a filter uh

[37:22]

I'm searching what is the main text

[37:24]

content for building the rag some

[37:26]

information is there let's say there's

[37:28]

some information related to the rag if I

[37:30]

ask that particular question and I say

[37:32]

by author Krishnaak I just had that

[37:34]

particular filter then it knows from

[37:36]

which document to probably pick up

[37:39]

because it is going to apply a filter by

[37:41]

using the name of author right and that

[37:44]

is why this metadata will definitely

[37:46]

play a very important role now if I just

[37:48]

go ahead and execute this doc you'll be

[37:51]

able to see that fine I'm getting this

[37:53]

particular document here you can see

[37:54]

metadata is there and as you go ahead

[37:56]

you'll also be able to see page_content

[37:59]

right so these are the two main

[38:01]

important parameters with respect to

[38:03]

this which everybody can probably go

[38:05]

ahead and use it. Okay. Now I hope you

[38:08]

got a very clear idea about it. Uh now

[38:11]

what I'll do I will just go ahead and

[38:12]

create a simple simple create a simple

[38:17]

txt file. Okay. Now for creating a

[38:21]

simple txt file what I will do I will

[38:23]

just go ahead and import OS. Okay. And

[38:26]

I'm saying OS domake directory data /

[38:29]

text file. So I'm trying to create this

[38:31]

particular inside this f folder I'm

[38:33]

creating this particular folder name

[38:35]

okay and if it already exist I'll say

[38:37]

that don't do anything right so as soon

[38:39]

as I go ahead and execute it you'll be

[38:40]

able to see that okay it is going inside

[38:43]

the notebook file I'll remove this and

[38:46]

let me go ahead and write double dot

[38:48]

slash let's see now you can see over

[38:50]

here text file is present okay so text

[38:53]

file I'm I've just done that inside this

[38:56]

now let me go ahead and manually create

[38:58]

a text file with the help of Python

[38:59]

code. Okay. So I will just go ahead and

[39:02]

use a Python code. See guys, these are

[39:04]

all our basic Python code. I don't want

[39:06]

to write each and every line of code and

[39:08]

make it very very big. Our main aim

[39:10]

should be that understand concepts

[39:12]

quickly show you multiple use cases and

[39:14]

then try to implement this. Okay. So now

[39:18]

you will be able to see I have created

[39:19]

this simple text. I've given the file

[39:21]

name something like this. So let me go

[39:23]

ahead and write this to it. Data text

[39:25]

files python intro.txt. And this is some

[39:29]

content that is present inside that

[39:31]

particular key name. Okay.

[39:33]

So this is my file name. You can see

[39:35]

this is key is my file name. And then

[39:38]

here I have specifically my Python

[39:40]

content. Okay. Here I'm saying for file

[39:43]

content in sample text do items. I'm

[39:46]

telling to open the file name. I'm

[39:48]

saying that write the content. Okay. So

[39:50]

this file path is nothing but my file

[39:53]

name. Okay. So if file is not there, it

[39:55]

will try to create python intro.txt.

[39:59]

So now if I go ahead and execute this.

[40:01]

So it is saying me no directory. Okay,

[40:04]

let me just go ahead and create one

[40:05]

file. Okay, python intro

[40:09]

um text file. Okay, I have to give the

[40:11]

path because there are two files that is

[40:13]

over here. One is okay, one file is also

[40:16]

over here. Okay, so I'll just go ahead

[40:18]

and write dot. Okay. So now here you can

[40:21]

see my sample files has got created

[40:23]

machine_arning.txt

[40:25]

and python intro.txt.

[40:28]

Now what I will do see I've created some

[40:31]

sample file. I could have also manually

[40:32]

created it instead of doing the code.

[40:35]

Okay. But I really wanted to show you

[40:36]

all the things. Now what I will do I

[40:38]

will show you how to read this

[40:40]

particular text using text loader. So

[40:43]

one of the loader that is present inside

[40:45]

langin is something called as text

[40:47]

loader. So here I will go ahead and

[40:49]

write from langchain dot

[40:52]

document loaders import text loader.

[40:56]

Okay text loader. So here we have

[40:59]

imported text loader and uh along with

[41:02]

this uh see if you don't want to also

[41:04]

use this if I execute this this is also

[41:07]

there before if I talk about it right

[41:10]

when langchain keeps on changing its

[41:12]

library here and there. So there we used

[41:15]

to use langun community.d document

[41:18]

loaders. This also we used to use import

[41:20]

text loader.

[41:22]

So any of them you can actually use

[41:24]

unless and until you get a deprecated

[41:26]

warning. Okay. Now the question is that

[41:28]

how do we go ahead and read the text. So

[41:31]

I'll write loader

[41:33]

is equal to I will initialize text

[41:35]

loader. Give let's give the path. The

[41:37]

path is nothing but parent folder. We go

[41:40]

to the parent folder data /ext files

[41:44]

/ython

[41:46]

intro.txt. So here I have actually given

[41:50]

my file name whatever file name we have

[41:51]

actually created and we can also go

[41:53]

ahead and use encoding UTF8. Okay,

[41:56]

encoding

[41:58]

UTF8.

[42:01]

So once I do this okay and now once I go

[42:04]

ahead and read this loader now what it

[42:06]

is giving it is giving me an object of

[42:10]

um text loader right now in order to get

[42:12]

the content inside this I will be using

[42:14]

loader.load load. Okay. And here you'll

[42:18]

be able to see that I will be getting

[42:19]

the document.

[42:22]

Okay.

[42:24]

Now let's go ahead and print the

[42:26]

document. So I will write print

[42:28]

document. So let's say this is my

[42:30]

document. I'm going to print it. So here

[42:32]

you can see in the document you are

[42:33]

getting metadata. You're getting the

[42:35]

entire information and this is your page

[42:37]

content. Now this is what it is doing,

[42:39]

right? This text loader is by default

[42:42]

giving you the data in the document

[42:44]

structure. as soon as it is reading. And

[42:46]

here the best part is that you can also

[42:48]

see some of the metadata information has

[42:50]

also got updated like what is the source

[42:53]

right you can still go ahead and and

[42:55]

manually change more information inside

[42:57]

the metadata but by default the best

[43:00]

part is that whenever you're using this

[43:02]

all libraries then also it will be able

[43:05]

to give you the content in the document

[43:07]

structure which is really really good

[43:09]

because in the document structure you

[43:10]

have two important things. one is the

[43:13]

metadata and one is the page content. So

[43:16]

this is with respect to text loader

[43:17]

right I have just read the text loader

[43:19]

and I'm able to get this in this way.

[43:22]

Okay. Now one more way what I will do I

[43:25]

will show you with the help of directory

[43:26]

loader like if I have all the important

[43:31]

files in my directory. Can I read it

[43:34]

like that also or not? Okay. So for

[43:36]

doing this let's use uh one more library

[43:39]

which is called as directory loader.

[43:42]

Right. So here you can see lang

[43:44]

community.document document loader

[43:45]

import directory loader now inside my

[43:48]

directory loader you can see that I'm

[43:49]

giving this particular file again this

[43:51]

file should be uh parent folder does

[43:54]

this and here I given the pattern to

[43:56]

match see this function basically you

[43:59]

can give a pattern to match all the

[44:01]

files then you can use loaderclass

[44:03]

loaderclass basically means which file

[44:06]

you are planning to load if it is a PDF

[44:08]

one you can directly go ahead and use

[44:09]

PDF okay so what I can actually do is

[44:12]

that I can also go ahead and insert PDF

[44:14]

files over here. I can also provide this

[44:17]

in the form of list so that it will be

[44:19]

able to read both the content. Okay. So

[44:22]

once I go ahead and execute this, you

[44:23]

can see here also I'm using the encoding

[44:26]

and all these things. And here you can

[44:27]

see uh once I go ahead and write

[44:29]

directory

[44:32]

loader

[44:34]

dot load okay and here you will be able

[44:38]

to see documents.

[44:41]

Okay. And then now if you just go ahead

[44:43]

and print the documents you should be

[44:45]

able to see this. Okay. I'm getting an

[44:47]

error to log the progress please install

[44:49]

pip install tdk. Okay. So here we have

[44:52]

enabled the parameter show progress is

[44:54]

equal to true. Let me make it as false.

[44:56]

So that I don't need to probably go

[44:57]

ahead and install this. Now here clearly

[44:59]

you can see that there were two text txt

[45:01]

file. I got two documents. Yes. Now

[45:04]

further you can do chunking and all

[45:06]

right based on the number of documents

[45:08]

over there I was able to get it. Right.

[45:11]

So this is the most amazing part uh

[45:14]

about this. Now what I will uh quickly

[45:16]

do is that let me go ahead and create uh

[45:19]

a PDF file also. Okay. So here I have

[45:22]

some examples of the PDF file. Okay. So

[45:25]

let me quickly go ahead and copy this

[45:28]

and paste it over here. Reveal explorer

[45:32]

data. I have text files. I have PDF

[45:35]

files. Now inside this PDF file now my

[45:37]

main aim is to read both the text and

[45:39]

PDF files. Let's see. So here I have

[45:42]

attention PDF, this PDF, this PDF. Okay,

[45:44]

so this is my one document. Okay, let me

[45:47]

go ahead and write the same code. Copy

[45:49]

and paste it over here. And this will

[45:51]

basically be for the PDFs. So for PDF I

[45:54]

will be having from langchain

[45:57]

lang core dot document loaders import

[46:03]

pipdf.

[46:06]

I think pi pdf is not available over

[46:08]

here. Let's see where is this specific

[46:10]

library. I'm just checking out the

[46:12]

documentation. Uh PI PDF. Oh yeah, it

[46:15]

should be there. So it should be here in

[46:18]

the inside my community dod document

[46:20]

loaders. I have two different types of

[46:22]

library. PI PDF and PIMU PDF. PIMU PDF

[46:25]

is better when compared to PIP PDF. You

[46:28]

can see uh PI PDF shows load and parse a

[46:30]

PDF file using PI PDF library. And

[46:33]

similarly if you go ahead and see py mu

[46:35]

pdf it loads and parse pdf file using

[46:38]

this provides method to load this this

[46:40]

this is there all the information you

[46:41]

can see the differences

[46:43]

which one is better which one is not

[46:45]

better in the later stages. Okay now

[46:47]

what I'm doing is that I will give the

[46:49]

path over here. So from data / data and

[46:53]

here you can see the path is nothing but

[46:55]

PDF

[46:57]

here I will go ahead and write PDF

[47:00]

instead of writing text loader I will go

[47:02]

ahead and write pi mu PDF let's go ahead

[47:04]

and use pi mu PDF I can also include

[47:07]

encoding in this and here what I will do

[47:11]

I will quickly write PDF documents is

[47:16]

equal to directory loader dot load

[47:20]

Okay. And then if I just go ahead and

[47:23]

see PDF documents, you should be able to

[47:25]

see there are so many different PDFs.

[47:27]

Okay. I'm getting an error. Uh get text

[47:30]

got an unexpected argument. Okay. Let's

[47:33]

remove this. I will not be requiring

[47:35]

anything. We don't need to apply any

[47:36]

encoding by default. Okay. So here you

[47:39]

can see I have got all my documents.

[47:41]

Yes. So how many different files were

[47:44]

there inside PDF folder? One is

[47:45]

attention. PDF, embedding, PDF, object

[47:48]

detection. These are some of the

[47:49]

research paper and with respect to this

[47:51]

all we are able to see this and now the

[47:53]

best part is that when you're using Pymo

[47:55]

PDF here the metadata information is

[47:57]

completely different seeation date

[48:00]

source file path total pages

[48:04]

right format see total pages is 15 for

[48:07]

the first one then 27 then 21 see you

[48:10]

can see it so beautifully it is there

[48:13]

see I have also created some of the PDFs

[48:16]

there also you'll be able to see some

[48:17]

kind of author's name also right

[48:21]

it tries to bring up all the entire

[48:23]

source information and this is your page

[48:24]

content right so beautifully you are

[48:27]

able to see the entire content quickly

[48:29]

right so that is what this all PDF is

[48:33]

all about and here at the end of the day

[48:35]

even though we use this specific

[48:36]

libraries we are getting this in the

[48:39]

form of a document structure it is a

[48:41]

list of documents so if I go ahead and

[48:43]

say what is type of PDF document of zero

[48:47]

You'll be able to see okay it is of a

[48:49]

document type right now that is the most

[48:53]

important thing if you now see that we

[48:55]

have understood about document structure

[48:58]

we know how to read PDF and txt now

[49:01]

don't you think you can actually easily

[49:03]

find out how to probably go ahead and

[49:05]

read the Excel DB any kind of files and

[49:08]

this is the task that you really need to

[49:09]

do how you'll do it just go to lang

[49:11]

chain document loaders right and you

[49:15]

will be able to find out everything over

[49:17]

here. Just go ahead and try it out. Try

[49:20]

it out. Try it out. Try to see if the

[49:22]

document structure that you're getting

[49:23]

is good or not. So here there are so

[49:25]

many different things you can go just go

[49:27]

ahead and try it out. If you want from a

[49:29]

AWS S3 you you want from AWS S3

[49:32]

directory go ahead and just install this

[49:34]

particular library give this but before

[49:36]

that you have to do the authentication

[49:37]

and all right. Once you do this and uh

[49:40]

once you're able to do it, you can use

[49:42]

any kind of document loaders as you add

[49:45]

but at the end of the day what is what

[49:47]

is the best thing about this at the end

[49:50]

of the day you are able to convert

[49:52]

everything into a document data

[49:53]

structure right now if you see with

[49:56]

respect to data injection here you have

[49:58]

actually completed now the next step is

[49:59]

that I will move towards chunking okay

[50:02]

I'll move and show you how the chunking

[50:04]

can be specifically done what are the

[50:06]

different ways of chunking um that you

[50:08]

can actually do you know and then

[50:10]

finally we'll see that how we can even

[50:12]

convert into embeddings we'll try to use

[50:14]

an open source embeddings for this and

[50:15]

then finally a vector DB so yes I hope

[50:18]

you have understood about the data

[50:20]

injection part now let's move towards

[50:21]

the chunking part where we will

[50:23]

understand uh how we can actually

[50:25]

performing chunking and I have also told

[50:27]

you what is the importance of chunking

[50:30]

so guys till now we have already

[50:31]

discussed about the entire document

[50:33]

structure and uh I've also shown you how

[50:36]

with the help of PI PDF loader PI MUD MU

[50:39]

PDF loader and how with the help of text

[50:42]

loader you will be able to read the txt

[50:44]

file and PDF file. All the other files

[50:46]

again you can go ahead and see the

[50:48]

langun documentation you have different

[50:50]

different document loaders which I have

[50:51]

already discussed right and these are

[50:53]

some of the document loaders that you

[50:55]

can specifically use uh which I have

[50:57]

already shown you um from the

[51:00]

documentation page now we going to go

[51:02]

ahead one step ahead you know um because

[51:05]

we have just started with this we

[51:07]

understood about data parsing and we

[51:09]

were able to create the document

[51:10]

structure itself now I really want to

[51:13]

probably go ahead and do the chunking

[51:15]

uh then after the chunking I also want

[51:18]

to probably go ahead and do the

[51:20]

embedding and finally whatever text to

[51:23]

vectors is basically converted this

[51:26]

vectors will be stored in some kind of

[51:28]

vector store DB okay so let's go ahead

[51:30]

and start building this entire pipeline

[51:32]

okay so uh and this pipeline will

[51:35]

initially build it we'll start from

[51:36]

complete basics since this entire rack

[51:38]

series we are learning from basic stuff

[51:41]

right so definitely you'll love it

[51:43]

you'll love to expl explanation that

[51:45]

what I'm doing you know so here uh what

[51:47]

I will do I will go ahead and create one

[51:48]

more file quickly and I'll say hey this

[51:51]

is nothing but PDF loader ipnb okay and

[51:56]

uh here I will go ahead and select my

[51:57]

kernel this is my kernel and let's go

[52:00]

ahead and start the entire rag pipeline

[52:04]

and this pipeline is nothing but data

[52:07]

injection to vector DB pipeline okay

[52:11]

vector DB pipeline we are going to go

[52:13]

ahead and build this quickly.

[52:16]

So, uh first step as you know that I

[52:20]

already have one data folder over here.

[52:23]

So, this is what is my data folder and I

[52:26]

definitely have a lot of PDF files

[52:27]

inside this PDF folder itself.

[52:30]

So first thing first uh what I will do I

[52:32]

will go ahead and create a function you

[52:35]

know uh saying that uh where in I will

[52:39]

try to read all the documents from this

[52:42]

and I will try to uh read the data

[52:44]

inside this particular document that is

[52:46]

PDF file and then uh we may use pi PDF

[52:49]

folder PI PDF loader and then finally

[52:52]

convert that into a document. Okay. So

[52:54]

for this what I will do I will quickly

[52:56]

go ahead and create a function and this

[52:58]

function will be nothing but uh this is

[53:01]

a markdown. Let me just go ahead and

[53:02]

make a code cell. So uh before I go

[53:05]

ahead I go I want to import all the

[53:08]

important libraries that are available.

[53:11]

Uh some of the libraries that I will be

[53:13]

noting down over here is nothing but

[53:15]

import OS. Then you have something

[53:17]

called langin document langen community

[53:20]

langun community document loaders. I'm

[53:23]

using pi pdfd loader and all then you

[53:25]

also have this langchen textsplitter and

[53:28]

recursive character textplitter. Okay so

[53:30]

u otherwise instead of writing in a new

[53:33]

file I will let's go ahead and use okay

[53:35]

this file is fine so I will just go

[53:37]

ahead and execute this I will I don't

[53:38]

require the path library. So once I

[53:41]

execute this these all libraries will

[53:44]

get executed now we will be able to use

[53:47]

this. Now since my first step is related

[53:50]

to data injection. Now whenever I really

[53:53]

want to specifically do data injection,

[53:55]

what I will do is that I will try to

[53:56]

read all the PDFs. So we will read all

[54:00]

the PDFs

[54:02]

inside the directory. Okay, directory.

[54:06]

Now guys, uh you need to have some

[54:08]

knowledge with respect to coding. So

[54:11]

otherwise if I keep on writing line by

[54:13]

line, it'll definitely take a lot of

[54:14]

time. So here we are going to create a

[54:17]

function which is called as process all

[54:19]

PDFs. Here we need to give the PDF

[54:21]

directory. Once you give the PDF

[54:24]

directory uh we will probably go ahead

[54:27]

and take the path. So for this also I

[54:29]

will be requiring the path library over

[54:31]

here. So once we get the path based on

[54:34]

the workspace location here we are going

[54:36]

to get the PDF directory path. Then

[54:38]

we'll list of all we'll go ahead and

[54:40]

apply this regular expression to get all

[54:42]

the PDF files. Then here I'm printing

[54:45]

what is the length of the PDF file and

[54:47]

we are processing every PDF files. So

[54:49]

here you can see that I'm using pi pdf

[54:51]

loader str of pdf file name whatever

[54:54]

file name then I'm doing documents is

[54:55]

equal to loader.load load here I get the

[54:58]

document okay here what I'm doing I'm

[55:00]

adding some more information related to

[55:02]

metadata so here you can see doc

[55:04]

metadata of source file I'm giving the

[55:06]

pdf file name I'm also saying that hey

[55:09]

what is the metadata file type so this

[55:11]

is my new keys inside my metadata to

[55:13]

some put some more additional

[55:14]

information and finally you get a PDF

[55:17]

I'm just mentioning some more metadata

[55:19]

information so along with this I've put

[55:21]

up this metadata information like file

[55:23]

type source file now you can add keep on

[55:25]

adding any number of metadata

[55:27]

information like you want right and once

[55:29]

we read this entire documents we are

[55:31]

going to go ahead and store in this

[55:33]

particular variable that is called as

[55:34]

all documents which is nothing but it is

[55:36]

a list of it is a list it is an empty

[55:39]

list okay so once we do this here we'll

[55:41]

be able to see it is returning this all

[55:43]

documents so this function what it does

[55:45]

is that from inside a folder it reads

[55:48]

all the all the uh PDF files it reads

[55:52]

the content inside this it adds this

[55:54]

kind of metadata information and finally

[55:56]

it is basically storing in this

[55:58]

particular variable. Okay. Now we call

[56:00]

this particular function process all

[56:02]

PDFs. I'm giving the data folder over

[56:04]

here. So once I execute this you'll be

[56:06]

able to see that it has found out four

[56:08]

PDF files and attention. PDF had 15

[56:11]

pages. My embedding PDF had 27 pages and

[56:15]

object detection PDF had 21 pages. And

[56:18]

this is proposal one page. Okay. So all

[56:21]

the information I have it over here. Now

[56:23]

if I go ahead and check my all

[56:27]

documents.

[56:28]

So if I go ahead and check just this

[56:30]

particular v variable all PDF documents

[56:33]

you should be able to see that this is

[56:36]

my list of documents right and the best

[56:38]

part is that for every PDF you'll be

[56:40]

able to see by default some of the

[56:41]

metadata information along with this you

[56:43]

can see there is an author metadata

[56:45]

keywords mode date all this modified

[56:48]

date right all these information are

[56:50]

basically present in the metadata

[56:52]

information now here what we have added

[56:54]

we have added source along with the

[56:56]

source you can see we have also uh total

[56:58]

pages is also added at source file is

[57:00]

also added and these are my text which

[57:03]

is present inside my page content right

[57:06]

so for every PDF whatever is the

[57:08]

possibility size of the document we have

[57:10]

we are able to read it now this is a

[57:12]

step that we have done right now we have

[57:15]

to go to the next step and perform the

[57:17]

chunking now how do I go ahead and

[57:19]

perform the chunking now I have my all

[57:21]

my list of documents so what I will do I

[57:23]

will just go ahead and quickly create a

[57:25]

function

[57:26]

and this will be specifically text

[57:29]

splitting

[57:32]

get into chunks. Okay, chunks I have

[57:35]

over here. Right. So, first of all, I

[57:37]

will go ahead and create a function

[57:38]

which is called as split documents.

[57:40]

Split documents. And inside this

[57:43]

documents, I will be giving my

[57:45]

parameters. The first parameter is

[57:47]

nothing but documents. Then I have my

[57:50]

chunk size is equal to,000. then I have

[57:55]

chunk underscore

[57:56]

overlap is equal to 200. Okay. So I have

[58:01]

given all these things. Now you know how

[58:02]

to do the chunking. It is very simple.

[58:05]

You go ahead and directly use the

[58:07]

recursive character text.

[58:09]

And for this we we definitely require

[58:12]

recursive character text which we have

[58:13]

already imported I think right. So on

[58:16]

the top you'll be able to see that we

[58:17]

have imported this which is present in

[58:19]

langin.extplitter.

[58:20]

So inside we are taking this text

[58:22]

splitter which is nothing but recursive

[58:23]

character text splitter. Now this is

[58:25]

recursively split all the document size

[58:28]

based on the chunk size that is 1,000

[58:30]

chunk overlap 200. Chunk overlap

[58:32]

basically means some number of text will

[58:35]

be able to get overlapped between two

[58:37]

different documents right when we are

[58:38]

doing the splitting. And uh here you can

[58:41]

see we are also using separators right

[58:43]

this is just like an empty space like a

[58:45]

blank uh sorry this is an empty space

[58:48]

this is one more separator this is a new

[58:50]

line separator now you tell me in the

[58:51]

comment section what separator is this

[58:53]

okay so we can use different different

[58:55]

separators you can also use comma um

[58:58]

we'll be seeing different types of

[58:59]

chunking strategies in the later stages

[59:01]

but let's let's start creating this one

[59:04]

pipeline then you'll be getting a clear

[59:06]

idea about it like how this entire

[59:08]

pipeline works Okay, then you have this

[59:10]

text splitter. Uh once you uh

[59:13]

specifically have this text splitter,

[59:14]

you can actually use this to do the

[59:16]

splitting. Right. So now what I will do,

[59:18]

I will create a variable inside this and

[59:22]

I will write textplitter.split

[59:23]

documents. So we are using the split

[59:25]

documents and we are giving the

[59:26]

documents and these all are the default

[59:28]

parameters that we are giving over here.

[59:29]

Now once we do the split, you'll also be

[59:31]

able to see what is the page content.

[59:33]

I'll just try to display 200 characters

[59:35]

from the page content and you can also

[59:37]

see the metadata right so once we go

[59:39]

ahead and execute this this is going to

[59:41]

return the entire split documents now

[59:44]

let's go ahead and use this split let's

[59:47]

say here I'm just going to go ahead and

[59:49]

get all my chunks I will be using this

[59:51]

function split documents and let's give

[59:55]

the documents here we are going to give

[59:56]

the list of documents right uh like uh

[60:00]

what are the list of documents so list

[60:02]

of documents is nothing but all PDF

[60:03]

document. So I will give it over here

[60:06]

and let's see the chunks. Okay. So now

[60:09]

if I go ahead and just go ahead and

[60:11]

print the chunks, you should be able to

[60:13]

see that my all my data is basically

[60:15]

chunked, right? And uh you can see that

[60:18]

we have splitted 64 documents into 359

[60:21]

chunks. So these are all my chunks that

[60:23]

we have done it, right? That basically

[60:25]

means we have converted all our text

[60:27]

into smaller chunks, right? Based on the

[60:30]

uh chunk size and the overlap. So like

[60:33]

this kind of chunks we have how much 359

[60:35]

I guess how much it is 359. Initially we

[60:38]

had only 64 documents right for every

[60:40]

page there will be a separate document

[60:42]

structure. Perfect. So we have done this

[60:46]

and uh we have done the splitting part.

[60:48]

Now let's go to the next step. The next

[60:50]

step will be quite interesting because

[60:53]

now if you see from this particular

[60:56]

pipeline right what are we doing right?

[60:58]

So here we have done the chunking but

[61:00]

these two are the most important steps.

[61:02]

One is the embedding right we need to

[61:05]

perform some kind of embeddings over

[61:06]

here right embedding uh generation

[61:09]

embedding generation and vector store DB

[61:11]

right embedding you can use any kind of

[61:14]

models but I will try to focus on using

[61:16]

open source model so that everybody will

[61:18]

be able to just try it out you know uh

[61:21]

for this what I will do I will just try

[61:23]

to use some kind of modular coding so I

[61:25]

will try to create some classes you know

[61:27]

for embedding I will create a separate

[61:28]

class and inside this we will try to

[61:31]

define different different function

[61:32]

Because in embedding uh you know that

[61:34]

you are converting text into vectors

[61:36]

right so for converting text into

[61:38]

vectors I may define different functions

[61:40]

like loading the model generating

[61:42]

embeddings you know that kind of and in

[61:44]

vector DB like again we'll try to create

[61:46]

this as a separate class. So let's go

[61:49]

ahead and probably go ahead and discuss

[61:51]

about this uh wherein we work on the

[61:54]

embedding part

[61:57]

quickly let's go ahead and see the

[61:59]

embedding part. So for the embedding I

[62:01]

will just go ahead and write a markdown.

[62:04]

So let me quickly write embedding and

[62:07]

vector store DB right. So we are going

[62:10]

to specifically go ahead and implement

[62:11]

these two important modules. Now first

[62:13]

of all what I do do is that I I

[62:16]

definitely require some kind of

[62:17]

libraries over here right for

[62:19]

embeddings. So for embedding uh we are

[62:21]

going to use sentence transformer. uh we

[62:23]

are going to use a model that is

[62:25]

available in hugging face and for that I

[62:27]

will be using the sentence transformers

[62:29]

library along with this uh I also want

[62:32]

to use some kind of uh you know vector

[62:36]

store so this is the vector store I may

[62:39]

use that is fire CPU you can use fires

[62:42]

or you can also go ahead and use chromb

[62:44]

so these are some very good open-source

[62:46]

vector store that is available um now

[62:50]

these all libraries will be more than

[62:51]

sufficient to get started with. So

[62:53]

quickly let me go ahead and install it.

[62:55]

So I will write uvad minus r

[62:57]

requirement.txt.

[63:00]

So once I do the installation you'll be

[63:02]

able to see that.

[63:04]

Okay the installation will get

[63:06]

completed.

[63:09]

So once the installation gets completed

[63:10]

it'll take some amount of time because

[63:12]

we are loading the entire transformers.

[63:14]

So here you can see that quickly it has

[63:16]

got installed. Now I'll go again back to

[63:18]

over here. Now once I go over here what

[63:21]

is the first step that I'm actually

[63:22]

going to do is that I will quickly go

[63:24]

ahead and import some of the libraries

[63:26]

that I require like this right so I'm

[63:28]

importing numpy from sentence

[63:30]

transformer I'm importing sentence

[63:32]

transformer my embedding model right

[63:35]

will be available inside this then I'm

[63:37]

importing chromadb then uh we also

[63:40]

importing the settings from this we are

[63:42]

importing uyu ID the reason of creating

[63:44]

this uyu ID is that because every record

[63:47]

that we specifically insert into the

[63:49]

vector dv we'll have some kind of id

[63:51]

over there we'll generate that then

[63:54]

along with this we will also be

[63:55]

importing list dictionary ne and t pupil

[63:57]

and uh since we are going to apply

[63:59]

cosign similarity while doing the

[64:00]

retrieval from the vector db I also will

[64:02]

be importing this and this is available

[64:04]

in skyitler so let's quickly execute

[64:06]

this okay and till then I will go ahead

[64:10]

and create more number of cells now as I

[64:13]

said for embedding I will go ahead and

[64:16]

write one different class So I will say

[64:19]

embedding manager. So this will be

[64:22]

responsible in doing the embedding part.

[64:24]

So first first thing is that once I am

[64:27]

creating this uh for every class that we

[64:30]

specifically create, we need to write an

[64:31]

init function. Okay. So init. So this is

[64:35]

my constructor you'll be seeing that it

[64:37]

handles document embedding generation

[64:38]

using transformer. Here we are

[64:40]

initializing the embedding manager and

[64:43]

the model name that we are giving is all

[64:44]

mini LM L6 V2. So this is available uh

[64:49]

in uh hugging face this specific model

[64:52]

all mini L6 V2 and this is responsible

[64:55]

in specifically converting a text into

[64:57]

vectors and you get somewhere around 384

[65:00]

dimensions. Okay. Then uh we initialize

[65:02]

the embedding manager. Then model name

[65:05]

is nothing but hugging fist model name

[65:07]

for sentence embeddings. We are going to

[65:08]

use this. Okay. So here we are

[65:10]

initializing the model name. Uh we are

[65:13]

saying self domodel is equal to none.

[65:15]

Okay. Because here uh later on we'll

[65:17]

initialize this value. This function is

[65:19]

very important load model. So that

[65:21]

basically means my next function will be

[65:23]

load model. And this model work is very

[65:25]

simple. This function work is very

[65:27]

simple. It is going to load this model

[65:29]

that is all mini L6 V2. Okay. So I will

[65:32]

create another function which is nothing

[65:33]

but underscore load model. Why we write

[65:35]

underscore? Uh this is just like a

[65:37]

protected function. Uh if you know about

[65:39]

classes, we use something called as a

[65:42]

protected function. And within this

[65:44]

protected function within this class

[65:45]

only it'll be accessible. So here uh

[65:47]

what we are doing we using the sentence

[65:49]

transformer and whatever model name we

[65:51]

have we are loading it. Okay we are

[65:54]

loading it. So selfro model of sentence

[65:56]

transformer model self model name then

[65:58]

this will be modeled uh loaded and here

[66:00]

you'll also be able to get the

[66:01]

dimension. For that we use a function

[66:03]

called as get sentence embedding

[66:05]

dimension and by default it will be uh

[66:08]

somewhere around 384 dimensions. Okay,

[66:10]

that basically means every text will be

[66:12]

converted into 384 dimensions. So once

[66:15]

we have this init function, we have the

[66:16]

load model. Now one more function that

[66:18]

we require is generate embeddings,

[66:20]

right? So here uh you'll be able to see

[66:23]

that I will be seeing this generate

[66:26]

embeddings function. Okay. So generate

[66:29]

embedding is nothing but it takes the

[66:32]

text that is nothing but list of string

[66:34]

and it returns a numpy array. Okay. So

[66:37]

here it generates the embedding for list

[66:38]

of text very simple. So here what we are

[66:41]

doing we are basically using this self

[66:42]

domodel dot encode is the function that

[66:45]

we have to use on text whatever text

[66:47]

list of text we give and we also giving

[66:49]

show progress bar is equal to true so

[66:51]

that we should be able to see the

[66:52]

progress bar and we return the

[66:53]

embeddings. Okay. Now generate embedding

[66:56]

is one function. Load model is one

[66:58]

function. We have al also used get

[67:00]

sentence embedding dimension just to get

[67:02]

the dimension. Okay. Now for this you

[67:05]

can either get I can you can either

[67:07]

create this particular function or you

[67:08]

can also remove this it is not necessary

[67:10]

but what I did is that to show you much

[67:13]

more in a better way we will create this

[67:15]

function get sentence embedding

[67:16]

dimension. So here is my get embedding

[67:19]

dimension self. So here what we are

[67:21]

doing we just written model get sentence

[67:23]

embedding dimension. See instead of

[67:25]

doing like this also I can write like

[67:26]

this only over here. Okay I can just

[67:29]

quickly write this particular function

[67:32]

over here. Okay. So sometime it is not

[67:34]

required you can also. So I will just go

[67:36]

ahead and remove it if you want. Okay I

[67:38]

will just remove it. Perfect. So I have

[67:42]

these two three important function. Now

[67:44]

we can initialize the embeddings. Okay.

[67:49]

Uh sorry we can initialize the embedding

[67:51]

manager. So here I will write embedding

[67:55]

manager is equal to embedding

[68:00]

manager.

[68:03]

So I hope this is the class name

[68:06]

should not be underscore it should be

[68:08]

like this. Okay now once I go ahead and

[68:10]

write this and once I execute it this

[68:12]

will just go ahead and initialize the

[68:15]

constructor. Right. So here you can see

[68:17]

it is loading the embedding model. All

[68:18]

mini LM V62 model loaded successfully

[68:22]

and here you can see the dimension is

[68:24]

384 right so it has been loaded so when

[68:27]

we calling this particular function this

[68:28]

is basically getting loaded right so my

[68:31]

embedding manager now has the model

[68:32]

information over here great so I have my

[68:36]

model ready so if you see from this

[68:38]

particular graph this entire class has

[68:41]

been created now we go to the next step

[68:43]

and create this specific class that

[68:44]

basically means over here we have our

[68:46]

model embedding ready we just need to

[68:48]

use it. Now, similarly, we'll go ahead

[68:50]

and create it for the vector store also.

[68:52]

Okay, vector store is just like a vector

[68:54]

DB database where you can store all the

[68:56]

vectors that has been converted by the

[68:58]

embedding layer inside it so that you

[69:00]

can apply any kind of similarity search

[69:02]

into it. Right? So, first of all, let me

[69:05]

quickly go ahead and define a class for

[69:08]

this also. So, here I will go ahead and

[69:12]

write vector store. Okay, vector store.

[69:17]

Uh remember guys the code that I'm

[69:20]

showing you is very simple if you just

[69:21]

see you need to have some coding

[69:24]

knowledge if you really want to become

[69:26]

better in rag. Okay now we'll go to the

[69:29]

next step with respect to the vector

[69:30]

store. Now in the vector store we are

[69:32]

creating a class vector store. Again

[69:34]

here we are using a init method. We are

[69:37]

giving a collection name. What should be

[69:38]

the collection name for the vector store

[69:40]

itself. And uh here the collection name

[69:43]

we giving it as PDF documents. We are

[69:45]

also giving the persistent directory

[69:47]

which will be this particular directory

[69:49]

that is inside my data folder.

[69:51]

Persistent directory means whatever

[69:52]

vector store is basically created we are

[69:54]

going to save it that in the hard disk.

[69:56]

So here uh first of all I'm giving the

[69:58]

collection name I'm giving the person

[69:59]

directory collection is none. Self

[70:01]

docolction is equal to none. Okay. And

[70:03]

then we are initializing the store. Now

[70:05]

whenever we initialize the store that

[70:07]

basically means this function will be

[70:09]

initializing the vector store itself.

[70:11]

Right. So for this we need to create

[70:13]

another function again and see the code.

[70:16]

Okay, just observe the code. Here we are

[70:17]

initializing chromab client and

[70:19]

collection. So here we have written

[70:20]

osmake directory of self.persistent

[70:22]

directory whatever directory path is

[70:24]

there. If it already exist we are just

[70:26]

going to keep it like that otherwise it

[70:28]

is going to create a new directory. Then

[70:30]

we create a client self.client wherein

[70:33]

we are using chromadv.persistentclient

[70:35]

function and we are given the persistent

[70:37]

directory over here. So what it is going

[70:38]

to do? It is basically going to create a

[70:40]

client which will be having a reference

[70:43]

to the chrom vector store. Okay. Then we

[70:46]

go ahead and create a collection. So

[70:48]

here we write self.colction. Then

[70:50]

self.client dot get or create

[70:52]

collections. We're giving the collection

[70:53]

name and we're giving some metadata

[70:55]

information like what is the collection

[70:56]

information. And here we basically

[70:59]

create a collection uh collection

[71:01]

basically means it's just like uh where

[71:03]

we are going to store the uh vector uh

[71:05]

where we are going to store the uh

[71:07]

vectors inside my vector store. So it'll

[71:10]

be stored inside this particular

[71:11]

collection name. Then we are

[71:13]

initializing this with the collection

[71:15]

name dot collection count. Okay. So as

[71:18]

soon as we execute this that basically

[71:20]

means my chromb client will be ready and

[71:22]

my collection will be created. Okay. Now

[71:24]

the next function is that usually

[71:26]

whenever we create a collection we need

[71:28]

to add the documents right. So for

[71:30]

documents we will be creating another

[71:32]

function. So quickly let's go ahead and

[71:35]

create this because whenever I have a

[71:37]

document I will go ahead and create this

[71:39]

particular connection. Okay. So here you

[71:42]

can see I've created another function

[71:43]

which is called as add document. Here we

[71:45]

give the list of document. We apply the

[71:47]

embeddings.

[71:48]

Very simple add documents and the

[71:50]

embeddings to the vector store. And here

[71:52]

you can see if length of documents is

[71:53]

not equal to length of embeddings. Here

[71:55]

you can actually see this. Now we are

[71:57]

preparing the data for chromb. We

[71:59]

require ids, metadata, document text and

[72:01]

embedding list. So now whatever

[72:04]

documents I have over here. Whatever

[72:06]

documents I'm getting, I will be zipping

[72:09]

it means I I'm creating a pupil with

[72:11]

embeddings and then I am creating a UYU

[72:14]

ID. Why I require UU ID? because it's

[72:17]

just like a id for a specific record,

[72:21]

right? And that will be my doc id. Okay,

[72:24]

doc id variable and I'm appending it

[72:25]

over there. Then we are preparing the

[72:28]

metadata. Whatever doc metadata we get.

[72:31]

Remember we are iterating through this

[72:32]

documents. So we have all the

[72:34]

information. So that all metadata we are

[72:36]

putting it over here. Doc index content

[72:39]

length. We are just adding some more

[72:40]

metadata information to put it inside my

[72:43]

vector db. Then we get the document

[72:46]

content from doc.page_content.

[72:49]

And we also get the embedding where we

[72:51]

are converting this embedding to list.

[72:53]

Okay. See two information is basically

[72:55]

required right over here. If you see uh

[72:58]

from this particular function one is

[73:00]

embedding which is my MP. ND array right

[73:02]

and this embedding is coming from where

[73:04]

from the previous function right

[73:06]

generate embeddings where we have done

[73:07]

it. So it's all linkage. See the reason

[73:10]

of creating this particular in the form

[73:12]

of class because I want to link each and

[73:14]

every pipeline right. So here we are

[73:16]

writing embedding list.append

[73:17]

embedding.2 two list. So we have the

[73:19]

page content, we have this list. So what

[73:21]

I'm doing I'm adding that entirely in

[73:24]

the collection. So for this we require

[73:26]

ids, we required emitting list, we

[73:28]

require metadata, we require document

[73:30]

text. So whatever we have prepared,

[73:33]

we're just adding it over here based on

[73:35]

the parameters, right? And finally

[73:37]

you'll be able to see the how many

[73:38]

number of documents has been inserted.

[73:40]

Now quickly let's go ahead and

[73:43]

initialize.

[73:47]

Let's go ahead and initialize my vector

[73:49]

store. So I'll write vector store is

[73:51]

equal to

[73:55]

uh vector

[73:57]

store and I'll initialize this. Okay. So

[74:01]

quickly I will go ahead and write vector

[74:03]

store. So now this is basically going to

[74:06]

initialize the entire vector store

[74:08]

itself. Right. So here you can see this

[74:10]

is my collection name and existing

[74:12]

document in collection is zero since we

[74:14]

did not add any number of records. Okay.

[74:17]

Now, if we want to add any number of

[74:19]

records, we have to call this function

[74:22]

add documents, right? So, let's uh go

[74:25]

ahead and do that and let's call it.

[74:26]

Okay. Now, first of all, uh you know

[74:29]

that I have already done the splitting

[74:30]

of the chunks, right? So, here if you go

[74:33]

ahead and see this, this is my split

[74:36]

chunks, right? Uh sorry, that was the

[74:38]

variable. Let's see which variable it

[74:40]

has got saved. Okay, it should be

[74:42]

chunks,

[74:44]

right? So these are my chunks right

[74:47]

now chunks what I am actually going to

[74:49]

do is that I will extract all the text

[74:52]

from that particular chunk and we'll

[74:54]

generate an embedding. Okay. So for that

[74:56]

what I will do I will say I will put a

[74:59]

list comprehension. So here now let's

[75:03]

convert

[75:05]

the

[75:08]

text to embeddings. Okay we're going to

[75:11]

go ahead and do this. And here we are

[75:13]

basically going to write

[75:16]

chunks.

[75:18]

First of all, I'll iterate. Okay, I will

[75:21]

say that hey for doc in chunks.

[75:25]

Okay, and we are just going to take this

[75:28]

doc dot page content. Okay, so we are

[75:31]

going to take all this page content and

[75:34]

basically go ahead and create my text

[75:37]

text variable. Okay. So once I go ahead

[75:39]

and do this, you should be able to see

[75:41]

this is my text, right? All the text

[75:44]

that I have and this text I will pass it

[75:46]

to my embedding manager, right?

[75:49]

Embedding manager which I have actually

[75:51]

created. So what I will do quickly, I

[75:53]

will just go ahead and execute this once

[75:55]

again. I have all my text.

[75:58]

Okay, I have all my text. Now from this

[76:01]

we will go ahead and generate the

[76:04]

embeddings. Now once we generate the

[76:06]

embedding how do we generate the

[76:07]

embeddings very simple we use this

[76:11]

embedding manager which object we have

[76:13]

actually created what object we have

[76:15]

created earlier if you see over here

[76:18]

this is my embedding manager right so we

[76:20]

are using this embedding manager dot

[76:22]

generate embedding and here I have to

[76:23]

give the text in the form of a list list

[76:26]

of strings right so here quickly I will

[76:29]

call this particular function dot uh dot

[76:34]

generate generate

[76:38]

generate

[76:40]

underscore

[76:42]

embeddings. Okay.

[76:44]

And here you will be able to see that

[76:46]

I'll be giving my text. Then let's store

[76:51]

store in the vector database. So after

[76:55]

we convert that into an embedding, we

[76:57]

store everything in the vector database.

[76:58]

Right? So here I will use vector store.

[77:02]

vector store the variable that we have

[77:05]

created dot add

[77:08]

documents and this is a small letter add

[77:13]

documents this is a function that we

[77:15]

have used and inside this if you

[77:16]

remember we have to give our

[77:19]

we have to give our entire

[77:23]

chunks

[77:25]

okay whatever embeddings we are

[77:28]

specifically applying okay so once we do

[77:31]

this

[77:33]

You can see this embeddings whatever we

[77:34]

have got and the chunks the documents

[77:37]

the entire documents we're going to do

[77:38]

this okay so let's quickly execute this

[77:41]

and I think now my embedding will happen

[77:43]

now you can see that for 359 text this

[77:46]

is happening and it has got converted

[77:48]

into so many number of batches

[77:51]

uh vector store is not defined why it is

[77:53]

not defined let's see what I have

[77:55]

defined over there okay it should be

[77:56]

vector store

[77:58]

so this should be the spelling of my

[78:00]

vector store instead of that. Okay. So

[78:03]

now let me quickly go ahead and execute

[78:05]

this. Now inside that same vector store

[78:08]

it'll get it'll get executed. Okay

[78:13]

perfect. Now you can see that the total

[78:15]

document in the collection is 359. So if

[78:17]

you see over here uh inside my u

[78:22]

notebook file inside my data file here

[78:24]

there is something called as vector

[78:25]

store and we have done the persistent

[78:27]

over here right. So persistent basically

[78:29]

means the now now f the it is saved in

[78:32]

this particular hard disk. We can just

[78:34]

load this hard disk and we can probably

[78:36]

go ahead and execute anything as such.

[78:38]

Okay. Now perfect. Now you can see that

[78:41]

we have completed this entire pipeline.

[78:43]

Now we have all the data available over

[78:46]

here in the vector store DB right in the

[78:48]

form of vectors.

[78:51]

But now the main thing is that how do we

[78:53]

perform the retrieval? Because retrieval

[78:55]

see in retrieval what happens is that

[78:58]

whenever we have a user query we have to

[79:02]

take this query we have to convert that

[79:04]

into embeddings again okay and then we

[79:09]

basically go ahead and hit the vector

[79:10]

store in the form of a retriever and

[79:12]

then only we get the context. So in our

[79:15]

example first of all we'll try to get

[79:17]

till here. Okay, we have a user query.

[79:20]

We convert that query into embeddings.

[79:23]

Then we hit this particular vector store

[79:25]

and we get the context. So let's go

[79:26]

ahead and create this specific pipeline

[79:28]

now. Okay. And for this pipeline, we

[79:31]

will try to create a rag retriever.

[79:33]

Okay. So we will try to create a rag

[79:35]

retriever. So let's quickly go ahead and

[79:38]

do that particular thing. Till now we

[79:40]

have created all the amazing pipelines.

[79:42]

We have created this embedding manager.

[79:45]

Now we also have this vector store. Now

[79:47]

what I will do is that I'll create

[79:48]

another pipeline which will be a rag

[79:50]

retriever. Okay, just to get the

[79:52]

specific context. So let's go ahead and

[79:54]

discuss about that. So guys, now let's

[79:56]

go ahead and create the rag retriever

[79:59]

pipeline. So first of all, what we are

[80:00]

going to do is that I will go ahead and

[80:02]

create a class which is called as rag

[80:04]

retriever. Now this rag retriever class

[80:07]

you will be able to see that it handles

[80:09]

query based retrieval from the vector

[80:11]

store. So inside the constructor we will

[80:14]

be giving two important parameters.

[80:16]

One is the vector store and one is the

[80:19]

embedding manager. And if you remember

[80:21]

we have created both this. We have

[80:23]

created the embedding manager. We have

[80:24]

created the vector store manager. Right

[80:27]

now after giving this we will be

[80:29]

initializing two class variables that is

[80:32]

vector store and embedding manager and

[80:33]

we'll be assigning with this. Now

[80:37]

whenever we create a retriever one thing

[80:39]

you really need to understand this

[80:40]

retriever is actually built on the top

[80:42]

of a vector store and retriever is

[80:45]

nothing but it is a simple interface

[80:47]

based on whatever query we get this

[80:49]

retriever is just going to give you the

[80:50]

response back. Okay and this retriever

[80:53]

is basically a kind of interface which

[80:55]

is connected to the vector store and

[80:56]

chart. Okay. Now uh the next step that

[81:00]

we are going to create is another

[81:02]

function which will be called as

[81:03]

retrieve function. Now this is really

[81:05]

important because this retrieve function

[81:07]

main work is to retrieve based on a

[81:11]

specific query. So let me go ahead and

[81:13]

define the specific function.

[81:16]

Now this function again see to write it

[81:18]

will definitely take a lot of time. So

[81:20]

we will try to understand this

[81:21]

particular function. Okay. So here a

[81:24]

retrie function you can see we are

[81:25]

giving query we are giving top key

[81:27]

results. How many top key results we

[81:29]

want and there is also a threshold

[81:31]

value. By default it is 0.0. zero and

[81:34]

this function is basically going to

[81:35]

return a list of results. Okay, so here

[81:38]

you can see retrieve relevant document

[81:40]

for a query arguments are the search

[81:43]

query, top K documents and score

[81:45]

threshold and it returns a list of

[81:47]

dictionaries contain the retriever

[81:48]

documents and metadata. At the end of

[81:51]

the day this function is actually help

[81:53]

us to get this specific context.

[81:56]

So you'll be able to see over here we

[81:59]

are using that same self embedding

[82:01]

manager and we are calling this generate

[82:03]

embedding function. Now if you remember

[82:05]

this generate embedding function is

[82:07]

already defined in my embedding manager

[82:09]

right. So if I go on the top so here is

[82:13]

my generate embedding function and this

[82:15]

is nothing but this is basically uh

[82:17]

you're just using model.enccode and

[82:19]

you're giving the text and it is

[82:20]

converting into embeddings. Yeah. So

[82:23]

that is the reason we are basically

[82:25]

using this because at the end of the day

[82:27]

first of all whenever we get a query

[82:31]

right so let me go down over here inside

[82:34]

this retrieve whenever we give this

[82:35]

query first the query needs to be

[82:38]

converted into an embeddings right so

[82:40]

this query that is given we need to

[82:42]

apply embedding for this also so that we

[82:44]

can do a um similarity search in the

[82:47]

retriever itself right so the first the

[82:49]

query is basically converted into a

[82:51]

vector by the help of embedding manager

[82:54]

dot generate fun embedding functions.

[82:57]

Then we are going to use the vector

[82:59]

store dot collection and we are going to

[83:01]

use this dot query and here we are going

[83:04]

to give our query embedding which is

[83:06]

nothing but this embedding in the form

[83:07]

of a list and then we are also going to

[83:10]

give the top results. So by using this

[83:12]

this is basically going to hit the

[83:14]

vector DB whichever vector vb we have

[83:16]

initialized and it is going to give you

[83:19]

the results. Once you get the results,

[83:22]

the results internally there will be a

[83:23]

key which is called as documents. Okay,

[83:26]

you can get document information, the

[83:28]

mech metadata information, the distance

[83:31]

information and some of the ids

[83:32]

information. So all the specific

[83:34]

information we are using it and here you

[83:38]

can see very similarly what we are doing

[83:40]

we are using all these parameters like

[83:42]

ID, documents, metadata and distance. We

[83:45]

are zipping it. Zipping it basically

[83:47]

means we are just trying to create a

[83:48]

pupil over here and then for every

[83:52]

values we are just trying to calculate

[83:54]

the distance right one minus distance 1

[83:57]

minus distance will basically give you

[83:58]

the similarity score like how similar

[84:01]

those text data is basically coming up

[84:04]

outside this vector store. So we are

[84:06]

creating the similarity score and if the

[84:08]

similarity score is greater than the

[84:10]

threshold then what we do we basically

[84:12]

add this inside my text context

[84:14]

documents and context documents is

[84:16]

basically created in this particular

[84:18]

variable which is nothing but retrieve

[84:20]

docs which we have kept it empty over

[84:22]

here. Okay. So all the information we

[84:25]

are just trying to add it over here so

[84:27]

that we'll be able to see it. Okay. And

[84:28]

finally we return that retrieve docs. So

[84:31]

if you say step by step we're not doing

[84:33]

anything we like not very complex thing

[84:36]

we are getting the user query we're

[84:37]

converting this into embeddings we are

[84:40]

hitting the vector store right then we

[84:42]

are getting the response okay once we

[84:44]

get the specific response that context

[84:46]

we are putting it in the form of a list

[84:49]

if you just go ahead and see the code

[84:50]

that is how things are happening okay so

[84:53]

this is one of the very important

[84:55]

function uh that you'll be able to see

[84:58]

now here what I can do is that I can

[85:00]

quickly go ahead and create a variable

[85:02]

called as rag retriever and I can call

[85:06]

this same class.

[85:09]

So if you see over here I will use this

[85:10]

same rag retriever over here

[85:14]

and let's give our vector store vector

[85:18]

store which I've defined it earlier

[85:20]

which is my vector store manager and

[85:22]

then my embedding manager.

[85:25]

Once I do this I should be able to see

[85:28]

this. Okay. uh it should be vector store

[85:30]

file right so now you'll be able to see

[85:34]

this is my rag retriever

[85:37]

rag retriever it is an object of this

[85:38]

now if I call this particular function

[85:41]

with a query right I can call dot

[85:43]

retrieve with a query so let's go ahead

[85:45]

and do this okay so here I will write

[85:48]

rag

[85:50]

retriever dot query sorry dot

[85:56]

retrieve is my function

[85:59]

Okay. So here you can see quickly this

[86:02]

is my function retrieve right and I need

[86:04]

to give a query. Now let's test for a

[86:07]

specific query. I'll say hey what is

[86:11]

attention is all you need because I know

[86:15]

inside my data there is a PDF file which

[86:19]

is called as attention or I have also

[86:22]

created some kind of proposal over here

[86:24]

embedding some files are there. So we'll

[86:26]

try to execute this. So here you can see

[86:30]

as soon as I asked what is attention is

[86:32]

all you need. Now it is giving me the

[86:34]

top K for all it is printing all the

[86:36]

information and it is generated

[86:38]

embedding for one text. Right? And the

[86:40]

text shape is 1, 384 because I have used

[86:43]

the embedding that is called as all mini

[86:45]

LMV6 that creates a 384 dimension. Now

[86:49]

once we go ahead and apply this

[86:51]

particular function right this function

[86:53]

it is basically getting the results over

[86:55]

here and we are printing that same thing

[86:57]

right and at the end of the day we we we

[87:00]

can also go ahead and return this

[87:01]

retrieve docs okay so in short this is

[87:05]

basically this function is going to give

[87:06]

me all the retrieve docs so this is the

[87:07]

retrieve docs you can see content

[87:09]

metadata author so these are my context

[87:12]

information so here you can see

[87:14]

attention function can be described as a

[87:16]

mapping a query as a set of this one and

[87:18]

this entire entire thing is basically

[87:19]

the context. So from this particular

[87:22]

diagram here you can see easily we are

[87:24]

able to get the context right and this

[87:26]

is nothing but this is your context. Now

[87:28]

let's try some more things. Okay I will

[87:31]

just go ahead and open some PDF. Okay.

[87:34]

[87:36]

this is some very new research paper

[87:39]

embedding technical report. Okay. Uh

[87:41]

we'll search for any topic over here. Uh

[87:44]

embedding model training. I'll just go

[87:46]

ahead and search for unified multitask

[87:48]

learning framework. Okay, because this

[87:50]

information also we have put it over

[87:51]

there. So here I'll go ahead and create

[87:55]

one more this one and I will copy this

[87:58]

entire code. Okay, quickly

[88:02]

and this is the query that I'm actually

[88:04]

going to give that is nothing but

[88:06]

unified

[88:10]

multi multitask learning framework. So

[88:12]

if I go ahead and execute this you can

[88:14]

see that I'm able to get this and then

[88:16]

you can see content benchmark ranking

[88:19]

over on both the leaders effective of

[88:21]

our approach. So we are able to get the

[88:23]

response very very much quickly right

[88:25]

and this response is basically coming

[88:26]

from the vector store right in a very

[88:30]

similar way very easy way uh we are able

[88:33]

to get the specific response over here

[88:35]

right and let me tell you right this is

[88:39]

the most easiest way like how things are

[88:41]

basically happening over here right now

[88:44]

uh what we can do is that see if you

[88:46]

know if you have created all these

[88:48]

things right till here you have created

[88:50]

now the further step is that you have to

[88:52]

just integrate LLM with the uh with this

[88:55]

specific context. Okay. Now for this LLM

[88:58]

with this specific context, what you can

[89:00]

do is that you can directly take this

[89:02]

particular context and give it to the

[89:03]

LLM and that is what we are going to see

[89:05]

in the next video. But in this

[89:07]

particular video, we saw the entire

[89:09]

thing the complete rack pipeline from

[89:11]

data injection to the vector DB

[89:13]

pipeline. Right now you can go ahead and

[89:15]

write any kind of queries and definitely

[89:18]

with all these information here you can

[89:19]

see similarity score is also coming up

[89:22]

right distance is also basically coming

[89:23]

up all the information you're putting it

[89:25]

over here and we have also used modular

[89:27]

coding right now in the next step what

[89:29]

I'll do I will take this vector store

[89:32]

and uh we will go ahead with the next

[89:34]

integration that is llm and output which

[89:36]

I will say it as a retrieval pipeline

[89:39]

but this entire data injection pipeline

[89:40]

with this uh query retrieval we have

[89:44]

actually created. Now the next two steps

[89:46]

will this one and after doing this we

[89:48]

will try to convert the same code

[89:50]

whatever same whatever code we have

[89:52]

basically written over here in the form

[89:54]

of modular coding right we'll try to see

[89:56]

that how we can put this inside our

[89:59]

source folder so here what I will do

[90:01]

we'll quickly create a source folder and

[90:04]

inside the source folder I will show you

[90:07]

that how we can take this entire

[90:09]

pipeline and how we can actually create

[90:11]

it in such a way that we have a kind of

[90:15]

pipeline over here right pipeline

[90:17]

basically means from data injection to

[90:19]

vector embedding how in a sequential way

[90:21]

we can actually go ahead and call it.

[90:23]

Hello guys so we are going to continue

[90:25]

the discussion with respect to rag. Uh

[90:27]

till now we have already discussed about

[90:29]

the entire data injection pipeline and

[90:32]

with the help of user query you know we

[90:34]

are also able to retrieve the context.

[90:37]

uh we have completely implemented this

[90:39]

first pipeline that is called as data

[90:41]

injection pipeline where we did the data

[90:43]

injection. We did the chunking uh then

[90:46]

we converted the text into vectors and

[90:48]

after that you know uh we were able to

[90:51]

probably store everything inside a

[90:53]

vector DB and we also persisted in the

[90:56]

local directory so that we can always

[90:58]

read whenever we definitely want okay

[91:00]

based on a specific query. Now we are

[91:02]

going to go towards the second pipeline

[91:04]

that is the query retrieval pipeline

[91:07]

wherein we are also going to use LLM

[91:09]

with it. Okay. So here we are going to

[91:11]

specifically use LLM models and this LLM

[91:14]

models will actually help us to generate

[91:17]

a summarized output. Okay. In the rag.

[91:21]

So the entire pipeline will look

[91:22]

something like this. And uh when we talk

[91:25]

about this query retrieval pipeline, we

[91:28]

are specifically talking about something

[91:30]

called as augmented generation. Okay.

[91:34]

See in retrieval uh rack basically means

[91:36]

retrieval augmented generation. And this

[91:39]

augmented generation how does it

[91:41]

specifically work? Okay. So let's

[91:43]

consider that this vector DB is already

[91:46]

ready and you know that how did I create

[91:49]

this particular vector DB? By following

[91:51]

this particular pipeline, right?

[91:54]

Now once we follow this pipeline the

[91:57]

data is stored inside the vector DB. Now

[92:00]

whenever a user gives a new query okay

[92:04]

it has a new query related to the

[92:06]

documents that are already ingested

[92:08]

inside the vector DB then what we do we

[92:11]

take up this query we apply the same

[92:13]

embedding and in this particular

[92:16]

embedding what we do we convert the

[92:18]

query to vectors

[92:21]

right and then from this particular

[92:23]

embedding we hit the vector DB we get

[92:26]

the context and then whatever context we

[92:30]

get along with the prompt engineering

[92:32]

like basically with a simple prompt we

[92:36]

give that instruction to the LLM right

[92:38]

so prompt is just like an instruction to

[92:39]

the LLM like how the LLM should

[92:41]

basically work now once we are doing

[92:44]

this right this this step is basically

[92:47]

called as augmentation

[92:50]

okay this step is basically called as

[92:52]

augmentation wherein we are giving we

[92:55]

are taking the context and along with

[92:56]

that we are also combining it with a

[92:58]

specific prompt

[92:59]

And finally you'll be able to see that

[93:01]

we'll generate the output from the LLM.

[93:03]

And this step is nothing but generation

[93:07]

right this is the retrieval step. So

[93:10]

here I have my retrieval step wherein we

[93:13]

are giving a query we're converting that

[93:15]

into vectors and we hitting the vector

[93:16]

DB. So you really need to understand the

[93:19]

entire concepts with respect to rack.

[93:21]

Okay. So let's go ahead and implement

[93:24]

this entire retrieval uh query retrieval

[93:26]

pipeline along with the LLMs. Okay. Now

[93:28]

here we also going to go ahead and set

[93:30]

up the LLM. So guys, now let's go ahead

[93:32]

and implement this uh with the help of

[93:35]

practical implementation. So here we are

[93:36]

going to integrate vector DB context

[93:38]

pipeline with LLM output. U as suggested

[93:42]

we are going to implement the augmented

[93:44]

and generation. Now first first of all

[93:46]

what we going to do is that I'm going to

[93:48]

use the my Gro API key. Okay. Okay, so I

[93:50]

have updated the gro API key over here

[93:52]

in the ENB file and uh you know here we

[93:56]

are going to probably go ahead and

[93:58]

create a simple rag pipeline. Okay, uh

[94:03]

with the gro lm okay so first of all

[94:06]

what we are going to do is that uh again

[94:09]

uh if you remember in our

[94:11]

requirement.txt we will go ahead and

[94:13]

import this two libraries that is called

[94:15]

as langin-g

[94:17]

gro and then you have python.nv PNB okay

[94:20]

and then after this uh we will go ahead

[94:23]

and uh you know quickly initialize from

[94:25]

langchain

[94:27]

grock import chat gro okay along with

[94:30]

this I'm also going to go ahead and

[94:31]

import os then from env I'm going to use

[94:35]

load env so that we import or we load

[94:39]

the entire environment variables then

[94:42]

the next thing is that we will go ahead

[94:43]

and initialize the gro lm and set your

[94:47]

environment gro API key inside this.

[94:49]

Okay. And in order to do this again here

[94:52]

you'll be able to see that I'm using gro

[94:54]

API key OS.get env something like this.

[94:57]

Okay. If you just go ahead and call this

[94:59]

sometime uh my suggestion would be that

[95:01]

directly don't call from get env.

[95:04]

Initially you can directly test it by

[95:06]

pasting the environment keys directly

[95:09]

over here. Okay. So here I will go ahead

[95:11]

and paste it. Otherwise you go ahead and

[95:14]

replace it. Just for testing purpose I'm

[95:16]

actually doing this. Now we'll go ahead

[95:17]

and initialize our LLM model chat gro

[95:20]

and here I will use my gro API key is

[95:23]

equal to API sorry gro API key okay and

[95:29]

then model name is gamma 2 temperature I

[95:31]

will select it as 0.1 and maximum number

[95:34]

of tokens it will generate is 1024 okay

[95:36]

so this is my lm we have initialized the

[95:38]

gro lm now the second thing is that we

[95:41]

will quickly go ahead and create a

[95:44]

simple rag tag function and this is

[95:49]

going to integrate everything from

[95:51]

retrieve context plus generate response

[95:54]

and if you remember guys here is my

[95:56]

retriever before class like the previous

[95:59]

u session we have already seen that how

[96:01]

this rag retriever was actually created

[96:02]

we created a class for that okay so here

[96:05]

uh we are going to probably take two

[96:07]

different parameters inside this we'll

[96:09]

first of all define a function called as

[96:11]

rag simple and then here we are going to

[96:14]

go ahead and give our query

[96:16]

Then we are going to go ahead and give

[96:18]

our retriever

[96:20]

llm

[96:24]

top k is equal to three. Okay.

[96:28]

And then uh over here quickly let's go

[96:31]

ahead and first of all retrieve the

[96:34]

context. Yeah. So we'll going to

[96:37]

retrieve the context. So here I'm going

[96:38]

to write results is equal to retrie dot

[96:42]

retrieve query. So here you have this

[96:44]

query and top k is equal to k. Okay. And

[96:48]

then uh we are just going to get the

[96:50]

context or I'll go ahead and define my

[96:52]

context. Inside this context I will say

[96:54]

that hey whatever information I'm

[96:56]

getting from my results right just go

[96:59]

ahead and combine everything and put it

[97:02]

inside this. Right? So here I'm saying

[97:05]

that hey for doc in results whatever

[97:07]

content I'm getting I'm going to join it

[97:09]

with a uh double new line over here. If

[97:12]

results are this empty, we are just

[97:14]

going to keep it as empty. So this is my

[97:16]

context over here, right? then uh I can

[97:19]

still go ahead and write one more

[97:21]

condition saying that hey if not context

[97:25]

okay we just going to go ahead and

[97:27]

return saying that no relevant context

[97:32]

form okay to the answer question and

[97:36]

then we are going to generate the answer

[97:41]

using grock lm okay and now I'm just

[97:46]

going to go ahead and define prompt

[97:48]

obviously I required a prompt. If you

[97:50]

remember here I can again use a prompt

[97:55]

template also I can directly use a

[97:56]

prompt over here. So here with respect

[97:58]

to the prompt I will give a query saying

[98:01]

that hey this is what you really need to

[98:03]

do. You need to go ahead and answer this

[98:05]

specific question and you should

[98:07]

probably get a response for that. Right?

[98:10]

So here what I will do I will quickly go

[98:11]

ahead and paste it. Use the following

[98:13]

context. So here you can see use the

[98:15]

following context to answer the question

[98:17]

uh uh question concisely. Okay. And here

[98:21]

what we can basically do is that we can

[98:23]

just go ahead and um do one thing on

[98:26]

over here quickly. I'll say just put

[98:30]

tab. Okay. So use the following context

[98:32]

to answer the question uh precisely or

[98:35]

concisely. So here I have given the

[98:36]

context. Here I've given the query.

[98:38]

Okay. Now the next thing after this is

[98:40]

that we will go ahead and create a

[98:42]

response. So response is equal to this

[98:44]

time we going to use llm dot invoke.

[98:47]

Okay. And here uh let's go ahead and put

[98:52]

something like prompt dot format.

[98:56]

And here we are going to write context

[98:59]

is equal to context

[99:02]

and here you have query is equal to

[99:05]

query whatever query I have. Okay. And

[99:08]

then we go ahead and return the response

[99:12]

dot content.

[99:15]

So once we do this uh then we can

[99:18]

specifically call this particular

[99:19]

function. Okay. So now what we are going

[99:21]

to do is that I will just go ahead and

[99:23]

write answer is equal to rag simple and

[99:28]

let's say I go ahead and ask a question.

[99:32]

What is attention mechanism?

[99:36]

Okay. And here I need to give my rag

[99:38]

retriever along with the llm and then we

[99:41]

can go ahead and print the answer.

[99:45]

Okay. So here you can see attention

[99:47]

mechanism is a function that maps a

[99:49]

query in this right and we are able to

[99:51]

get the answer over here. This is really

[99:53]

good. See a very simple pipeline where I

[99:56]

have initialized my lm model. I've

[99:58]

defined a function and then this

[100:00]

function what it is doing first of all

[100:01]

it is hitting the rag retriever retrieve

[100:04]

function. It is getting the context. it

[100:06]

is combining the context and along with

[100:07]

the prompt we are hitting the llm. So if

[100:09]

you remember we are we are just

[100:11]

following this entire process and

[100:12]

generating a proper output right if that

[100:15]

particular output is available inside

[100:17]

the uh vector DB right now guys uh what

[100:21]

we are going to do is that we are going

[100:22]

to enhance the rack pipeline the simple

[100:25]

rack pipeline that we have created over

[100:26]

here okay we'll enhance in such a way

[100:28]

that it will have more amazing features

[100:31]

in it okay so now we're going to go

[100:33]

ahead and create an amazing enhanced

[100:36]

track pipeline and this is the code so

[100:38]

now you can see over Here we have a

[100:39]

function called as rag advanced. I'm

[100:41]

giving a query retriever lm topk

[100:44]

elements like how many we want minimum

[100:46]

scores return context is equal to false.

[100:48]

So here you can see that um before we

[100:51]

were simply like we were just combining

[100:53]

the context we are putting the

[100:55]

information in the prompt and we were

[100:56]

probably generating the response. In

[100:58]

this what we will do is that here we are

[101:01]

going to generate this entire pipeline

[101:03]

with some more additional features like

[101:05]

what all additional features we'll be

[101:07]

requiring. See here we are directly

[101:09]

getting the answers right but we do not

[101:12]

have much information about the source

[101:14]

about the context over here right. So

[101:16]

here what we are doing we will return

[101:17]

answers sources confidence score

[101:20]

optionally fully context full context

[101:22]

okay so first of all again the code will

[101:24]

be similar where we are retrieving the

[101:26]

context so this becomes my context when

[101:28]

we are retrieving it from retriever

[101:29]

retrieve and then uh I have written if

[101:32]

not results if results are empty we are

[101:34]

saying that no relevant context found

[101:36]

and here we are giving sources is blank

[101:38]

confidence is 0.0 zero and context is

[101:40]

blank. This context is basically coming

[101:42]

from the vector DB. Let's say that if we

[101:44]

are getting some kind of results over

[101:46]

here, we are combining all those results

[101:47]

and we are preparing the context over

[101:49]

here and then we are adding sources. See

[101:52]

this sources which is the list here we

[101:53]

are adding metadata information source

[101:56]

file right and along with that you can

[101:58]

see metadata page number from which page

[102:00]

number you are able to get then what is

[102:02]

the similarity score and here what I

[102:04]

will do is that I'll just try to go

[102:06]

ahead and you know display at least 300

[102:10]

um length of the content right so up to

[102:12]

300 characters we'll try to display and

[102:15]

then we are going through each and every

[102:16]

docs that is available inside this

[102:18]

results then we are going to calculate

[102:19]

the confidence uh we are actually

[102:22]

getting that information in this doc

[102:23]

similarity score. Here is my prompt. In

[102:27]

this prompt we are giving context query

[102:29]

each and everything and we are invoking

[102:31]

it and the output will be in this

[102:32]

format. So let's now go ahead and

[102:34]

execute this rag advanced function. Here

[102:37]

I've given all the information like I've

[102:39]

asked what is the attention mechanism?

[102:41]

What is rag retrieval like rag retrievy

[102:44]

I'm given over here llm return context

[102:46]

is equal to true minimum score all these

[102:48]

things is given right. So now I'll go

[102:50]

ahead and execute this. Now as soon as I

[102:52]

ask what is attention mechanism here

[102:54]

you'll be able to see that I'm getting

[102:55]

this particular information right and it

[102:57]

is also giving me the source information

[102:59]

which number page number what is the

[103:00]

score and what is the preview

[103:02]

information along with that here is my

[103:04]

final information that you can see right

[103:07]

where we are displaying the first 300

[103:09]

characters let's say that I go ahead and

[103:12]

change my question okay I I ask

[103:15]

something else I'll say hey u attention

[103:18]

mechanism was one of the thing but if I

[103:20]

go ahead see my data, my PDFs. Okay, I

[103:25]

will go ahead and ask something else.

[103:27]

Okay, let's see what I can ask. So, I'll

[103:29]

go to embeddings PDF. I'll say okay. And

[103:33]

then let me search something else,

[103:35]

right? I will say hard negative. I'll

[103:38]

ask this question hard negative mining

[103:41]

techniques. Okay, so I will go to my

[103:46]

question over here.

[103:50]

hard

[103:51]

negative

[103:53]

mining techniques.

[103:56]

Okay.

[104:00]

And I'll go ahead and search this thing

[104:03]

from my vector retriever. So here you

[104:05]

can see that I'm able to get this entire

[104:06]

information. The test is several

[104:08]

hardcand

[104:10]

embeddings NV retriever all these

[104:13]

information and again you can see that

[104:14]

embedding.pdf PDF page 4 I'm able to see

[104:17]

all the information along with the

[104:18]

context right so this is uh really

[104:21]

amazing and here we have just created an

[104:23]

Nstrack pipeline why we say this as an

[104:24]

NS rack pipeline because here we are

[104:27]

providing information related to answers

[104:30]

we are providing information related to

[104:32]

confidence score and each and everything

[104:34]

now let me just show you one more

[104:36]

amazing way and this is also an advanced

[104:39]

rack pipeline but this time I will tell

[104:41]

you to probably go through this

[104:42]

particular code and tell me so here what

[104:44]

What we doing? We're doing streaming,

[104:45]

citation, history and summarization. So

[104:48]

all these things we have included over

[104:49]

here and uh you can just go and search

[104:52]

for this and you can see the answer.

[104:53]

Okay, final answer roment context found

[104:56]

because that question may not be there.

[104:58]

Okay, I will just or let me just change

[105:01]

this minimum score to 0.1. I think we

[105:03]

should be able to get something. Still

[105:05]

nothing. Uh let me change the

[105:09]

question. Let's say hard negative mining

[105:11]

techniques. And here we are just going

[105:14]

to go ahead and display this particular

[105:16]

output. Okay. So now you just go ahead

[105:19]

and explore this. Okay. I'll keep this

[105:21]

for you at least see some kind of

[105:23]

coding. Okay. So here uh we are not able

[105:25]

to get anything as such. Uh let's see

[105:28]

advanced rack query hard query to top

[105:30]

querying summarize equal to true. Uh no

[105:34]

relevant this one. Let's see that I go

[105:36]

ahead and ask what is

[105:39]

what is

[105:41]

attention

[105:43]

is all you need. Okay, I'll go ahead and

[105:47]

execute it. So here you can see that I'm

[105:50]

able to see all these particular answers

[105:51]

over here. Right. Yeah, for some of the

[105:54]

queries this will not it is not giving

[105:57]

there may be some problem with respect

[105:59]

to the context size but it's okay. You

[106:01]

can try out with different different

[106:02]

things. If it if something is not coming

[106:04]

then we'll try to optimize that also as

[106:06]

we go ahead we'll try to see this. So

[106:08]

here we have seen three amazing rack

[106:10]

pipelines. One was a simple rack

[106:11]

pipeline. Here was an enhanced rack

[106:13]

pipeline. And here uh in the last one we

[106:16]

have made sure to put streaming citation

[106:18]

and history and summarization with all

[106:19]

this kind of information over here. You

[106:21]

just go ahead and check it out all the

[106:23]

information and just see the code. I

[106:25]

think you should be able to understand

[106:26]

it. So overall uh if you see I hope you

[106:30]

were able to understand this particular

[106:32]

video

[106:34]

and uh yeah this was about rack

[106:36]

pipeline. Now in the upcoming videos

[106:37]

what we will do is that we will try to

[106:40]

create some modular coding because see

[106:42]

here the entire everything is basically

[106:45]

created in one IP file. So guys now it's

[106:48]

time that we implement the entire rack

[106:51]

pipeline in the form of a modular

[106:53]

structure. Already in our notebook we

[106:55]

have seen about PDF loader.pipinb IP and

[106:57]

B you know wherein we discussed how to

[107:00]

probably go ahead and create the entire

[107:01]

data injection and how to probably store

[107:04]

all the information into the vector DB

[107:06]

and finally you're also able to make the

[107:08]

query right along with that uh I have

[107:10]

also shown you how to work with

[107:11]

typesense uh which was an open-source uh

[107:13]

vector store itself which was also again

[107:16]

amazing for searching anything in a

[107:19]

quicker way right now all the kind of

[107:22]

implementation that we have done what we

[107:23]

are going to do is that I'll try to show

[107:25]

you how in a modular way you can go

[107:27]

ahead and integrate this in a form of a

[107:28]

pipeline. Okay. So already we have this

[107:31]

source folder. Now inside this source

[107:32]

folder, what I am actually going to do

[107:34]

is that I'll go ahead and create

[107:35]

my_init_.py

[107:39]

file. And after creating this particular

[107:42]

file, what is the next step is that I

[107:44]

will go ahead and create all my

[107:46]

components important components that

[107:48]

will be required in order to create your

[107:51]

uh rack pipeline. The first important

[107:53]

component is nothing but data

[107:56]

loader. Right? Data loader. py file.

[108:00]

Right? So this will be my first

[108:02]

component because initially we need to

[108:04]

load the document. We need to do the

[108:05]

chunking and then we need to probably go

[108:07]

ahead and store it into the vector

[108:09]

store. Right? So inside my data loader

[108:11]

you know I I will just try to go ahead

[108:13]

and read all the documents uh that is

[108:16]

actually required. Okay. Then uh after

[108:19]

this uh the next step should be your

[108:21]

vector store. Right? Now the vector

[108:23]

store what vector store we are basically

[108:25]

going to use. Uh so for that I will be

[108:27]

creating my another file. So here inside

[108:29]

my source I will go ahead and create one

[108:32]

more file which is called as vector

[108:34]

store. py. Okay. So this is my next file

[108:38]

that is basically created. Okay. uh

[108:40]

along with this uh while while actually

[108:43]

inserting anything into the vector store

[108:45]

I also need to probably go ahead and do

[108:46]

some kind of embeddings right and uh I

[108:49]

will try to show you some open source

[108:51]

embeddings that we are going to use. So

[108:52]

for that I'll be creating my embedding

[108:55]

py file and finally uh the last file

[108:58]

that I really want to create is

[108:59]

something called a search py. Now my

[109:01]

entire rack pipeline needs to be

[109:03]

integrated in such a way that there

[109:05]

should be a linkage between all the

[109:07]

specific files. Now the first case is

[109:10]

that I will go ahead and start working

[109:12]

on data loader. Now you know data loader

[109:14]

work is nothing but it should be reading

[109:16]

this particular data. Okay, it can be

[109:18]

from any source itself. Um we will try

[109:21]

to read this specific data itself.

[109:23]

Right? So for this what I'm actually

[109:24]

going to do is that I'll go ahead and

[109:26]

import some of the libraries. So quickly

[109:29]

I will go ahead and import these all

[109:31]

libraries like uh pi PDF loader, text

[109:34]

loader and all. Okay. So I'll start

[109:36]

working on this because I need to form a

[109:38]

pipeline itself right. So inside this

[109:40]

particular file my main code should be

[109:42]

in such a way that I will go ahead and

[109:45]

read all the documents let it be of a

[109:47]

PDF text loader or CSV. Okay here I'm

[109:50]

also going to give you some of the

[109:51]

assignments because uh in this entire

[109:53]

series of videos we have discussed about

[109:55]

this. Okay. So quickly what I'm actually

[109:58]

going to do is that I will go ahead and

[109:59]

create one function which is basically

[110:02]

called as load all documents. Now see

[110:05]

this. Okay. So here I'm just going to go

[110:07]

ahead and write this function. Now

[110:09]

please have a look onto this particular

[110:11]

function. This function function

[110:13]

definition is load_all

[110:17]

documents. I'm given the data directory.

[110:19]

This should be in the form of string

[110:20]

format and it is returning list right

[110:23]

list of anything right of any kind of

[110:25]

data type. Now the main important thing

[110:28]

about this function is that it loads all

[110:29]

supported files from the data dictionary

[110:31]

and convert to langen document data

[110:33]

structure because as soon as we read any

[110:36]

kind of data like PDF, CSV, TXT, right?

[110:39]

We need to probably go ahead and convert

[110:40]

that into a langen document structure

[110:43]

then only we'll be able to apply the

[110:44]

chunking. Okay. So here you can actually

[110:47]

see that I have used data path uh of the

[110:51]

data directory itself. the data

[110:53]

directory I will be giving in the

[110:54]

runtime and obviously by just seeing

[110:56]

this the data directory is nothing but

[110:58]

data itself. Okay. Now this is the code

[111:01]

specifically to read all the PDF files.

[111:04]

Okay. So here I have created a list

[111:06]

documents which will be storing all the

[111:08]

documents itself. Uh here we have used

[111:11]

data path globe globe function and here

[111:15]

I have used this pattern this kind of

[111:17]

regular expression to match all the PDF

[111:19]

files. So what it will do is that inside

[111:22]

this data directory it will start

[111:24]

looking for all the PDF files. So inside

[111:26]

this you know that in the inside my PDF

[111:28]

folder there are some PDF files. So it

[111:30]

is going to go ahead and read all these

[111:31]

particular PDF files. Okay. So once it

[111:34]

reads the PDF files uh we will be having

[111:36]

those PDF files over here in the form of

[111:38]

a list. Okay. Then what we are doing we

[111:41]

are writing for PDF and PDF files. We

[111:43]

are going through every PDF and then we

[111:45]

are using pi PDF loader to read the

[111:48]

content inside this and we are using

[111:50]

loader.load and finally I get all the

[111:52]

information over here and we are going

[111:54]

to extend that documents. Now this is

[111:57]

just an example of PDF files right now.

[111:59]

Same thing you can also do over here for

[112:02]

text files. Okay, text files. You can

[112:05]

also do it for CSV files. Right? See

[112:08]

similar kind of code is basically

[112:10]

suggested by GitHub copilot. But I

[112:12]

really want to give you an assignment.

[112:14]

Okay. So this will be for CSV file. This

[112:16]

can be for SQL files. Any kind of files

[112:19]

that you really want to work with. you

[112:21]

can go ahead and write that particular

[112:23]

code and keep on appending inside this

[112:26]

particular documents. Okay. So as soon

[112:28]

as you do that automatically you'll be

[112:30]

able to do this specific stuff and

[112:32]

you'll be able to get all the documents.

[112:34]

Okay. Now what I will do just to test it

[112:37]

out whether my PDF files is working fine

[112:39]

or not. I will just go ahead and create

[112:41]

one app. py file over here. Okay. Now

[112:44]

inside this app py file let me go ahead

[112:47]

and import some of the libraries. So

[112:49]

first of all I need to read everything

[112:51]

over here right. So I have written from

[112:54]

source dot data loader import load all

[112:56]

documents. So this load all documents is

[112:58]

nothing but this is the same function

[112:59]

that is present inside my data loader.

[113:01]

py. Okay. And then from source dove

[113:04]

vector store files vector store and rack

[113:07]

search I will create in the later

[113:08]

stages. So right now I'll remove this.

[113:10]

Okay. Now let's try to test the example.

[113:13]

So example usage I will write if

[113:17]

name

[113:19]

main okay and then here I will go ahead

[113:22]

and write documents is equal to load all

[113:25]

documents and I'll give my data folder

[113:28]

okay data folder then what I can

[113:31]

actually do is that I can just go ahead

[113:33]

and print my docs okay

[113:37]

if you see inside this data loader what

[113:39]

this is returning right now it is not

[113:41]

returning anything so what you can

[113:43]

actually do do is that from here so here

[113:45]

what we are going to do is that we are

[113:47]

going to return the specific documents

[113:48]

over here so that we should be able to

[113:50]

print that particular documents over

[113:51]

here right now what I am quickly going

[113:54]

to do is that I will just go ahead and

[113:56]

write open command prompt okay and here

[114:00]

I'm going to go ahead and write python

[114:03]

app py now let's see whether it'll be

[114:06]

able to read the uh pdf files or not now

[114:09]

here you can see it has found four pdf

[114:11]

files all the pdf file URL is over here

[114:14]

and you are able to see that it is also

[114:16]

able to see all the content that is

[114:18]

available inside that particular

[114:19]

documents which is good right and this

[114:22]

is basically in the form of a document

[114:24]

data structure I guess yeah so all the

[114:26]

information is basically happening so

[114:28]

that basically means so clearly I can

[114:31]

see something really amazing over here

[114:33]

is that my entire data the PDF code that

[114:37]

we have written is working absolutely

[114:39]

fine okay now uh comes the next step.

[114:42]

Now the next step you should probably

[114:44]

start thinking whether we should

[114:46]

basically go ahead and work with

[114:47]

embedding so that to do the chunking and

[114:50]

all right so here uh I will go ahead and

[114:52]

start working on embedding now inside my

[114:55]

embedding what we are going to do is

[114:57]

that I'll be importing these libraries

[114:59]

now these all are same thing repeated

[115:01]

but here I'm using classes and function

[115:04]

definition so here you can see that

[115:06]

after reading all the documents after

[115:08]

loading all the documents I'm going to

[115:10]

use sentence transformer recursive

[115:11]

character text splitter and here you can

[115:14]

see I've defined a function uh class

[115:16]

called as embedding pipeline right the

[115:18]

model that I'm going to use is all mini

[115:20]

v6 uh lm l6 v2 chunk size is nothing but

[115:24]

1,000 and chunk overlap is nothing but

[115:26]

2,00 200 then here we are writing self

[115:30]

dot chunk size chunk self overlap and

[115:32]

then we are also initializing the

[115:34]

sentence transformer now in the next

[115:36]

function that we are going to go ahead

[115:38]

and do is nothing but uh we are going to

[115:41]

go ahead and create a function which is

[115:42]

called as chunk documents. Now inside

[115:45]

this chunk documents we are giving the

[115:48]

documents which can be a list of any

[115:49]

documents. Here we are applying

[115:51]

recursive character text splitter based

[115:53]

on all these values that we have

[115:54]

initialized. Along with this we have

[115:57]

also used different different separators

[115:59]

if you're interested or you can directly

[116:00]

use this blank separator. Okay. Then you

[116:04]

can see that I am also using the

[116:06]

splitter.split split documents over here

[116:09]

and then you will be able to see the

[116:10]

remaining chunks over here itself. Okay.

[116:13]

Now this is for uh any document that I

[116:15]

pass inside this particular function

[116:17]

right but one thing is very important is

[116:20]

that because after the chunking is done

[116:23]

right you need to also convert that

[116:24]

chunking into vectors with the help of

[116:26]

this particular model. So for that I

[116:29]

will be creating one more function which

[116:30]

is called as embedding chunks right. So

[116:33]

here what I will be doing is that I'll

[116:36]

create this particular function called

[116:37]

as embed chunks. Here we will take this

[116:39]

chunks. So what happens is that first

[116:42]

the load all documents will be called

[116:43]

right after that the chunk documents

[116:45]

will be called wherein all these

[116:47]

documents will be chunked. Then all the

[116:49]

chunks will be passed through our model

[116:52]

to probably convert that into a vector

[116:55]

embeddings. Right? So here you'll be

[116:57]

able to see self domodel.enccode.

[116:59]

So show progress bar is equal to true.

[117:01]

Right? So here what we are doing we are

[117:03]

reading all the page content and we are

[117:05]

performing the embeddings and finally we

[117:07]

return the embeddings over here right so

[117:09]

this is what we are actually doing right

[117:11]

so two important function one is chunk

[117:13]

documents and one is embed chunks inside

[117:15]

a class called as embedding pipeline now

[117:18]

the same thing you can go ahead and test

[117:19]

it in your app py right so in the app py

[117:23]

what you are going to do is that here um

[117:25]

I will just go ahead and

[117:28]

go ahead and

[117:30]

just a Okay, let me go ahead and

[117:33]

initialize just a second uh the

[117:36]

embedding pipeline. Okay, so here what I

[117:39]

will do, I will go ahead and write from

[117:42]

from src

[117:44]

dot

[117:46]

embedding import embedding pipeline.

[117:48]

Right? And once you do this, I will go

[117:50]

ahead and initialize the embedding

[117:52]

pipeline. Okay? And then I will just go

[117:56]

ahead and give this right. So this

[117:58]

basically becomes my vectors

[118:03]

sorry embed chunks it is there right so

[118:05]

embed chunks before that I need to chunk

[118:08]

the documents I also did not call the

[118:10]

chunk documents so let's first of all

[118:11]

call the chunk documents over here

[118:16]

okay and then this will basically be my

[118:19]

chunks

[118:21]

and finally you can also go ahead and

[118:24]

write over here as my chunk vectors ve

[118:30]

chunk vectors is equal to and here uh

[118:34]

you can go ahead and use the same

[118:37]

embedding pipeline dot embed chunks

[118:40]

right and finally you can go ahead and

[118:42]

[118:44]

the chunk vectors. So once you do this

[118:47]

that basically means you'll be able to

[118:49]

understand whether the chunking is

[118:50]

happening or not. So let's quickly run

[118:52]

this particular file again. And now you

[118:55]

should be able to see the chunking that

[118:57]

may be happening over here. Okay. So

[119:00]

it'll take some amount of time because

[119:02]

it is going to load all the documents

[119:04]

again. Okay. And then the chunk document

[119:06]

function is going to get applied over

[119:08]

here. The chunk documents what it does

[119:10]

is that it is just going to apply

[119:12]

recursive character text splitter on

[119:14]

every documents that we specifically

[119:16]

give. Right? And once we do that you'll

[119:18]

be able to see that it is loading. You

[119:20]

can see all the things are happening

[119:22]

over here. 21 PDFs, one PDF like 21

[119:25]

pages PDFs is over here with respect to

[119:28]

this proposal load embedding all models

[119:31]

splitted 64 documents I got into uh 359

[119:35]

chunks you know and then we basically go

[119:38]

ahead and store this. Now the next step

[119:40]

is that after this uh I will try to

[119:42]

create a vector store and uh we will try

[119:45]

to save those embeddings also. Okay. So

[119:49]

here you can see all the chunks is uh

[119:51]

vectors are visible over here right. So

[119:53]

this is really really good. So just just

[119:56]

imagine right in a pipeline it is

[119:58]

specifically working one by one right it

[120:00]

is it is working over here and that's

[120:02]

that's the best part out here right now

[120:05]

the next step is that what I will do is

[120:07]

that I will try to create some more

[120:10]

functions uh which can be for save and

[120:14]

load uh like if I want to save this

[120:16]

entire chunks how do I go ahead and save

[120:18]

it you know u what do I save it each and

[120:22]

every information that you'll be able to

[120:23]

see over here Okay. Now,

[120:26]

uh this was about uh the two important

[120:30]

pipeline which is basically load all

[120:32]

documents and uh embedding pipelines

[120:35]

with uh two important function. One is

[120:37]

chunk documents and one is embed chunk.

[120:39]

So guys, now the next step is that what

[120:41]

we are going to do is that now already

[120:42]

we have created this embedding pipeline,

[120:44]

right? Now let me do one thing because

[120:46]

after performing the embedding, we also

[120:48]

need to store it in some kind of vector

[120:49]

store and it should be persistent in any

[120:51]

kind of directory or in cloud. Right? So

[120:53]

for this I will start working on this

[120:55]

vector store. py file and here I'm going

[120:58]

to use some code. Now you can see what

[121:00]

all things I'm actually using. So I'm

[121:02]

using the sentence transformer and

[121:04]

embedding pipeline over here. Fiest

[121:06]

vector store is the class name that we

[121:09]

going to use. Uh I'm going to

[121:10]

specifically use fis. Uh here we are

[121:13]

going to use the same model. All mini l6

[121:15]

v2 chunk size everything is over here.

[121:17]

And uh we are also making some kind of

[121:20]

directories. the persistent directories

[121:22]

like fire store should be the name and

[121:24]

then here you'll be able to see I'm

[121:25]

initializing the embedding model

[121:26]

sentence transformer and all now the

[121:28]

first step is that build from the

[121:30]

documents now see here uh the same code

[121:32]

we will go ahead and write what we had

[121:34]

written in embedding pipeline right so

[121:36]

here we are initializing embedding

[121:37]

pipeline model dot self embedding model

[121:39]

chunk size and I've given the chunk

[121:41]

documents embed document embed chunks

[121:44]

I've got the metadata and I'm adding all

[121:46]

these embeddings inside my vector store

[121:48]

and once I use selfsave Save. What is

[121:51]

this self dots save? Save is a function

[121:53]

which is going to save all the vector

[121:55]

inside this index dotpickle files.

[121:57]

Right? So metadata is basically getting

[121:59]

saved in pickle file and files.index

[122:02]

will basically be my vector store which

[122:03]

will be in the persistent directory. So

[122:05]

that is the reason I have written

[122:06]

files.right index self.index files path

[122:10]

right with open metame this and all

[122:13]

information is there right. So this same

[122:15]

method is basically there add embedding

[122:17]

method is over here. Add embedding is

[122:18]

nothing but it is basically taking it it

[122:20]

is adding as a index flat tail two. So

[122:23]

these are some basic stuffs when you

[122:25]

actually work on this. Along with that

[122:27]

I've also created two more function load

[122:29]

and search. Load and search what it does

[122:32]

is that it will actually allow you to

[122:34]

load the files index the vector store.

[122:37]

Okay. And will uh load it in the read

[122:40]

byte mode and then with the help of

[122:42]

search and query you should be able to

[122:44]

ask any kind of queries that you have.

[122:46]

Right. You can also use this query

[122:48]

method. Uh here you can see we have

[122:49]

written self domodel.enccode with

[122:51]

respect to the query test as type float

[122:54]

32 and with the help of query search

[122:56]

you'll be able to get the output. Okay.

[122:59]

So this was about my vector store. Now

[123:01]

in the app py what I am actually going

[123:03]

to do I will just go ahead and make some

[123:05]

changes. Okay. Now what what are the

[123:07]

changes that I will be making? Okay.

[123:09]

Instead of calling this two, okay, I

[123:11]

will just go ahead and write store is

[123:14]

equal to

[123:16]

first of all let me go ahead and

[123:17]

initialize this files vector store. So

[123:20]

source dot embeddings files vector store

[123:23]

here okay and here I will go ahead and

[123:26]

initialize this

[123:28]

and let me go ahead and give the path

[123:30]

name. The path name is fires h o r e.

[123:34]

Okay. Now initially if this p path is

[123:37]

there then it is fine. Otherwise it'll

[123:39]

go ahead and I'll just go ahead and

[123:41]

write store.build from documents of all

[123:43]

the docs. That's it. Now if I do this it

[123:47]

is just going to go ahead and for the

[123:49]

first time it is going to build it. Okay

[123:52]

it is going to build it. So let's see

[123:53]

whether it'll be able to build it or

[123:55]

not. So here I'm going to clear the

[123:58]

screen. Python app.p py

[124:03]

let's quickly see this

[124:07]

now it is going to read first of all it

[124:09]

is going to read it then this is fine

[124:11]

loading perfect load all the PDF files

[124:14]

perfect now the chunking will happen

[124:16]

automatically and it'll save it in the

[124:18]

vector store inside that particular

[124:20]

folder that is files let's see

[124:23]

now it is generating 359 chunks

[124:26]

all the steps are almost same what we

[124:28]

have discussed from starting but this is

[124:30]

A very super cool way of building

[124:32]

something. Right? Now you can see save

[124:34]

files index metadata to fire store

[124:36]

vector store also. So here you can see

[124:38]

fire store is there fires.index and

[124:40]

metadata.pickle right now we need not

[124:43]

run it each and every time right uh

[124:45]

because uh once we have this right from

[124:47]

the next time what we can do instead of

[124:49]

always building unless and until you

[124:51]

have a new documents I can also go ahead

[124:53]

and write store.load

[124:55]

okay if I go ahead and write store.load.

[124:58]

Okay, I should be able to print anything

[125:02]

that I want, right? Let's say I will go

[125:04]

ahead and print something like this. I

[125:06]

can use the same query method that we

[125:09]

had. What is attention mechanism? Top K

[125:11]

is equal to three. Right? So once I do

[125:13]

this, you should be and this time I

[125:16]

don't think so we need to also read any

[125:18]

kind of documents also over here. Right?

[125:20]

So I'll comment it down over here. This

[125:22]

also you can uncomment it if you really

[125:24]

want to or you can also give another

[125:26]

conditions. Now what it'll do, it'll

[125:28]

directly go ahead and read from the

[125:29]

vector store. It'll pick it from the

[125:31]

persistent directory and it'll give you

[125:33]

the output. Let's see.

[125:36]

So from the fire store, it'll go ahead

[125:38]

and pick it up. And here you go. Here

[125:40]

you get the answer clearly, right? See

[125:43]

loading embedding models. This is there

[125:45]

loading fire index and metadata. What is

[125:47]

attention mechanism? All the information

[125:49]

is over here. And this is the output

[125:52]

that you are able to get. Right.

[125:54]

Perfect. This this is what exactly uh I

[125:58]

was actually talking about. But the best

[126:00]

part is that we have created this in the

[126:02]

form of a pipeline. You have data

[126:03]

loader, you have embedding, you have

[126:04]

vector store. Now for search what you

[126:06]

can do is that you can integrate any

[126:08]

LLMs over here. Right? So for this also

[126:11]

I have written the code. Again I don't

[126:12]

want to discuss it step by step line by

[126:14]

line. So that it'll be again taking a

[126:18]

lot amount of time to complete this.

[126:20]

Right? So here I have my load_.env.

[126:23]

You can just go ahead and load all these

[126:24]

things. Groc API key is given over here.

[126:27]

You can use it or you can use your own

[126:29]

Gro API key. It's fine. Okay. And then

[126:32]

we are doing the search, right? Wherein

[126:34]

we are using this vector store do.query

[126:36]

getting all the documents getting all

[126:37]

the metadata and then we're giving some

[126:40]

prompt and we are invoking it along with

[126:42]

the LLM. So once we do this, it is

[126:44]

superbly easy to execute this. Anyhow,

[126:47]

you can do the research because I have

[126:49]

discussed all these things in my Jupyter

[126:51]

notebook, right? Uh now what I will do

[126:53]

in my app.py py I'll see what changes

[126:56]

needed to be added and uh what I will do

[126:59]

is that I will first of all import rack

[127:02]

search again from search dot search

[127:04]

import rack search and then I will go

[127:07]

ahead and initialize like this right and

[127:10]

now I don't even require this okay now

[127:14]

let's see whether it'll be able to give

[127:17]

the summary or not it is loading from

[127:20]

the vector store now I'm asking the

[127:22]

question search and summarize This is

[127:24]

the function here. What we do? We first

[127:26]

of all do the query from the vector

[127:28]

store that we were usually doing before.

[127:30]

Then we give a prompt and then finally

[127:32]

LLM will be able to give the output. So,

[127:35]

so here you can see if my LLM is fine

[127:38]

then I think I should be able to get an

[127:39]

answer. So here you can see all the

[127:41]

output is basically over here.

[127:44]

So this was a complete idea or a kind of

[127:47]

crash course that I really wanted to

[127:49]

give on the entire uh rag. Rag is one of

[127:53]

the most important use cases. That is

[127:55]

what I always believe. Most of the

[127:58]

companies are specifically building rag

[127:59]

applications. So I think this is really

[128:01]

really important and super cool topic. I

[128:04]

hope you like this particular video.

[128:05]

This was it from my side. I'll see you

[128:06]

on the next video. Thank you. Take care.

Raw transcript

Full transcript without timestamps

Hello all, my name is Krishna and I am super excited to announce this amazing crash course on rag that is retrieval augmented generation. Uh in this specific crash course it'll be somewhere around 2.5 to 3 hours but we are going to discuss everything that is related to rack completely from scratch. Uh we'll be talking about the entire pipeline from data injection to retrieval pipeline to output generation. how to use LLM models, how to use embedding models in this uh along with this uh what should be the right strategy of using chunkings and many more things right so we will be deep diving into both the theoretical understanding along with the practical implementation and we will initially go ahead step by step we'll start with the basic implementation and then as we go ahead in the advanced section we'll also implement the modular coding right the main aim of the modular coding is to link the entire pipeline in a way so that you should be able to understand how rag actually works and also implement it in your company use cases. Let me tell you one very important thing. 90%age of the use cases that are currently been worked in all the companies are specifically related to rag. So this crash course will be an amazing one for you all of you. We'll keep a simple like target of thousand uh try to complete it as soon as possible and we'll also keep a like target to some uh comments target of 500. So please try to complete it and yes go ahead and enjoy this particular crash course. Thank you. So this is a simple definition that uh I have put up over here and uh in this definition first of all we'll try to understand rag. Okay. So first of all let's go through the definition and then I will give you a brief idea what exactly rag is all about you know. So here you can clearly see that rag is the process of optimizing the output of a large language model. Okay. So it references an authorative knowledge base outside of his training data set source before get generating a response. LLMs are trained on vast volume of data as we all know and use billions of parameters to generally original output for task like question answering, translating and completing sentences. Rag extends the already powerful capabilities of LLM to specific domain or an organizational internal knowledge base all without the need to retrain the model. Okay, it is cost effective approach to improve LLM output. So it's relevant, accurate and useful in various context. So this is just a basic definition. You can refer to this particular definition. So guys, now let's go ahead and understand about rag. So let's consider that I have a generative AI application. And as you all know in a generative AI application, usually let's say that I have an LLM. So this is my LLM. Now usually whenever we have a LLM what happens is that let's consider that I have a user a user is asking a query. So this is a my query from the user and before it is sent to the LLM we do add a prompt right we do add a prompt and this prompt is just like an instruction to the LLM like how the LLM should work okay and then based on this we actually get an output now this is a simple generative AI application wherein the LLM is used to generate the content Okay, generate the content. So obviously by using this specific technique we give a query and this LLM you know that it has been trained with billions of data okay different kind of data that is available in the internet and based on this it will be able to generate the output. One of the disadvantage of this, let me talk about the disadvantage of this particular approach. As you know that every LLM that is trained, you know, it will be trained for a specific set of data. So let's say right now it is 31st August. Okay, 31st August. Let's say this is my LLM model and this is basically GPT5 which is the recent model from OpenAI. Now as you know that when this model was launched this model may be trained by may be trained with data till 1st August. Okay. So this LLM will not have any idea what has basically happened in the current world between 1st to 31st August. Right? And let's say if I go ahead and ask a specific question to the LLM which is between this specific dates for any kind of events the LLM will start hallucinating. So one of the major disadvantages of only using the LLM is that it will hallucinate. Okay. When we say hallucinating what does this basically mean? It means that even though it does not have the knowledge what has happened between 1st August to 31st August any events even though we ask any question the LLM will try to generate it own answer because it does not want to look like a fool. Okay, that is the best example. It does not want to look like a fool. So it will try to generate some answers and it will make sure that it'll it'll show you answer that you may also have to believe it. that is how it will be written you know in terms of the output that we get so usually this condition is basically called as hallucinating okay so this is one of the major disadvantage the second disadvantage that you have so let's say that I'm using this LLM and you know this LLM has been trained with huge amount of data now what happens is that I'm running a startup let's say now in my startup I'm solving a specific use case and I have some data which again I need to use this particular data along with my LLM. Okay. So let's say that I have some other data like you know um policies policies of my company I have HR policies of my company I have finance policies you know and this policies all will not be available in the it will not be available publicly because it is my startup so these all data has been protected now I also want to use this specific data and probably create a chatbot okay now how do I do this Now one way is that many people will say hey kish we can take this particular data and we can fine-tune the model right we can simply fine-tune the model yes this is a very good solution but understand fine-tuning a model is a very expensive process very tedious process because this LLM whichever LLM we are using it has billions of parameter and tweaking this billions of parameter usually takes a lot of time Right? So obviously this is a solution but this is a very expensive solution. Okay. Now do we have any other way? Any other way and remember these all policies and these all data will also keep on getting updated as we run the startup. Right? So every time we cannot just go ahead and finetune it like every day we not fine-tune it. Right? So we should try to find out a solution like how do we prevent this? So this can again be prevented with the help of rag right now how it will be prevented with the help of rag I will talk about it okay so here instead of fine-tuning I'm saying that hey I will go ahead and implement the rag now you'll understand only when we understand the pipeline of the rag which I will discuss in this specific video okay now these are the major two disadvantages that you see right over here and yes they are some more disadvantages which we'll just deep dive more as we go ahead. Okay. Now what happens in uh if we use rag and how we are preventing it. See rag is nothing but it is it is saying that is a process of optimizing the output of a large language model. So it references an authorative knowledge base outside of his training data. Now how do we solve this hallucinating and this problem that we have. Okay. So let me just go ahead and draw the diagram again. Okay. So here is my LLM. Okay. And here is my query. So let's say that uh I am coming up with an user query. So let's consider it over here. Okay. And here I'm drawing a user I'm user. Okay. And this user will first of all give a query. Okay. Now what happens is that there will be two important pipelines that will be created. As I said over here we are trying to optimize the output of a large language model. So it references an authorative knowledge base outside of it training data source. So as you all know this is my LLM right? This LLM is already trained with huge amount of data. Now along with this I will be having an external database and this database we basically say it as vector database okay external vector database now you you know that this LLM is already trained with some amount of data and any additional data let's say my startup data my policies HR finance whatever data is there we will try to create a data injection pipeline over here data injection pipeline over here. Now what will be this data injection pipeline? So let's say I have my data from this data we will do some kind of parsing and from this parsing we will do embeddings embeddings and then we finally store it into the vector store. Okay. Now whenever we talk about the specific data this data can be in any format. It can be in PDF format. It can be in HTML format. It can be in Excel format. It can be even in SQL database format or unstructured format. Any format. So what we do initially we take this data and we do data parsing. Now here data parsing is a very important step. I think if you crack this step then developing a rag application becomes very easy. Data parsing is all about how do you read the unstructured data or the structured data that is present inside this and how do you chunk this data right? How do you chunk? How do you divide the specific data into chunks? Chunking is very important because you need to save this data inside some kind of vector store. This is nothing but vector store or vector DB. Okay. Now vector store and vector DB is nothing but it will actually help you to save vectors inside this. Okay. So once you do the chunking after doing the chunking you pass it to the embedding models. Now here in the embedding models you basically convert text to vectors. Okay, vectors is just like a numerical representation for text so that you will be able to apply algorithms like similarity search, cosine similarity techniques that are already available, right? Wherein similar kind of results based on a specific query can be retrieved from this particular databases. Okay, so here whenever I talk about vector DB, this is my vector DB or vector store. Here we are storing embeddings. Okay. And this embeddings will get applied to every chunks. Embeddings is nothing but we basically use we convert text into vectors. Here we can use different different embeddings like Google gemin models. We can use openi embedding models. We can use hugging phase embedding models and each and every embedding models exist with different different cost and there are also open source embedding models which will actually help you to convert the text into vectors. Now this is one specific pipeline which we call it as data injection pipeline. At the end of the data injection pipeline, you are able to store the text into vectors inside your vector DB. Now how rag is different from the previous one, right? So initially you had this data injection pipeline where you are converting all your data into vectors, right? And this data is specifically for this particular startup. And now I have created a knowledge base. So this is my knowledge base. External knowledge base or internal knowledge base whatever knowledge base I have. And this knowledge base does not exist with this LLM. Right? Yes, some amount of information may be available but not the entire part. Now see the definition. It is a process of optimizing the output of a large language so that it references an authorative knowledge base outside of this training data. Now what will happen when user gives a query? Now this query instead of directly going to the LLM will go to this vector database right and before going here also we need to go ahead and apply embedding right because this query will be converted into vectors right why we need to convert into vectors so that when we are hitting this query to the vector DB this similarity search is basically applied and based on this we get some kind of context we get some information from the vector DB and now whatever query I'm asking okay if I ask hey what is the leave policy of my company right now what will happen first of all it will go to the vector store it will gather all the related information that is available over here and that information when it is sending it to the lm it is called as context Now we use this context along with we go ahead and write a specific prompt. Now this prompt is an instruction to the LLM and it says that you can use this context to answer the question and finally you get a output. This is the entire pipeline. This pipeline is basically called as retrieval pipeline. Retrieval pipeline. And this is a very good example of a traditional rag. Now you may be thinking kish what about other types of rag. Don't worry thumb don't worry I will explain it completely from basic to advanc with implementation each and everything because later on we'll be discussing about agentic rags. We'll be discussing how agentic rags actually work each and everything. But I hope you got an idea with respect to this. Now here you will even not be seeing this particular problem like you'll not completely remove hallucination but some amount of hallucination if any queries that is asked related to the data that is present in the vector DB I will definitely get some kind of context and my LLM will give me the output as let's say that if that data is not present over here then LLM can hallucinate right but here we are doing this see one best example that you can do is that you can use perfectly Perplexity. Perplexity is nothing but it is based on rag. It is completely developed based on rag applications. Okay. Rag it is it is a kind of a rag application. In perplexity you have connected to various retrievers. You are connected to tools. You are connected to web search right and then it is summarizing the output and giving by the LLM. Right? and it also uses various LLMs itself. I'm also planning to mostly start a startup soon enough within couple of weeks I guess and the kind of application that I'm developing is a rag application only and it solves a very good problem for a developer. Okay. So that is the reason I'm not even able to upload a lot of videos because I'm pretty much involved in those startups and working and developing a product that India can definitely remember. Okay. And this is how you know this is this is this is how things are and you can basically see how good uh you know the pipeline actually works and this is basically a traditional rack. Now you may be thinking what all things we'll be discussing. Okay fine we have discussed about a traditional rack in the future classes what coding we'll be doing. Okay so let's go ahead and talk about it. As I said two important pipelines we'll go ahead and create one is a data injection pipeline and one is a retrieval pipeline. Okay. Now in the data injection pipeline you'll be seeing that we will be performing data injection. Along with the data injection we will go ahead and do data parsing. Then we'll perform embeddings. Then uh we will store everything into the vector store. Then we will create a ve retriever for this. And whenever a user ask any queries, it will be able to give the context to the LLM. And then finally we will be generating the output. So here this is retrieval. This is auggmentation right? This is augumentation over here. Augmentation basically means what? You're giving a context to the LLM along with the prompt to generate the output. Right? So this is basically called as augumentation and finally you're generating the output right which is nothing but generation. So here you are basically generating. Now in the next session how we are going to implement it. First of all I will show you how to perform this two steps in a very efficient way. Okay sorry not these two steps. I will show you how we can perform these all steps right data in data parsing and embedding. Here we are going to consider different different files like PDF, HTML. Okay. Um PDF, HTML, you can consider Excel, you can consider SQL database, you can consider any kind of files. Then we'll do document parsing and we will try to convert this into document. So document is an amazing data structure which you can basically use it and you can even parse this do the chunking and store it in the vector embeddings sorry vector store then we'll perform embeddings here we will use both open source and we are going to use paid embeddings for the same okay and then finally we go to the vector store then based on a user query how do we go ahead and apply the same embeddings we are going to see that okay and then finally we'll be developing this So mostly I really want I'm I'm focusing more on making bigger videos so that you don't just follow a playlist. Okay, I want to basically cover a lot of stuff in one video so that uh you should also be able to efficiently cover it instead of covering 50 different videos. Right now when we are doing data injection and data parsing right there are various techniques. See we are going to see about optimization. We are going to see about various chunking strategies, context engineering, these all kind of topics will be coming up when we talk about data parsing you know u what is semantic chunker you know how do we go ahead and do the chunking in those strategies and all everything we'll try to discuss as we go ahead but I hope you got a very super cool idea about what exactly is rag hello guys so we are going to continue the discussion with respect to rag already till now we have understood what is rag then what are the main drawbacks we are fixing with rag and along with that we have also understood how the rag pipeline is right it usually consists of two important pipeline one is the data injection pipeline and one is the retrieval pipeline which includes this two box okay now we are going to go ahead with some kind of practical implementation now the major thing that usually comes in my mind right whenever we go ahead and start any new series that is how should we cover a specific topic you know so that we can understand the coding from basics and we move towards modular coding so that is how I'm going to implement this entire pipeline initially we will go ahead with some basic code we'll try to understand the fundamentals and then we will start writing more complex code we'll be using modular coding also so initially we will write all the code in Jupyter notebook then we'll increase the complexity we'll write uh code in terms of class reus reus usability and then we'll try to see that how we can actually create the pipeline. So that is how the agenda will probably go ahead as we go ahead right. So two important things that we'll think about. The first important thing is to understand about the document structure. Now whenever we work with any external knowledge database any data that needs to be feeded into the vector DB you definitely need to know about this document structure. Why? Because inside this data injection pipeline the first step is data injection. Now whenever we talk about data injection here we can have any kind of files right we can have PDF files, HTML file, DB file, Excel file. Our main aim is to read all this particular file content and probably convert into a structure wherein we can additionally do uh we can apply strategies like chunking embedding and store it into the vector DB. That is what this entire pipeline is all about. So for that you really need to understand this document structure. So if you see this diagram right so since uh these two are the main topics that we are going to cover in this particular video initially we will go ahead with document structure understanding this and then we'll try to build our complete rag pipeline in our complete rag pipeline we have two important step one is the data injection pipeline and the other one is the query retrieval pipeline now whenever we talk about the data injection pipeline let's let's talk about this in complete depth right so initially you have this data injection pipeline Right? In the data injection pipeline, the first step is data injection. That basically means let's say that you have you may have different kind of files like PDF, HTML, right? Excel, you may have uh DB file, you may have unstructured file, any kind of file format. So in data injection what is our main strategy is that how to proceed with reading this particular file. How to perform data parsing. How to perform data parsing and then finally how to convert this into a document structure. Document structure. So that is the reason in this video right as I said we're going to first of all understand about document structure. how to build this document structure, what is metadata? Now, inside this document structure, uh you will be learning about important components like metadata. You'll be learning about content. You'll be learning about how the structure of the metadata exist each and everything, right? So, we will be covering completely in depth like how these things actually work. Okay? Once you understand this that and this data parsing is really really important step because of this you know later in the retrieval pipeline that is the query retrieval pipeline based on this parsing it can become much more efficient right you'll be able to get the results much more accuracy much more accurate so that is the reason you need to really focus on the data parsing now after doing the data parsing the next step usually is something called as chunking right so Here in the chunking we we convert this entire data into chunks multiple chunks. So this chunks is like let's say this is my chunk one this is my chunk two this is my chunk three this is my chunk four okay then as we go ahead after applying chunking. So chunking basically means and why do we apply chunking? Chunking strategy is very simple. Whatever documents we have, we are just dividing this into smaller parts or smaller chunks. The reason we do this because whenever we consider with respect to any LLM model or any L embedding models, let's say here the next step is all about embeddings. Okay. In embedding with respect to every LMA model, there is a fixed context size. Okay. Let's say if I take the complete 100 pages PDF and I directly try to give it to a L model for performing the embeddings like uh if I give it directly to an embedding model for performing the embeddings and embedding basically means you convert text to vectors. It will not be possible. It will say that hey you have you you are providing data more than the context size and that will not be possible in order to convert the text into vectors. So within the limit of the context size you really need to give the data and this is for both embedding models and even in the later stages whenever we use any kind of LLM model because for every LLM model there is a fixed context size. Yeah different LLM model may have different different context size. So that is the reason and it is always a good strategy that we try to divide our data into chunks so that we fit them in a way that we uh in the later stages we'll be able to efficiently put them into the vector database which is this. So after chunking for every chunk we go ahead and apply embeddings. Okay. So we go ahead and apply embeddings and from the embeddings we finally store that into our vector DB. Now inside this vector DB all this will be stored in the form of vectors. Like let's say this is my record one record two record three record four like that right so this is one record two record this is my third record then fourth record fifth record this you have right now from this particular vector DB you will definitely be able to apply any kind of similarity search similarity search now in this specific video what we are going to do is that I will be using any of this file and I'll create this entire pipeline. Okay, I will I'll just create this entire pipeline and you also need to probably work along with me later on. For any other files, I will give you an assignment. Okay, I will show you with couple of files. Let's say I'll take PDF file and I'll show you this entire data injection. Then what you do is that as an assignment you use any of the other file format let's say Excel, CSV whatever file format you want and you try to complete the same pipeline. Okay. So that is what is my strategy and please make sure to complete the assignment also and we will go step by step completely from scratch so that everybody will be able to follow. So first of all I will go ahead and open my empty folder and in this remember I will be using lang chain uh and this is just a traditional rag right now in the later stages we will move towards aentic rag. So from this particular command I will just go ahead and open my command prompt. I will open my VS code. So let me quickly go ahead and open the VS code. Now from the VS code the next step will be that I will quickly open my terminal terminal and let me just go ahead and write uv uh I'll just go ahead and initialize this particular workspace as my repository. So yt rag is my workspace. Now I will just go ahead and also go ahead and create my environment. So if you're using uv package so you can just write uv env. So my Python 3.13.2 will be the recent uh Python version that I'm specifically using for this particular project. And then I will go ahead and create activate this particular environment. Okay, perfect. Till here we are good enough. Now I will go ahead and create my requirement.txt. Now from this requirement.txt txt. Let me quickly go ahead and install some of the packages like lang chain lang chain core uh core lang chain dash community uh the all things are there. Let's me quickly go ahead and install these packages. So uv add minus r requirement txt. Okay, txt. So this is done and along with this I will also go ahead and install some of the libraries like pi pdf pi mu m new pdf. Okay so these are all libraries I'll be using. I'll talk about why I'm using pi pdfd pi mu pdf right. This is specifically to read my pdf documents. So one example that I'm actually going to show you is with respect to PDF and then you should also try to create the same pipeline with the help of any other uh data types. Okay, data formats types like let's say it will be it can be JSON, it can be anything as such. So uh my requirement txt is filled. Now what I will do is that I'll quickly go ahead and create my data folder and here I will also go ahead and create my notebook folder quickly so that I can start working on it and then along with this I will also go ahead and add UV add ipi kernel. Okay so that I will be able to work along with my Jupyter notebook. So ipi kernel has got executed. Now quickly I will first of all start with my Jupyter notebook and at the first thing that I told you it's related to document data structure right document what is document and what is how document can be very very helpful if you are using in the document data uh in the data injection pipeline okay so I'll quickly select my kernel and these all things you really need to be a good at Python programming language see there cannot be anything that you uh you can skip Python programming programming language. So my suggestion would be never do that. Okay. So Python is must and this time I'm just going to use some more advanced coding and it will not be possible for me to write line by line. So definitely I'll go a little bit fast to in order to explain you. Okay. Now as I told you if I go back over here in the data injection our main aim is to load some data apply some chunking then convert into embeddings and finally store it into the vector DB. That is what my entire data injection pipeline is all about. Right? For understanding this, we need to understand a document structure because all this chunking that is done, you know, the final output will be documents. Now, what exactly is a document data structure? So here I will go ahead and write what exactly is a document data structure. So for this I will go ahead and import from lang chain or to probably show you this. I will be showing you some kind of uh file so that you'll be able to understand it. Okay, let me put this file over here. Okay, I have some file over here and then we'll try to understand. Okay, what exactly is a document structure? See lang chin document structure. So langchen uh document is a kind of a data structure which will be able to save some data in some format where we have two important things. One is the page content and one is the metadata. The page content will basically have the content that is present inside that particular file. Okay. So if you are reading the file inside my page content all those detail all those content that is present inside the file will be available over here and metadata will be some more additional information of the file like it can be the file name it can be how many number of pages are there how what is the time stamp of the file each and everything. So this way whenever you read any kind of data and you convert them right in a document data structure this format will be very very important because at the end of the day we will be doing the embedding on this particular data and pushing it into the vector DB and when we do that specific task pushing it to the vector DB we will be able to apply different different uh algorithms like similarity search cosine similarity and we'll be able to retrieve the results. So here you can see that all the information regarding this is given over here. So usually langchen document structure it has two important core components. One is page underscore content and one is metadata. And here page content will be the actual text uh content where all it will be very very handy in research papers if you want to probably create a rag application or research papers product manual. So you can specifically use this in lang chain you definitely have different different loaders. Okay, loaders like you have something like PDF loader, you have CSV loader, you have web- based loader, you have directory loader. Now see all these loaders what it does is that for PDF loader will be used to load the PDF files and once it loads the PDF file right it will be giving you the output of the documents in the form of a document structure. Okay, I will show you practically also why I'm specifically saying and stressing on this. Okay, it will definitely give you all the output in the form of a document structure. Similarly, in the case of CSV loader, here we are giving the CSV file, but it will try to convert the entire content that is present inside that CSV into a document data structure. Similarly, with respect to web brace loader, clarity loader. Similarly, there are so many different different loaders over here, right? You can use any of this particular loader to load the data and at the end of the day uh this loader will finally give you the output in the form of document structure. Okay. So I hope you got an idea about what exactly is document structure itself. Okay. So now quickly what I will do I will go ahead and uh start explaining you about like how we can start with the document structure. So for the document we need to import from langin. langchen dot there's something called as text splitter and uh sorry langchen core it is present inside core dot documents import document okay now this document you will be able to see that if you just hover over here you'll be able to the class for storing a piece of text and associated metadata okay now if you really want to understand a document structure so first of all I will go ahead create one document let's say manually I'll go ahead and create so I will use this document and inside this we will be using two parameters one is the page content let's say this page content I'm writing this is the main text content uh content uh I'm using to create rag okay so I I've just basically written some some basic content over here let's consider that this particular content is coming from a txt file Okay, but along with this content, if you really want to improve the search query retrieval from the vector DB, you need to also go ahead and write metadata. So the second parameter that you'll be able to see is something called as metadata. Now inside this metadata, you can write different different information because at the end of the day this is text. You can write like okay fine this is my source. The source is basically coming from example.txt file. Okay. Then let's say the number of pages are uh equal to one. Okay. Total number of pages are like one. Uh I can also go ahead and write some more information like okay who is the author for this? Author is nothing but crush nayak. So this is the additional details that you'll be able to see it. Okay fine. Let's go ahead and write date created. So date created. Right. Date created. And here I can go ahead and write 24 -01 - 0 like it's like first 2024 or first 2025. Now why these all metadata will be really really important because once we consider this document right once we do the chunking once we do the embedding and once we store into the vector DB when you're doing the similarity search you can also apply filters that is the most important thing of this and when you apply filters let's say that I am applying a filter uh I'm searching what is the main text content for building the rag some information is there let's say there's some information related to the rag if I ask that particular question and I say by author Krishnaak I just had that particular filter then it knows from which document to probably pick up because it is going to apply a filter by using the name of author right and that is why this metadata will definitely play a very important role now if I just go ahead and execute this doc you'll be able to see that fine I'm getting this particular document here you can see metadata is there and as you go ahead you'll also be able to see page_content right so these are the two main important parameters with respect to this which everybody can probably go ahead and use it. Okay. Now I hope you got a very clear idea about it. Uh now what I'll do I will just go ahead and create a simple simple create a simple txt file. Okay. Now for creating a simple txt file what I will do I will just go ahead and import OS. Okay. And I'm saying OS domake directory data / text file. So I'm trying to create this particular inside this f folder I'm creating this particular folder name okay and if it already exist I'll say that don't do anything right so as soon as I go ahead and execute it you'll be able to see that okay it is going inside the notebook file I'll remove this and let me go ahead and write double dot slash let's see now you can see over here text file is present okay so text file I'm I've just done that inside this now let me go ahead and manually create a text file with the help of Python code. Okay. So I will just go ahead and use a Python code. See guys, these are all our basic Python code. I don't want to write each and every line of code and make it very very big. Our main aim should be that understand concepts quickly show you multiple use cases and then try to implement this. Okay. So now you will be able to see I have created this simple text. I've given the file name something like this. So let me go ahead and write this to it. Data text files python intro.txt. And this is some content that is present inside that particular key name. Okay. So this is my file name. You can see this is key is my file name. And then here I have specifically my Python content. Okay. Here I'm saying for file content in sample text do items. I'm telling to open the file name. I'm saying that write the content. Okay. So this file path is nothing but my file name. Okay. So if file is not there, it will try to create python intro.txt. So now if I go ahead and execute this. So it is saying me no directory. Okay, let me just go ahead and create one file. Okay, python intro um text file. Okay, I have to give the path because there are two files that is over here. One is okay, one file is also over here. Okay, so I'll just go ahead and write dot. Okay. So now here you can see my sample files has got created machine_arning.txt and python intro.txt. Now what I will do see I've created some sample file. I could have also manually created it instead of doing the code. Okay. But I really wanted to show you all the things. Now what I will do I will show you how to read this particular text using text loader. So one of the loader that is present inside langin is something called as text loader. So here I will go ahead and write from langchain dot document loaders import text loader. Okay text loader. So here we have imported text loader and uh along with this uh see if you don't want to also use this if I execute this this is also there before if I talk about it right when langchain keeps on changing its library here and there. So there we used to use langun community.d document loaders. This also we used to use import text loader. So any of them you can actually use unless and until you get a deprecated warning. Okay. Now the question is that how do we go ahead and read the text. So I'll write loader is equal to I will initialize text loader. Give let's give the path. The path is nothing but parent folder. We go to the parent folder data /ext files /ython intro.txt. So here I have actually given my file name whatever file name we have actually created and we can also go ahead and use encoding UTF8. Okay, encoding UTF8. So once I do this okay and now once I go ahead and read this loader now what it is giving it is giving me an object of um text loader right now in order to get the content inside this I will be using loader.load load. Okay. And here you'll be able to see that I will be getting the document. Okay. Now let's go ahead and print the document. So I will write print document. So let's say this is my document. I'm going to print it. So here you can see in the document you are getting metadata. You're getting the entire information and this is your page content. Now this is what it is doing, right? This text loader is by default giving you the data in the document structure. as soon as it is reading. And here the best part is that you can also see some of the metadata information has also got updated like what is the source right you can still go ahead and and manually change more information inside the metadata but by default the best part is that whenever you're using this all libraries then also it will be able to give you the content in the document structure which is really really good because in the document structure you have two important things. one is the metadata and one is the page content. So this is with respect to text loader right I have just read the text loader and I'm able to get this in this way. Okay. Now one more way what I will do I will show you with the help of directory loader like if I have all the important files in my directory. Can I read it like that also or not? Okay. So for doing this let's use uh one more library which is called as directory loader. Right. So here you can see lang community.document document loader import directory loader now inside my directory loader you can see that I'm giving this particular file again this file should be uh parent folder does this and here I given the pattern to match see this function basically you can give a pattern to match all the files then you can use loaderclass loaderclass basically means which file you are planning to load if it is a PDF one you can directly go ahead and use PDF okay so what I can actually do is that I can also go ahead and insert PDF files over here. I can also provide this in the form of list so that it will be able to read both the content. Okay. So once I go ahead and execute this, you can see here also I'm using the encoding and all these things. And here you can see uh once I go ahead and write directory loader dot load okay and here you will be able to see documents. Okay. And then now if you just go ahead and print the documents you should be able to see this. Okay. I'm getting an error to log the progress please install pip install tdk. Okay. So here we have enabled the parameter show progress is equal to true. Let me make it as false. So that I don't need to probably go ahead and install this. Now here clearly you can see that there were two text txt file. I got two documents. Yes. Now further you can do chunking and all right based on the number of documents over there I was able to get it. Right. So this is the most amazing part uh about this. Now what I will uh quickly do is that let me go ahead and create uh a PDF file also. Okay. So here I have some examples of the PDF file. Okay. So let me quickly go ahead and copy this and paste it over here. Reveal explorer data. I have text files. I have PDF files. Now inside this PDF file now my main aim is to read both the text and PDF files. Let's see. So here I have attention PDF, this PDF, this PDF. Okay, so this is my one document. Okay, let me go ahead and write the same code. Copy and paste it over here. And this will basically be for the PDFs. So for PDF I will be having from langchain lang core dot document loaders import pipdf. I think pi pdf is not available over here. Let's see where is this specific library. I'm just checking out the documentation. Uh PI PDF. Oh yeah, it should be there. So it should be here in the inside my community dod document loaders. I have two different types of library. PI PDF and PIMU PDF. PIMU PDF is better when compared to PIP PDF. You can see uh PI PDF shows load and parse a PDF file using PI PDF library. And similarly if you go ahead and see py mu pdf it loads and parse pdf file using this provides method to load this this this is there all the information you can see the differences which one is better which one is not better in the later stages. Okay now what I'm doing is that I will give the path over here. So from data / data and here you can see the path is nothing but PDF here I will go ahead and write PDF instead of writing text loader I will go ahead and write pi mu PDF let's go ahead and use pi mu PDF I can also include encoding in this and here what I will do I will quickly write PDF documents is equal to directory loader dot load Okay. And then if I just go ahead and see PDF documents, you should be able to see there are so many different PDFs. Okay. I'm getting an error. Uh get text got an unexpected argument. Okay. Let's remove this. I will not be requiring anything. We don't need to apply any encoding by default. Okay. So here you can see I have got all my documents. Yes. So how many different files were there inside PDF folder? One is attention. PDF, embedding, PDF, object detection. These are some of the research paper and with respect to this all we are able to see this and now the best part is that when you're using Pymo PDF here the metadata information is completely different seeation date source file path total pages right format see total pages is 15 for the first one then 27 then 21 see you can see it so beautifully it is there see I have also created some of the PDFs there also you'll be able to see some kind of author's name also right it tries to bring up all the entire source information and this is your page content right so beautifully you are able to see the entire content quickly right so that is what this all PDF is all about and here at the end of the day even though we use this specific libraries we are getting this in the form of a document structure it is a list of documents so if I go ahead and say what is type of PDF document of zero You'll be able to see okay it is of a document type right now that is the most important thing if you now see that we have understood about document structure we know how to read PDF and txt now don't you think you can actually easily find out how to probably go ahead and read the Excel DB any kind of files and this is the task that you really need to do how you'll do it just go to lang chain document loaders right and you will be able to find out everything over here. Just go ahead and try it out. Try it out. Try it out. Try to see if the document structure that you're getting is good or not. So here there are so many different things you can go just go ahead and try it out. If you want from a AWS S3 you you want from AWS S3 directory go ahead and just install this particular library give this but before that you have to do the authentication and all right. Once you do this and uh once you're able to do it, you can use any kind of document loaders as you add but at the end of the day what is what is the best thing about this at the end of the day you are able to convert everything into a document data structure right now if you see with respect to data injection here you have actually completed now the next step is that I will move towards chunking okay I'll move and show you how the chunking can be specifically done what are the different ways of chunking um that you can actually do you know and then finally we'll see that how we can even convert into embeddings we'll try to use an open source embeddings for this and then finally a vector DB so yes I hope you have understood about the data injection part now let's move towards the chunking part where we will understand uh how we can actually performing chunking and I have also told you what is the importance of chunking so guys till now we have already discussed about the entire document structure and uh I've also shown you how with the help of PI PDF loader PI MUD MU PDF loader and how with the help of text loader you will be able to read the txt file and PDF file. All the other files again you can go ahead and see the langun documentation you have different different document loaders which I have already discussed right and these are some of the document loaders that you can specifically use uh which I have already shown you um from the documentation page now we going to go ahead one step ahead you know um because we have just started with this we understood about data parsing and we were able to create the document structure itself now I really want to probably go ahead and do the chunking uh then after the chunking I also want to probably go ahead and do the embedding and finally whatever text to vectors is basically converted this vectors will be stored in some kind of vector store DB okay so let's go ahead and start building this entire pipeline okay so uh and this pipeline will initially build it we'll start from complete basics since this entire rack series we are learning from basic stuff right so definitely you'll love it you'll love to expl explanation that what I'm doing you know so here uh what I will do I will go ahead and create one more file quickly and I'll say hey this is nothing but PDF loader ipnb okay and uh here I will go ahead and select my kernel this is my kernel and let's go ahead and start the entire rag pipeline and this pipeline is nothing but data injection to vector DB pipeline okay vector DB pipeline we are going to go ahead and build this quickly. So, uh first step as you know that I already have one data folder over here. So, this is what is my data folder and I definitely have a lot of PDF files inside this PDF folder itself. So first thing first uh what I will do I will go ahead and create a function you know uh saying that uh where in I will try to read all the documents from this and I will try to uh read the data inside this particular document that is PDF file and then uh we may use pi PDF folder PI PDF loader and then finally convert that into a document. Okay. So for this what I will do I will quickly go ahead and create a function and this function will be nothing but uh this is a markdown. Let me just go ahead and make a code cell. So uh before I go ahead I go I want to import all the important libraries that are available. Uh some of the libraries that I will be noting down over here is nothing but import OS. Then you have something called langin document langen community langun community document loaders. I'm using pi pdfd loader and all then you also have this langchen textsplitter and recursive character textplitter. Okay so u otherwise instead of writing in a new file I will let's go ahead and use okay this file is fine so I will just go ahead and execute this I will I don't require the path library. So once I execute this these all libraries will get executed now we will be able to use this. Now since my first step is related to data injection. Now whenever I really want to specifically do data injection, what I will do is that I will try to read all the PDFs. So we will read all the PDFs inside the directory. Okay, directory. Now guys, uh you need to have some knowledge with respect to coding. So otherwise if I keep on writing line by line, it'll definitely take a lot of time. So here we are going to create a function which is called as process all PDFs. Here we need to give the PDF directory. Once you give the PDF directory uh we will probably go ahead and take the path. So for this also I will be requiring the path library over here. So once we get the path based on the workspace location here we are going to get the PDF directory path. Then we'll list of all we'll go ahead and apply this regular expression to get all the PDF files. Then here I'm printing what is the length of the PDF file and we are processing every PDF files. So here you can see that I'm using pi pdf loader str of pdf file name whatever file name then I'm doing documents is equal to loader.load load here I get the document okay here what I'm doing I'm adding some more information related to metadata so here you can see doc metadata of source file I'm giving the pdf file name I'm also saying that hey what is the metadata file type so this is my new keys inside my metadata to some put some more additional information and finally you get a PDF I'm just mentioning some more metadata information so along with this I've put up this metadata information like file type source file now you can add keep on adding any number of metadata information like you want right and once we read this entire documents we are going to go ahead and store in this particular variable that is called as all documents which is nothing but it is a list of it is a list it is an empty list okay so once we do this here we'll be able to see it is returning this all documents so this function what it does is that from inside a folder it reads all the all the uh PDF files it reads the content inside this it adds this kind of metadata information and finally it is basically storing in this particular variable. Okay. Now we call this particular function process all PDFs. I'm giving the data folder over here. So once I execute this you'll be able to see that it has found out four PDF files and attention. PDF had 15 pages. My embedding PDF had 27 pages and object detection PDF had 21 pages. And this is proposal one page. Okay. So all the information I have it over here. Now if I go ahead and check my all documents. So if I go ahead and check just this particular v variable all PDF documents you should be able to see that this is my list of documents right and the best part is that for every PDF you'll be able to see by default some of the metadata information along with this you can see there is an author metadata keywords mode date all this modified date right all these information are basically present in the metadata information now here what we have added we have added source along with the source you can see we have also uh total pages is also added at source file is also added and these are my text which is present inside my page content right so for every PDF whatever is the possibility size of the document we have we are able to read it now this is a step that we have done right now we have to go to the next step and perform the chunking now how do I go ahead and perform the chunking now I have my all my list of documents so what I will do I will just go ahead and quickly create a function and this will be specifically text splitting get into chunks. Okay, chunks I have over here. Right. So, first of all, I will go ahead and create a function which is called as split documents. Split documents. And inside this documents, I will be giving my parameters. The first parameter is nothing but documents. Then I have my chunk size is equal to,000. then I have chunk underscore overlap is equal to 200. Okay. So I have given all these things. Now you know how to do the chunking. It is very simple. You go ahead and directly use the recursive character text. And for this we we definitely require recursive character text which we have already imported I think right. So on the top you'll be able to see that we have imported this which is present in langin.extplitter. So inside we are taking this text splitter which is nothing but recursive character text splitter. Now this is recursively split all the document size based on the chunk size that is 1,000 chunk overlap 200. Chunk overlap basically means some number of text will be able to get overlapped between two different documents right when we are doing the splitting. And uh here you can see we are also using separators right this is just like an empty space like a blank uh sorry this is an empty space this is one more separator this is a new line separator now you tell me in the comment section what separator is this okay so we can use different different separators you can also use comma um we'll be seeing different types of chunking strategies in the later stages but let's let's start creating this one pipeline then you'll be getting a clear idea about it like how this entire pipeline works Okay, then you have this text splitter. Uh once you uh specifically have this text splitter, you can actually use this to do the splitting. Right. So now what I will do, I will create a variable inside this and I will write textplitter.split documents. So we are using the split documents and we are giving the documents and these all are the default parameters that we are giving over here. Now once we do the split, you'll also be able to see what is the page content. I'll just try to display 200 characters from the page content and you can also see the metadata right so once we go ahead and execute this this is going to return the entire split documents now let's go ahead and use this split let's say here I'm just going to go ahead and get all my chunks I will be using this function split documents and let's give the documents here we are going to give the list of documents right uh like uh what are the list of documents so list of documents is nothing but all PDF document. So I will give it over here and let's see the chunks. Okay. So now if I go ahead and just go ahead and print the chunks, you should be able to see that my all my data is basically chunked, right? And uh you can see that we have splitted 64 documents into 359 chunks. So these are all my chunks that we have done it, right? That basically means we have converted all our text into smaller chunks, right? Based on the uh chunk size and the overlap. So like this kind of chunks we have how much 359 I guess how much it is 359. Initially we had only 64 documents right for every page there will be a separate document structure. Perfect. So we have done this and uh we have done the splitting part. Now let's go to the next step. The next step will be quite interesting because now if you see from this particular pipeline right what are we doing right? So here we have done the chunking but these two are the most important steps. One is the embedding right we need to perform some kind of embeddings over here right embedding uh generation embedding generation and vector store DB right embedding you can use any kind of models but I will try to focus on using open source model so that everybody will be able to just try it out you know uh for this what I will do I will just try to use some kind of modular coding so I will try to create some classes you know for embedding I will create a separate class and inside this we will try to define different different function Because in embedding uh you know that you are converting text into vectors right so for converting text into vectors I may define different functions like loading the model generating embeddings you know that kind of and in vector DB like again we'll try to create this as a separate class. So let's go ahead and probably go ahead and discuss about this uh wherein we work on the embedding part quickly let's go ahead and see the embedding part. So for the embedding I will just go ahead and write a markdown. So let me quickly write embedding and vector store DB right. So we are going to specifically go ahead and implement these two important modules. Now first of all what I do do is that I I definitely require some kind of libraries over here right for embeddings. So for embedding uh we are going to use sentence transformer. uh we are going to use a model that is available in hugging face and for that I will be using the sentence transformers library along with this uh I also want to use some kind of uh you know vector store so this is the vector store I may use that is fire CPU you can use fires or you can also go ahead and use chromb so these are some very good open-source vector store that is available um now these all libraries will be more than sufficient to get started with. So quickly let me go ahead and install it. So I will write uvad minus r requirement.txt. So once I do the installation you'll be able to see that. Okay the installation will get completed. So once the installation gets completed it'll take some amount of time because we are loading the entire transformers. So here you can see that quickly it has got installed. Now I'll go again back to over here. Now once I go over here what is the first step that I'm actually going to do is that I will quickly go ahead and import some of the libraries that I require like this right so I'm importing numpy from sentence transformer I'm importing sentence transformer my embedding model right will be available inside this then I'm importing chromadb then uh we also importing the settings from this we are importing uyu ID the reason of creating this uyu ID is that because every record that we specifically insert into the vector dv we'll have some kind of id over there we'll generate that then along with this we will also be importing list dictionary ne and t pupil and uh since we are going to apply cosign similarity while doing the retrieval from the vector db I also will be importing this and this is available in skyitler so let's quickly execute this okay and till then I will go ahead and create more number of cells now as I said for embedding I will go ahead and write one different class So I will say embedding manager. So this will be responsible in doing the embedding part. So first first thing is that once I am creating this uh for every class that we specifically create, we need to write an init function. Okay. So init. So this is my constructor you'll be seeing that it handles document embedding generation using transformer. Here we are initializing the embedding manager and the model name that we are giving is all mini LM L6 V2. So this is available uh in uh hugging face this specific model all mini L6 V2 and this is responsible in specifically converting a text into vectors and you get somewhere around 384 dimensions. Okay. Then uh we initialize the embedding manager. Then model name is nothing but hugging fist model name for sentence embeddings. We are going to use this. Okay. So here we are initializing the model name. Uh we are saying self domodel is equal to none. Okay. Because here uh later on we'll initialize this value. This function is very important load model. So that basically means my next function will be load model. And this model work is very simple. This function work is very simple. It is going to load this model that is all mini L6 V2. Okay. So I will create another function which is nothing but underscore load model. Why we write underscore? Uh this is just like a protected function. Uh if you know about classes, we use something called as a protected function. And within this protected function within this class only it'll be accessible. So here uh what we are doing we using the sentence transformer and whatever model name we have we are loading it. Okay we are loading it. So selfro model of sentence transformer model self model name then this will be modeled uh loaded and here you'll also be able to get the dimension. For that we use a function called as get sentence embedding dimension and by default it will be uh somewhere around 384 dimensions. Okay, that basically means every text will be converted into 384 dimensions. So once we have this init function, we have the load model. Now one more function that we require is generate embeddings, right? So here uh you'll be able to see that I will be seeing this generate embeddings function. Okay. So generate embedding is nothing but it takes the text that is nothing but list of string and it returns a numpy array. Okay. So here it generates the embedding for list of text very simple. So here what we are doing we are basically using this self domodel dot encode is the function that we have to use on text whatever text list of text we give and we also giving show progress bar is equal to true so that we should be able to see the progress bar and we return the embeddings. Okay. Now generate embedding is one function. Load model is one function. We have al also used get sentence embedding dimension just to get the dimension. Okay. Now for this you can either get I can you can either create this particular function or you can also remove this it is not necessary but what I did is that to show you much more in a better way we will create this function get sentence embedding dimension. So here is my get embedding dimension self. So here what we are doing we just written model get sentence embedding dimension. See instead of doing like this also I can write like this only over here. Okay I can just quickly write this particular function over here. Okay. So sometime it is not required you can also. So I will just go ahead and remove it if you want. Okay I will just remove it. Perfect. So I have these two three important function. Now we can initialize the embeddings. Okay. Uh sorry we can initialize the embedding manager. So here I will write embedding manager is equal to embedding manager. So I hope this is the class name should not be underscore it should be like this. Okay now once I go ahead and write this and once I execute it this will just go ahead and initialize the constructor. Right. So here you can see it is loading the embedding model. All mini LM V62 model loaded successfully and here you can see the dimension is 384 right so it has been loaded so when we calling this particular function this is basically getting loaded right so my embedding manager now has the model information over here great so I have my model ready so if you see from this particular graph this entire class has been created now we go to the next step and create this specific class that basically means over here we have our model embedding ready we just need to use it. Now, similarly, we'll go ahead and create it for the vector store also. Okay, vector store is just like a vector DB database where you can store all the vectors that has been converted by the embedding layer inside it so that you can apply any kind of similarity search into it. Right? So, first of all, let me quickly go ahead and define a class for this also. So, here I will go ahead and write vector store. Okay, vector store. Uh remember guys the code that I'm showing you is very simple if you just see you need to have some coding knowledge if you really want to become better in rag. Okay now we'll go to the next step with respect to the vector store. Now in the vector store we are creating a class vector store. Again here we are using a init method. We are giving a collection name. What should be the collection name for the vector store itself. And uh here the collection name we giving it as PDF documents. We are also giving the persistent directory which will be this particular directory that is inside my data folder. Persistent directory means whatever vector store is basically created we are going to save it that in the hard disk. So here uh first of all I'm giving the collection name I'm giving the person directory collection is none. Self docolction is equal to none. Okay. And then we are initializing the store. Now whenever we initialize the store that basically means this function will be initializing the vector store itself. Right. So for this we need to create another function again and see the code. Okay, just observe the code. Here we are initializing chromab client and collection. So here we have written osmake directory of self.persistent directory whatever directory path is there. If it already exist we are just going to keep it like that otherwise it is going to create a new directory. Then we create a client self.client wherein we are using chromadv.persistentclient function and we are given the persistent directory over here. So what it is going to do? It is basically going to create a client which will be having a reference to the chrom vector store. Okay. Then we go ahead and create a collection. So here we write self.colction. Then self.client dot get or create collections. We're giving the collection name and we're giving some metadata information like what is the collection information. And here we basically create a collection uh collection basically means it's just like uh where we are going to store the uh vector uh where we are going to store the uh vectors inside my vector store. So it'll be stored inside this particular collection name. Then we are initializing this with the collection name dot collection count. Okay. So as soon as we execute this that basically means my chromb client will be ready and my collection will be created. Okay. Now the next function is that usually whenever we create a collection we need to add the documents right. So for documents we will be creating another function. So quickly let's go ahead and create this because whenever I have a document I will go ahead and create this particular connection. Okay. So here you can see I've created another function which is called as add document. Here we give the list of document. We apply the embeddings. Very simple add documents and the embeddings to the vector store. And here you can see if length of documents is not equal to length of embeddings. Here you can actually see this. Now we are preparing the data for chromb. We require ids, metadata, document text and embedding list. So now whatever documents I have over here. Whatever documents I'm getting, I will be zipping it means I I'm creating a pupil with embeddings and then I am creating a UYU ID. Why I require UU ID? because it's just like a id for a specific record, right? And that will be my doc id. Okay, doc id variable and I'm appending it over there. Then we are preparing the metadata. Whatever doc metadata we get. Remember we are iterating through this documents. So we have all the information. So that all metadata we are putting it over here. Doc index content length. We are just adding some more metadata information to put it inside my vector db. Then we get the document content from doc.page_content. And we also get the embedding where we are converting this embedding to list. Okay. See two information is basically required right over here. If you see uh from this particular function one is embedding which is my MP. ND array right and this embedding is coming from where from the previous function right generate embeddings where we have done it. So it's all linkage. See the reason of creating this particular in the form of class because I want to link each and every pipeline right. So here we are writing embedding list.append embedding.2 two list. So we have the page content, we have this list. So what I'm doing I'm adding that entirely in the collection. So for this we require ids, we required emitting list, we require metadata, we require document text. So whatever we have prepared, we're just adding it over here based on the parameters, right? And finally you'll be able to see the how many number of documents has been inserted. Now quickly let's go ahead and initialize. Let's go ahead and initialize my vector store. So I'll write vector store is equal to uh vector store and I'll initialize this. Okay. So quickly I will go ahead and write vector store. So now this is basically going to initialize the entire vector store itself. Right. So here you can see this is my collection name and existing document in collection is zero since we did not add any number of records. Okay. Now, if we want to add any number of records, we have to call this function add documents, right? So, let's uh go ahead and do that and let's call it. Okay. Now, first of all, uh you know that I have already done the splitting of the chunks, right? So, here if you go ahead and see this, this is my split chunks, right? Uh sorry, that was the variable. Let's see which variable it has got saved. Okay, it should be chunks, right? So these are my chunks right now chunks what I am actually going to do is that I will extract all the text from that particular chunk and we'll generate an embedding. Okay. So for that what I will do I will say I will put a list comprehension. So here now let's convert the text to embeddings. Okay we're going to go ahead and do this. And here we are basically going to write chunks. First of all, I'll iterate. Okay, I will say that hey for doc in chunks. Okay, and we are just going to take this doc dot page content. Okay, so we are going to take all this page content and basically go ahead and create my text text variable. Okay. So once I go ahead and do this, you should be able to see this is my text, right? All the text that I have and this text I will pass it to my embedding manager, right? Embedding manager which I have actually created. So what I will do quickly, I will just go ahead and execute this once again. I have all my text. Okay, I have all my text. Now from this we will go ahead and generate the embeddings. Now once we generate the embedding how do we generate the embeddings very simple we use this embedding manager which object we have actually created what object we have created earlier if you see over here this is my embedding manager right so we are using this embedding manager dot generate embedding and here I have to give the text in the form of a list list of strings right so here quickly I will call this particular function dot uh dot generate generate generate underscore embeddings. Okay. And here you will be able to see that I'll be giving my text. Then let's store store in the vector database. So after we convert that into an embedding, we store everything in the vector database. Right? So here I will use vector store. vector store the variable that we have created dot add documents and this is a small letter add documents this is a function that we have used and inside this if you remember we have to give our we have to give our entire chunks okay whatever embeddings we are specifically applying okay so once we do this You can see this embeddings whatever we have got and the chunks the documents the entire documents we're going to do this okay so let's quickly execute this and I think now my embedding will happen now you can see that for 359 text this is happening and it has got converted into so many number of batches uh vector store is not defined why it is not defined let's see what I have defined over there okay it should be vector store so this should be the spelling of my vector store instead of that. Okay. So now let me quickly go ahead and execute this. Now inside that same vector store it'll get it'll get executed. Okay perfect. Now you can see that the total document in the collection is 359. So if you see over here uh inside my u notebook file inside my data file here there is something called as vector store and we have done the persistent over here right. So persistent basically means the now now f the it is saved in this particular hard disk. We can just load this hard disk and we can probably go ahead and execute anything as such. Okay. Now perfect. Now you can see that we have completed this entire pipeline. Now we have all the data available over here in the vector store DB right in the form of vectors. But now the main thing is that how do we perform the retrieval? Because retrieval see in retrieval what happens is that whenever we have a user query we have to take this query we have to convert that into embeddings again okay and then we basically go ahead and hit the vector store in the form of a retriever and then only we get the context. So in our example first of all we'll try to get till here. Okay, we have a user query. We convert that query into embeddings. Then we hit this particular vector store and we get the context. So let's go ahead and create this specific pipeline now. Okay. And for this pipeline, we will try to create a rag retriever. Okay. So we will try to create a rag retriever. So let's quickly go ahead and do that particular thing. Till now we have created all the amazing pipelines. We have created this embedding manager. Now we also have this vector store. Now what I will do is that I'll create another pipeline which will be a rag retriever. Okay, just to get the specific context. So let's go ahead and discuss about that. So guys, now let's go ahead and create the rag retriever pipeline. So first of all, what we are going to do is that I will go ahead and create a class which is called as rag retriever. Now this rag retriever class you will be able to see that it handles query based retrieval from the vector store. So inside the constructor we will be giving two important parameters. One is the vector store and one is the embedding manager. And if you remember we have created both this. We have created the embedding manager. We have created the vector store manager. Right now after giving this we will be initializing two class variables that is vector store and embedding manager and we'll be assigning with this. Now whenever we create a retriever one thing you really need to understand this retriever is actually built on the top of a vector store and retriever is nothing but it is a simple interface based on whatever query we get this retriever is just going to give you the response back. Okay and this retriever is basically a kind of interface which is connected to the vector store and chart. Okay. Now uh the next step that we are going to create is another function which will be called as retrieve function. Now this is really important because this retrieve function main work is to retrieve based on a specific query. So let me go ahead and define the specific function. Now this function again see to write it will definitely take a lot of time. So we will try to understand this particular function. Okay. So here a retrie function you can see we are giving query we are giving top key results. How many top key results we want and there is also a threshold value. By default it is 0.0. zero and this function is basically going to return a list of results. Okay, so here you can see retrieve relevant document for a query arguments are the search query, top K documents and score threshold and it returns a list of dictionaries contain the retriever documents and metadata. At the end of the day this function is actually help us to get this specific context. So you'll be able to see over here we are using that same self embedding manager and we are calling this generate embedding function. Now if you remember this generate embedding function is already defined in my embedding manager right. So if I go on the top so here is my generate embedding function and this is nothing but this is basically uh you're just using model.enccode and you're giving the text and it is converting into embeddings. Yeah. So that is the reason we are basically using this because at the end of the day first of all whenever we get a query right so let me go down over here inside this retrieve whenever we give this query first the query needs to be converted into an embeddings right so this query that is given we need to apply embedding for this also so that we can do a um similarity search in the retriever itself right so the first the query is basically converted into a vector by the help of embedding manager dot generate fun embedding functions. Then we are going to use the vector store dot collection and we are going to use this dot query and here we are going to give our query embedding which is nothing but this embedding in the form of a list and then we are also going to give the top results. So by using this this is basically going to hit the vector DB whichever vector vb we have initialized and it is going to give you the results. Once you get the results, the results internally there will be a key which is called as documents. Okay, you can get document information, the mech metadata information, the distance information and some of the ids information. So all the specific information we are using it and here you can see very similarly what we are doing we are using all these parameters like ID, documents, metadata and distance. We are zipping it. Zipping it basically means we are just trying to create a pupil over here and then for every values we are just trying to calculate the distance right one minus distance 1 minus distance will basically give you the similarity score like how similar those text data is basically coming up outside this vector store. So we are creating the similarity score and if the similarity score is greater than the threshold then what we do we basically add this inside my text context documents and context documents is basically created in this particular variable which is nothing but retrieve docs which we have kept it empty over here. Okay. So all the information we are just trying to add it over here so that we'll be able to see it. Okay. And finally we return that retrieve docs. So if you say step by step we're not doing anything we like not very complex thing we are getting the user query we're converting this into embeddings we are hitting the vector store right then we are getting the response okay once we get the specific response that context we are putting it in the form of a list if you just go ahead and see the code that is how things are happening okay so this is one of the very important function uh that you'll be able to see now here what I can do is that I can quickly go ahead and create a variable called as rag retriever and I can call this same class. So if you see over here I will use this same rag retriever over here and let's give our vector store vector store which I've defined it earlier which is my vector store manager and then my embedding manager. Once I do this I should be able to see this. Okay. uh it should be vector store file right so now you'll be able to see this is my rag retriever rag retriever it is an object of this now if I call this particular function with a query right I can call dot retrieve with a query so let's go ahead and do this okay so here I will write rag retriever dot query sorry dot retrieve is my function Okay. So here you can see quickly this is my function retrieve right and I need to give a query. Now let's test for a specific query. I'll say hey what is attention is all you need because I know inside my data there is a PDF file which is called as attention or I have also created some kind of proposal over here embedding some files are there. So we'll try to execute this. So here you can see as soon as I asked what is attention is all you need. Now it is giving me the top K for all it is printing all the information and it is generated embedding for one text. Right? And the text shape is 1, 384 because I have used the embedding that is called as all mini LMV6 that creates a 384 dimension. Now once we go ahead and apply this particular function right this function it is basically getting the results over here and we are printing that same thing right and at the end of the day we we we can also go ahead and return this retrieve docs okay so in short this is basically this function is going to give me all the retrieve docs so this is the retrieve docs you can see content metadata author so these are my context information so here you can see attention function can be described as a mapping a query as a set of this one and this entire entire thing is basically the context. So from this particular diagram here you can see easily we are able to get the context right and this is nothing but this is your context. Now let's try some more things. Okay I will just go ahead and open some PDF. Okay. Um this is some very new research paper embedding technical report. Okay. Uh we'll search for any topic over here. Uh embedding model training. I'll just go ahead and search for unified multitask learning framework. Okay, because this information also we have put it over there. So here I'll go ahead and create one more this one and I will copy this entire code. Okay, quickly and this is the query that I'm actually going to give that is nothing but unified multi multitask learning framework. So if I go ahead and execute this you can see that I'm able to get this and then you can see content benchmark ranking over on both the leaders effective of our approach. So we are able to get the response very very much quickly right and this response is basically coming from the vector store right in a very similar way very easy way uh we are able to get the specific response over here right and let me tell you right this is the most easiest way like how things are basically happening over here right now uh what we can do is that see if you know if you have created all these things right till here you have created now the further step is that you have to just integrate LLM with the uh with this specific context. Okay. Now for this LLM with this specific context, what you can do is that you can directly take this particular context and give it to the LLM and that is what we are going to see in the next video. But in this particular video, we saw the entire thing the complete rack pipeline from data injection to the vector DB pipeline. Right now you can go ahead and write any kind of queries and definitely with all these information here you can see similarity score is also coming up right distance is also basically coming up all the information you're putting it over here and we have also used modular coding right now in the next step what I'll do I will take this vector store and uh we will go ahead with the next integration that is llm and output which I will say it as a retrieval pipeline but this entire data injection pipeline with this uh query retrieval we have actually created. Now the next two steps will this one and after doing this we will try to convert the same code whatever same whatever code we have basically written over here in the form of modular coding right we'll try to see that how we can put this inside our source folder so here what I will do we'll quickly create a source folder and inside the source folder I will show you that how we can take this entire pipeline and how we can actually create it in such a way that we have a kind of pipeline over here right pipeline basically means from data injection to vector embedding how in a sequential way we can actually go ahead and call it. Hello guys so we are going to continue the discussion with respect to rag. Uh till now we have already discussed about the entire data injection pipeline and with the help of user query you know we are also able to retrieve the context. uh we have completely implemented this first pipeline that is called as data injection pipeline where we did the data injection. We did the chunking uh then we converted the text into vectors and after that you know uh we were able to probably store everything inside a vector DB and we also persisted in the local directory so that we can always read whenever we definitely want okay based on a specific query. Now we are going to go towards the second pipeline that is the query retrieval pipeline wherein we are also going to use LLM with it. Okay. So here we are going to specifically use LLM models and this LLM models will actually help us to generate a summarized output. Okay. In the rag. So the entire pipeline will look something like this. And uh when we talk about this query retrieval pipeline, we are specifically talking about something called as augmented generation. Okay. See in retrieval uh rack basically means retrieval augmented generation. And this augmented generation how does it specifically work? Okay. So let's consider that this vector DB is already ready and you know that how did I create this particular vector DB? By following this particular pipeline, right? Now once we follow this pipeline the data is stored inside the vector DB. Now whenever a user gives a new query okay it has a new query related to the documents that are already ingested inside the vector DB then what we do we take up this query we apply the same embedding and in this particular embedding what we do we convert the query to vectors right and then from this particular embedding we hit the vector DB we get the context and then whatever context we get along with the prompt engineering like basically with a simple prompt we give that instruction to the LLM right so prompt is just like an instruction to the LLM like how the LLM should basically work now once we are doing this right this this step is basically called as augmentation okay this step is basically called as augmentation wherein we are giving we are taking the context and along with that we are also combining it with a specific prompt And finally you'll be able to see that we'll generate the output from the LLM. And this step is nothing but generation right this is the retrieval step. So here I have my retrieval step wherein we are giving a query we're converting that into vectors and we hitting the vector DB. So you really need to understand the entire concepts with respect to rack. Okay. So let's go ahead and implement this entire retrieval uh query retrieval pipeline along with the LLMs. Okay. Now here we also going to go ahead and set up the LLM. So guys, now let's go ahead and implement this uh with the help of practical implementation. So here we are going to integrate vector DB context pipeline with LLM output. U as suggested we are going to implement the augmented and generation. Now first first of all what we going to do is that I'm going to use the my Gro API key. Okay. Okay, so I have updated the gro API key over here in the ENB file and uh you know here we are going to probably go ahead and create a simple rag pipeline. Okay, uh with the gro lm okay so first of all what we are going to do is that uh again uh if you remember in our requirement.txt we will go ahead and import this two libraries that is called as langin-g gro and then you have python.nv PNB okay and then after this uh we will go ahead and uh you know quickly initialize from langchain grock import chat gro okay along with this I'm also going to go ahead and import os then from env I'm going to use load env so that we import or we load the entire environment variables then the next thing is that we will go ahead and initialize the gro lm and set your environment gro API key inside this. Okay. And in order to do this again here you'll be able to see that I'm using gro API key OS.get env something like this. Okay. If you just go ahead and call this sometime uh my suggestion would be that directly don't call from get env. Initially you can directly test it by pasting the environment keys directly over here. Okay. So here I will go ahead and paste it. Otherwise you go ahead and replace it. Just for testing purpose I'm actually doing this. Now we'll go ahead and initialize our LLM model chat gro and here I will use my gro API key is equal to API sorry gro API key okay and then model name is gamma 2 temperature I will select it as 0.1 and maximum number of tokens it will generate is 1024 okay so this is my lm we have initialized the gro lm now the second thing is that we will quickly go ahead and create a simple rag tag function and this is going to integrate everything from retrieve context plus generate response and if you remember guys here is my retriever before class like the previous u session we have already seen that how this rag retriever was actually created we created a class for that okay so here uh we are going to probably take two different parameters inside this we'll first of all define a function called as rag simple and then here we are going to go ahead and give our query Then we are going to go ahead and give our retriever llm top k is equal to three. Okay. And then uh over here quickly let's go ahead and first of all retrieve the context. Yeah. So we'll going to retrieve the context. So here I'm going to write results is equal to retrie dot retrieve query. So here you have this query and top k is equal to k. Okay. And then uh we are just going to get the context or I'll go ahead and define my context. Inside this context I will say that hey whatever information I'm getting from my results right just go ahead and combine everything and put it inside this. Right? So here I'm saying that hey for doc in results whatever content I'm getting I'm going to join it with a uh double new line over here. If results are this empty, we are just going to keep it as empty. So this is my context over here, right? then uh I can still go ahead and write one more condition saying that hey if not context okay we just going to go ahead and return saying that no relevant context form okay to the answer question and then we are going to generate the answer using grock lm okay and now I'm just going to go ahead and define prompt obviously I required a prompt. If you remember here I can again use a prompt template also I can directly use a prompt over here. So here with respect to the prompt I will give a query saying that hey this is what you really need to do. You need to go ahead and answer this specific question and you should probably get a response for that. Right? So here what I will do I will quickly go ahead and paste it. Use the following context. So here you can see use the following context to answer the question uh uh question concisely. Okay. And here what we can basically do is that we can just go ahead and um do one thing on over here quickly. I'll say just put tab. Okay. So use the following context to answer the question uh precisely or concisely. So here I have given the context. Here I've given the query. Okay. Now the next thing after this is that we will go ahead and create a response. So response is equal to this time we going to use llm dot invoke. Okay. And here uh let's go ahead and put something like prompt dot format. And here we are going to write context is equal to context and here you have query is equal to query whatever query I have. Okay. And then we go ahead and return the response dot content. So once we do this uh then we can specifically call this particular function. Okay. So now what we are going to do is that I will just go ahead and write answer is equal to rag simple and let's say I go ahead and ask a question. What is attention mechanism? Okay. And here I need to give my rag retriever along with the llm and then we can go ahead and print the answer. Okay. So here you can see attention mechanism is a function that maps a query in this right and we are able to get the answer over here. This is really good. See a very simple pipeline where I have initialized my lm model. I've defined a function and then this function what it is doing first of all it is hitting the rag retriever retrieve function. It is getting the context. it is combining the context and along with the prompt we are hitting the llm. So if you remember we are we are just following this entire process and generating a proper output right if that particular output is available inside the uh vector DB right now guys uh what we are going to do is that we are going to enhance the rack pipeline the simple rack pipeline that we have created over here okay we'll enhance in such a way that it will have more amazing features in it okay so now we're going to go ahead and create an amazing enhanced track pipeline and this is the code so now you can see over Here we have a function called as rag advanced. I'm giving a query retriever lm topk elements like how many we want minimum scores return context is equal to false. So here you can see that um before we were simply like we were just combining the context we are putting the information in the prompt and we were probably generating the response. In this what we will do is that here we are going to generate this entire pipeline with some more additional features like what all additional features we'll be requiring. See here we are directly getting the answers right but we do not have much information about the source about the context over here right. So here what we are doing we will return answers sources confidence score optionally fully context full context okay so first of all again the code will be similar where we are retrieving the context so this becomes my context when we are retrieving it from retriever retrieve and then uh I have written if not results if results are empty we are saying that no relevant context found and here we are giving sources is blank confidence is 0.0 zero and context is blank. This context is basically coming from the vector DB. Let's say that if we are getting some kind of results over here, we are combining all those results and we are preparing the context over here and then we are adding sources. See this sources which is the list here we are adding metadata information source file right and along with that you can see metadata page number from which page number you are able to get then what is the similarity score and here what I will do is that I'll just try to go ahead and you know display at least 300 um length of the content right so up to 300 characters we'll try to display and then we are going through each and every docs that is available inside this results then we are going to calculate the confidence uh we are actually getting that information in this doc similarity score. Here is my prompt. In this prompt we are giving context query each and everything and we are invoking it and the output will be in this format. So let's now go ahead and execute this rag advanced function. Here I've given all the information like I've asked what is the attention mechanism? What is rag retrieval like rag retrievy I'm given over here llm return context is equal to true minimum score all these things is given right. So now I'll go ahead and execute this. Now as soon as I ask what is attention mechanism here you'll be able to see that I'm getting this particular information right and it is also giving me the source information which number page number what is the score and what is the preview information along with that here is my final information that you can see right where we are displaying the first 300 characters let's say that I go ahead and change my question okay I I ask something else I'll say hey u attention mechanism was one of the thing but if I go ahead see my data, my PDFs. Okay, I will go ahead and ask something else. Okay, let's see what I can ask. So, I'll go to embeddings PDF. I'll say okay. And then let me search something else, right? I will say hard negative. I'll ask this question hard negative mining techniques. Okay, so I will go to my question over here. hard negative mining techniques. Okay. And I'll go ahead and search this thing from my vector retriever. So here you can see that I'm able to get this entire information. The test is several hardcand embeddings NV retriever all these information and again you can see that embedding.pdf PDF page 4 I'm able to see all the information along with the context right so this is uh really amazing and here we have just created an Nstrack pipeline why we say this as an NS rack pipeline because here we are providing information related to answers we are providing information related to confidence score and each and everything now let me just show you one more amazing way and this is also an advanced rack pipeline but this time I will tell you to probably go through this particular code and tell me so here what What we doing? We're doing streaming, citation, history and summarization. So all these things we have included over here and uh you can just go and search for this and you can see the answer. Okay, final answer roment context found because that question may not be there. Okay, I will just or let me just change this minimum score to 0.1. I think we should be able to get something. Still nothing. Uh let me change the question. Let's say hard negative mining techniques. And here we are just going to go ahead and display this particular output. Okay. So now you just go ahead and explore this. Okay. I'll keep this for you at least see some kind of coding. Okay. So here uh we are not able to get anything as such. Uh let's see advanced rack query hard query to top querying summarize equal to true. Uh no relevant this one. Let's see that I go ahead and ask what is what is attention is all you need. Okay, I'll go ahead and execute it. So here you can see that I'm able to see all these particular answers over here. Right. Yeah, for some of the queries this will not it is not giving there may be some problem with respect to the context size but it's okay. You can try out with different different things. If it if something is not coming then we'll try to optimize that also as we go ahead we'll try to see this. So here we have seen three amazing rack pipelines. One was a simple rack pipeline. Here was an enhanced rack pipeline. And here uh in the last one we have made sure to put streaming citation and history and summarization with all this kind of information over here. You just go ahead and check it out all the information and just see the code. I think you should be able to understand it. So overall uh if you see I hope you were able to understand this particular video and uh yeah this was about rack pipeline. Now in the upcoming videos what we will do is that we will try to create some modular coding because see here the entire everything is basically created in one IP file. So guys now it's time that we implement the entire rack pipeline in the form of a modular structure. Already in our notebook we have seen about PDF loader.pipinb IP and B you know wherein we discussed how to probably go ahead and create the entire data injection and how to probably store all the information into the vector DB and finally you're also able to make the query right along with that uh I have also shown you how to work with typesense uh which was an open-source uh vector store itself which was also again amazing for searching anything in a quicker way right now all the kind of implementation that we have done what we are going to do is that I'll try to show you how in a modular way you can go ahead and integrate this in a form of a pipeline. Okay. So already we have this source folder. Now inside this source folder, what I am actually going to do is that I'll go ahead and create my_init_.py file. And after creating this particular file, what is the next step is that I will go ahead and create all my components important components that will be required in order to create your uh rack pipeline. The first important component is nothing but data loader. Right? Data loader. py file. Right? So this will be my first component because initially we need to load the document. We need to do the chunking and then we need to probably go ahead and store it into the vector store. Right? So inside my data loader you know I I will just try to go ahead and read all the documents uh that is actually required. Okay. Then uh after this uh the next step should be your vector store. Right? Now the vector store what vector store we are basically going to use. Uh so for that I will be creating my another file. So here inside my source I will go ahead and create one more file which is called as vector store. py. Okay. So this is my next file that is basically created. Okay. uh along with this uh while while actually inserting anything into the vector store I also need to probably go ahead and do some kind of embeddings right and uh I will try to show you some open source embeddings that we are going to use. So for that I'll be creating my embedding py file and finally uh the last file that I really want to create is something called a search py. Now my entire rack pipeline needs to be integrated in such a way that there should be a linkage between all the specific files. Now the first case is that I will go ahead and start working on data loader. Now you know data loader work is nothing but it should be reading this particular data. Okay, it can be from any source itself. Um we will try to read this specific data itself. Right? So for this what I'm actually going to do is that I'll go ahead and import some of the libraries. So quickly I will go ahead and import these all libraries like uh pi PDF loader, text loader and all. Okay. So I'll start working on this because I need to form a pipeline itself right. So inside this particular file my main code should be in such a way that I will go ahead and read all the documents let it be of a PDF text loader or CSV. Okay here I'm also going to give you some of the assignments because uh in this entire series of videos we have discussed about this. Okay. So quickly what I'm actually going to do is that I will go ahead and create one function which is basically called as load all documents. Now see this. Okay. So here I'm just going to go ahead and write this function. Now please have a look onto this particular function. This function function definition is load_all documents. I'm given the data directory. This should be in the form of string format and it is returning list right list of anything right of any kind of data type. Now the main important thing about this function is that it loads all supported files from the data dictionary and convert to langen document data structure because as soon as we read any kind of data like PDF, CSV, TXT, right? We need to probably go ahead and convert that into a langen document structure then only we'll be able to apply the chunking. Okay. So here you can actually see that I have used data path uh of the data directory itself. the data directory I will be giving in the runtime and obviously by just seeing this the data directory is nothing but data itself. Okay. Now this is the code specifically to read all the PDF files. Okay. So here I have created a list documents which will be storing all the documents itself. Uh here we have used data path globe globe function and here I have used this pattern this kind of regular expression to match all the PDF files. So what it will do is that inside this data directory it will start looking for all the PDF files. So inside this you know that in the inside my PDF folder there are some PDF files. So it is going to go ahead and read all these particular PDF files. Okay. So once it reads the PDF files uh we will be having those PDF files over here in the form of a list. Okay. Then what we are doing we are writing for PDF and PDF files. We are going through every PDF and then we are using pi PDF loader to read the content inside this and we are using loader.load and finally I get all the information over here and we are going to extend that documents. Now this is just an example of PDF files right now. Same thing you can also do over here for text files. Okay, text files. You can also do it for CSV files. Right? See similar kind of code is basically suggested by GitHub copilot. But I really want to give you an assignment. Okay. So this will be for CSV file. This can be for SQL files. Any kind of files that you really want to work with. you can go ahead and write that particular code and keep on appending inside this particular documents. Okay. So as soon as you do that automatically you'll be able to do this specific stuff and you'll be able to get all the documents. Okay. Now what I will do just to test it out whether my PDF files is working fine or not. I will just go ahead and create one app. py file over here. Okay. Now inside this app py file let me go ahead and import some of the libraries. So first of all I need to read everything over here right. So I have written from source dot data loader import load all documents. So this load all documents is nothing but this is the same function that is present inside my data loader. py. Okay. And then from source dove vector store files vector store and rack search I will create in the later stages. So right now I'll remove this. Okay. Now let's try to test the example. So example usage I will write if name main okay and then here I will go ahead and write documents is equal to load all documents and I'll give my data folder okay data folder then what I can actually do is that I can just go ahead and print my docs okay if you see inside this data loader what this is returning right now it is not returning anything so what you can actually do do is that from here so here what we are going to do is that we are going to return the specific documents over here so that we should be able to print that particular documents over here right now what I am quickly going to do is that I will just go ahead and write open command prompt okay and here I'm going to go ahead and write python app py now let's see whether it'll be able to read the uh pdf files or not now here you can see it has found four pdf files all the pdf file URL is over here and you are able to see that it is also able to see all the content that is available inside that particular documents which is good right and this is basically in the form of a document data structure I guess yeah so all the information is basically happening so that basically means so clearly I can see something really amazing over here is that my entire data the PDF code that we have written is working absolutely fine okay now uh comes the next step. Now the next step you should probably start thinking whether we should basically go ahead and work with embedding so that to do the chunking and all right so here uh I will go ahead and start working on embedding now inside my embedding what we are going to do is that I'll be importing these libraries now these all are same thing repeated but here I'm using classes and function definition so here you can see that after reading all the documents after loading all the documents I'm going to use sentence transformer recursive character text splitter and here you can see I've defined a function uh class called as embedding pipeline right the model that I'm going to use is all mini v6 uh lm l6 v2 chunk size is nothing but 1,000 and chunk overlap is nothing but 2,00 200 then here we are writing self dot chunk size chunk self overlap and then we are also initializing the sentence transformer now in the next function that we are going to go ahead and do is nothing but uh we are going to go ahead and create a function which is called as chunk documents. Now inside this chunk documents we are giving the documents which can be a list of any documents. Here we are applying recursive character text splitter based on all these values that we have initialized. Along with this we have also used different different separators if you're interested or you can directly use this blank separator. Okay. Then you can see that I am also using the splitter.split split documents over here and then you will be able to see the remaining chunks over here itself. Okay. Now this is for uh any document that I pass inside this particular function right but one thing is very important is that because after the chunking is done right you need to also convert that chunking into vectors with the help of this particular model. So for that I will be creating one more function which is called as embedding chunks right. So here what I will be doing is that I'll create this particular function called as embed chunks. Here we will take this chunks. So what happens is that first the load all documents will be called right after that the chunk documents will be called wherein all these documents will be chunked. Then all the chunks will be passed through our model to probably convert that into a vector embeddings. Right? So here you'll be able to see self domodel.enccode. So show progress bar is equal to true. Right? So here what we are doing we are reading all the page content and we are performing the embeddings and finally we return the embeddings over here right so this is what we are actually doing right so two important function one is chunk documents and one is embed chunks inside a class called as embedding pipeline now the same thing you can go ahead and test it in your app py right so in the app py what you are going to do is that here um I will just go ahead and go ahead and just a Okay, let me go ahead and initialize just a second uh the embedding pipeline. Okay, so here what I will do, I will go ahead and write from from src dot embedding import embedding pipeline. Right? And once you do this, I will go ahead and initialize the embedding pipeline. Okay? And then I will just go ahead and give this right. So this basically becomes my vectors sorry embed chunks it is there right so embed chunks before that I need to chunk the documents I also did not call the chunk documents so let's first of all call the chunk documents over here okay and then this will basically be my chunks and finally you can also go ahead and write over here as my chunk vectors ve chunk vectors is equal to and here uh you can go ahead and use the same embedding pipeline dot embed chunks right and finally you can go ahead and print the chunk vectors. So once you do this that basically means you'll be able to understand whether the chunking is happening or not. So let's quickly run this particular file again. And now you should be able to see the chunking that may be happening over here. Okay. So it'll take some amount of time because it is going to load all the documents again. Okay. And then the chunk document function is going to get applied over here. The chunk documents what it does is that it is just going to apply recursive character text splitter on every documents that we specifically give. Right? And once we do that you'll be able to see that it is loading. You can see all the things are happening over here. 21 PDFs, one PDF like 21 pages PDFs is over here with respect to this proposal load embedding all models splitted 64 documents I got into uh 359 chunks you know and then we basically go ahead and store this. Now the next step is that after this uh I will try to create a vector store and uh we will try to save those embeddings also. Okay. So here you can see all the chunks is uh vectors are visible over here right. So this is really really good. So just just imagine right in a pipeline it is specifically working one by one right it is it is working over here and that's that's the best part out here right now the next step is that what I will do is that I will try to create some more functions uh which can be for save and load uh like if I want to save this entire chunks how do I go ahead and save it you know u what do I save it each and every information that you'll be able to see over here Okay. Now, uh this was about uh the two important pipeline which is basically load all documents and uh embedding pipelines with uh two important function. One is chunk documents and one is embed chunk. So guys, now the next step is that what we are going to do is that now already we have created this embedding pipeline, right? Now let me do one thing because after performing the embedding, we also need to store it in some kind of vector store and it should be persistent in any kind of directory or in cloud. Right? So for this I will start working on this vector store. py file and here I'm going to use some code. Now you can see what all things I'm actually using. So I'm using the sentence transformer and embedding pipeline over here. Fiest vector store is the class name that we going to use. Uh I'm going to specifically use fis. Uh here we are going to use the same model. All mini l6 v2 chunk size everything is over here. And uh we are also making some kind of directories. the persistent directories like fire store should be the name and then here you'll be able to see I'm initializing the embedding model sentence transformer and all now the first step is that build from the documents now see here uh the same code we will go ahead and write what we had written in embedding pipeline right so here we are initializing embedding pipeline model dot self embedding model chunk size and I've given the chunk documents embed document embed chunks I've got the metadata and I'm adding all these embeddings inside my vector store and once I use selfsave Save. What is this self dots save? Save is a function which is going to save all the vector inside this index dotpickle files. Right? So metadata is basically getting saved in pickle file and files.index will basically be my vector store which will be in the persistent directory. So that is the reason I have written files.right index self.index files path right with open metame this and all information is there right. So this same method is basically there add embedding method is over here. Add embedding is nothing but it is basically taking it it is adding as a index flat tail two. So these are some basic stuffs when you actually work on this. Along with that I've also created two more function load and search. Load and search what it does is that it will actually allow you to load the files index the vector store. Okay. And will uh load it in the read byte mode and then with the help of search and query you should be able to ask any kind of queries that you have. Right. You can also use this query method. Uh here you can see we have written self domodel.enccode with respect to the query test as type float 32 and with the help of query search you'll be able to get the output. Okay. So this was about my vector store. Now in the app py what I am actually going to do I will just go ahead and make some changes. Okay. Now what what are the changes that I will be making? Okay. Instead of calling this two, okay, I will just go ahead and write store is equal to first of all let me go ahead and initialize this files vector store. So source dot embeddings files vector store here okay and here I will go ahead and initialize this and let me go ahead and give the path name. The path name is fires h o r e. Okay. Now initially if this p path is there then it is fine. Otherwise it'll go ahead and I'll just go ahead and write store.build from documents of all the docs. That's it. Now if I do this it is just going to go ahead and for the first time it is going to build it. Okay it is going to build it. So let's see whether it'll be able to build it or not. So here I'm going to clear the screen. Python app.p py let's quickly see this now it is going to read first of all it is going to read it then this is fine loading perfect load all the PDF files perfect now the chunking will happen automatically and it'll save it in the vector store inside that particular folder that is files let's see now it is generating 359 chunks all the steps are almost same what we have discussed from starting but this is A very super cool way of building something. Right? Now you can see save files index metadata to fire store vector store also. So here you can see fire store is there fires.index and metadata.pickle right now we need not run it each and every time right uh because uh once we have this right from the next time what we can do instead of always building unless and until you have a new documents I can also go ahead and write store.load okay if I go ahead and write store.load. Okay, I should be able to print anything that I want, right? Let's say I will go ahead and print something like this. I can use the same query method that we had. What is attention mechanism? Top K is equal to three. Right? So once I do this, you should be and this time I don't think so we need to also read any kind of documents also over here. Right? So I'll comment it down over here. This also you can uncomment it if you really want to or you can also give another conditions. Now what it'll do, it'll directly go ahead and read from the vector store. It'll pick it from the persistent directory and it'll give you the output. Let's see. So from the fire store, it'll go ahead and pick it up. And here you go. Here you get the answer clearly, right? See loading embedding models. This is there loading fire index and metadata. What is attention mechanism? All the information is over here. And this is the output that you are able to get. Right. Perfect. This this is what exactly uh I was actually talking about. But the best part is that we have created this in the form of a pipeline. You have data loader, you have embedding, you have vector store. Now for search what you can do is that you can integrate any LLMs over here. Right? So for this also I have written the code. Again I don't want to discuss it step by step line by line. So that it'll be again taking a lot amount of time to complete this. Right? So here I have my load_.env. You can just go ahead and load all these things. Groc API key is given over here. You can use it or you can use your own Gro API key. It's fine. Okay. And then we are doing the search, right? Wherein we are using this vector store do.query getting all the documents getting all the metadata and then we're giving some prompt and we are invoking it along with the LLM. So once we do this, it is superbly easy to execute this. Anyhow, you can do the research because I have discussed all these things in my Jupyter notebook, right? Uh now what I will do in my app.py py I'll see what changes needed to be added and uh what I will do is that I will first of all import rack search again from search dot search import rack search and then I will go ahead and initialize like this right and now I don't even require this okay now let's see whether it'll be able to give the summary or not it is loading from the vector store now I'm asking the question search and summarize This is the function here. What we do? We first of all do the query from the vector store that we were usually doing before. Then we give a prompt and then finally LLM will be able to give the output. So, so here you can see if my LLM is fine then I think I should be able to get an answer. So here you can see all the output is basically over here. So this was a complete idea or a kind of crash course that I really wanted to give on the entire uh rag. Rag is one of the most important use cases. That is what I always believe. Most of the companies are specifically building rag applications. So I think this is really really important and super cool topic. I hope you like this particular video. This was it from my side. I'll see you on the next video. Thank you. Take care.

Download subtitles

These subtitles were extracted using the Free YouTube Subtitle Downloader by LunaNotes.

Download more subtitles

Most viewed

Untertitel für 'Nicos Weg' Deutsch lernen A1 Film herunterladen

Laden Sie die Untertitel für den gesamten Film 'Nicos Weg' herunter, um Ihr Deutschlernen auf A1 Niveau zu unterstützen. Untertitel helfen Ihnen, Wortschatz und Aussprache besser zu verstehen und verbessern das Hörverständnis effektiv.

ดาวน์โหลดซับไตเติ้ล DMD LAND 3 The Final Land Day 1

ดาวน์โหลดซับไตเติ้ลสำหรับวิดีโอ DMD LAND 3 The Final Land Day 1 เพื่อช่วยให้เข้าใจเนื้อหาได้ง่ายขึ้น และเพิ่มความสะดวกในการติดตามทุกช่วงเวลา เหมาะสำหรับผู้ชมที่ต้องการความชัดเจนและเข้าถึงข้อมูลอย่างครบถ้วน

Descarga Subtítulos para NARCISISMO | 6 DE COPAS - Episodio 63

Accede fácilmente a los subtítulos del episodio 63 de '6 DE COPAS', centrado en el narcisismo. Descargar estos subtítulos te ayudará a entender mejor el contenido y mejorar la experiencia de visualización.

Subtítulos para TIPOS DE APEGO | 6 DE COPAS Episodio 56

Descarga los subtítulos para el episodio 56 de la tercera temporada de 6 DE COPAS, centrado en los tipos de apego. Mejora tu comprensión y disfruta del contenido en detalle con nuestros subtítulos precisos y accesibles.

Download Subtitles for Your Favorite Videos Easily

Enhance your video watching experience by downloading accurate subtitles and captions. Enjoy better understanding, accessibility, and language support for all your favorite videos.

If you found these subtitles useful, consider buying us a coffee. It would help us a lot!

Download Subtitles for Complete RAG Crash Course with Langchain

Complete RAG Crash Course With Langchain In 2 Hours

Related videos

Download Subtitles for CLAUDE CODE Full Course 2026

Download Subtitles for Introduction to DaVinci Resolve Full Course

Download Subtitles for XLMRat Lab - Cyberdefenders Video

Download Subtitles for All Machine Learning Concepts Video

Download Subtitles for 90-Second Brain Capture Video