Introduction to Data Structures
Data structures are essential building blocks in computer science, facilitating efficient data storage and organization. This comprehensive guide covers fundamental structures including arrays, linked lists, stacks, queues, trees, and graphs, with implementation insights and real-world applications. For a foundational overview, see Introduction to Data Structures and Algorithms.
Arrays and Lists
- Arrays store elements in contiguous memory with fast indexed access.
- Linked lists comprise nodes linked via pointers, allowing dynamic size but slower access.
- Dynamic lists handle insertions and deletions but may involve memory copying or pointer adjustments.
Stack Data Structure
- Stack is a Last-In-First-Out (LIFO) collection with operations:
- Push: insert element at the top.
- Pop: remove element from the top.
- Common uses: function call management, recursion, undo operations, expression evaluation.
- Implementations include array-based and linked-list-based stacks.
- Time complexity for push/pop/top operations: O(1).
Queue Data Structure
- Queue is a First-In-First-Out (FIFO) collection with operations:
- Enqueue (nq): insert element at the rear/tail.
- Dequeue (dq): remove element from the front/head.
- Common uses: resource scheduling, breadth-first search (BFS).
- Implementations using arrays (circular buffers) and linked lists.
- Time complexity for enqueue/dequeue/front operations: O(1).
Linked Lists
- Singly linked lists allow efficient insertions/removals at the head.
- Doubly linked lists maintain bi-directional links, facilitating easier removals and reverse traversals.
- Circular linked lists connect the tail back to the head for continuous traversal.
Trees
- Trees are hierarchical, non-linear structures with nodes containing data and pointers to children.
- Binary trees: each node has up to two children (left and right).
- Binary Search Trees (BST): binary trees with left children less or equal and right children greater than the node.
- Key operations in BST:
- Search, insertion, deletion with average time complexity O(log n), worst-case O(n).
- Tree traversals:
- Pre-order: Root, Left, Right
- In-order: Left, Root, Right
- Post-order: Left, Right, Root
- Level-order: breadth-first, visiting nodes level-wise.
Graphs
- Graphs are collections of vertices (nodes) connected by edges.
- Edges may be:
- Directed (unidirectional) or undirected (bidirectional).
- Weighted or unweighted.
- Graph terminology includes:
- Path: sequence of nodes connected by edges.
- Cycle: a closed path with no repeated nodes or edges except start/end node.
- Connectivity: strongly connected (directed), connected (undirected).
- Graph representations:
- Edge list: simple but inefficient for adjacency queries.
- Adjacency matrix: fast adjacency queries, O(1) time but high memory usage O(V2).
- Adjacency list: memory efficient for sparse graphs, stores neighbors per vertex; adjacency queries O(k) where k is node degree. For more on algorithms and data structures involving graphs, check Comprehensive Overview of Algorithms and Data Structures Course.
Summary
Efficient data storage and retrieval depend on appropriate data structure selection based on data size, operation frequency, and memory constraints. Arrays excel in indexed access, linked lists in dynamic sizing, stacks and queues in order-constrained access, trees in hierarchical organization, and graphs in modeling complex relationships.
This guide provides foundational understanding for designing and implementing efficient algorithms for software development and computational problem solving. For deeper insights and programming perspectives, refer to Understanding Data Structures Through C Language: A Comprehensive Guide, and for a broader language perspective, see Comprehensive Overview of Data Structures and Algorithms Using Python.
In this lesson and in this series of lessons, we will introduce you to the concept of data structures. Data structure is the most fundamental and building block concept in computer science and good knowledge of data
structures is a must to design and develop efficient software systems. Okay, so let's get started. We deal with data all the time and how we store, organize and group our data together matters. Let's pick up some examples from our day-to-day life
where organizing data in a particular structure helps us. We are able to search a word quickly and efficiently in a language dictionary because the words in the dictionary are sorted. What if the words in the dictionary were not sorted?
It would be impractical and impossible to search for a word among millions of words. So dictionary is organized as a sorted list of words. Let's pick up another example. If we have something like a city map, the data like position of a
landmark and road network connection. So all this data is organized in the form of geometries. We show the map data in the form of these geometries on a two dimensional plane. So map data needs to be structured like this so that we have
scales and directions and we are effectively able to search for a landmark and get route from one place to another. And I'll pick one more example for something like daily cash in and cash out statement of a business, what we
also call a cash book in accounts. It makes most sense to organize and store the data in the form of a tabular schema. It is very easy to aggregate data and extract information if the data is organized in these columns in these
tables. So different kind of structures are needed to organize different kind of data. Now computers work with all kind of data. Computers work with text, images, videos, relational data, geospatial data and pretty much any kind of data that we
have on this planet. How we store, organize and group data in computers matters because computers deal with really really large data and even with the computational power of machines, if we do not use the right kind of
structures, the right kind of logical structures, then our software systems will not be efficient. Formal definition of a data structure would be that a data structure is a way to store and organize data in a computer so that the
data can be used efficiently. When we study data structures as ways to store and organize data, we study them in two ways. So I'll say that we talk about data structures as one, we talk about them as mathematical and logical models. When we
talk about them as mathematical and logical models, we just look at an abstract view of them. We just look at from a high level what all features and what all operations define that particular data structure. Example of abstract view
from real world can be something like the abstract view of a device named television can be that it is an electrical device that can be turned on and off. It can receive signals for satellite programs and play the audio video of the
program and as long as I have a device like this, I do not bother how circuits are embedded to create this device or which company makes this device. So this is an abstract view. So when we study data structures as mathematical or
logical models, we just define their abstract view or in other words, we have a term for this, we define them as abstract data types. An example of abstract data type can be, I want to define something called a list that should be
able to store a group of elements of a particular data type and we should be able to read the elements by their position in the list and we should be also able to modify element at a particular position in the list. I would
say store a given number of elements of any data type. So we are just defining a model. Now we can implement this in a programming language in a number of ways. So this is a definition of an abstract data type. We also call abstract data
type as ADT and if you see, all the high-level language is already have a concrete implementation of such an ADT in the form of arrays. So arrays give us all these functionalities. So arrays are data types which are concrete
implementation. So the second way of talking about data structures is talking about their implementation. So implementations would be some concrete types and not an abstract data type. We can implement the same ADT in multiple ways
in the same language. For example in C or C++ we can implement this list ADT as a data structure named linked list and if you have not heard about it, we will be talking about them a lot. We will be talking about linked list a lot in the
coming lessons. Okay so let's define an abstract data type formally because this is one term that we will encounter quite often. Abstract data types are entities that are definitions of data and operation but do not have implementations.
So they do not have any implementation details. We will be talking about a lot of data structures in this course. We will be talking about them as abstract data types and we will also be looking at how to implement them. Some of the
data structures that we will talk about are arrays linked list, stack, queue, tree, graph and the list goes on. There are many more to study. So when we will study these data structures we will study their logical view. We will study what
operations are available to us with these data structures. We will study the cost of these operations mostly in terms of time and then definitely we will study the implementation in a programming language. So we will be studying all these
data structures in the coming lessons and this is all for this introductory lesson. Thanks for watching. In our previous lesson we introduced you to the concept of data structures and we saw how we can talk about data structures in
two ways. One as a mathematical and logical model that we also call that we also term as an abstract data type or ADT and then we also study data structures as concrete implementations. In this lesson we will study one simple data
structure. We will first define an abstract view of it. We will first define it as an abstract data type and then we will see the possible implementations and this data structure is list. List is a common real world entity. List is nothing
but a collection of objects of the same type. We can have a list of words, we can have a list of names or we can have a list of numbers. So let us first define list as an abstract data type. So when we define abstract data type we just
define the data that will store and we define the operations available with the type and we do not go into the implementation details. Let us first define a very basic list. I want a list that can store a given number of elements of a
given data type. This would be a static list. The number of elements in the list will not change and we will know the number of elements before creating the list. We should be able to write or modify element at any position in the
list and of course we should be able to read element at a particular position in the list. So if I ask you for an implementation of such a list and you have taken a basic course in programming, a basic introductory course then you'll
be like hey I know this and array gives us all these features. All these operations are available with an array. We can create an array of any data type so let us say if we want to create a list of integers then we declare the array
type as integer and then we can give the size as a parameter in declaration. I can write or modify element at a particular position. The elements are A0, A1 and are accessed something like this. We all know about arrays and then we can also
read elements at a particular position. The element at ith position is accessed as AI. So array is a data structure that gives us implementation for this list. Now I want a list that should have many more features. I want it to handle more
scenarios for me. So I'll redefine this list here. I do not want a static list a static collection with a fixed size. I want a dynamic list that should grow as per my need. So the features of my list are that I'll call my list empty if there
are no elements in the list. I'll say the size of the list is 0 when it is empty and then I can insert an element into the list and I can insert an element at any position in the list and in an existing list. I can remove element from the list.
I can count the number of elements in the list and I should be able to read or write or rather read or modify element at a particular position in the list and I should also be able to specify the data type for the list. So I should be able to
while creating the list I should be able to say whether this is a list of integers or whether this is a list of string or float or whatever. Now I want a data structure which is implementation of this dynamic list. So how do I get it?
Well actually we can implement such a dynamic list using arrays. It's just that we will have to write some more operations on top of arrays to provide for all these functionalities. So let us see how we can implement this
particular list using arrays. Let's for the sake of simplicity of design assume that the data type for the list is integer. So we are creating a list of dynamic list of integers. What we can do is to implement such a list we can
declare a really large array. We will define some max size and declare an array of this max size. Now as we know the elements in the array are indexed as a0, a1, a2 and we go on like this. So what I'll do is I'll define a variable that
will mark the end of the list in this array. So if the list is empty we can initialize this variable or we can set this variable as minus 1 because the lowest index possible is 0. So if end is minus 1 the list is empty. At any time
a part of the array will store the list. Okay so let's say initially when the list is empty this pointer end is pointing to index minus 1 which is not valid which does not exist. And now I insert an integer into this array and let's
say if we do not give the position at which the number is to be inserted the number is always inserted towards the tail of the list towards the end of the list. So the list will be like we will have an element at position 0 and now
end is index 0. So at any time end marks the this variable end marks the end of the list in this array. Now if I want to insert something in the list at a particular position let's say I want to insert number 5 at index 2 then to
accommodate 5 here at this particular position we will have to shift all the elements one unit towards the right. All the elements starting index 2 we need to shift all the elements starting index 2 towards the right. Okay I just inserted
some elements into the list let me also write the function call for these. Let's say we went in this order we inserted 2 then we inserted 4 and then we inserted in the end we are inserting 5 and we will also give the position at which we
want to insert. So this insert with two arguments would be the call to insert element at a particular position. So after all these operations after all these insertions this is what the list will look like. This arrow here marks the end
of the list in the array. Now if I want to remove an element from a particular position let's say I make a call to something to the remove function I want to remove the element 2. So I'll pass the index 0 here I want to remove the element
at index 0. So to do so all these elements after index 0 will be shifted one unit towards the left or towards the lower indices and 2 will go away. Now this end variable here is being adjusted after each insertion that we are making. So after
this insertion end will be 0 after this 1, 2, 3 and so on after this remove end will be 4 again. Okay looks like we pretty much have an implementation of this list in the left that is described as an abstract data type. We have a logic
of calling the list empty when we have this variable n is equal to minus 1. We can insert element at a particular position in the list we can remove element. It's just that we have to perform some shifts in the array. We can
count the number of elements in the list. It will be equal to n plus 1 the value in the variable n plus 1. We can read or modify element at a position well this is an array so we can definitely read or modify element at a particular
position. If we wanted to choose the data type it was just choosing the array of that particular data type. Now this looks like a cool implementation except that we have one problem. We said that the array will be of some large size some
max size. But what is a good max size? We can always exhaust array the list can always grow to exhaust the array. There is no good max size. So we need to have a strategy for the scenario when the list will fill up the whole array. So what do
we do in that case? We need to keep that into our design. We cannot extend the same array. It is not possible to do so. So we will have to create a new array a larger array. So when the array is full we will create a new larger array and
copy all the elements from the previous array into the new array. And then we can free the memory for the previous array. Now the question is by how much should we increase the size of the new array? This whole operation of creating a new array
and copying all the elements from the previous array into the new array is costly in terms of time and definitely a good design would be to avoid such big cost. So the strategy that we choose is that each time the array is full we create
a new larger array of double the size of the previous array. And why this is the best strategy is something that we will not discuss in this lesson. So we will create a larger array of double size and copy elements from previous array into
this new array. This looks like a cool implementation. The study of data structures is not just about studying the operations and the implementation of these operations. It's also about analyzing the cost of these operations. So let us see what are
the costs in terms of time for all these operations that we have in the dynamic list. The access to any element in this dynamic list if we want to access access it using index for read or write then this will take constant time because we
have an array here. And in array elements are arranged in one contiguous block of memory using the starting address or the base address of the block of the memory of the block of memory and the index or the position of the element we
can calculate the address of that particular element and access it in constant time. Big O notation that is used to describe the time complexity of operations for constant time it is written as in terms of Big O the time
complexity is written as Big O of 1. If we wanted to insert element if we wanted to insert element at the end of the array end of the list then that again will be constant time but if we would insert element at a particular position in the
list then we will have to shift elements towards higher indices. In the worst case we will have to shift all the elements to the right when we will be inserting at the first position. So the time taken for insertion will be proportional to the
length of the list let's say the length of the list is n or in other words we will say that insertion will be Big O of n in terms of time complexity. If you do not know about Big O notation do not bother just understand that inserting an
element at a particular position will be a linear function in terms of the size of the list. Removing an element will again be Big O of n. Time taken will be proportional to the current size of the list n is the size of the list here okay
now inserting an element at the end we just said that it will happen in constant time it is not so if the array is full then we will create a new array let's call inserting element at the end as adding an element. Adding an element will take
constant time if the list is not full but it will take time proportional to the size of the list size of the array if the array is full. So adding in the worst case will be Big O of n again as we said when the list is full we create a new
copy double the size of the previous array and then we copy the previous array the elements from previous array into the new array. So prime of a see what looks like the good thing with this kind of implementation. Well the good thing is
that we can access elements at any index in constant time which is the property of the array but if we have to insert some element in between and if we have to remove element from the list then it is costly. If the list grows and
strings a lot then we will also have to create a new array and have all this thing of copying elements from previous array into new array again and again. And one more problem is that a lot of time a lot of the array would be unused the
memory there is of no use but definitely the use of array as dynamic list is not efficient in terms of memory this kind of implementation is not efficient in terms of memory. This leads us to think can we have a data structure that will
give us a dynamic list and use the memory more efficiently. We have one data structure that gives us good utilization of the memory and this data structure is linked list and we will study about the linked lists in the
next lesson. So that's it for this lesson thanks for watching. In this lesson we will introduce you to link the list data structure. In our previous lesson we tried to implement a dynamic list using arrays and we had some issues there. It
was not most efficient in terms of memory usage in terms of memory consumption. When we use arrays we have some limitations. To be able to understand linked list well we need to understand these limitations. So I am going to tell
you a simple story to help you understand this. Now let us say this is computer's memory and each partition here is one byte of memory. Now as we know each byte of memory has an address we are showing only a section of the memory that's why it is
extending towards the bottom and the top. Let's say the address increases from bottom to top. So if this byte is address 200 the next byte would be address 201 and next byte would be address 202 and so on. What I want to do is I want to draw
this memory from left to right horizontally instead of trying it from bottom to top like this. This looks better. Let's say this byte here is address 200 and as we go towards the right the address increases. So this is
like 201 and we go on like 202, 203 and so on. It doesn't really matter whether we show memory from bottom to top or left to right. These are just logical ways to look at the memory. So coming back to our story memory is a crucial resource and
all the applications keep asking for it. So Mr. Computer has given this job of managing the memory to one of his components to one of his guys who he calls the memory manager. Now this guy keeps track of what part of the memory
is free and what part of the memory is allocated and anyone who needs memory to store something needs to talk to this guy. Albert is our programmer and he is building an application. He needs to store some data in the memory so he
needs to talk to the memory manager. He can talk to the memory manager in a high level language like C. Let us say that he is using C to talk to the memory manager. First he wants to store an integer in the memory. So he communicates
this to memory manager by declaring an integer variable something like this. The memory manager sees this declaration and he says that okay you need to store an integer variable. So I need to give you four bytes of memory because integer
variable is stored in four bytes in a typical architecture and let us say in this architecture it is stored in four bytes. So the memory manager looks for four bytes of free space in the memory and assigns it or allocates it for
variable X. Address of a block of memory is the address of the first byte in the memory. So let us say this first byte of memory here is at address 217. So variable X is at address 217. So memory manager kind of communicates it back to Albert
that hey I have assigned address 217 for your variable X you can store whatever you want there and Albert can fill in any data into this variable. Now Albert needs to store a list of integers a list of numbers and he thinks that the
maximum number of integers in this list will be four. So he asks the memory manager for an integer array of size four named A. Now array is always stored in memory as one contiguous block of memory. So memory manager is like okay I need to
look for a block of memory of 16 bytes for this variable this array A. So the memory manager allocates this block starting address 201 and ending address 216 for this variable A which is an array of four integers. Because array is
stored as one contiguous block of memory and memory manager conveys the starting address of this block whenever Albert tries to access any of the elements in the array let's say he tries to access let's say he tries to write some value at
the fourth element in the array which he accesses as A3. Albert's application knows where to write this particular value because it knows the base address the starting address of the block A the array A and from base address using the
index which is three here it calculates the address of A3. So it knows that A3 is at address 213. So to access any of the elements in the array the application takes constant time and this is one awesome thing about arrays that
irrespective of the size of the arrays the application and application can access any of the elements in the array in constant time. Now let's say Albert uses this array of four integers to store his list. So I'll fill in some
values here at these positions let's say this is 8, this is 2, this is 6, this is 5, this is 4. Now Albert at some point feels that okay I need to have one more element in this list. Now he has declared an array of size 4 and he wants to add
a fifth element in the array. So he asks the memory manager that hey I want to extend my array A is it possible to do so I want to extend the same block and the memory manager is like when I allocate memory for an array I do not expect that
you will ask for an extension. So I use whatever memory is available adjacent to that block for other variables. In some cases I may extend the same block but in this case I have an element and a variable X next to your block so I cannot
give you an extension. So Albert is like what all options do I have? Memory manager is like you can tell me the new size and I can recreate a new block at some new address and we will have to copy all the elements from the previous block
to the new block. So Albert says that okay let's do it but the memory manager is like you still need to give me the size of the new block. Albert thinks that this time he'll give a really large size for the new array or the new block so
that it does not fill up. This new block starting address 2 to 4 is allocated. Albert asks memory manager to free the previous block and this is some cost he has to copy all the elements all the numbers from the previous block into the
new block and now he can add one more element to this list and he has kept his array large this time just in case he needs more numbers in the list. So the only option that Albert had was to create a as an entirely new block as an
entirely new array and Albert is still feeling bad because if the list is too small he is not using some part of the array and so memory is getting wasted and if the list again grows too much he will again have to create a new array a
new block and he will again have to copy all the elements from the previous block into the new block. Albert is desperately seeking a solution to this problem and the solution to this problem is a data structure named linked list. So let us
now try to understand linked list data structure and see how it solves Albert's problem. What Albert can do is that instead of asking the memory manager for an array which will be one large contiguous block of memory he can ask memory for
one unit of data at a time for one element at a time in a separate request. I am cleaning up the memory here once again let's say Albert wants to store this list of four integers in the memory. What if he requests memory for one
integer at a time. So first he pings memory manager for some memory to store number six memory manager will be like okay you need space to store an integer so you get this block of four bytes at address 204. So Albert can store number
six here now Albert makes another request a separate request for number five. Let say he gets this block starting address 217 for number five because he makes a separate request he may or may not get memory adjacent to number six higher
probabilities that he will not get an adjacent memory location. So similarly Albert makes separate requests for number four and two. So let's say he gets these two blocks at address 232 and 242 respectively for numbers four and two. So
as you can see when all but makes separate requests for each integer instead of getting one contiguous block of memory he gets these disjoint non-contiguous blocks of memory. So we need to store some more information here we need to
store the information that this is the first element in the list and this is the second element in this list. So we need to link these blocks together somehow with an array it was very simple. We had one contiguous block of memory so so we knew
where a particular element is by calculating its address using the starting address of the block and the position of the element in the array but here we need to store the information that this is the first block which stores
the first element and this is the second block which stores the second element and so on. To link these blocks together and to store the information that this is the first block in the list and this is the second block in the list what we
can do is that we can store some extra information with each block. So what if we can have two parts in each block something like this and in one part of the block we store the data or the value and in the other part of the block we
store the address of the next block. In this example in the first block the address part would be 2 1 7 the address of the next block that stores 5 and in this next block or the second block the address part would be 2 3 2. In the block
at address 2 3 2 we will store the address 2 4 2 the address of the next block that stores number 2 and the block at 2 4 2 is the last block there is no next block after this. So in the address part we can have address as 0 0 is in
valid address 0 can be used to mark that this is the end of the list there is no link to the next node or next block after this particular block. So all but now actually has to request memory manager for a block of memory that will store
two variables one and integer variable that will store the value of our element and one a pointer variable that will store the address of the next block or the next node in the list. In C he can define a type named node like this he
will have two fields in the node one to store the data this field will be an integer and one more field to store the address of the next node in the list. So Albert will ask a node Albert will ask memory for a node from the memory manager
and the memory manager will be like okay you need a node that needs four bytes for an integer variable and four more bytes for the pointer variable that will store the address pointer variable also in a typical architecture is stored in
four bytes. So now memory manager gives us a block of eight bytes and we call this block a node. Now notice that the second field in the node structure is node star which means pointer to node so this field will only store an address
of the next node in the list. So if we store the list like this in the memory as these non-contiguous nodes connected to each other then this is a linked list data structure. Logical view of the linked list data structure will be
something like this. Data is stored in these nodes and each node store the data as well as the link to the next node. So each node kind of points to the next node. The first node is also called the head node and the only information about
the list that we keep all the time is address of the head node or address of the first node. So address of the head node kind of gives us access to the complete list. The address in the last node is null or 0 which means that the
last node does not point to any other node. Now if we want to traverse the linked list the only way to do it is we start at the head and we go to the first guy and then we ask the first guy the address of the next guy address of the
next node and then we go to the next node and ask the address of the next node and this is the only way to access the elements in the linked list. If we want to insert a node in the linked list let's say we want to add number 3 at the end
of the linked list then all we need to do is first create a node in the linked list. Sorry first create a node independently and separately it will get some memory location. So we created this node with value 3. Now all we need to do
is fill the address properly adjust these links properly. So the address of this particular node will be filled in this node with value 2 and this node the address part can be null. So it is the last node it does not point to any other
node. Let's also show these nodes in the memory here. So I've written the address of each node in brown at top of these nodes and I've also filled in this address field of each node. Let's say the node for value 3 gets address 2 5 2. So this is
how things will be in the memory and this is how the logical view will be. The linked list is always identified by the address of the first node and unlike arrays we cannot access any of the elements in constant time. In the case of
arrays using the starting address of the block of memory and using the position of the element in the list. We could calculate the address of the element but in this case we have to start at the head and we have to ask this element for next
element and then ask the next element who is your next. It's like playing treasure hunt. You go to the first guy and then you get the address for the second guy and then you go to the second guy and you get the address for the
third guy. So the time taken to access elements will be proportional to the size of the list. Let's say the size of the list is n. There are n elements in the list. In the worst case to traverse the last element you will go through all
the elements. So time taken to access elements is proportional to n or in other words we say that this operation will cost us or rather the time complexity of this operation is big O of n. Insertion into the list. We can insert anywhere in
the list. We first need to create a node and just adjust these links properly. Like say I want n at third position in the list. So all we need to do is create a node, store the value 10 in the data part, something like this. Let's say we get the
node 10 at address 310. So we will adjust the address field in the second node to point to and this node with address 310 and this node will point to the node with value 4. Now to insert also we will have to traverse the list and go to that
particular position and so this will be big O of n again in terms of time complexity. The only thing is that the insertion will be a simple operation. We will not have to do all the shifts as we had to do in an array to insert
something in between. We had to shift all the elements by one position towards higher indices. Similarly to delete something from this list will also be O n. So we can see some good things about linked list. There is no extra use of
memory in the sense that some memory is unused. We are using some extra memory. We are using some extra memory to store the addresses but we have the benefit that we create nodes as and when we want and we can also free the nodes as and when we want.
We do not have to guess the size of the list beforehand like in the case of arrays. Now we will discuss all the operations on linked list and the cost of these operations as well as comparison with arrays in our next lessons. We will also be implementing
linked list in C or C++. So this is all for a basic introduction to linked list. Thanks for watching. In our previous lesson we introduced you to linked list data structure and we saw how linked lists solve some of the problems that we have with arrays. So now the obvious question would be
which one is better and array or a linked list. Well there is no such thing as one data structure is better than another data structure. One data structure may be really good for one kind of requirement while another data structure can be really good for another kind of requirement.
So it all depends upon factors like what is the most frequent operation that you want to perform with the data structure or what is the size of the data and there can be other factors as well. So in this lesson we will compare these two data structures based on some parameters based on
the cost of operations that we have with these data structures. So all in all we will comparatively study the advantages and disadvantages and try to understand in which scenario we should use an array and in which scenario we should use a linked list. So I will draw two columns here one for array
and another for linked list and the first parameter that we want to talk about is the cost of accessing an element irrespective of the size of an array it takes constant time to access an element in the array. Now this is because an array is stored as one contiguous block of memory. So if we know the
starting address or the base address of this block of memory let us say what we have here is an integer array and the base address is 200 the first byte in this block is at address 200 then let's say if we want to calculate the address of element at index i then it will be equal to 200 plus i into
size of an integer in bytes. So size of an integer in bytes is typically 4 bytes so it will be 200 plus 4 into i. So if 0th element is at address 200 if we want to calculate the address for element at index 6 it will be 200 plus 6 into 4 which will be equal to 224. So knowing address of any
element in an array is just this calculation for our application in terms of big o notation constant time is also called big o of 1. So accessing an element in an array is big o of 1 in terms of time complexity. If you are not aware of big o notation check the description of this video for
a tutorial on time complexity analysis. Now in a linked list data is not stored in a contiguous block of memory. So if we have a linked list something like this let's say we have a linked list of integers here then we have multiple blocks of memory at different addresses. Each block in
the linked list is called a node and each node has two fields one to store the data and one to store the address of the next node. So we call the second field the link field. The only information that we keep with us about a linked list is the address of the first node which we also call
the head node and this is what we keep passing to all the functions also the address of the head node. To access an element in the linked list at a particular position we first need to start at the head node or the first node and at the first node we need to see the address of the second node
and then we go to the second node and see the address of the third node. In the worst case to access the last element in the list we will be traversing all the nodes in the list. In the average case we will be accessing the middle element maybe. So if n is the size of the linked list
and is the number of elements in the linked list then we will traverse n by two elements. So the time taken in the average case also is proportional to number of elements in the linked list. So we can say that the time complexity in average case is big O of n. So on this parameter
cost of accessing an element arrays course heavily over linked list. So if you have a requirement where you want to access elements in the list all the time then definitely array is a better choice. Now the second parameter that we want to talk about is memory requirement or
memory usage. With an array we need to know the size of the array before creating it because array is created as one contiguous block of memory. So array is a fixed size. What we typically do is create a large enough array and some part of the array stores our list
and some part of the array is vacant or empty so that we can add more elements in the list. For example we have an array of seven integers here and we have only three integers in the list. Rest four positions are unused. There would be some garbage value there.
With linked list let's say we have let's say we have this linked list of integers. There is no unused memory. We ask memory for one node at a time so we do not keep any reserved space but we use extra memory for pointer variables. And this extra memory requirement for pointer
variables in a linked list cannot be ignored in a typical architecture let's say integer is stored in four bytes and pointer variable also takes four bytes. So if you see the memory requirement for this array of seven integers is 28 bytes and the memory requirement for this linked list would be
eight into three where eight is the size of each node four for integer and four bytes for the pointer variable. So this is also 24 bytes. If we add one more element to the list in the array we will just use one more position. While in linked list we will create one more node
and we'll take another eight bytes. So this will be 32 bytes linked list would fetch us a lot of advantage of the data. The data part is large in size. So in this case we had a linked list of integers. So integer is only four bytes. What if we had a linked list in which a data part was
some complex type that took 16 bytes. So four bytes for the link and 16 bytes for the data each node would have been 20 bytes and array of seven elements for 16 bytes of data would be 16 byte for each element would be 112 bytes and linked list of four would be only 80 bytes.
So it all depends. If the data part for the list takes a lot of memory linked list will definitely consume lot less memory. Otherwise it depends what strategy we are choosing to decide the size of the array. At any time how much array we keep unused. Now one more point with memory allocation
because arrays are created as one contiguous block of memory. Sometimes when we may want to create a really really large array then maybe memory may not be available as one large block but if we are using linked list memory may be available as multiple small blocks. So we will have this problem
of fragmentation in the memory. Sometimes we may get many small units of memory but may not get one large block of memory. This may be a rare phenomenon but this is a possibility. So this is also where linked list scores. Because arrays have fixed size once array gets filled and we
need more memory then there is no other option than to create a new array of larger size and copy the content from the previous array into the new array. So this is also one cost which is not there with linked list. So we need to keep these constraints and these requirements
in mind when we want to decide for one of these data structures for our requirement. Now the third parameter that we want to talk about is cost of inserting an element in the list. Remember when we are talking about arrays here we are also talking talking about the possible use
of array as dynamic list. So there can be three scenarios in insertion. First scenario will be when we want to insert an element at the beginning of the list. Let's say we want to insert number three at the beginning of the list. In the case of arrays we will have to shift each element
by one position towards the higher index. So the time taken will be proportional to the size of the list. So this will be big O of n. Let's say n is the size of the list. This will be big O of n in terms of time complexity. In the case of linked list inserting an element
in the beginning will mean only creating a new node and adjusting the head pointer and the link of this new node. So the time taken will not depend upon the size of the list. It will be constant. So for linked list inserting an element at the beginning is big O of one in terms of
time complexity. Inserting an element at n for an array. Let's say we are talking about dynamic array a dynamic list in which we create a new array if it gets field filled. If there is space in the array we just write to the next higher index of the list. So it will be constant time.
So time complexity is over if array is not full. If array is full we will have to create a new array and copy all the previous content into new array which will take O in time where n is the size of the list. In the case of linked list adding an element inserting an element at the end will
mean traversing the whole list and then creating a new node and adjusting the links. So time taken will be proportional to n. I'll use this color coding for linked list. Here n is the number of elements in the list. Now the third case would be when we want to insert in the middle of the list
at some nth position or maybe some ith position. Again in the case of arrays we will have to shift elements. Now for the average case we may want to insert at the mid position in the array. So we will have to shift n by two elements where n is the number of elements in the list. So the time
taken will is definitely proportional to n in average case. So complexity will be big O of n. For linked list also we will have to traverse till that position and then only we can adjust the links even though we will not have any shifting we will have to traverse till that point and in
the average case time taken will be proportional to n and the time complexity will be big O of n. If you can see deleting an element will also have these three scenarios and the time complexity for deleting for these three scenarios will also be the same. And the final point the final parameter
that I want to talk about is which one is easy to use and implement and array definitely is a lot easier to use linked list implementation especially in C or C++ is more prone to errors like segmentation fault and memory leak it takes good care to work with linked list. So this was arrays versus linked
list in our next lesson we will implement linked list in C or C++ we will get our hands dirty with some real code. So this is it for this lesson thanks for watching. In our previous lessons we described linked list we saw the cost of various operations in linked list and we also compared
linked list with arrays. So now let us implement linked list the implementation will be pretty similar in C and C++ there will be slight differences that we will discuss. The prerequisite for this lesson is that you should have a working knowledge of pointers in C C++ and you should also know the
concept of dynamic memory allocation. If you want to refresh any of these concepts check the description of this video for additional resources. Okay so let's get started. As we know in a linked list data is stored in multiple non-contiguous blocks of memory and we call each block of memory a
node in the linked list. So let me first draw a linked list here. So we have a linked list of integers here with three nodes as we know each node has two fields or two parts one to store the data and another to store the address of the next node what we can also call link to the next node.
So let's say the address of the first node is 200 and address of the second node is 100 and the address of the third node is 300 for this linked list. This is only a logical view of the linked list. So the address part of the first node will be 100 the address of the second node and we will
have 300 here. The address part of the last node will be null which is only a synonym or macro for address 0. 0 is an invalid address a pointer variable equal to 0 or null with address 0 or null means that the pointer variable does not point to a valid memory location. The memory block
the address of the memory block allocated to each of the nodes is totally random there is no relation it's not a guarantee that the addresses will be in increasing order or decreasing order or adjacent to each other. So that's why we need to keep these links.
Now the identity of the linked list that we always keep with us is the address of the first node what we also call the head node. So we keep another variable that will be of type pointer to node and this guy will store the address of the first node and we can name this pointer variable
whatever let's say this pointer variable is named a. The name of this particular pointer variable that points to the head node or the first node can also be interpreted as the name for the linked list also because this is the only identity of the linked list that we keep with us all the time.
So let us now see how this logical view can be mapped to a real program in C or C++. In our program node will be a data type that will have two fields want to store the data and another to store the address. In C we can define such a data type as structure.
So let's say we define a structure named node with two fields first field to store the data the type of the data here is integer. So this will be node for a linked list of integers. If we wanted a linked list of say double this data type would be double. The second field will be
pointer to node struct node star we can name this link or we can name this next or whatever. This is C style of declaring node star or pointer to node. If this was C++ we could simply write node star I would write it this way the C++ way it looks better to me. In our logical view here
this variable A is of type node star or pointer to node. Each of these three rectangles with two fields are of type node and this field in the node the first field is of type integer and the second field is of type pointer to node or node star. It is important to know which one is what
in the logical view we should have this visualization before we go on to implement linked list. Okay so let us now create this particular linked list of integers that we are showing here through our code. To be able to do so we will have to implement two operations one to insert
a node into the linked list one operation to insert a node in the linked list and another operation to traverse the linked list. But before that the first thing that we want to do is that we want to declare a pointer to the head node a variable that will store the address of the
head node for the sake of clarity I'll write head node here. So I have declared a pointer to node named A. Initially when the list is empty when there is no element in the list this pointer should point nowhere so we write a statement something like A is equal to null to say the same. Now with
these two statements what we have done is we have created a pointer to node named A and this and this pointer points nowhere so the list is empty. Now let's say we want to insert a node in this list so what we do is we first create a node creating a node is nothing but creating a memory block
to store a node in C we use the function malloc to create a memory block as argument we pass the number of bytes that we want in the block so we say that give me a memory block that will be equal to the size of a node so this call to malloc will create a memory block. This is a dynamically
allocated memory memory allocated during runtime and the only way to work with this kind of memory is through reference to this memory location through pointers. Let us assume that this memory block assigned here is at address 200. Now malloc returns a void pointer that gives us the address of
assigned memory block so we need to collect it in some variable so let's say we create a variable named temp which is pointer to node so we can collect the return of malloc the address in this particular variable we will need a type casting here because malloc returns void pointer and we are
having temp as pointer to node so now we have created one node in the memory. Now what we need to do is fill in the data in this particular node and adjust the links which will mean writing the correct address in the variable a and the link field of this newly created node to do so we will
have to dereference this particular pointer variable and that we just created. As we know if we put an asterisk sign in front of the pointer variable we mean dereferencing it to modify the value at that particular address. Now in this case we have a node which has two fields so once we dereference
if we want to access each of the fields we need to put something like a dot data here to access the data and a dot link to write to the link field so we will write a statement like this to fill in value 2 here and we have this temp variable pointing to this right now
and the link part of this newly created node should be null because because this is the first and the last node and the final thing that we need to do is write the address of this newly created node in a so we will write something like a is equal to temp okay temp was to temporarily
store the address of the node till the time we had not fixed all the links properly we can now use temp for some other purpose our linked list is intact now it has one node these two lines that we have written here for dereferencing and writing the values into the new node there is alternate
syntax for this instead of writing something like start temp bracketed dot data we could also write temp followed by this arrow and data we will have two characters to make this arrow one hyphen and one this right angular bracket right angular brace so we can write something like this
and the same thing below we can write something like this to create a memory block in c++ we can use malloc as well as we can use the new operator so in c++ it gets very simple we could simply write node start temp is equal to new node like this and we would mean the same thing
this is a lot cleaner and new operator is always preferred over malloc so if you're using c++ new is recommended so so far through our program we created an empty list by creating this pointer to the head node and assigning the value null to it initially then we created a node and we added
this first node in this linked list when the list is empty and we want to insert a node the logic is pretty straightforward when the list is not empty we may want to insert a node at the beginning of the list or we may want to insert a node at the end of the list or we may even want to insert a
node somewhere in the middle of the list at a particular position we will write separate functions and routines for these different kind of insertions and we will see running code in a compiler in our coming lessons let's just talk about the logic here in this whatever unstructured code I
have right now so I want to write a code to insert two more nodes each time at the end of the list we actually want to create the linked list with three nodes having values two four and six that was our initial example in the beginning okay so let us add two more nodes with values
four and six into this linked list at this stage in our code we already have a variable temp which is pointing to this particular node we will create a new node and use the same variable name to collect the address of this new node so we will write a statement like this so a new node is created
and temp now stores the address of this new node which is located at address 100 here once again we can set the data and then because this is going to be the last node we need to set the link as null now all we need to do is build the link of this particular node right the address of this
newly created node into the address field of this last node to do so we will have to traverse the list and we will have to go to the end of the list to do so we will write something like this we can create a new variable temp one which will be pointed to node and initially we can
point to the head node point this variable to the head node by a statement like this we can write a loop like this now this is generic logic to reach the end of the list it will not be so clear if we see this logic with only one node as we have in this example let's draw a list with multiple
nodes so we are pointing temp one to the first node here and if the link part of this node is null we are at the last node else we can move to the next node so temp one equal temp one dot link will get us to the next node and we will go on till we reach the last node
for the last node this particular condition temp one dot link not equal null will be false because the link part will be null and we will exit this while loop so this is our code logic for traversal of the list all the way till end if we want to print the elements in the list we
will write something like this and we will write print temp dot data inside this while loop but right now we want to insert in the at the end of the list and we are only traversing the list to reach the last node there is one more thing that I want to point out we are using this variable
temp one and initially storing the address in a we are not doing something like a equal a dot link and using the variable a itself to traverse the list because if we modify a we will lose on the address of the head node so a is never modified the address of the head node whatever
variable stores the address of the head node is never modified only these temporary variables are modified to traverse the list so finally after all this we will write a statement like temp one dot link is equal to temp temp one is pointing here so now this address part is updated and this link
is built so we have two nodes now in the list once again when we want to insert node with number six in this list we will have to create a new node by this logic then we will have to traverse the list by this logic so we will point temp one here first and then the loop will
move the temp one to the end let's say this new block is at address 300 so this last line finally will adjust the link of the node at address 100 to insert a node at the end there is one logic in these four lines if the list is empty and there is another logic in these remaining lines if the
list is not empty ideally we will be writing all these logics all this logic in a function we will do that in our coming lessons we will implement separate methods to print all the nodes in a linked list and to insert a node at the end we will implement a separate method to insert a
node at the beginning of the list and at a particular position in the list so this is all for this lesson thanks for watching in our previous lesson we saw how we can map the logical view of a linked list into a c or c++ program we saw how we can implement two basic operations one traversal of
the linked list and another inserting a node at the end of the linked list in this lesson we will see a running code that will insert a node at the beginning of the linked list so let's get started I will write a c program here the first thing that we want to do in our program is that we want
to define a node a node will be a structure in c it will have two fields want to store the data let's say we want to create a linked list of integers so our data type will be integer if we wanted to create a linked list of characters then our type would be character here so we will have another
field that will be pointed to node that will store the address of the next node we can name this variable link or some people some people also like to name this variable next because it sounds more intuitive this variable will store the address of the next node in the linked list
in c whenever we have to declare node or pointer to node we will have to write struct node or struct node star in c++ we will have to write only node star and that's one difference okay so this is the definition of our node now to create a linked list we will have to create
a variable that will be pointer to node and that will store the address of the first node in the linked list what we also call the head node so I will create a pointer to node here struct node star we can name this variable whatever often for the sake of understanding
we name this variable head now I have declared this variable as a global variable I have not declared this variable inside any function and I'll come back to why I'm doing so now I'll write the main method this is the entry point to my program the first thing that I want to do is I want to
say head is equal to null which will mean that this pointer variable points nowhere so right now the list is empty so far what we have done here in our code is that we have created a global variable named head which is of type pointer to node and the value in this pointer variable is null
so so far the list is empty now what I want to do in my program is that I want to ask the user to input some numbers and I want to insert all these numbers into the linked list so I'll print something like how many numbers let's say the user wants to input n numbers so I'll collect this
number in this variable n and then I'll define another variable I to run the loop and so I'm running a loop here if it was c++ I could declare this integer variable right here inside the loop now I'll write a print statement like this and I'll define another variable x and each time
I'll take this variable x as input from the user and now I will insert this particular number x this particular integer x into the linked list by making a call to the method insert and then each time we insert we will print all the nodes in the linked list the value of all the nodes in
the linked list by calling a function named print there will be no argument to this function print of course we need to implement these two functions insert and print let me first write down the definition of these two functions so let us implement these two functions insert and print
let us first implement the function insert that will insert a node at the beginning of the linked list now in the insert function what we need to do is we first need to create a node in c we can create a node using malloc function we have talked about this earlier malloc returns a pointer to
the starting address of the memory block we are having to type custer because malloc returns a void pointer and we need a pointer to node a variable that is pointer to node and then only if we day reference we day reference using an asterisk sign then we will be able to access the fields of
the node so the data part will be x and we have an alternate syntax for this particular syntax we could simply write something like temp and this arrow and it will mean the same thing and this is more common with these two lines in the insert function all we are doing is we are
creating a node let's say we get this node and let's assume that the address that we get for this node is hundred now there is a variable temp where we are storing the address we can do one thing whenever we create a node we can set data to whatever we want to set and we can set the link field
initially to null and if needed we can modify the link field so I'll write one more statement temp.next is equal to null remember temp is a pointer variable here and we are de-referencing the pointer variable to modify the value at this particular node temp will also take some space in
the memory that's why I have shown this rectangular block for both the pointer variables head and temp and node has two parts one for the pointer variables and one for the data so this part the link part is null we can either write null here or we can write it like this it's the same thing logically it
means the same now if we want to insert this node in the beginning of the list there can be two scenarios one when the list is empty like in this case so the only thing that we need to do is we need to point head to this particular node instead of pointing to null so I will write a
statement like head is equal to temp and the value in head now will be address hundred and that's what we mean when we say a pointer variable points to a particular node we store the address of that node so this is our linked list after we insert the first node let us now see what we can do to
insert a node at the beginning if the list is not empty like what we have right now once again we can create a node fill in the value x here that is passed as argument initially we may set the link field as null and let's say this node gets address 115 the memory and we have this variable temp
through which we are referencing this particular memory block now unlike the previous case if we just set head is equal to temp now this is not good enough because we also need to build this link we need to set the next or the link of the newly created node to whatever the previous head was
so what we can do is we can write something like if head is not equal to null or if the list is not empty first set m dot next equal head so we first build this link the address here will be hundred and then we say head equal temp so we cut this link and point head to this newly created
node and this is our modified linked list after insertion of this second node at the beginning of the list now one final thing here this particular line the third line temp dot next equal null this is getting used only when the list is empty if you see when the list is empty head is already
null so we can avoid writing two statements we can simply write this one statement m dot next equal head and this will also cover the scenario when the list is empty now the only thing remaining in this program to get this running is the implementation of this print function so let us
implement this print function now what i will do here is i'll create a local variable which is pointed to node named temp and i need to write struct node here i keep missing this in c you need to write it like this and i want to set this as address of the head node so this global
variable has the address of the head node now i want to traverse the linked list so i will write a loop like this while temp is not equal to null i'll keep going to the next node using this statement temp is equal to temp dot next and at each stage i'll print the value in that node as temp dot data
now i'll write two more print one outside this while loop and one outside after this while loop to print an end of line now why did we use a temporary variable because we do not want to modify head because we will lose the reference of the first node so first we collect the address
in head in another temporary variable and we are modifying the addresses in this temporary variable using temp is equal to temp dot next to traverse the list now let us now run this program and see what happens so this is asking how many numbers you want to insert in the list
let's say we want to insert five numbers initially the list is empty let's say the first number that we want to insert is two at each stage we are printing the list so the list is now two the first element and the last element is two we will insert another number the list is now five two five is
inserted at the beginning of the list again we inserted eight and eight is also inserted at the beginning of the list okay let's insert number one the list is now one eight five two and finally I inserted number 10 so the final list is 10 1 8 5 2 this seems to be working
now if we were writing this code in c++ we could have done a couple of things we could have written a class and organized the code in an object oriented manner we could also have used new operator in place of the malloc function and now coming back to the fact that we have declared this head
as global variable what if this was not a global variable this was declared inside this main function as a local variable so I'll remove this global declaration now this head will not be accessible in other functions so we need to pass address of the first node as argument to other functions
to both these functions print and insert so to this print method we will pass let's say we name this argument as head now we can name this argument argument as head or a or temp or whatever if we name this argument as head this head in print will be a local variable of print and will not be
this head in main these two heads will be different these two variables will be different when the main function calls print passing its head then the value in this particular head in the main function is copied to this another head in the print function so now in the print function we
may not use this m variable what we can do is we can use this variable head itself to traverse the list and this should be good we are not modifying this head here in the main similarly to the insert function we will have to pass the address of the first node and this head again is just a copy
this is again a local variable so after we modify the linked list the head in main method should also be modified there are two ways to do it one we can pass the pointer to node as return from this method so in the main method insert function will take another argument head and we will have
to collect the return into head again so that it is modified now this code will work fine whoops i forgot to write a return here return head and we can run this program like before we can give all the values and see that the list is building up correctly there was another way
of doing this instead of asking this insert function to return the address of head we could have passed this particular variable head by reference so we could have passed insert ampersand head head is already a pointer to node so in the insert function we will have to receive
pointer to pointer node star star and to avoid confusion let's name this variable something else this time let's name this pointer to head so to get head we will have to write something like we will have to dereference this particular variable and write astric pointer to head
everywhere and the return type will be void sometimes we want to name this variable as head this local variable as head doesn't matter but we will have to take care of using it properly now this code will also work as you can see here we can insert nodes and this seems to be going well
if you do not understand this concept of scope you can refer to the description of this video for additional resources so this was inserting a node at the beginning of the linked list thanks for watching in our previous lesson we had written code to insert a node at the beginning of the linked list
now in this lesson we will write program to insert a node at any given position in the linked list so let me first explain the problem in a logical view let's say we have a linked list of integers here there are three nodes in this linked list let us say they are at addresses 200 and 250
respectively in the memory and we have a variable head and that is pointer to node that stores the address of the first node in the list now let us say we number these nodes we number these positions on a one based index so this is the first node in the list and this is the second node and this
is the third node and we want to write a function insert that will take the data to be inserted in the list and the position at which we want to insert this particular data so we will be inserting a node at that particular position with this data there will be a couple of scenarios the list could
be empty so this variable head will be null or this argument being passed to the insert function the position n could be an invalid position for example 5 is an invalid position here for this linked list the maximum possible position at which we can insert a node in this list will
be 4 if we want to insert at position 1 we want to insert at the beginning and if we want to insert at position 4 we want to insert at end so our insert function should gracefully handle all these scenarios let us assume for the sake of simplicity for the sake of simplifying our
implementation that we always give a valid position we will always give a valid position so that we do not have to handle the error condition in case of invalid position the implementation logic for this function will be pretty straightforward
we will first create a node let's say in this example we want to insert a node with value 8 at third position in the list so i'll set the data here in the node the data part is 8 now to insert a node at the nth position we will first have to go to the n minus 1th node
in this case n is equal to 3 so we will go to the second node now the first thing that we will have to do is we will have to set the link field of this newly created node equal to the link field of this n minus 1th node so we will have to build this link
let's say the address that we get for this newly created node is 150 once we build this link we can break this link and set the link of this newly created node as address of this set the link of this n minus 1th node as address of this newly created node we may have special cases in our
implementation like the list may be empty or maybe we may want to insert a node at the beginning let's say we will take care of special cases if any in our actual implementation so now let's move on to implement this particular function in our program
in my c program the first thing that i need to do is i want to define a node so node will be a structure and we have seen this earlier so node has these two fields one data of type integer and another next of type pointer to node now to create a linked list the first thing that i need to create
is a pointer to node that will always store the address of the first node or the head node in the linked list so i will create struct node star let's name this variable head and once again i have created this variable as a global variable to understand linked list implementation we need
to understand what goes where what variable sits in what section of the memory and what is the scope of these variables what goes in the stack section of the memory and what goes in the heap section of the memory so this time as we write this code we will see what goes where in the main method
first i'll set this head as null to say that initially the list is empty so let us now see what has gone where so far in our program in what section of the memory and the memory that is allocated to our program or application is typically divided into these four parts or these four sections
we have talked about this in our lesson on dynamic memory allocation there is a link to our lesson on dynamic memory allocation in the description of this video i'll quickly talk about what these sections are one section of the applications memory is used to store all the instructions that need
to be executed another section is allocated to store the global variables that live for the entire lifetime of the program of the application now one section of the memory which is called stack is used to store all the information about function call executions to store all the local
variables and these three sections that we talked about are fixed in size their size is decided at compile time the last section that we call heap or free store is not fixed and we can request memory from the heap during runtime and that's what we do when we use malloc or new operator
now i have drawn these three sections of the memory stack heap and the section to store the global variables in our program we have declared a global variable named head which is pointed to node so it will go and sit here and this variable is like anyone can access it initially value here
is null now in my program what i want to do is i first want to define two functions uh insert and this function should take two arguments data and the position at which i want to insert a node and insert that particular node at that position insert data at that position in the list
and another function print that will simply print all the numbers in the linked list now in the main method i want to make a sequence of function calls first i want to insert number two the list is empty right now so i can only insert that position one so after this insert list
will be having this one number this particular number two and let's say again i want to insert number three at position two so this will be our list after this insertion and i will make two more insertions and finally i'll print the list so this is my main method i could have also
asked a user to input a number and position but let's say we go this way this time now let us first implement insert i'll move this print above so the first thing that i want to do in this method is i want to create a node so i will make a call to malloc in c++ we can simply
write a new node for this call to malloc and this looks a lot cleaner let's go c++ way this time now what i can do is i can first set the data field and set the link initially as null i have named this variable temp one because i want to use another temp variable
in this function i'll come to that in a while now we first need to handle one special case when we want to insert at the head when we want to insert at the first position so if n is equal to one we simply want to set the link field of the newly created node as whatever
the existing head is and then adjust this variable to point to the new head which will be this newly created node and we will be done at this stage so we will not execute any further and return from this function if you can see this will work even when the list is empty because the head will be
null in that case i'll show a simulation in the memory in a while so hold on till then things will be pretty clear to you after that now for all other cases we will first need to go to the n-1th node as we had discussed in our logic initially so what i'll do is i'll create another
pointer to node name this variable temp two and we will start at the head and then we will run a loop and go to the n-1th node something like this we will run the loop n-2 times because right now we are pointing to head which is the first node so if we do this temp two equal temp two dot next
n-2 times we will be pointing temp two to n-1th node and now the first thing that we need to do is set the next or the link field of newly created node as the link field of this n-1th node and then we can adjust the link of this n-1th node to point to our newly created node and now i'm writing this
print here i've written this print here we have used a temporary variable a temporary pointer to node initially pointed to pointed it to head and we have traversed the whole list okay so let us now run this program and see what happens we are getting this output which seems to be correct
the list should be four five two three in this order now i have this code i'll run through this code and show you what's happening in the memory when the program starts execution initially the main method is invoked some part of the memory from the stack is allocated for execution of a
function all the local variables and the state of execution of this function is saved in this particular section we also call this stack frame of a function here in this main method we have not declared any local variable we just set head to null which we have already done here now the
next line is a call to function insert so the machine will set the execution of this particular method main on hold and go on to execute this call to insert so insert comes into this stack and insert has couple of local local variables it has two arguments data and this variable n
this stack frame will be a little larger because we will have a couple of local variables and now we create this another local variable which is a pointer to node m1 and we use the new operator to create a memory block in the heap and this guy temp1 initially stores the address of this
memory block let's say this memory block is at address 150 so this guy stores the address 150 when we request some memory to store something on the heap using new or malloc we do not get a variable name and the only way to access it is through a pointer variable so this pointer variable
is the remote control here kind of so here when we say temp1 dot data is equal to this much through this pointer which is our remote we are going and writing this value to here and then we are saying temp dot next equal null so null is nothing but address 0 so we are writing
address 0 here so we have created a node and in our first call n is equal to 1 so we will come to this condition now we want to set temp1 dot next equal head temp1 dot next is this section this second field and this is already equal to head head is null here and this is
already null null is nothing but 0 the only reason we set temp dot next equal head will work for empty cases because head would be null and now we are saying head is equal to temp1 so head guy now points to this because it stores address 150 like temp1
and in this first call to insert after this we will return so the execution of insert will finish and now the control returns to the main method we come to this line where we make another call to insert with different arguments this time we pass number 3 to be inserted at position 2
now once again memory in the stack frame will be allocated for this particular call to insert the stack frame allocation is corresponding to a particular call so each time the function execution finishes all the local variables are gone from the memory
now once again in this call we create a new node we keep the address initially in this temporary variable temp1 now let's say we get this node at address 100 this time now n is not equal to 1 we will move on to create another temporary variable temp2 now we are not creating a new node and
storing the address in temp2 here we are saying temp2 is initially equal to head so we store the address 150 so initially we make this guy point to the head node and now we want to run this loop and want to keep going to the next node until we reach n minus 1th node in this case n is equal to
2 so this loop will not execute this statement even once n minus 1th node is the first node itself now we execute these two lines the next of the newly created node will be set first so we will build this link oops no temp2 dot the next is 0 only so even after reset this will be 0
and now we are setting temp2 dot next as temp1 so we are building this link and now this call to insert will finish so we go back to the main method so this is how things will happen for other calls also so after everything we have inserted when we will reach this print statement
in the main function our list will be something like this in the memory this is a little messy i've chosen this addresses as per my convenience for the sake of example and now print will execute and once again i'm using a temp variable in print by now it should have been clear to you
why we use temp variable again and again and why this variable head that stores the address of the first node is so important now what if this head was not global what if we would have declared this head inside the main method we have talked about this in our previous lesson head will not
be accessible everywhere so in each call to these functions in each call to insert we will have to return some value from the function to update this head or we will have to pass this head by reference we have talked about this in our previous lesson so this is it for this this lesson
in our next lesson we will see program to delete a node at a particular position in the list so thanks for watching in our previous lesson we wrote program to insert a node at nth position or a given position in a list in a linked list now in this lesson we will write a program to delete
a node at any given position in a linked list so once again i have drawn a linked list here we have four nodes in this list at addresses 100, 200, 150 and 250 respectively so this is my example of a linked list of integers and let's say we number the positions on a one based index
so this is the first node in the list and this is the second node this is the third node and this is the fourth node when we talk about deleting a node from the linked list we will have to do two things first we will have to fix the links so that the node is no more a part of the list
let's say in this case we want to delete the node at third position so we will go to the second node for nth node we will have to go to the n minus 1th node and we will have to set the link part of the n minus 1th node as the link of the nth node which will be the n plus 1th node so we will cut
this link and now this node at address 150 is not part of the linked list because when we will traverse the linked list we will go from address 100 to 200 and from 200 we will go to 250 this is one scenario for deletion in which we have a node before and a node after
there will be special cases like we may want to delete the node at the first position or the head itself in that case we will have to point head to the second node we will have to build this link now we will talk about all these special cases in our implementation let's first understand
the logic now fixing the links is not good enough because all that we do when we fix the links is that we detach the node from the linked list so that it is no more accessible but it is still occupying space in the memory as we know a node is allocated space from what we call the dynamic
memory or the heap section of the memory we have talked about this earlier in C or C plus plus we have to explicitly free this memory when we are done using it because it is not automatically deallocated and memory being a crucial resource we do not want to consume it unnecessarily when
we do not need it so the second thing that we will have to do is we will have to free the space that's being taken by the node and that's when the node will actually be deleted from the memory so let us now write code for this I'm writing my C program here the first thing that I have
done is I have defined a node which is a structure with two fields one to store data and another to store address of the next node so the second field is appointed to node now to create a linked list we will have to first create a pointer to node a variable which is pointer to node and
that will store the address of the head node or the first node in the list and now I want to define three functions first insert function that will take some value some data to be inserted into the list and always insert this value at the end of the list then I want to define a print function
that will print all the elements in the list now we have defined this variable head as a global variable so it will be accessible to all these functions and the third function that I want to write is delete that will take the position end of the node to be deleted and delete the node at
that particular position we will go back to implementing these methods first I'll write the main method so in the main method first what I'll do is I'll set head as null so at this stage the list is empty and then I'll make a couple of calls to insert function to insert some integers
in the list so after this fourth insert the list will be two four six five because we are always inserting at the end of the list this insert function will insert the node at the end of the list now what I want to do in my main method is I want to ask a user for a position and I'll input this
position from the console and then I'll delete the node at this particular position and then I'll print the whole linked list now let's also make a call to print after all the inserts okay so this is what we want to do in our main method we want to insert four integers
in a linked list to create a list two four six five in this order and then I want to print the list then I want to input a number from the console and delete the node at that particular position now let us assume that we will always give a valid position and in my implementation also
I will not handle the error condition when position will not be valid we have seen implementation of insert and print earlier so I will not go into their implementation details what I'll do now is I'll implement delete function now in my delete function let's first
handle the case when there is a node before the node that we want to delete so we have an n minus one-th node what I'll do is I'll first create a temporary variable that is pointed to node and point this to head and using this temporary variable we will go to n minus one-th node to go
to the n minus one-th node we will have to run a loop n minus two times and we will have to do something like this temp1 is equal to temp1.next now what I'll do is I'll create a variable to point to the nth node name this temp2 and this will be equal to temp1.next and now I can fix the link
I can say that adjust the link section the link part of n minus one-th node to point to n plus one-th node which will be temp2.next now our linked link is fixed and this variable temp2 stores the nth node reference to the nth node so we can make a call to free
function now free function deallocate whatever memory is allocated through malloc if we were using c++ and used and if we would have used new operator we should have said delete temp2 okay now we should be good this much code will work for scenarios when we have an n minus one-th
node and even if there is no n plus one-th node if n plus one-th position is null this will work for this that scenario I'll leave that as an exercise for you to validate now we have not handled one special case when we want to delete the head node so if n is equal to one then what we want
to do is we just want to set head as temp1.next temp1 is right now equal to head and now head has moved on to point to the second node and temp1 still points to the first node so links are fixed and we can free the first node which is now detached from the linked list because head is now
pointing to the second node okay so this is our delete function I have missed one thing here for n not equal to one we should not execute this section of the code so either we put an else statement after this or what we can do is we can say return after we execute these statements
for this condition now this code should work if I've got everything right so let us now run this and see what happens I have already written the insert and print functions I'll come back to this main function this is my list 2465 and I can enter any of the positions one two three or four
so let's first say we want to delete the head node and we are printing the list after deleting a particular node so the list now is 465 this seems to be correct let us run this again and this time I delete number five from position four the list is now 246 which is correct again
similarly if I enter position two the list is 265 which is correct so we seem to be good I'll quickly walk you through this code in the logical view to make things further clear let's say we first make a call to delete node from the first position that is we want to delete
the head node so in this code what we are doing is we are first creating a variable temp1 which is pointed to node initially temp1 is equal to head so it stores the address 100 so it points to the head node now n is equal to 1 so we come to this instruction head is equal to temp1 dot next
actually this is temp arrow dot next but while reading we read this as temp1 dot next this is nothing but a syntactical sugar for this statement asterisk temp1 dot next so we are de-referencing this pointer variable to go to this node and then accessing the next
field of this node now we are saying head is equal to temp1 dot next so head is now 200 so we are building this link and breaking this link and now in the next line we say free temp1 so we want to free the memory which is being pointed to by this variable temp1
temp1 still points to this node at address 100 so this node now will be cleared from the memory and now we return so this function does not execute any further it finishes its execution once the function execution finishes temp1 which was a local variable also gets cleared from the
memory head is a global variable so it does not gets cleared this is how we know the linked list this is the identity of the linked list this particular variable head let's read on this code again and this time i want to delete the node at third position in the list i have drawn this
initial list so once again we create this variable temp1 we say that the address here is equal to 100 so it points to the head node or the first node and now n is not equal to 1 it is equal to 3 so we come to this particular loop n is equal to 3 so this loop will execute exactly once
this statement will execute exactly once so temp1 will now move to address 200 so temp1 is now pointing to the second node this is what we wanted to do we wanted temp1 to point to n minus 1th node n is 3 here now we create another variable another pointer to node temp2 and we set this guy as temp1
dot next temp1 dot next is 150 so we set this guy as 150 so this guy points to the nth node or the third node now in the next line we are saying that temp1 dot next this value which is 150 right now is now temp2 dot next address of the n plus 1th node or fourth node so this guy will
now be 250 so we are building this link and we are breaking this link so we have fixed the links and now finally we are saying that free the memory which is being pointed to by temp2 so now this third node the memory block will be deleted from the memory and once this function
execution finishes all the local variables temp1 and temp2 will be cleared and this is what the list will be this node at address 250 will now be the third node so this was deleting a node at a particular position in the linked list now we can also have a problem where we may want to delete
a node with a particular value now you can try implementing it in the coming lessons we will see more problems on linked list so thanks for watching in our lessons on linked list so far we have implemented some of the basic scenarios like inserting a node in linked list and deleting a node
from linked list in this lesson we will write code to reverse a linked list this is one of the most favorite interview questions and this is a really interesting problem so let me first define the problem let's say we have been given a linked list of integers like this so this is our input
we have four nodes in this linked list at addresses 100, 200, 150 and 250 respectively I always write these addresses in the logical view because it's really important that we visualize how things are in the memory and what is what like this first node that we also call the head node
is being pointed by this particular variable named head so this variable is basically storing the address of the head node now this variable is only a pointer this is not the head node itself and we do not have any other identity of the linked list except the address of the head node
so given a linked list like this if we have to reverse it and by reversing we do not mean moving around data like we cannot move five at address 100 two at address 250 and do something like this we actually have to adjust the links so our output should be something like this the head
pointer should point to this node at address 250 and we should go like 250 250 150 to 200 and this node at address 100 should have address 0 or null in each of these nodes this first field in red is the data part and the second field is the address part so this is what we will get when
we will reverse the list there are two approaches to solve this problem one is an iterative approach where we will be using a loop we will traverse through the list and at each step we will revert one of the links another solution is using recursion in this lesson we will try to understand
the iterative solution so coming back to our input list the iterative solution is relatively easier to understand what we can do is we can traverse the whole list and as we go to each node we can adjust the link part of that node to make it point to the previous node instead of the next
node so we will start at the first node at each step we want to reverse the link so we want to make the node point to the previous node instead of the next node for the first node there is no previous node so let's say the previous node is null and now we want to cut this particular link
and we want to build this particular link so we will simply change the address field to 0 and we have reversed the link part of this particular node and now we will go to the next node in the list we will come to this node of course the question would be how would we
go to the next node if we have broken this link here we will come back to that in our implementation details let's say we are able to traverse the list and go to each of the nodes at each step let's say we store all the relevant information to do that in some temporary variables now at this
node again we will reverse the link so the address part will be set as 100 here now we will go to the next node at address 150 once again to reverse the link we will set the address as 200 here so we will break this link and basically we are building this link and now we will go to address
250 the next node we will set the address 150 here so we will cut this link and build this link and finally when we have reached the last node we will adjust the address in this head variable to 250 so this particular variable this particular pointer
will point to this node at address 250 and our linked list is reversed now so let us implement this particular logic in a real c program I will redraw the original input list in my ccode I will define node as a structure like this this is how we have defined a node
in all our previous lessons so there will be two fields one to store the data which will be of type integer and another to store the address of the next node we will name this feed it next and it will be of type pointer to node and let's say head is a global variable so head is a pointer to node
head is a variable which is a pointer to node and it is a global variable so it is accessible to all the functions it we do not need to pass it around to functions now all I want to do in my code is I want to write a reverse function that will reverse the linked list which is pointed to
by this particular pointer head as we said we will traverse the whole list and at each step we will modify the link field of the node to make it point to the previous node instead of the next node so how do we traverse the list we would traverse the list in our ccode something like this
we will first take a variable which will be pointer to node let's say we will name it temp then first we will set temp to head by saying this we will make temp point to the first node and then we will run a loop like this we will say that what temp while temp is not
equal to null take temp to the next address with a statement like temp is equal to temp dot next in our problem here we don't just have to traverse the list as we traverse the list we have to reverse the link so we have to set the address field of a particular node as the address of the
previous node instead of the next node now in a linked list we would always know the address of the next node but we would never know the address of the previous node so as we traverse the list we will have to keep track of the previous node in another variable so what I will do here is
something like this I will also declare a variable named previous and initially set it to null because for the first node or the head node the previous node is null and now in my loop we will have to update both these variables and the variable temp that will store the current node
and the variable pref that will store the address of the previous node and now in my loop I can do something like this at each step if temp is our current node as we are traversing the list then we will say that temp dot next is equal to previous so we will set the link part of the
current node as the address of the previous node in our example here at the first step we will say that temp dot next will be 0 null is nothing but address 0 so we will cut this link and we will build this link now we should be able to move temp to 200 now and we should be able to move
previous to 100 now in the next step but there is a problem as soon as we adjusted the link of this particular node at address 100 to make it point to null we lost the address of the next node so how do we move temp to this particular node at address 200 we cannot set temp equal temp
dot next now if we set temp equal temp dot next now we will go to null so this is a problem so at each step in our iteration before we set the link field of the current node to make it point to the previous node we should store the address of the next node in our temporary variable in
another temporary variable so what I'll do here is something like this first of all I want to name this particular variable temp as current to mean that this is the current node at any stage in my iteration so we initially set current to head and then we are running the loop as while current is
not equal to null and then I've also declared one more temporary pointer variable named next what I'll do at each step each step in my iteration inside the while loop is that first I'll say something like next is equal to current dot next so first I'll store the address of the next node
in this particular variable next so in our example here for the first node initially things will look some something like this now we can set the link part of the current node as address of the previous node with a statement like this so when we will write the address 0 here initially we will break
this link and create this link we will not lose the information about the next node now we can redefine our previous and current so we will first move previous to current and then we will move current to next please note that this particular variable next is a local variable in the reverse
function and when we say something like current dot next we mean the link field in the node while when we say when we simply say next we mean this particular local pointer variable so they're different this is not current dot next actually this is current arrow next which is an alternate
syntax for asterisk current dot next so we use the asterisk operator to dereference that address and then we access the next field for the sake of saying we say current dot next dot temp dot next so with these two lines in our loop we are resetting our previous and current pointers
this is how we are traversing the list if you see in the next iteration current is 200 it is not equal to null null is 0 so we will go to this particular statement next is equal to current dot next so next we'll now store the address 150 and now we will say current dot next is equal to previous
so we will cut this link previous is 100 right now so we will set 100 here so basically we will build this particular link and then we will move we will first move previous to current and then move current to next and we will go on like this
so finally we will reach a stage like this when current will be equal to null we will come out of the loop and when we will come out of the loop this particular variable previous this particular pointer previous will store the address of the last node and there is one more thing remaining
here we need to adjust this particular variable head this link at this stage does not exist and in my code I'll say head should now be equal to the address invariable previous so head is now 250 this is our new head and now our list is reversed there are a couple of things that I
want to point out here one thing is that we must see whether our implementation is working for all test cases so we must also verify it for special or corner test cases in this case corner test case will be when the list is empty in that case head will be null or when the list is having
only one node if you see this particular implementation will work for these two scenarios give it give it some time and you should be able to figure it out let's now run this code with complete implementation of all the functions to insert and print nodes in my code here i have
written reverse function to accept the address of the head node as argument and then return the address of the head node after modification of the list after reversal of the list and then i have written the main method in which i'm declaring head as a local variable
and then i'm using couple of insert functions i'm making couple of calls to insert function insert function also takes two arguments the address of the head node and the data to be inserted and it returns back the address of the head node it could either be modified or not modified
let's say we are inserting at the end of the list so initially our list will be two four six eight and then we are making a call to the print function which i have written to print the elements in the list and then i'm making a call to reverse and finally printing again
my logic of the reverse function remains the same except that i've changed the method signature and in the end i'm returning head which will return the address of the head node let's say we have written all the other functions insert and print correctly
these are the two functions insert and print so let's now run this code and see what happens before the list is reversed the output is two four six eight and after the list is reversed the output is eight six four two let us try this for the case when we have only one element in the list
so i'll remove i'll comment out these three insert statements and this also seems to be working so this was reversal of linked list through iteration in the next lesson we will write code to reverse linked list using recursion so thanks for watching in our series on linked list so far we have implemented
some of the basic operations like insertion deletion and traversal now in this lesson we will write code to traverse and print the elements of a linked list using recursion prerequisite for this lesson is that you should understand a recursion as a programming concept
recursive traversal of linked list actually helps us solve a couple of interesting problems but in this lesson we will keep it simple we will just traverse and print all the elements in linked list using recursion and we will write one simple variation to print all the elements in reversed
order using recursion we will actually not reverse the list we will just print the elements in reversed order so once again i have taken example of a linked list of integers here we have four nodes each rectangle here is a node it has two fields one to store the data and another to store the
address of the next node let's say we have four nodes that addresses 100 200 150 and 250 respectively and of course we will also have a variable that will store the address of the head node let's name this variable head programmatically in our c or c++ program a node will be defined something
like this we will have a structure with two fields one to store the data and another to store the address of the next node what we want to do in this particular lesson is that we want to write two functions first we want to write a function named print that will take address of a node as
argument we will pass this function the address of the head node so let's name this argument head and in this function we will use recursion to print the elements in the list so for this particular example here if we want to print a space separated list of all the elements our output will be
something like this and we also want to write another function named reverse print here also we will take the address of a node so we will pass this guy the address of the head node and in this function we will use recursion to print the elements in the list in reversed order so if we have to
print a space separated list for this example and our output will be something like this so let's first implement the print function in my c code here i'll declare print function like this it will take us argument the address of a node so the argument is of type pointer to node
initially we will pass the address of the head node we can name this argument head or we can name this argument p we can name it whatever but we must understand that this will be a local variable and let's not bother about other infrastructure in the code like how we would create a linked
list and how we would insert a node in the linked list let's assume that they are in place so let's keep the name of this particular argument p now recursion is a function calling itself so we have been passed the address of a node initially the head node so what we can do in our code is first
we can print the value at that particular node with a print f statement like this and then we can make another call to the print function and this time we will pass the address of the next node with a statement like this this next field is also a pointer to node so this will pass the address
of the this will be the address of the next node there is one more important thing in recursion and that we should never forget and it's the exit condition from recursion we should not go on making recursive calls infinitely so in this case if we go from the first node to the second node
and from the second node to the third node using recursion then finally at one stage p will be equal to null in one of the calls at this stage we can avoid making a recursive call we will exit we will show you a simulation of how things will happen in memory hold on for a while
so once we will reach the end of the list p will be equal to null and we will exit the recursion at that stage now i'll write the main method i've already written the insert function here so i'll declare a variable head as null in the main method so head will be a local variable
once again we could name this particular variable a or b or whatever just because this variable points to the head node or the first node in the list we named this variable as head and then we will insert some nodes in the linked list using by making call to the insert function
that takes the address of the head node as argument initially head is set as null to say that the list is empty and there should be two arguments to head to the function insert the address of the head node and the value that needs to be inserted and why is it that this particular function insert
is returning a pointer it's because this head in the main method is a local variable and if we pass it to the function we just pass a copy of the address of the head node in this head which will be a local variable of insert function so this guy returns back the address of the modified head
so we can update it in the main function this function inserts a node at the end of the list so initially when head is null head will be modified in the insert function for other cases it will not be modified if we are inserting at the end so we will make four such calls to the insert
function to create a linked list of four integers two four six five and now we will make a call to print function and pass it the address of the head node let us now run this code and see what happens as you can see we have got this output two four six five the print function here in our
code which is a recursive implementation to print the lists is working now i'll make one slight change in the print function instead of printing the value in the node and then making a recursive call i'll first make a recursive call and then when the recursive call finishes
i'll print the value in the node and i'll not modify anything else in the code the main method will remain the same and if we run this code we can say that we can see that the elements in the list are printed in reversed order so we just implemented the reverse print function that we
had talked about let us now analyze these two recursive implementations in a logical view in our example here if we want to print this particular list we will do something like from the main method we will make a call to the print function passing a third address of the head node
so initially this print function is being called with p equal hundred now in the execution of this function we will come here if p is equal to null null is address 0 and our argument is hundred so control will not go inside this if condition we will come here we will print p arrow data p arrow
data means that we will first dereference the address so we will go to the address hundred and then we will look at the data field there so on the console we will print the data field of data field at address hundred and now we will make a recursive call we will make a call to
print function passing it address p arrow next which is 200 and the execution of this particular call will not finish it will finish only after print 200 finishes we will come back to it now print 200 once again prints the data at address 200 and then makes a recursive call to print function
passing address 150 and we will go on like this in this call to print with address 250 we will first print the data and the address field the the value of p dot next p arrow next is 0 what we can also say null so we will make a call like this now for this call
with argument null we have reached and the exit condition recursion will not go grow any further so we will just print an end of line and return this particular structure that we have drawn here is called recursion tree so print null function call will finish and control will return back
to print 250 there is no statement after this particular recursive call finishes so we will simply exit this function call also and control will return back to print 150 and we will go on like this finally we will come back to the main method
if you want to see how the recursion will execute in the memory then i'll have to draw a diagram like this applications memory the memory that is allocated for the execution of a program has these two sections all the details of function call execution and the local variables
they are stored in the stack section of the memory and any memory that is allocated using the malloc function or the new operator in c++ they go into the heap section the memory for the nodes in a linked list is allocated from the heap so that's why these four nodes
in our example are sitting in the heap if you want to know in detail about stack and heap check the description of this video for a lesson on dynamic memory allocation when the program will start executing first the main function will be invoked
anytime a function is invoked some amount of memory from the stack is allocated for the execution of that particular function now it's called the stack frame of that function so let's say main is executing we have already inserted some nodes in the linked list we have this variable head in
the main function so all the local variables in sit in the stack frame of the function so head will sit here now at this stage let's say main makes a call to print function so main was executing and now it makes a call to print function execution of main will be paused
and we will go on to execute the print function the argument passed to the print function is hundred which is stored in a local variable this argument p is a local variable in the print function now print function again makes a recursive call now a stack frame is always allocated
corresponding to each recursive each call of a function so a function calling itself is not different from a function calling another function at any time whatever function call is at the top of the stack is executing finally even we will reach the exit condition of the recursion stack
will be something like this and then first this call where p is zero will finish we will come back to this particular call and then this will finish and we will go on like this so this is how recursion works this is how things will happen in the memory okay so now i'll clear this diagram
of stack and heap in the right and i'll make some change in my print function what i've done is i have renamed my function print as reverse print and in my function i'm first making a recursive call and after coming back from that recursive call i'm writing
a print statement and from the main function i'll make a call to reverse print let's write rp as shortcut for reverse print and initially i'll pass the address of the head node so i'll make a call like this reverse print hundred the control will come inside this function
p is hundred it is not equal to null and i've also drawn the console like before now this particular function call does not print first it first makes a recursive call so this guy will go ahead and make a recursive call to the reverse print function passing
it address 200 nothing will be written on the console and once again this particular function will make a recursive call like this and once again this particular function will go ahead and make a recursive call like this and finally we will have a recursive call where the function
is passed address null at this stage we will come to in the exit condition in the recursion the recursion will not grow any further we will simply return the control will return to this particular call reverse print 250 so we will come here now to this print of statement
the data field at address 250 is 4 so 4 will be printed on the console and now this particular function call will finish and we will go to reverse print 150 and now this call will print 5 and exit and we will go on like this finally we will return back to the main function with this output
on the console the elements of the list printed in reversed order so this was recursive traversal of linked list to print its elements i must point out here that for normal traversal of the linked list not for the reverse print for the normal print an iterative approach will be a lot more
efficient than the recursive approach because in a iterative approach we will just use one temporary variable while in recursion we will use space in the stack section of the memory for so many function calls so there is implicit use of memory there for reverse print operation
we will anyway have to store elements in some structure so if we use recursion it's still okay in the coming lessons we will solve more problems more interesting problems on linked list so thanks for watching in our previous lesson we saw how we can traverse a linked list using recursion
we wrote code to print the elements of linked list in forward as well as reverse order using recursion we did not actually reverse the list we just printed the elements in reverse order now in this lesson we will reverse a linked list using recursion this is yet another famous programming
interview question so if we have an input list like this we have a linked list of integers here we have four nodes in the linked list each rectangular block here with two partitions is a node first field is to store the data and another to store the address of the next node the second field stores
the address of the next node and of course we will have one variable to store the address of the first node or head node we named that variable head we may name it anything i have named it head so this is our input list and after reversal our output should be like this
this variable head should store the address of the last node in the original list the last node in the original list was that address 250 and we will go like from 250 to 150 to 250 to 200 200 to 100 and 100 to null null is nothing but address 0 we have already seen how we can reverse
a linked list using iterative method in one of our previous lessons let us now see how we can solve this problem using recursion in our solution we must reverse the list by adjusting the links by reversing the links not by moving around data or something so let us first understand the logic
that we can use in our recursive approach if you remember from our previous lesson where we had used recursion to print the list backward print the elements in reverse order then recursion gives us a way to traverse the list backward in our c or c++ program programmatically
node will be a structure like this so let's first look at the function from the previous lesson the recursive function that was used to print the list backward to this function we pass the address of a node initially we pass the address of the head node and we have this exit
condition if the address passed is null then we simply return else we make a recursive call and pass the address of the next node so main method will typically call reverse print passing it the address of the head node and this guy will first make a recursive call and then
when this recursive call finishes then only it will print so i'm writing rps shortcut for reverse print so the recursion will go on like this and when it reaches this particular call when argument is null it will return so this call will finish and again the control will come
to this call with an address 250 as argument and now we are printing the value of the node at address 250 which will be 4 and then this guy finishes and then we go ahead and print 5 and similarly we then go on to print 6 and 2 so recursion kind of gives us a way to first
traverse the list in the forward direction and then traverse the list in the backward direction so let us now see how we can implement reverse function using recursion let's say for the sake of simplicity and implementation that head is a global variable so it is accessible to all the
functions now we will implement a function named reverse that will take the address of a node as argument initially we will pass address of the head node to this function now i want to do something like this in my recursion i want to go on till the end i want to go on making a recursive
call till i reach the last node for the last node the link part will be null so this is my exit condition from recursion this exit condition is what will stop us from going on infinitely in a recursion and what i'm doing here is something very simple as soon as i'm reaching the last
node i'm modifying the head pointer to make it point to this guy so the recursion will work like this from the main method we will call the reverse function passing it the address of the head node address 100 we will come and check this condition if p.next is equal to null no it is equal to 200
for the node at address 100 so this recursion will go on till we reach this call call to reverse passing it address 250 and now we will come down and now we have come to this exit condition and now head will be set as p and the list will look like this and now reverse 250 the call to reverse 250
will finish and we will come back to reverse 150 there is no statement here after this recursive call to reverse function if there were some statements here then they would have executed now for reverse 150 after we would have come from reverse 250 and that's how we actually traverse
the list in reverse order if you see when reverse 250 has finished the node till 250 is already reversed because head is pointing to this node and the link part of this node node is set as null so till 250 we are already reversed now when we come to 150 we can make sure the list is reversed
till 150 when we finish the execution of reverse 150 to do to do that we can write statement like this we will have to do two things we will have to cut this node and make this type point to this guy so we will build this link and we would have to cut this link and make this guy point to null
and that's how node till address 150 will be reversed after we finish this call so i've written these three lines in my function that will execute after the recursive call so they will execute when the recursion is folding up and we are traversing the list in the backward
direction so when we are executing reverse 150 and we have come back to it after recursion we are at this particular line so p would be 150 and q would be p dot next so q would be 250 so this guy is p and this guy is q and we are saying that set q dot next is equal to p so we will set this
particular field as 100 so we are building this link and cutting this link and now we are saying that set p dot next equal null so we are building this link making p dot next null and now this call to reverse 150 finishes and when this call has finished the list till 150 is reversed as you can
see head is 250 so from 250 we will go to 150 and 150 from 150 we are going to null so till 150 we have a reversed list so this is how things will look like when the call to reverse 200 finishes till 200 we have a reversed list and once again we come to execution of reverse 100
and this is how things will look like finally when reverse 100 will finish and we will return back to the main function we had seen in the previous lesson that how things will happen in the memory when recursion executes in recursion we save the state of execution of all the function calls
in stack section of the memory in this function all we are doing is basically we are storing the addresses of node in a structure as we go forward in recursion and then we first work on the last node to make it part of the reversed list and then we once again come back to the previous node and
we and we keep doing this watch the previous lesson for detailed explanation and simulation of how things will happen in the memory for recursion there are a couple of more things here one thing is that instead of writing these two lines i could write one line for these two lines
i could say something like p arrow next arrow next equal p and that would have meant the same except that this statement is more obfuscated and there is one more thing we have assumed that head is a global variable whatever head is not a global variable this reverse function
will have to return the address of the modified head i'll leave that as an exercise for you to do so this was reversing a linked list using recursion thanks for watching hello everyone in our lessons in this series so far we have discussed linked list quite a bit
we have seen how we can create a linked list and how we can perform various operations with linked list linked lists as we know our collections of entities that we call nodes so far in all our implementations we have created linked lists in which each node would contain two fields one to
store data and another to store address of the next node let's say we have a linked list of integers here so i'll fill in some values in data field of each node let's assume that these nodes are at addresses 200, 250 and 350 respectively i'll also fill in the address field in each node the address
field in first node will be the address of second node which is 250 the address field in second node will be address of third node which is 350 and address part in third node will be zero or null the identity of a linked list that we always keep with us is the address of head node or
reference to head node let's say we have a variable named head only to store the address of the head node remember this variable named head is only a pointer to the head node ideally we should have named there's something like head pointer it's only pointing to the head node it's not
the head node itself head node is this guy the first node in the linked list okay so right now in the linked list that we are showing here each node has only one link a link to the next node in a real program node for the linked list that i'm showing here will be defined like this
this is how we have defined nodes so far in all our lessons we have two fields here one of type integer to store data and another of type pointer to node struct node asterisk i'm calling this field next when we say linked list by default we mean such a list that we can also
call singly linked list what we have here is a singly linked list what we want to talk about in this lesson is idea of a doubly linked list the idea of a doubly linked list is really simple in a doubly linked list each node would have two links one to the next node and another to
the previous node programmatically this is how we will define node for a doubly linked list in c or c plus plus i have one more field here which once again is a pointer to node so i can store the address of a node i can point to a node using this field and this field will be used to store
the address of the previous node in a logical representation i will draw my node like this now i have one field to store data one to store address of previous node and one to store address of next node let's say i want to create a doubly linked list of integers i have created three nodes here
let's say these address these nodes are at addresses 400 600 and 800 respectively i'll fill in some data let's say the cell in the middle in each node is to store data the right most cell is let's say to store the address of the next node so for first node this field will be 600
which means we have a link like this for second node this field will be 800 for third node this field will be zero for first node there is no previous node so this leftmost cell which is supposed to contain the address of the previous node will be zero or null
the previous node for second node will be 400 and the previous node for the third node is the node at address 600 and of course we will have a variable to store the address of the head node okay so what we have here is a doubly linked list of integers with three nodes okay so with this much
you already know doubly linked list if you have ever implemented a singly linked list then it should not be very difficult implementing a doubly linked list one obvious question would be why would we ever want to create a doubly linked list what are the advantages or huge cases of a
doubly linked list first advantage is that now if we have a pointer to any node then we can do a forward as well as reverse lookup with just one pointer we can look at the current node the next node as well as the previous node i am showing a pointer named temp here if temp is a pointer
pointing to a node then temp dot next is a pointer pointing to the next node it's the address of the next node and temp dot previous or rather temp arrow previous this is actually a syntactical sugar for asterisk temp dot prev so this guy temp arrow prev is previous node or in pure words pointer to
previous node the value stored in temp for this example right now is 600 temp dot next is 800 and temp dot prev is 400 in a singly linked list there is no way you can look at the previous node with just one pointer you will have to use an extra pointer to keep track of the previous node
in a lot of scenarios the ability to look at the previous node makes our life easier even implementation of some of the operations like deletion becomes a lot easier in a singly linked list to delete a node you would need two pointers one to the node to be deleted and one to the previous
node but in our doubly linked list we can do so using only one pointer the pointer to the node to be deleted all in all this ability that we can do a reverse lookup in the linked list is really useful we can flow through the linked list in both directions disadvantage of doubly linked list
is that we are having to use extra memory for pointer to previous node for a linked list of integers let's say integer takes four bytes in a typical architecture and pointer also takes four bytes pointer variable also takes four bytes then in a singly linked list each node
will be eight bytes four for data and four for a link to the next node in a doubly linked list each node will be 12 bytes we will take four bytes for data and eight bytes for links for a linked list of integers we will take twice for links than data with a doubly linked list we also need to be more
careful while resetting links while inserting or deleting we need to reset a couple of more links than a singly linked list and so we are more prone to errors we will implement doubly linked list in a c program in next lesson we will write basic operations like traversal insertion and deletion
this is it for this lesson thanks for watching in our previous lesson we saw what doubly linked lists are now in this lesson we are going to implement doubly linked list in c we are going to write simple operations like insertion traversal and deletion in a doubly linked list as we saw in
our previous lesson each node contains three fields i have drawn logical representation of a doubly linked list here one to store data one to store address of next node and one to store address of previous node for a linked list of integers node will be defined like this in a c
or c plus plus program in the logical representation i'll fill in some data in each node let's say these nodes are at addresses 400 600 and 800 respectively i'll also fill in next and previous fields and we must also have a pointer variable pointing to the head node quite often we
name this pointer variable head in my implementation i'm going to write these functions i'm going to write a function to insert a node at beginning or head of linked list this function will take an integer as argument i'll write another function to insert a node at tail of linked list i'll write
one function to print elements in linked list while traversing it from head to tail i'll write another one to print the elements in reverse order while traversing the list from tail to head reverse print function will validate whether reverse link for each node is created properly
or not let's now write these functions in a real c program in my c program here i have defined node as a structure with three fields first field is of type integer to store data second field is of type pointer to node to store reference of next node and the third field is a pointer to
node to store the reference of previous node i have defined a variable named head which once again is a pointer to node and i have defined this variable in global scope head is a global variable when we define a variable inside a function it's called a local variable the lifetime of a
local variable is lifetime of a function call it's created during a function call execution and it's cleared from the memory when function call execution finishes but global variables live in the memory for whole lifetime of an application they live till the time program is
executing global variables can be accessed everywhere in all functions local variables are not accessible everywhere unless you access them through pointers in all our previous implementations we have mostly declared head as global variable okay so let's now write the functions the first function that i want
to write is insert at head this function will take an integer as argument the first thing that we want to do here is we want to create a node we can always declare a node like this just like declaration of any other variable we can say struct node and then we can give an identifier
or name and now in this my node that i have created i can fill in all the fields but the problem here is that when i'm creating a node like this i'm creating it as a local variable and it will be cleared from memory when function call will finish a local variable lives in what we
call stack section of applications memory and we cannot control its lifetime it's cleared from memory when function call finishes we do not want this our requirement is that a node should be in memory unless we explicitly remove it so that's why we create a node in in dynamic memory or what
we call heap section of memory anything in heap is not cleared unless we explicitly free it to create a node in heap we use malloc function in c or new operator in c++ all malloc function does is it reserves some memory in heap and this memory can be used for writing anything
any variable any object access to this memory always happens through a pointer variable we have talked about this concept quite a bit in our previous lessons but i keep on repeating because this is really important concept so here with this statement i have created a node in dynamic
memory or heap that can be referenced through a variable which is pointer to node i have named this variable temp now i can use this pointer variable to fill in values in various fields of the node i'll have to dereference this pointer variable using asterisk operator and then i can
access various fields like data prep or next there is an alternate syntax for this asterisk temp dot data we can simply write temp arrow data and similarly i can access other fields also so to access prep field i can say temp arrow prep let's set this as null and let's also set the
next field as null if you want to understand or refresh the concept of stack and heap in memory then you can check the description of this video for a link to our lesson on dynamic memory allocation okay so in my function insert at head i have created a node in heap section of memory
and i'm referencing that node using this pointer variable named temp temp is not a very meaningful name let's use a name like new node or new node pointer i would like to separate out this logic of node creation these lines for node creation in a separate function i've written a function
here named ket new node that will take an integer's argument create a node filling in data field as x and setting both previous and next pointers as null this function will return a pointer to nodes so i will return new node from here i'm writing a separate function because i can avoid duplicate
code by using a separate function for creation of node because i'm going to create a node for function in function insert at head as well as in function insert at tail that i'll be writing after some time now in insert at head function i can simply call this function get new node passing
it x this function is returning a pointer to newly created node that i'm going to receive in this variable which once again is appointed to node named temp we can name this variable also as new node this new node in insert at head is different from this new node in get new node these are local
variables this new node is local to insert at head and this new node is local to get new node now there will be two cases in insertion at head list could be empty so head will be equal to null in this case we can simply set head as the address of new node and return or exit
things will be clear if i'll show everything in logical view also right now my linked list is empty here in this logical view that i'm showing let's say i have made a call to insert at head passing it number two get new node function will give me a new node let's say a new node is created
at address 400 with this statement head equal new node we are setting the address stored in new node variable in head null is nothing but address 0 as soon as this function insert at head will finish this variable new node will be cleared from memory but the node itself will not be cleared
if we would have created node like this struct node new node and in this declaration new node is not pointed to node it's node and we are not saying struct node as stress so if we would have created node like this the node also would have been cleared okay coming back
to the function here let's write rest of the logic to insert a node when list is not empty this is what i'll do now i'm making a call to insert at head passing it number four once new node is created i'll first set the previous field of existing existing head node
as the address of this new node so i'm building this link then i'll set the next field of new node as the address of current head and now i can break this link and build this link so i'll set head as address of new node this is how things will look like finally
let's also quickly see how things will actually move in various sections of applications memory the memory that is allocated to a program is typically divided into these four segments we have seen this diagram quite a bit in our earlier lessons code or text segment stores
all the instructions to be executed there is a segment to store global variables there is a section that we call stack that is used just like scratch pad or white board for function call execution stack is where all the local variables go and not just local variables all
the information about function call execution heap is what we also call dynamic memory i'm showing stack heap and global section separately here in our program we had declared head as a global variable initially for an empty list we'll set head as null or zero now let's say we will do
that in main function now when a call to insert at head is made at this stage let's say i'm making a call passing number two as argument let's say we are making a call to insert head from main function when program starts execution first main function is invoked whenever a function is invoked some
amount of memory from the stack is allocated for execution of that function that section is called stack frame of that function and all the local variables of that function live inside its stack frame when function call execution finishes the stack frame is reclaimed when main will make a
call to insert at head the execution of main will pause at at at the line where it's making a call a stack frame will be allocated for execution of insert at head i'm writing shortcut i a h for insert at head because i'm short of space here all the arguments of insert at head all the local
variables will live inside this stack frame we are creating a variable named new named new node which is a pointer to node as local variable and we are making a call to get new node function execution of insert at head will pause and we will go on to execute get new node we could
write get new node like this here i'm creating a node on stack x is a local variable and get new node also then i'm creating a node filling in data as the value of x which is two i'm setting previous and next fields as null or zero and then because i need to return a pointer to node i have
used ampersand operator here using ampersand operator gives us pointer to a variable let's say this new node that we have in the stack frame of get new node has addressed 50 with this return when get new node will finish the value in this new node of insert at head will be 50
please note that with this code this new node in get new node function is of type struct node while this new node in insert at head is of type pointer to struct node so they are different types we can return this address 50 that's fine but the stack frame for get new node will
be reclaimed once the function finishes so now even though you have the address 50 there is no node there we cannot control allocation and deallocation of memory on stack it happens automatically that's why we use the memory on heap if i'm using this code for creation
of new node then what i'm doing is i'm declaring this variable new node not as struct node but as struct node asdrisk that is pointer to node i'm using malloc to create the actual node in heap section let's say i'm getting a dress 400 for this node now for a section of memory in heap
for something in heap we cannot have a direct name the only way to access something in heap is through a pointer if we will lose this pointer we will lose this node okay so now what we are doing is using this point a new node which is local to get new node function we are accessing this node
filling in data filling in address fields and now we are returning this address 400 now when get new node is finishing i'm collecting the return this address 400 in this variable in this local variable new node we are returning back to insert at head function and at this line
head at this stage is null so now we are saying that set head is equal equal new node head is a global variable it's not going to be cleared for whole lifetime of application and now we are returning stack frame of insert at head will be cleared and this is what we finally have
when we will make another call to insert at head once again fresh stack frames will be allocated in the execution of functions appropriate links will be created so our linked list will be modified accordingly i hope all of this is making some sense with another call to insert at head when
everything will finish and control will return back to main we can have a picture like this let's say i got a node at 600 right cell is for next node right cell is storing the address of next node and left cell is storing the address of previous node so this will this is what
we will have let's now go and write rest of the functions print function will be same as print for singly linked list we will take a temporary pointer to node initially set it to head and then we will use this statement temp equal temp dot next to go to the next node and we will keep on
printing in reverse print we will first go to the end node of the list using next pointer and then we will traverse backward using this statement temp equal temp arrow pref so we will use the previous pointer and while traversing backward we will print the data okay let's now test all
these functions that we have written so far in the main function i'm setting head as null to say that the list is empty initially and now i'm writing couple of insert statements i'm making a couple of calls to insert at head function and after each call i'm printing the list both in
forward as well as reverse direction let's run this code and see the output this is what i'm getting and i think this is as expected there is one more function insert at tail that i had said i'll write if you have understood things so far it should not be very difficult for you to write this function
insert at tail i'll leave this as an exercise for you i'll stop here now if you want to get this source code check the description of this video for a link in coming lessons we are going to talk about circular linked list and we will see some more interesting problems on linked list thanks
for watching in this lesson we are going to introduce you to stack data structure data structures as we know are ways to store and organize data in computers so far in the series we have discussed some of the data structures we have talked about arrays and linked lists now in
this lesson we are going to talk about stacks and we are going to talk about stack as abstract data type or ADT when we talk about a data structure as abstract data type we talk only about the features or operations available with the data structure we do not go into implementation details
so basically we define the data structure only as a mathematical or logical model we will go into implementation of stack in later lessons in this lesson we are going to talk only about stack ADT so we are only going to have a look at the logical view of stack stack as
a data structure in computer science is not very different from stack as a way of organizing objects in real world here are some examples of stack from real world first figure is of a stack of dinner plates second figure is of a mathematical puzzle called tower of hanoi where we have three
rods or three pegs and multiple disks and the game is about moving a stack of disks from one peg to another with this constraint that a disk cannot go on top of a smaller disk third figure is of a pack of tennis balls stack basically is a collection with this property that
an item in the stack must be inserted or removed from the same end that we call the top of stack in fact this is not just a property this is a constraint or restriction only the top of a stack is accessible and any item has to be inserted or removed from the top a stack is also called last
in first out collection most recently added item in a stack has to go out first in the first example you will always pick up a dinner plate from top of the stack and if you will have to put a plate back into the stack you will always put it back on top of the stack you can argue that
I can slip out a plate from in between without actually removing the plates on the top so the constraint that I should take out a plate always from the top is not strictly enforced for the sake of argument this is fine you can say this in other two examples when we have disks in a peg
and tennis balls in this box that can open only from one side there is no way you can take out an item from in between any insertion or removal has to happen from top you cannot slip out an item from in between you can take out an item but for that you will have to remove all the items
on top of that item let's now formally define stack as an abstract data type a stack is a list or collection with the restriction that insertion and deletion can be performed only from one end that we call the top of stack let's now define the interface or operations available with
stack adt there are two fundamental operations available with a stack and insertion is called a push operation push operation can insert or push some item x onto the stack another operation second operation is called pop pop is removing the most recent item from the stack most recent
element from the stack push and pop are the fundamental operations and there can be few more typically there is one operation called top that simply returns the element at top of the stack and there can be an operation to check whether a stack is empty or not so this operation will
return true if the stack is empty false otherwise so push is inserting an element on top of stack and pop is removing an element from top of stack we can push or pop only one element at a time all these operations that have written here can be performed in constant time
or in other words the time complexity is big o of one remember an element that is pushed or inserted last onto a stack is popped or removed first so stack is called last in first out structure what goes in last comes out first last in first out in short is called leafo logically a stack is
represented something like this as a three-sided figure as a container open from one side this is representation of an empty stack let's name this stack s let's say this figure is representing a stack of integers right now the stack is empty i will perform push and pop operations
to insert and remove integers from the stack i will first write down the operation here and then show you what will happen in the logical representation let's first perform a push i want to push number two onto the stack the stack is empty right now so we cannot pop anything after the push
stack will look something like this there is only one integer in the stack so of course it's on top let's push another integer this time i want to push number 10 and now let's say we want to perform a pop the integer at top right now is 10 with a pop
it will be removed from the stack let's do few more push i just pushed 7 and 5 onto the stack at this stage if i will call top operation it will return me number five is empty will return me false at this stage a pop will remove five from the stack
as you can see the element the integer which is coming last is going out first that's why we call stack last in first out data structure we can pop till the stack gets empty one more pop and stack will be empty so this pretty much is stack data structure
now one obvious question can be what are the real scenarios where stack helps us let's list down some of the applications of stack stack data structure is used for execution of function calls in a program we have talked about this quite a bit in our lessons on dynamic memory
allocation and linked lists we can also say that stack is used for recursion because recursion is also a chain of function calls it's just that all the calls are to the same function to know more about this application you can check the description of this video for a link to my course schools
lesson on dynamic memory allocation another application of stack is we can use it to implement undo operation in an editor and we can perform undo operation in any text editor or image editor right now i'm pressing ctrl z and as you can see some of the text that i have written
is getting cleared you can implement this using a stack stack is used in a number of important algorithms like for example a compiler verifies whether parenthesis in a source code are balanced or not using stack data structure corresponding to each opening curly brace or opening parenthesis
in a source code there must be a closing parenthesis at appropriate position and if parenthesis in a source code are not put properly if they are not balanced compiler should throw error and this check can be performed using a stack we will discuss some of these problems in detail
incoming lessons this much is good for an introduction in our next lesson we will discuss implementation of stack this is it for this lesson thanks for watching in our previous lesson we introduced you to stack data structure we talked about stack as abstract data type or ADT as we
know when we define a data structure as abstract data type we define it as a mathematical or logical model we define only the features or operations available with the data structure and do not bother about implementation now in this lesson we will see how we can implement
stack data structure we will first discuss possible implementations of stack and then we'll go ahead and write some code okay so let's get started as we had seen a stack is a list or collection with this restriction with this constraint that insertion and deletion that we call push and pop operations
in a stack must be performed one element at a time and only from one end that we call the top of stack so if you see if we can add only this one extra property only this one extra constraint to any implementation of a list that insertion and deletion must be performed only from one end
then we can get a stack there are two popular ways of creating lists we have talked about them a lot in our previous lessons we can use any of them to create a stack we can implement stacks using a arrays and p linked lists both these implementations are pretty intuitive let's first
discuss array based implementation let's say i want to create a stack of integers so what i can do is i can first create an array of integers i'm creating an array of 10 integers here i'm naming this array a now i'm going to use this array to store a stack what i'm going to say is that at
any point some part of this array starting index 0 till an index marked as stop will be my stack we can create a variable named top to store the index of top of stack for an empty stack top is set as minus 1 right now in this figure top is pointing to an imaginary minus 1 index in
the array and insertion or push operation will be something like this i will write a function named push that will take an integer x as argument in push function we will first increment top and then we can fill in integer x at top index here we are assuming that
a and top will be accessible to push function even when they are not passed as arguments in c we can declare them as global variables or in an object oriented implementation all these entities can be members of a class i'm only writing pseudo code to explain
the implementation logic okay so for this example array that i'm showing here right now top is set as minus 1 so my stack is empty let's insert something onto the stack i will have to make call to push function let's say i want to insert number 2 onto the stack in a call to push first
top will be incremented and then the integer passed as argument will be written at top index so 2 will be written at index 0 let's push one more number let's say i want to push number 10 this time once again top will be incremented 10 will now go at index 1 with each push the stack
will expand towards higher indices in the array to pop an element from the stack i'm writing a function here for pop operation all i need to do is decrement top by one with a call to pop let's i'm making a call to pop function here top will simply be decremented whatever cells are in yellow
in this figure are part of my stack we do not need to reset this value before popping if a cell is not part of stack anymore we do not care what garbage lies there next time when we will push we will modify it anyway so let's say after this pop operation i want to perform a push i want to
insert number 7 onto the stack so top once again will be incremented and value at index 2 will be overwritten the new value will be 7 these two functions push and pop that i have written here will take constant time we have simple operations in these two functions and execution time will
not depend upon size of stack while defining stack identity we had said that all the operations must take constant time or in other words the time complexity should be big o of 1 in our implementation here both push and pop operations are big o of 1 one important thing here
we can push onto the stack only till array is not exhausted only till some space is left in the array we can have a situation where stack would consume the whole array so top will be equal to highest index in the array a further push will not be possible because it will result in an overflow
this is one limitation with array based implementation to avoid an overflow we can always create a large enough array for that we will have to be reasonably sure that stack will not grow beyond a certain limit in most practical cases large enough array works but irrespective of that
we must handle overflow in our implementation there are couple of things that we can do in case of an overflow push function can check whether array is exhausted or not and it can throw an error in case of an overflow so push operation will not succeed this will not be a really good behavior
we can do another thing we can use the concept of dynamic array we have talked about dynamic array in initial lessons in the series what we can do is in case of an overflow we can create a new larger array we can copy the content of stack from older filled up array into new array
if possible we can delete the smaller array the cost of copy will be big o of n or in simple words time taken to copy elements from smaller array to larger array will be proportional to number of elements in stack or the size of the smaller array because anyway stack will occupy the whole array
there must be some strategy to decide the size of larger array optimal strategy is that we should create an array twice the size of smaller array there can be two scenarios in a push operation in a normal push we will take constant time in case of an overflow we will first create
a larger array twice the size of smaller array copy all elements in time proportional to size of the smaller array and then we will take constant time to insert the new element the time complexity of push with this strategy will be big o of one in best case
and big o of n in worst case in case of an overflow time complexity will be big o of n but we will still be big o of one in average case if we will calculate the time taken for n pushes then it will be proportional to n remember n is the number of elements in stack
big o of n is basically saying that time taken will be very close to some constant times n in simple words time taken will be proportional to n if we are taking c into n time for n pushes to find out average we will divide by n average time taken for each push will be a constant
hence big o of one in average case i will not go into all the mathematics of why it's big o of n for n pushes to know about it you can check the description of this video for some resources okay so this pretty much is core of our implementation we have talked about two more operations in
definition of stack ADT top operation simply returns the element at top of stack so top function will look something like this we will simply return the element at top index to verify whether stack is empty or not this is another operation that we had defined we can simply check the value
of top if it is equal to minus one we can say the stack is empty we can return true else we can return false sometimes pop and top operations are combined together in that case pop will not just remove an element from top of stack it will also return that element language libraries in a
lot of programming languages give us implementation of stack signature of functions in these implementations can vary slightly okay now i will quickly show you a basic implementation of stack in C in my C code here i'm going to write a simple array based implementation to create a stack of
integers the first thing that i'm going to do is i'm going to create an array of integers as global variable and the size of this array is max size where max size is defined by this macro as 101 i will declare another global variable named top and set it as minus one initially
remember top equal minus one means an empty stack when a variable is not declared inside any function it's a global variable it can be accessed anywhere so you do not have to pass it as argument to functions and now i will write all the operations this is my push function i'm first incrementing top
and then setting the value at top as x x is the integer to be inserted past as argument instead of writing these two statements i can write one statement like this and i will be good i'm using pre increment operator so increment will happen before assignment i also want to handle
overflow we will have an overflow when top index will be equal to max size minus one highest index available in the array in case of an overflow i simply want to print an error message something like this and return so in this implementation i'm not using a dynamic array in case of overflow
push will not succeed okay now this is my pop function i'm simply decrementing top here also we must handle one error condition if stack is already empty we cannot pop so i'm writing these statements here if top is equal to minus one we cannot pop i will print this error message that there is no
element to pop and simply return now let's write top operation top operation will simply return the integer at top index so now my basic operations are all written here i have already written push pop and top in main function i will make some calls to push and pop and i want to write
one more function named print and this is something that i'm going to write only to verify that push and pop are happening properly i will simply print all the elements in the stack in my main function after each push or pop operation i will make a call to print i'm writing multiple
function calls two function calls on same line here because i'm short of space remember print function is not a typical operation available with stack i'm writing it only to test my implementation so this pretty much is my code let's now run this program and see what happens
this is what i'm getting as output we are pushing three integers two five and ten and then we are performing a pop so ten gets removed from the stack and then we are pushing 12 so this is a basic implementation of stack in c this is not an ideal implementation an ideal implementation
should be something like we should have a data type called stack and we should be able to create instances of it we can easily do it in an object oriented implementation we can do it in c also using structures check the description of this video for link to source code of this implementation
as well as of an object oriented implementation in our next lesson we will discuss linked list implementation of stack this is it for this lesson thanks for watching in our previous lesson we saw how we can implement stack using arrays now in this lesson we will
see how we can implement stack using linked list for this lesson i'm assuming that you already know about both stack as well as linked list stack as we know from our discussion so far is called a last in first out data structure whatever goes in last in a stack comes out first it's a list
with this restriction that insertion and deletion must be performed only from one end that we call the top of stack an insertion in a stack is called push operation and deletion is called pop to implement a stack all we need to do is enforce this behavior in any implementation of a list that
insertion and deletion must be performed only from one end and we can call that end top of stack it's really easy to enforce this behavior in a linked list i have drawn a linked list of integers here this is logical representation of a linked list a linked list is a collection of entities that
we call nodes each node contains two fields one two store data and another to store the address of the next node let's assume that these nodes are at addresses 100 200 and 400 respectively so i will fill up the address part as well the identity of a linked list is the address of the
first node that we also call the head node a variable stores the address of head node we often name this variable as head unlike arrays linked lists are not a fixed size and elements in a linked list are not stored in one contiguous block of memory we already know how to create a linked
list or insert and delete elements from a linked list from our previous lessons i'm just doing a quick recap here to insert an element in a linked list we first create a new node which is basically blocking some part of memory to store our data in this example here let's say for my new node i'm
getting address 350 we can set the data part of the linked list as whatever value i want to add in the list and then i need to modify the address field of some of the existing nodes to link this node in actual list now for a stack we want that insertion and deletion must always happen from the
same end we can use a linked list at stack if we always insert and delete a node at same end we have two options we can insert or delete from end of the list what we also call tail or beginning of the list that we call head if you remember from our previous lessons inserting a
node at end of linked list is not a constant time operation the cost of both insertion and deletion at end of linked list if we have to talk about the time complexity of it is big o of n here in the definition of stack we are saying that push and pop operations should take constant
time or the time complexity should be big o of one but if we will insert and delete from end time complexity will be big o of n to insert a new node in a linked list at the end we need to go to the last node and set the address part of that node to make it point to the new node to traverse
a linked list and go to the last node we should start at the head or the first node from first node we get the address of the second node so we go to the second node and from second node we get the address of the third node it's like playing treasure hunt you go to the first guy ask the address of
the second guy and then you go to the second guy ask the address of the third guy and so on now once i've reached this last node in my example here i can set its address part to make it point to the newly created node all in all this operation will take time proportional to number of elements
in the linked list to delete a node from end once again we will have to traverse the whole list we will have to go to the second last node break this link we will set the address field as zero or null and then we can simply wipe off the last node removed from the list from computer's memory
once again the cost of traversal will be big o of n so inserting and deleting at end or tail is not an option for us because we will not be able to do push and pop in constant time if we choose to insert and delete from end the cost of inserting or deleting from beginning however
is big o of one it will take constant time to insert a node at beginning or delete a node from beginning to insert a node at beginning we must create a new node in this example here once again i have created a new node let's say the address of the new node is 350 i will insert some data in the
first field of this node okay so to insert this node at beginning we just need to build two links first we need to build this link so we will set the address here as whatever the address of the current head is and then we can break this link and make this guy the new head by setting its address
here in this variable named head to delete a node in this example here we will have to first cut this link and build this link which will mean resetting the address in this variable head and then we can free the memory allocated to this particular guy this particular node deletion from beginning
once again is a constant time operation so this is the thing if we will insert at beginning and delete from beginning then all our conditions are satisfied so linked list implementation of stack is pretty straightforward all we need to do is insert a node at the beginning and delete a node
from beginning so head of the linked list is basically the top of stack i would rather name this variable top here i'll quickly write a basic implementation in c i'm defining node as a structure in c i want to create a stack of integers so first field in the node is an integer another
field is pointed to node that will store the address of the next node we have seen this definition of node in all our previous lessons on linked list the next thing that i'm doing is i'm declaring a variable named top which is pointed to node and initially i'm setting the address in it as null
i'm using variable name top instead of head here when top is null our stack is empty by initializing top as null i'm saying that initially my stack is empty now let's write push and pop functions this is my push function push is taking an integer x as argument that must be inserted
onto the stack the first thing that we are doing in push function is that we are creating a node using malloc let's say in this example in this logical representation that i'm showing here i'm performing a push operation so i'm making a call to push function passing it number two as
argument so a node is created in memory is created in what we call the dynamic memory or heap let's say the address of this node is hundred this variable is basically a pointer pointing to this node temp is a pointer pointing to this node in the next line we are setting the data field in this node
we are dereferencing temp to do so then we are setting the link part of this newly created node as existing top so we are building this link and then we are saying top equal temp so we are building this link this is simple insertion at beginning of a linked list we have one complete video in this
series on how to insert a node at beginning of linked list let's do one more push let's say i want to push number five onto the stack this time once again a node will be created we will set the data and then we will first point this guy to the existing top and then make this pointer variable
point to this guy the new top let's say the address of this guy is 250 so the address in this variable top will be set as 250 after this second push this is how my stack will look like top here is a global variable so we do not need to pass it as argument to functions it is accessible
to all the functions in an object oriented implementation it can be a private field and we can set it as null in the constructor okay let's now see how push sorry pop function will look like this is my pop function let's say for this example i'm making a call to pop function
if the stack is already empty we can check whether stack is empty or not by checking whether top is null or not if top is null stack is empty in this case we can throw some error and return for this example here stack is not empty we have two integers in the stack what we are first doing
is we are creating a pointer to node temp and pointing it to the top node and now we are breaking this link we are setting the address in top as address of the next node and now using this pointer variable temp we are freeing the memory allocated to the node being removed from the list
once i exit the pop function this is my stack so this pretty much is the core of our implementation i would encourage you to write rest of the stuff yourself you can write code for operations like top and is empty linked list implementation of stack has some advantages one of the advantages
is that unlike array based implementation we do not need to worry about overflow unless we exhaust the memory of the machine itself some amount of extra memory is used in each node to store reference or address but the fact that we use memory when needed
and release when not needed is something that makes push and pop operations more graceful so this is linked list based implementation of stack in our coming lessons we will solve some problems using stack this is it for this lesson thanks for watching
in our previous lesson we saw how we can implement a stack we saw two popular implementations of stack one using arrays and another using linked list a warrior should not just possess a weapon he must also know when and how to use it as programmers we must know
in what all scenarios we can use a particular data structure in this lesson i'm going to talk about one simple use case of stack a stack can be used to reverse a list or collection or simply to traverse a list or collection in reverse order i'm going to talk about two problems
reversal of string and reversal of linked list and i'm going to solve both these problems using stack let's first discuss reversal of string i have a string in the form of a character array here i have this string hello a string is a sequence of characters
this is a c-style string in c a string must be terminated with a null character so this last character is a null character reversal means characters in the array should be rearranged like what i'm showing here in the right null character is used only to mark the end of string
it is not part of string okay there are couple of efficient ways in which we can reverse a string let's first discuss how we can solve this problem using a stack and then we will see how efficient it is what we can do is we can create a stack of characters i'm showing logical representation
of a stack here this is a stack of characters and right now it's empty and now what we can do is we can traverse the characters in the string from left to right and start pushing them onto the stack so first h goes into the stack then the next character is e then l then we have another l
and then the last character is o once all the characters in the string have gone into the stack we can once again start at the zeroeth index now we need to write the topmost character in the stack at this index we can get the topmost character by calling top operation
and now we can perform a pop and now we can go to the next index fill in whatever is at top of stack and perform a pop again we can go on doing this until stack is not empty so all the positions in the character array will be overwritten so finally we have reversed our string here
in a stack whatever goes in last comes out first so if we will push a bunch of items onto a stack and once all items are pushed if we will start popping we will get the items in reverse order first item pushed onto the stack will come out last let's quickly write code for this logic
i'm going to write c++ here things will be pretty similar in other languages so it doesn't really matter what i'm going to do in my code is i'm going to create a character array to store a string and then i will ask user to input a string once i input the string i will make a call to a function
named reverse passing it the array and length of string that i will get by making a call to string length function and finally i'm printing the reversed string now i need to write the reverse function in reverse function i want to use a stack a stack of characters we have already seen
how we can implement stack in c++ we can create a class named stack that would have an array of characters and an integer variable named top to mark the top of stack in array and these variables can be private and we can work upon the stack using these public functions
in reverse function we can simply create an object of stack and use it this class can be an array based implementation of stack or a linked list based implementation of stack it doesn't really matter in c++ and many other languages language libraries also give us implementation of stack
in this program i'm not going to write my own stack i'm going to use stack from what we call standard template library in c++ i will have to use this include statement hash include stack and now i have a stack class available to me to create an object of this class i need to write
stack and within angular brackets data type for which we want a stack then after space name or identifier with this one statement here i have created a stack of characters let's now write the core logic this n in the signature of reverse function is number of characters in string
this array as we know array in c or c++ is always passed by reference through a pointer this c followed by brackets is only an alternate syntax for asterisk c it's interpreted like this by the compiler okay so now what i'm going to do is i'm going to run a loop starting 0 till n minus 1
so i will traverse the string from left to right and as i traverse the string i will push the character onto stack by calling push function i will use a statement like this once push is done i'll do another loop for pop i will run a loop with this variable i starting
at 0 going till n minus 1 and i'll first set c i as top of stack and then i will perform a pop operation if you want to know more about functions available with stack in stl like their signatures and how to use them you can check the description of this video for some resources this is all i need
to do in my reverse function let's run this code and see what happens i need to enter a string let's enter hello this is what i get as output which seems to be correct let's run this again and this time i want to enter my code school this looks all right too so we seem to be good so this
function is solving my problem of reversal let's now see how efficient it is let's analyze its time complexity we know that all operations on stack take constant time so all these statements within loop inside loop will take constant time the first loop is running n times and then the
second loop is also running n times first loop will execute in big o of n and the second loop will also execute in big o of n the loops are not nested they are one after other so in such scenario complexity of the whole function will also be big o of n time complexity is big o of n
but we are using some extra memory here for stack we are pushing all the characters in the string want to stack the extra space taken in stack will be proportional to number of characters in the string will be proportional to n so we can say that space complexity of this function
is also big o of n in simple words extra space taken is directly proportional to n there are efficient ways to reverse a string without using extra space the most efficient way probably would be to use just two variables to mark the start and end index in the string
initially let's say i am using variables i and j initially i for this example is zero and j is four while i is less than j we can swap the characters at these positions and once we have swapped we can increment i and decrement j if i is less than j we can swap again
and once again increment i and decrement j now i is not less than j i is equal to j at this stage we can stop swapping and we are done this algorithm has space complexity big o of one we are using constant extra memory here time complexity of this approach once again is big o of n
we will do n by two swaps so time taken will be proportional to n definitely because of space complexity this approach is better than our stack approach sometimes when we know that our input will be very small and time and space is not much of concern we use a particular algorithm
for ease of implementation for its being intuitive it's clearly not the case when we are using stack to reverse or string but for this other problem reversal of linked list that we had said we will discuss using a stack gives us a neat and intuitive solution i have drawn a linked list of integers
here as we know linked lists are collections of entities that we call nodes each node contains two fields one to store data and other to store address of next node i have assumed that these nodes in this example here are at addresses 100 150 250 and 300 respectively identity of a
linked list is address of the head node we typically store this address in a variable named head in an array it takes constant time to access any element so whether it's the first element or last element it takes constant time to access it it is so because array is stored as one
contiguous block of memory so if we know the starting address of the array let's say the starting address of this array is 400 and size of each element in the array characteristics one byte so for this example each element is one byte then we can calculate a address of any element
so we know that a4 is at 400 plus 4 or 404 but in a linked list nodes are stored at these joint locations in memory to access any node we have to start at the head node so we can't do something as simple as having two pointers at start and end and accessing the
elements we have already seen in the series two possible approaches that can be used to reverse a linked list one was an iterative solution where we go on reversing links as we traverse the linked list using some temporary variables another solution was using recursion the time complexity of
iterative solution is big o of n space complexity is big o of one in recursive solution we do not create a stack explicitly but recursion uses the stack in computer's memory that is used to execute function calls in such a case we say that we are using implicit stack stack is not being created
explicitly but still we are using an implicit stack i will come back to this and explain in detail the time complexity of recursive solution once again is big o of n but the space complexity is big o of n this time space complexity is also big o of n now let's see how we can use an
explicit stack to solve this problem once again i have drawn logical representation of stack here right now the stack is empty in a program this will be a stack of type pointer to node what i'm going to do now is i'm going to traverse this linked list using a temporary
pointer to node the temporary variable will initially point to head when we will go to a particular node we will push the address of that node onto the stack so first 100 will go to stack and now we will move to the next node now 150 will go in stack and now we will go to 250
and then to the last node at 300 we are showing addresses here in the stack but basically the objects that we are pushing are pointers to node or in other words references to nodes if node is defined like this in c++ we will have to use these statements to traverse the linked
list and push all the references let's say head is a pointer to node which i'm assuming is a global variable that will store the address of head node i'm using a temporary variable that is pointer to node initially i'm storing the address of head node in this temporary variable and then
i'm running a loop and i'm traversing the linked list and as i'm traversing i'm pushing the reference onto stack once all the references are pushed onto stack we can start popping them and as we will pop them we will get references to nodes in reverse order it would be like going through the list in
reverse order while traversing the list in reverse order we can build reverse links the first thing that i'll do is i'll take a temporary variable that will be pointed to node and store the address of address at the top of stack which right now is 300 now i will set head
as this address so head now becomes 300 and then i will pop i'm running you through this example here as i'm writing code head and temp right now are both 300 and now i will run a loop like this like what i have written here while stack is not empty this function empty returns true if stack
is empty i'm using stack from standard template library in c++ so while stack is not empty i'm going to say that set temp dot next as address at top of stack basically i'm using this pointer to node temp to dereference and set this particular address field right now top is 250 so i'm building
this reverse link next statement is a pop and in the next statement i'm saying temp equal temp dot next which means temp will now point to this node at 250 stack is not empty so loop will execute again we are writing address here now then we should pop and then move to 150 using this
statement temp equal temp dot next now we're building this link popping and then oops this should have been 150 and with the next temp equal temp dot next we're going here even though we have built this link by setting this field here this node is still pointing to
this guy because the stack is empty now we will exit the loop after the loop after exit from the loop i have written one more line temp dot next equal null so i'm setting the last link part of last node in reversed list as null finally this is my reverse function i have assumed that head is
a global variable and it's a pointer to node if you want the complete source code you can check the description of this video for a link using a stack in this case is making our life easier reversing a linked list is still a complex problem try to just print the elements of linked list in
reverse order if you will use a stack it will be really easy i will stop here for this lesson if you know if you want to know what i meant by implicit stack you can once again check the description of this video for some resources so this is it for this lesson thanks for watching
in our previous lesson we saw one simple application of stack we saw that a stack can be used to reverse a list or collection or maybe to simply traverse a list or collection in reverse order now in this lesson we will discuss another famous problem that can be solved
using stack and this is also a popular programming interview question and the problem is given an expression in the form of a string comprising of let's say constants variables operators and parenthesis and when i say parenthesis i also want to include curly braces and brackets in
my definition of parenthesis so my expression or string can contain characters that can be upper or lowercase letters symbols for operators and an opening or closing parenthesis or an opening or closing curly brace or an opening or closing square bracket let's write down some expressions
here i'm going to write a simple expression we have one simple expression here with one pair of opening and closing parenthesis here in this expression we have nested parenthesis now given such expressions we want to write a program that would tell us whether parenthesis
in the expression are balanced or not and what do we really mean by balanced parenthesis what we really mean by balanced parenthesis is that corresponding to each opening parenthesis or opening curly brace or opening bracket we should have a closing counterpart in correct order
these two expressions here are balanced however this next expression is not balanced a closing curly brace is missing here this next expression is also not balanced because we are missing an opening square bracket here this next one is also not balanced because corresponding to
this opening curly brace we do not have a closing curly brace and corresponding to this closing parenthesis we do not have an opening parenthesis if we are opening with a curly brace we should also close with a curly brace these two want count for each other checking for balanced parenthesis is
one of the tasks performed by a compiler when we write a program we often miss an opening or closing curly brace or an opening or closing parenthesis compiler must check for this balancing and if symbols are not balanced it should give you an error in this problem here what's inside a parenthesis
does not matter we do not want to check for correctness of anything that is inside a parenthesis so in the string any character other than opening and closing parenthesis or opening and closing curly brace or opening and closing square bracket can be ignored this problem
sometimes is better stated like this given a string comprising only of opening and closing characters of parenthesis braces or brackets we want to check for balancing so only these characters and their order is important while parsing a real expression we can simply ignore other
characters all we care about is these characters and their order okay so now how do we solve this problem one straightforward thing that comes to mind is that because we should have a closing counterpart for an opening parenthesis or opening curly brace or opening square bracket what we can
do is we can count the number of opening and closing symbols for each of these three types and they should be equal so the number of opening parenthesis should be equal to number of closing parenthesis and the number of opening curly braces should be equal to number of closing curly braces
and same should be true for square brackets as well but it will not be good enough this expression here has one opening parenthesis and one closing parenthesis but it's not balanced this next one is balanced but this one with same number of characters of each type as the second expression
is not balanced so this approach won't work apart from count being equal there are some other properties that must be conserved every opening parenthesis must find a closing counterpart to its right and every closing parenthesis must find an opening counterpart in its left which is not
true in the first expression and the other property that must be conserved is that a parenthesis can close only when all the parenthesis opened after it are closed this parenthesis has been opened after this square bracket so this square bracket cannot close unless this parenthesis has closed
anything that is opened last should be closed first well actually it should not be last opened first closed in this example here this is getting opened last but this guy that is open previous to this is closed first and it is fine the property that must be conserved
is that as we scan the expression from left to right any closer should be for the previous unclosed parenthesis any closer should be for the last unclosed let's scan some expressions from left to right and see how it's true let's scan this last one we will go from left to right
first character is an opening of square bracket second one is an opening parenthesis let's mark opening of unclosed parenthesis in red okay now we have a closer here the third character is a closer this should be the closer for the last unclosed so this
should be the closer for this one this guy this opening parenthesis last unclosed now is this guy next character once again is an opening parenthesis now we have two unclosed parenthesis at this stage and this one is the last unclosed the next one is a closure so so it should be closer
for the last unclosed now the last unclosed once again is the opening of square bracket now when we have a closer it should be closer for this guy we can use this approach to solve this problem what we can do is we can scan the expression
from left to right and as we scan at any stage we can keep track of all the unclosed parenthesis basically what we can do is whenever we get an opening symbol an opening parenthesis an opening curly brace or an opening square bracket we can add it to a list if we get a closing symbol
it should be the closer for the last element in the list in case of an inconsistency like if the last opening symbol in the list is not of the same type as the closing symbol or if there is no last opening symbol at all because the list is empty we can stop this whole
process and say that parenthesis are not balanced else we can remove the last opening symbol in the list because we have got its counterpart and continue this whole process things will be further clear if i will run through an example i will run through this last example once again
we are going to scan this expression from left to right and we will maintain a list to keep track of all the open parenthesis that are not yet closed we will give a track of all the unclosed parenthesis opened but not closed initially this list is empty the first character that we have got
is an opening of square bracket this will go into the list and we will move to the next character the next character is an opening parenthesis so once again it should go to the list we should always insert at end in the list the next character is a closing of parenthesis now we must look at
the last opening symbol in the list and if it is of the same type then we have got its counterpart and we should remove this now we move on to the next character this is once again an opening parenthesis it should go in the list at the end the next character is a closing of parenthesis
so we will look at the last element in the list it's an opening parenthesis so we can remove it from the list and now we go to the last character which is a closing of square bracket once again we need to look at the last element in the list we have one element only one element in the list
at this stage it's an opening of square bracket so once again we can remove it from the list now we are done scanning the list and the list is empty once again if everything is all right if parenthesis are balanced we will always end with an empty list if in the end list is not empty
then some opening parenthesis has not found its closing counterpart and expression is not balanced one thing worth noticing here is that we are always inserting or removing one element at a time from the same end of the list in this whole process whatever is coming in last in the list is going
out first there is a special kind of list that enforces this behavior that element should be inserted and removed from the same end and we call it a stack in a stack we can insert and remove an element one at a time from the same end in constant time so what we can do is whenever we get an opening
symbol while scanning the list we can push it onto the stack and when we get a closing symbol we can check whether the opening symbol at the top of stack is of the same type as the closing symbol if it's of the same type we can pop it if it's not of the same type we can simply say that parenthesis
are not balanced i will quickly write pseudo code for this logic i'm going to write a function named check balanced parenthesis that will take an expression in the form of a string as argument first of all i will store the number of characters in the string in a variable and then i will create
a stack and i will create a stack of characters and now i will scan the expression from left to right using a loop while scanning if the character is an opening symbol if it's an opening parenthesis or opening curly brace or opening square bracket we can push that character onto the stack let's
say this function push will push a character onto s else if expression i or the character at ith position while scanning is a closing symbol of any of the three types we can have two scenarios if stack is empty or top of stack does not pair with the closing symbol if we have a closing of
parenthesis then the top of stack should be an opening of parenthesis it cannot be an opening of curly brace in such a scenario we can conclude that the parenthesis are not balanced else we can perform a pop finally once our scanning is over we can check whether stack is empty or not
if it's empty parenthesis are balanced if it's not they are not balanced so this is my pseudo code let's run through couple of examples and see whether this works for all scenarios or test cases or not let's first look at this expression the first thing that we are doing in our code is that we are
creating a stack of characters i have drawn logical representation of a stack here okay now let's scan this string let's say we have a zero based index and the string is just a character array we are starting the scan we are going inside the loop this is a closing of parenthesis so this if statement
will not hold true so we will go to the else condition and now we will go inside the else to check for this condition whether stack is empty or not or whether the top of stack pairs with this closing symbol or not the stack is empty if the stack is empty there is no opening
counterpart for this closing symbol so we will simply return false returning means exiting the function so we are simply concluding here that parenthesis are not balanced and exiting let's go through this one now first we have an opening square bracket so we will go to the first
if and push next one is an opening parenthesis once again it will be pushed next one is a closing square bracket so the condition for this else if will be true we will go inside this else if now this time the top of stack is an opening parenthesis it should have been an opening square
bracket and then only we would have a pair so this time also we will have to return false and exit okay now let's go through this one first we will have a push the next one will also be a push now next one is a closer of parenthesis which pairs with the top of stack
which is opening of parenthesis so we will have a pop we will go to the next character and this one once again is an opening parenthesis so there will be a push next one is a closing parenthesis and the top is an opening parenthesis the pair so there will be a pop last character is a closing
curly brace so once again we will see whether top of stack is an opening curly brace or not do we have a pair or not yes we have a pair so there will be a pop with this our scanning will finish and finally stack should be empty it is empty so we have balanced parenthesis here
try implementing this pseudo code in a language of your choice and see whether it works for all test cases or not if you want to look at my implementation you can check the description of this video for a link in the coming lessons we will see some more problems on stack this is it for
this lesson thanks for watching hello everyone in this lesson we are going to talk about one important and really interesting topic in computer science where we find application of stack data structure and this topic is evaluation of arithmetic and logical expressions
so how do we write an expression I have written some simple arithmetic expressions here an expression can have constants variables and symbols that can be operators or parenthesis and all these components must be arranged according to a set of rules according to a grammar and we should be
able to parse and evaluate the expression according to this grammar all these expressions that I have written here have a common structure we have an operator in between two operands operand by definition is an object or value on which operation is performed in this expression
two plus three two and three are operands and plus is operator in the next expression a and b are operands and minus is operator in the third expression this asterisk is for multiplication operation so so this is the operator the first operand p is a variable
and the second operand 2 is a constant this is the most common way of writing an expression but this is not the only way this way of writing an expression in which we write an operator in between operands is called in fix notation operand doesn't always have to be
a constant or variable operand can be an expression itself in this fourth expression that I have written here one of the operands of multiplication operator is an expression itself another operand is a constant we can have a further complex expression in this fifth expression that I have
written here both the operands of multiplication operator are expressions we have three operators in this expression here for this first plus operator p and q this variables p and q are operands for the second plus operator we have r and s and for this multiplication operator the first
operand is this expression p plus q and the second operand is this expression r plus s while evaluating expressions with multiple operators operations will have to be performed in certain order like in this fourth example we will first have to perform the addition and then
only we can perform multiplication in this fifth expression first we will have to perform these two additions and then we can perform the multiplication we will come back to evaluation but if you can see in all these expressions operator is placed in between operands this is the syntax that we are
following one thing that I must point out here throughout this lesson we are going to talk only about binary operators an operator that requires exactly two operands is called a binary operator technically we can have an operator that may require just one operand or maybe more than two
operands but we are talking only about expressions with binary operators okay so let's now see what all rules we need to apply to evaluate such expressions written in this syntax that we are calling in fix notation for an expression with just one operator there is no problem we can
simply apply that operator for an expression with multiple operators and no parenthesis like this we need to decide an order in which operators should be applied in this expression if we will perform the addition first then this expression will reduce to 10 into 2 and will finally evaluate
as 20 but if we will perform the multiplication first then this expression will reduce to 4 plus 12 and will finally evaluate to 16 so basically we can look at this expression in two ways we can say that operands for addition operator are 4 and 6 and operands for multiplication are
this expression 4 plus 6 and this constant 2 or we can say that operands for multiplication are 6 and 2 and operands for addition operation are 4 and this expression 6 into 2 there is some ambiguity here but if you remember your high school mathematics this problem is resolved by following operator
precedence rule in an algebraic expression this is the precedence that we follow first preference is given to parenthesis or brackets next preference is given to exponents i'm using this symbol for exponent operator so if i have to write 2 to the power 3 i'll be writing it something like this
in case of multiple exponentiation operator we apply the operators from right to left so if i have something like this then first this rightmost exponentiation operator will be applied so this will reduce to 512 if you will apply the left operator first then this will evaluate
to 64 after exponents next preference is given to multiplication and division and if it's between multiplication and division operators then we should go from left to right after multiplication and division we have addition and subtraction and here also we go from left to right if we have
an expression like this with just addition and subtraction operators then we will apply the leftmost operator first because the precedence of these operators is same and this will evaluate to 3 if you will apply the plus operator first this will evaluate as 1 and that will be wrong
in this second expression 4 plus 6 into 2 that i have written here if we will apply operator precedence then multiplication should be performed first if we want to perform the addition first then we need to write this 4 plus 6 within parenthesis and now addition will be performed first because
precedence of parenthesis is greater i'll take example of another complex expression and try to evaluate it just to make things further clear so i have an expression here in this expression we have four operators one multiplication one division one subtraction and one addition
multiplication and division have higher precedence between these two multiplication and division which have same precedence we will pick the left one first so we will first reduce this expression like this and now we will perform the division and now we have only subtraction and addition
so we will go from left to right and this is what we will finally get this right to left and left to right rule that i have written here for operators with equal precedence is better termed as operator associativity if in case of multiple operators with equal precedence we go from left to right
then we say that the operators are left associative and if we go from right to left we say that the operators are right associative while evaluating an expression in in fixed form we first need to look at precedence and then to resolve conflict among operators with equal precedence we need to
see associativity all in all we need to do so many things just to parse and evaluate an in fix expression the use of parenthesis becomes really important because that's how we can control the order in which operation should be performed parenthesis add explicit intent that operation
should be performed in this order and also improve readability of expression i have modified this third expression we have some parenthesis here now and most often we write in fix expressions like this only using a lot of parenthesis even though in fix notation is the most
common way of writing expressions it's not very easy to parse and evaluate an in fix expression without ambiguity so mathematicians and logicians studied this problem and came up with two other ways of writing expressions that are parenthesis free and can be passed without ambiguity without
requiring to take care of any of these operator precedence or associativity rules and these two ways are post fix and prefix notations prefix notation was proposed earlier in year 1924 by a polished logician prefix notation is also known as polished notation in prefix notation operator
is placed before operands this expression two plus three in in fix will be written as plus two three in prefix plus operator will be placed before the two operands two and three p minus q will be written as minus pq once again just like in fix notation operand in prefix notation
doesn't always have to be a constant or variable operand can be a complex prefix notation itself this expression a plus b as to risk c in in fix form will be written like this in prefix form i'll come back to how we can convert in fix expression to prefix first have a look at this
third expression in prefix form for this multiplication operator the two operands are variables b and c these three elements are in prefix syntax first we have the operator and then we have the two operands the operands for addition operator are variable a and this prefix expression
as to risk b c in in fix expression we need to use parenthesis because an operand can possibly be associated with two operators like in this third expression in in fix form b can be associated with both plus and multiplication to resolve this conflict we need to use operator precedence
and associativity rules or use parenthesis to explicitly specify association but in prefix form and also in post-fix form that we will discuss in some time an operand can be associated with only one operator so we do not have this ambiguity while parsing and evaluating prefix
and post-fix expressions we do not need extra information we do not need all the operator precedence and associativity rules i'll come back to how we can evaluate prefix notation i'll first define post-fix notation post-fix notation is also known as reverse polished notation
this syntax was proposed in 1950s by some computer scientists in post-fix notation operator is placed after operands programmatically post-fix expression is easiest to parse and list costly in terms of time and memory to evaluate and that's why this was actually invented prefix expression
can also be evaluated in similar time and memory but the algorithm to parse and evaluate post-fix expression is really straightforward and intuitive and that's why it's preferred for computation using machines i'm going to write post-fix for these expressions that i had written
earlier in other forms this first expression 2 plus 3 in post-fix will be 2 3 plus to separate the operands we can use a space or some other delimiter like a comma that's how you would typically store prefix or post-fix in a string when you'll have to write a program this second expression
in post-fix will be pq minus so as you can see in post-fix form we are placing the operator after the operands this third expression in post-fix will be abc asterisk and then plus for this multiplication operator operands are variables b and c and for this addition operands are variable
a and this post-fix expression bc asterisk we will see efficient algorithms to convert in-fix to prefix or post-fix in later lessons for now let's not bother how we will do this in a program let's quickly see how we can do this manually to convert an expression from
in-fix to any of these other two forms we need to go step by step just the way we would go in evaluation i have picked this expression a plus b into c in in-fix form we should first convert the part that should be evaluated first so we should go in order of precedence we can also first
put all the implicit parenthesis so here we will first convert this b into c so first we are doing this conversion for multiplication operator and then we will do this conversion for addition operator we will bring addition to the front so this is how the expression will transform we can use
parenthesis in intermediate steps and once we are done with all the steps we can erase the parenthesis let's now do the same thing for post-fix we will first do the conversion for multiplication operator and then in next step we will do it for addition
and now we can get rid of all the parenthesis parenthesis surely adds readability to any of these expressions to any of these forms but if we are not bothered about human readability then for a machine we are actually saving some memory that would be used to store parenthesis
information in fix expression definitely is most human readable but prefix and post fix are good for machines so this is in fix prefix and post fix notation for you in next lesson we will discuss evaluation of prefix and post fix notations this is it for this lesson thanks for watching
in our previous lesson we saw what prefix and post fix expressions are but we did not discuss how we can evaluate these expressions in this lesson we will see how we can evaluate prefix and post fix expressions algorithms to evaluate prefix and post fix expressions are similar but i'm going
to talk about post fix evaluation first because it's easier to understand and implement and then i'll talk about evaluation of prefix okay so let's get started i have written an expression in in fix form here and i first want to convert this to post fix form as we know in in fix form operator
is written in between operands and we want to convert to post fix in which operator is written after operands we have already seen how we can do this in our previous lesson we need to go step by step just the way we would go in evaluation of in fix we need to go in order of precedence
and in each step we need to identify operands of an operator and we need to bring the operator in front of the operands what we can actually do is we can first resolve operator precedence and put parenthesis at appropriate places in this expression we'll first do this multiplication
this first multiplication then we'll do this second multiplication then we will perform this addition and finally the subtraction okay now we will go one operator at a time operands for this multiplication operator are a and b so this a asterisk b will become
a b asterisk now next we need to look at this multiplication this will transform to see the asterisk and now we can do the change for this addition the two operands are these two expressions in post fix so i'm placing the plus operator after these two expressions finally for this last
operator the operands are this complex expression and this variable e so this is how we will look like after the transformation finally when we are done with all the operators we can get rid of all the parenthesis they are not needed in post fix expression this is how you can do the conversion
manually we will discuss efficient ways of doing this programmatically in later lessons we will discuss algorithms to convert in fix to prefix or post fix in later lessons in this lesson we are only going to look at algorithms to evaluate prefix and post fix expressions
okay so we have this post fix expression here and we want to evaluate this expression let's say for these values of variables a b c d and e so we have this expression in terms of values to evaluate i'll first quickly tell you how you can evaluate a post fix expression manually
what you need to do is you need to scan the expression from left to right and find the first occurrence of an operator like here multiplication is the first operator in post fix expression operands of an operator will always lie to its left for the first operator the preceding two
entities will always be operands you need to look for the first occurrence of this pattern operand operand operator in the expression and now you can apply the operator on these two operands and reduce the expression so this is what i'm getting after evaluating two three asterisk
now we need to repeat this process till we are done with all the operators once again we need to scan the expression from left to right and look for the first operator if the expression is correct it will be preceded by two values so basically we need to look for first occurrence of this pattern
operand operand operator so now we can reduce this we have six and then we have five into four twenty we are using space as still a meter here there should be some space in between two operands okay so this is what i have now once again i look for the first occurrence of operand operand
and operator we will go on like this till we are done with all the operators when i'm saying we need to look for first occurrence of this pattern operand operand and operator what i mean by operand here is a value and not a complex expression itself the first operator
will always be preceded by two values and if you will give this some thought you will be able to understand why if you can see in this expression we are applying the operators in the same order in which we have them while parsing from left to right so first we are applying this left most
multiplication on two and three then we are applying the next multiplication on five and four then we are performing the addition and then finally we are performing the subtraction and whenever we are performing an operation we are picking the last two operands preceding the operator in the
expression so if we have to do this programmatically if we have to evaluate a post fix expression given to us in a string like this and let's say operands and operators are separated by space we can have some other delimiter like comma also to separate operands and operator now what we can do is we
can parse the string from left to right in each step in this parsing in each step in this scanning process we can get a token that will either be an operator or an operand what we can do is as we parse from left to right we can keep track of all the operands seen so far and i'll come back to
how it will help us so i'm keeping all the operands so seen so far in a list the first entity that we have here is two which is an operand so it will go to the list next we have three which once again is operand so it will go into the list next we have this multiplication operator
now this multiplication should be applied to last two operands preceding it last two operands to the left of it because we already have the elements stored in this list all we need to do is we need to pick the last two from this list and perform the operation it should be two into three
and with this multiplication we have reduced the expression this two three asterisk has now become six it has become an operand that can be used by an operator later we are at this stage right now that i'm showing in the right i'll continue the scanning next we have an operand we'll push this
number five onto the list next we have four which once again will come to the list and now we have the multiplication operator and it should be applied to the last two operands in the reduced expression and we should put the result back into the list this is the stage where we are right now
so this list actually is storing all the operands in the reduced expression preceding the position at which we are during passing now for this addition we should take out the last two elements from the list and then we should put the result back next we have an operand
we are at this stage right now next we have an operator this subtraction we will perform this subtraction and put the result back finally when i'm done scanning the whole expression i'll have only one element left in the list and this will be my final answer this will be my final result
this is an efficient algorithm we are doing only one pass on the string representing the expression and we have our result the list that we are using here if you could notice is being used in a special way we are inserting operands one at a time from one side and then to perform an operation we are
taking out operand from the same side whatever is coming in last is getting out first this whole thing that we are doing here with the list can be done efficiently with a stack which is nothing but a special kind of list in which elements are inserted
and removed from the same side in which whatever gets in last comes out first it's called a last in first out structure let's do this evaluation again i have drawn logical representation of stack here and this time i'm going to use this stack i'll also write pseudo code for this algorithm i'm
going to write a function named evaluate postfix that will take a string as argument let's name this string expression exp for expression in my function here i'll first create a stack now for the sake of simplicity let's assume that each operand or operator in the expression will be of only one
character so to get a token or operator we can simply run a loop from zero till length of expression minus one so expression i will be my operand or operator if expression i is operand i should put it push it onto the stack else if expression i is operator we should do
two pop operations in the stack store the value of the operands in some variable i'm using variables named op1 and op2 let's say this pop function will remove an element from top of stack s and also return this element once we have the two operands we can perform the operation i'm using
this variable to store the output let's say this function will perform the operation now the result should be pushed back onto the stack if i have to run through this expression with whatever code i have right now then first entity is two which is operand so it should be pushed onto the stack
next we have three once again this will go to the stack next we have this multiplication operator so we will come to this else if part of the code i'll make first pop and i'll store three in this variable op1 well actually this is the second operand so i should say this one is op2
and next one will be op1 once i have popped these two elements i can perform the operation as you can see i'm doing the same stuff that i was doing with the list the only thing is that i'm showing things vertically stack is being shown as a vertical list i'm inserting or taking
out from the top now i'll push the result back onto the stack now we will move to the next entity which is operand it will go into the stack next four will also go into the stack and now we have this multiplication so we will perform two pop operations after this operation is performed
result will be pushed back next we have addition so we will go on like this we have 26 pushed onto the stack now now it's nine which will go in and finally we have this subtraction 26 minus nine 17 will be pushed onto the stack at this stage we will be done with the loop
we are done with all the tokens all the operands and operators the top of stack can be returned as final result at this stage we will have only one element in the stack and this element will be my final result you will have to take care of some parsing logic in actual implementation
operand can be a number of multiple digits and then we will have delimiter like space or comma so you'll have to take care of that parsing operand or operator will be some task if you want to see my implementation you can check the description of this video for a link okay so this was post-fix
evaluation let's now quickly see how we can do prefix evaluation once again i've written this expression in infix form and i'll first convert it to prefix we will go in order of precedence i first put this parenthesis this two asterix three will become asterix two three this five into four will
become asterisk five four and now we will pick this plus operator whose operands are these two prefix expressions finally for the subtraction operator this is the first operand and this is the second operand in the last step we can get rid of all the parenthesis so this is what i have
finally let's now see how we can evaluate a prefix expression like this we will do it just like post-fix this time all we need to do is we need to scan from right so we will go from right to left once again we will use a stack if it's an operand we can push it on to the stack so here for this
example nine will go on to the stack and now we will go to the next entity in the left it's four once again we have an operand it will go on to the stack now we have five five will also be pushed on to the stack and now we have this multiplication operator at this stage
we need to pop two elements from the stack this time the first element popped will be the first operand in post-fix the first element popped was the second operand this time the second element popped will be the second operand for this multiplication first operand is five and second
operand is four this order is really important for multiplication the order doesn't matter but for say division or subtraction this will matter result 20 will be pushed on to the stack and we will keep moving left now we have three and two both will go on to the stack
and now we have this multiplication operation three and two will be popped and their product six will be pushed now we have this addition the two elements at top are 20 and six they will be popped and their sum 26 will be pushed finally we have this subtraction 26 and nine will be popped
out and 17 will be pushed and finally this is my answer prefix evaluation can be performed in couple of other ways also but this is easiest and most straightforward okay so this was prefix and post-fix evaluation using stack in coming lessons we will see efficient algorithms to
convert in-fix to prefix or post-fix this is it for this lesson thanks for watching in our previous lesson we saw how we can evaluate prefix and post-fix expressions now in this lesson we will see an efficient algorithm to convert in-fix to post-fix we already know of one way of doing
this we have seen how we can do this manually to convert an in-fix expression to post-fix we apply operator precedence and associativity rules let's do the conversion for this expression that I have written here the precedence of multiplication operator is higher so we will first convert this
part b asterisk c b asterisk c will become b c asterisk the operator will come in front of the operands now we can do the conversion for this addition for addition the operands are a and this post-fix expression in the final step we can get rid of all the parentheses so finally this is
my post-fix expression we can use this logic in a program also but it will not be very efficient and the implementation will also be somewhat complex i'm going to talk about one algorithm which is really simple and efficient and in this algorithm we need to parse the in-fix expression
only once from left to right and we can create the post-fix expression if you can see in in-fix to post-fix conversion the positions of operands and operators may change but the order in which operands occur from left to right will not change the order of operators may change this is an
important observation in both in-fix and post-fix forms here the order of operands as we go from left to right is first we have a then we have p and then we have c but the order of operators is different in in-fix first we have plus and then we have multiplication in post-fix first we
have multiplication and then addition in post-fix form we will always have the operators in the same order in which they should be executed i'm going to perform this conversion once again but this time i'm going to use a different logic what i'll do is i'll parse the in-fix expression from
left to right so i'll go from left to right looking at each token that will either be an operand or an operator in this expression we will start at a a is an operand if it's an operand we can simply append it in the post-fix string or expression that we are trying to create
at least for a it should be very clear that there is nothing that can come before a okay so the first rule is that if it's an operand we can simply put it in the post-fix expression moving on next we have an operator we cannot put the operator in the post-fix expression
because we have not seen its right operand yet while parsing we have seen only its left operand we can place it only after its right operand is also placed so what i'm going to do is i'm going to keep this operator in a separate list or collection and place it later in the post-fix expression
when it can be placed and the structure that i'm going to use for storage is stack a stack is only a special kind of list in which whatever comes in last goes out first insertion and deletion happen from the same end i have pushed plus operator onto the stack here moving on next we have b
which is an operand as we had said operand can simply be appended there is nothing that can come before this operand the operator in the stack is anyway waiting for the operand to come now at this stage can we place the addition operator in the post-fix string well actually what's after b
also matters in this case we have this multiplication operator after b which has higher precedence and so the actual operand for addition is this whole expression be asterisk c we cannot perform the addition until multiplication is finished so while parsing when i'm at b and i have not seen
what's ahead of b i cannot decide the fate of the operator in the stack so let's just move on now we have this multiplication operator i want to make this expression further complex to explain things better so i'm adding something at tail here in this expression
now i want to convert this expression to post-fix form i'm not having any parenthesis here we will see how we can deal with parenthesis later let's look at an expression where parenthesis does not override operator precedence okay so right now in this expression while parsing from left to right
we are at this multiplication operator the multiplication operator itself cannot go into the post-fix expression because we have not seen its right operand yet and until its right operand is placed in the post-fix expression we cannot place it the operator that we would be
looking at while parsing that operator itself cannot be placed right away but looking at that operator we can decide whether something from the collection something from the stack can be placed into the post-fix expression that we are constructing or not
any operator in the stack having higher precedence than the operator that we are looking at can be popped and placed into the post-fix expression let's just follow this as rule for now and i'll explain it later there is only one operator in the stack and it is not having
higher precedence than multiplication so we will not pop it and place it in the post-fix expression multiplication itself will be pushed if an element in the stack has something on top of it that something will always be of higher precedence so let's move on in this expression now now we are
at c which is an operand so it can simply go next we have an operator subtraction subtraction itself cannot go but as we had said if there is anything on the stack having higher precedence than the operator that we are looking at it should be popped out and should go
and the question is why we are putting these operators in the stack we are not placing them in the post-fix expression because we are not sure whether we are done with their right operand or not but after that operator as soon as i'm getting an operator of lower precedence
that marks the boundary of the right operand for this multiplication operator c is my right operand it's this simple variable for addition b*c is my right operand because subtraction has lower precedence anything on or after that cannot be part of my right operand
subtraction i should say has lower priority because of the associativity rule if you remember the order of operation addition and subtraction have same precedence but the one that would occur in left would be given preference so the idea is any time for an operator if i'm getting
a an operator of lower priority we can pop it from the stack and place it in the expression here we will first pop multiplication and place it and then we can pop addition and now we will push subtraction onto the stack let's move on now d is an operand
so it will simply go next we have multiplication there is nothing in the stack having higher precedence than multiplication so we will pop nothing multiplication will go onto the stack next we have an operand it will simply go now there are two ways in which we can find
the end of right operand for an operator a is if we get an operator of lesser precedence be if we reach the end of the expression now that we have reached end of expression we can simply pop and place these operators so first multiplication will go and then subtraction will go
let's quickly write pseudo code for whatever i have said so far and then you can sit with some examples and analyze the logic i'm going to write a function named in fix to post fix that will take a string exp for expression as argument for the sake of simplicity let's assume that
each operand or operator will be of one character only in an actual implementation you can assume them to be tokens of multiple characters so in my pseudo code here the first thing that i'll do is i'll create a stack of characters named s now i'll run a loop starting 0 till length of expression
minus 1 so i'm looking at each character that can either be an operand or operator if the character is an operand we can append it to the post fix string well actually i should have declared and initialized a string before this loop this is the result string in which i'll be
appending else if expression i is operator we need to look for operators in the stack having higher precedence so i'll say while stack is not empty and the top of stack has higher precedence and let's say this function has higher precedence we'll take two arguments
two operators so if the top of stack has higher precedence than the operator that we are looking at we can append the top of stack to the result which is the variable that will store the post fix string and then we can pop that operator i'm assuming that this s is some class that has these functions
stop and pop and empty to check whether it's empty or not finally once i'm done with the popping outside this while loop i need to push the current operator s is an object of some class that will have these functions stop pop and empty okay so this
is the end of my for loop at the end of it i may have some operators left in the stack i'll pop these operators and append them to the post fix string i'll use this while loop i'll say that while the stack is not empty append the operator at top and pop it and finally after this while loop
i can return the result string that will contain my post fix expression so this is my pseudo code for whatever logic i've explained so far in my logic i've not taken care of parenthesis what if my infix expression would have parenthesis like this there will be slight change from what
we were doing previously with parenthesis any part of the expression within parenthesis should be treated as independent complete expression in itself and no element outside the parenthesis will influence its execution in this expression this part a plus b is one within
one parenthesis its execution will not be influenced by this multiplication or this subtraction which is outside it similarly this whole thing is within the outer parenthesis so this multiplication operator outside will not have any influence on execution of this part as a whole if parenthesis
are nested inner parenthesis is sorted out or resolved first and then only outer parenthesis can be resolved with parenthesis we will have some extra rules we will still go from left to right and we will still use stack and let's say i'm going to write the post fix part in right here
as i created now while parsing a token can be an operand an operator or an opening or closing of parenthesis we will have some extra rules i'll first tell them and then i'll explain if it's an opening parenthesis we can push it onto the stack the first token here in this example is an opening
parenthesis so it will be pushed onto the stack and then we will move on we have an opening parenthesis once again so once again we will push it now we have an operand there is no change in rule for operand it will simply be appended to the post fix part next we have an operator
remember what we were doing for operator earlier we were looking at top of stack and popping as long as we were getting operator of higher precedence earlier when we were not using parenthesis we could go on popping and empty the stack but now we need to look at top of stack and
pop only till we get an opening parenthesis because if we are getting an opening parenthesis then it's the boundary of the last open parenthesis and this operator does not have any influence after that outside that so this plus operator does not have any influence outside this opening parenthesis
i'll explain the scenario with some more examples later let's first understand the rule so the rule is if i'm seeing an operator i need to look at the top of stack if it's an operator of higher precedence i can pop and then i should look at the next stop if it's once again an operator
of higher precedence i should pop again but i should stop when i see an opening parenthesis at this stage we have an opening parenthesis at top so we do not need to look look below it nothing will be popped anyway addition however will go onto the stack remember after the whole
popping game we pushed the operator itself next we have an operand it will go and we will move on next we have a closing of parenthesis when i'm getting a closing of parenthesis i'm getting a logical end of the last opened parenthesis for part of the expression within that parenthesis
it's coming to the end and remember what we were doing earlier when we were reaching the end of infix expression we were popping all the operators out and placing them so this time also we need to pop all the operators out but only those operators that are part of this parenthesis that we are
closing so we need to pop all the operators until we get an opening parenthesis i'm popping this plus and appending it next we have an opening of parenthesis so i'll stop but as last step i will pop this opening also because we are done for this parenthesis okay so the rule for closing
of parenthesis pop until you're getting an opening parenthesis and then finally pop that particular opening parenthesis also let's move on now next we have an operator we need to look at top of stack it's an opening of parenthesis this operator will simply be pushed next we have an
operand next we have an operator once again we will look at the top we have multiplication which is higher precedence so this should be popped and appended we will look at the top again it's an opening of parenthesis so we should stop looking now minus will be pushed now next we have an operand
next we have closing of parenthesis so we need to pop until we get an opening minus will be appended finally the opening will also be popped next we have an operator and this will simply go next we have an operand and now we have reached the end of expression so everything in the stack
will be popped and appended so this finally is my post fix expression i'll take one more example and convert it to make things further clear i want to convert this expression i'll start at the beginning first we have an operand then this multiplication operator which will simply go onto
the stack the stack right now is empty there is nothing on the top to compare it with next we have an opening parenthesis which will simply go next we have an operand it will be appended and now we move on to this addition operator if this opening parenthesis was not there
the top of stack would have been the multiplication operator which has higher precedence so it would have been popped but now we will look at the top and it's an opening parenthesis so we cannot look below and we will simply have to move on next we have c i missed pushing the addition operator
last time okay after c we have this closure so we need to pop until we get an opening and then we need to pop one opening also finally we have reached the end of expression so everything in the stack will be popped and appended so this finally is my post fix part post fix form
in my pseudo code that i had written earlier only the part within this for loop will change to take care of parenthesis in case we have an operator we need to look at top of the stack and pop but only till we are getting an opening parenthesis so i have put this
extra condition in the while loop this condition will make sure that we stop once we get an opening parenthesis right now in the for loop we are dealing with operator and operators we will have two more conditions if it's an opening of parenthesis
we should push else if it's a closer we can go on popping and appending let's say this function is opening parenthesis we'll check whether a character is opening of parenthesis or not in fact we should use this function here also when i'm checking whether current token is
opening or not because it could be an opening curly brace or opening bracket also this function will then take care let's say this function will take care and similarly for this last else if we should use this function is closing parenthesis okay things are consistent
now after this while loop in the last else if we should do one extra pop and this extra pop will pop the opening parenthesis and now we are done with this else if and this is closer of my for loop rest of this stuff will remain same after the for loop we can pop the leftovers and the pen to the
string and finally we can return so this is my final pseudo code you can check the description of this video for a link to real implementation actual actual source code okay so i'll stop here now this is it for this lesson thanks for watching hello everyone we have been talking
about data structures for some time now as we know data structures are ways to store and organize data in computers so far in this series we have discussed some of the data structures like arrays linked lists and in last couple of lessons we have talked about stack in this lesson we are
going to introduce you to queues we are going to talk about qadt just the way we did it for stacks first we are going to talk about q as abstract data type or adt as we know when we talk about a data structure as abstract data type we define only the features or operations available with
the data structure and do not go into implementation details we will see possible implementations in later lessons in this lesson we are only going to discuss logical view of q data structure okay so let's get started q data structure is exactly what we mean when we say q in real world a q is a
structure in which whatever goes in first comes out first in short we call q a fee for structure earlier we had seen stack which is a last in first out structure which is called a last in first out structure or in short leifo a stack is a collection in which both insertion and removal
happen from the same end that we call the top of stack in q however an insertion must happen from one end that we call rear or tail of the q and any removal must happen from the other end that we can call front or head of the q if i have to define q formally as an abstract data type
then a q is a list or collection with the restriction or constraint that insertion can be and must be performed at one end that we call the rear of q or the tail of q and deletion can be performed at other end that we can call the front of q or head of q let's now define the interface or
operations available with q just like stack we have two fundamental operations here an insertion is called in q operation some people also like to name this operation push in q operations should insert an element at tail or rear end of q deletion is called
dq operation in some implementations people call this operation pop also push and pop are more famous in context of stack in q and dq are more famous in context of qs while implementing you can choose any of these names in your interface
dq should remove an element from front or head of the q and dq typically also returns this element that it removes from the head the signatures of nq and dq for a q of integers can be something like this nq is returning void here while dq is returning an integer this integer should be
the removed element from the q you can design dq also to return void typically a third operation front or peak is kept just to look at the element at the head just like the top operation that we had kept in stack this operation should just return the element at front and should not delete
anything okay we can have few more operations we can have one operation to check whether q is empty or not if q has a limited size then we can have one operation to check whether q is full or not why i'm calling out these alternate names for operations is also because most of the time
we do not write our own implementation of a data structure we use inbuilt implementations available with language laboratories interface can be different in different language laboratories for example if you would use the inbuilt q in c plus plus the function to insert is push
while in c sharpets nq so we should not confuse i'll just keep more famous names here okay so these are the operations that i have defined with q adt nq dq front and is empty we can insert or remove one element at a time from the q using nq and dq front is only to look at
the element at head is empty is only to verify whether q is empty or not all these operations that have written here must take constant time or in other words their time complexity should be big o of one logically a q can be shown as a figure or container open from two sides
so an element can be inserted or in queued from one side and an element can be removed or dequeued from other side if you remember stack we show a stack as a container open from one side so an insertion or what we call push in context of stack and removal or pop both must happen from
the same side in q insertion and removal should happen from different sides let's say i want to create a queue of integers let's say initially we have an empty queue i will first write down one of the operations and then show you the simulation in logical view
let's say i first want to nq number two this figure that i'm showing here right now is an empty queue of integers and i'm saying that i'm performing an in queue operation here in a program i would be calling an in queue function passing it number two as argument
after this nq we have one element in the queue we have one integer in the queue because we have only one element in the queue right now front and rear of the queue are this are same let's nq one more integer now i want to insert number five
five will be inserted at rear or tail of the queue let's nq one more and now i want to call dequeued operation so we will pick two from head of the queue and it will go out if dequeued is supposed to return this removed integer then we will get integer two as return
nq and deque are the fundamental operations available with queue in our design we can have some more for our convenience like we have front and is empty here a call to front at this stage will get us number five integer five as return no integer will be removed from the queue
calling s is empty at this stage can return us a boolean false or zero for false and one for true so this pretty much is how queue works now one obvious question can be what are the real scenarios where we can use queue what are the use cases of queue data structure queue is most often used in
a scenario where there is a shared resource that's supposed to serve some requests but the resource can handle only one request at a time it can serve only one request at a time in such a scenario it makes most sense to queue up the requests the request that comes first gets served first
let's say we have a printer shared in a network any machine in the network can send a print request to this printer printer can serve only one request at a time it can print only one document at a time so if a request comes when it's busy it can't be like i'm busy request later that will be really
rude of the printer what really happens is that the program that manages the printer puts the print request in a queue as long as there is something in the queue printer keeps picking up a request from the front of the queue and serves it processor on your computer is also a shared
resource a lot of running programs or processes need time of the processor but the processor can attend to only one process at a time processor is the guy who has to execute all the instructions who has to perform all the arithmetic and logical operations so the processes are put in a queue
queues in general can be used to simulate weight in a number of scenarios we will discuss some of these applications of queue in detail while solving some problems in later lessons this is good for an introduction in next lesson we will see how we can implement queue this is it for this
lesson thanks for watching in our previous lesson we introduced you to queue data structure we talked about queue as abstract data type or ADT as we know when we talk about the data structure as abstract data type we define it as a mathematical or logical model we define only the features or operations
available with the data structure and do not go into implementation details in this lesson we are going to discuss possible implementations of queue i will do a quick recap of what we have discussed so far a queue is a list or collection with this restriction with this constraint
that insertion can be performed at one end that we call rear of queue or tail of queue and deletion can be performed at other end that we call the front of queue or the head of queue and insertion in queue is called in queue operation a deletion is called the queue operation i have defined
queue ADT with these four operations that i have written here in an actual implementation all these operations will be functions front operation should simply return the element at front of queue it should not remove any element from the queue is empty should simply check whether queue is empty
or not and all these operations must take constant time and queue the queue or looking at the element at front the time taken for any of these operations must not depend upon a variable like number of elements in queue or in other words time complexity of all these operations must be big o of one
okay so let's get started we are saying that a queue is a special kind of list in which elements can be inserted or removed one at a time and insertion and removal happen at different ends of the queue we can insert an element at one end and we can remove an element from the other end
just the way we did it for stack we can add these constraints or extra properties of queue to some implementation of a list and create a queue there are two popular implementations of queue we can have an array based implementation and we can have linked list based implementation
let's first discuss array based implementation let's say we want to create a queue of integers what we can do is we can first create an array of integers i have created an array of 10 integers here i have named this array a now what i'm going to do is i'm going to use this array
to store my queue what i'm going to say is that at any point some part of the array starting an index marked as front till an index marked as rear will be my queue in this array i'm showing front of the queue towards left and rear towards right in earlier examples i was showing front towards
right and rear towards left doesn't really matter any side can be front and any side can be rear it's just that an element must always be added from rear side and must always be removed from front so if at any stage a segment of the array from an index marked as front till an index marked as
rear is my queue and rest of the positions in the array are free space that can be used to expand the queue to insert an element to nq we can increment rear so we will add a new cell in the queue towards rear end and in this cell we can write the new value element to be inserted
can come to this position i'll fill in some values here at these positions so we have these integers in the queue and let's say we want to insert number five to insert we will increment rear of course there should be an available cell in the right an available empty cell in the right
and now we can write value five here after insertion new rear index is seven and the value at index seven is five now the queue means we must remove an element from front of the queue in this example here a deque operation should remove number two from the queue to deque we can simply increment
front because at any point only the cells starting front till rear are part of my queue by incrementing front i have discarded index two from the queue and we do not care what value lies in a cell that is not part of the queue when we will include a cell in the queue we will overwrite
the value in that cell anyway so just incrementing front is good enough for deque operation let's quickly write pseudocode for whatever we have discussed so far in my code i will have two variables named front and rear and initially i'll set them both as minus
one let's say for an empty queue both front and rear will be minus one to check whether q is empty or not we can simply check the value of front and rear and if they're both minus one we can say that q is empty i just wrote his empty function here minus one is not a valid index
for an empty queue there will be no front and rear in our implementation we are saying that we will represent empty state of queue by setting both front and rear as minus one now let's write the nq function nq will take an integer x as argument there will be a couple
of conditions in nq if rear is already equal to maximum index available in array a we cannot insert or nq n element in such scenario we can return and exit i would rather use a function named is full to determine whether q is full or not if q is already full we can do much we should
simply exit else if q is empty we can add a cell to the queue we can add cell at index zero in the queue and now we can set the value at index rear as x in all other cases we can first increment rear and then we can fill in value x at index rear i can get this statement a rear
equal x outside these two conditional statements because it's common to them so this is my nq function in the example array that i'm showing here let's nq some integers i'll make calls to nq function and show you the simulation in the figure here let's say first i want to insert
number two in the queue i'm making a call to nq function passing number two as argument the queue is empty so we will set both front and rear as zero now we will come to this statement we will write value two at index zero so this is my queue after one nq operation front and
rear of the queue is same let's make another call to nq this time i want to insert number five this time q is not empty so rear will be incremented we have added a cell to the queue by incrementing rear and now we will write the value five at the new rear index let's nq one more number i have
nq seven let's now write dq operation there will be couple of cases in dq if the q is already empty we cannot remove an element in this case we can simply print or throw an error and return or exit there will be one more special case if the q has only one element in this case front and
rear will not be minus one but they will both be equal because we are already checking for minus one case in his empty function in the previous if in this else if we can simply check whether front is equal to rear or not if this is the case our dq will make the q empty and to mark the q
as empty we need to set both front and rear as minus one this is what we had said that we will represent an empty queue by marking both front and rear as minus one in default or normal scenario we will simply increment front we should really be careful about corner cases in any implementation
that's where most of the bugs come okay so this finally is my dq function in this example here at this stage let's say we want to perform a dq q is not empty and we do not have only one element in the queue so we will simply increment front before
incrementing we could set the value in this cell at index zero as something but the value in a cell that is not part of q anymore doesn't really matter at this stage it doesn't really matter what we have at index zero or index three or any other index apart from the segment between
front and rear when we will add a cell in the queue we will overwrite the value in that cell anyway let's now perform some more in queues and dq's i'm in queuing three and then i'm in queuing one we teach in queue we are incrementing rear i just performed some more in queue here
now let's perform a dq if i'll perform one more in queue here rear will be equal to maximum index available in the array let's enqueue one more now at this stage we cannot enqueue an element anymore because we cannot increment rear enqueue operation will fail now there are two unused cells
right now but with whatever logic we have written we cannot use these two cells that are in the left of front in fact this is a real problem as we will dequeue more and more all the cells left of front index will never be used again they will simply be wasted can we do something to use these
cells well we can use the concept of a circular array circular array is an idea that we use in a lot of scenarios the idea is very simple as we traverse an array we can imagine that there is no end in the array from zero we can go to one from one we can go to two and finally when we will reach
the last index in the array like in this example when we are at index nine the next index for me is index zero we can imagine this array something like this remember this is only a logical way of looking at the array in circular interpretation of array if i'm pointing to a position and my
current position is i then the next position or next index will not simply be i plus one it will be i plus one modulo the number of elements in array or the size of array let's say n is the number of elements in array then the next position will be i plus one modulo n the modulo operation will
get us the remainder upon dividing by n for any i other than n minus one this modulo operation will not have any effect but for i equal n minus one next position will be n modulo n which will be equal to zero when you divide the number by itself the remainder is zero previous position in circular
interpretation of array will be i plus n minus one modulo n we could simply say i minus one modulo n just to make sure this expression inside the parenthesis is always positive i'm adding n here give this some thought you should be able to get why it should be i plus n minus one modulo n
now with this interpretation of array we can increment rear in an nq operation as long as there is any unused cell in the array i'm going to modify functions in my pseudo code now is empty will remain the same we are still saying that for an empty q front and rear will be minus one
let's scroll down and come to nq now in circular interpretation i will call my q full when the position next to rear in circular interpretation that we will calculate as rear plus one modulo n will be equal to front so we will have a situation like this right now the next position to rear
in circular interpretation is front so there is no unused cell the complete array is exhausted nothing will change in this condition if q is empty we can simply set front and rear as zero in the last else condition we will increment rear like this we will say rear is equal to
rear plus one modulo n where n is number of elements in the array with this much change my nq function is good now let's make a call to nq and insert something in this array here i want to insert number 15 we will come to this last else condition rear right now is nine so this
expression will be nine plus one modulo n n is 10 here the size of this array a is 10 here this will evaluate to zero now my new rear is zero i will write number 15 here let's now see what we need to do in dq function nothing will change in the first two conditions if q is already
empty or if there is only one element in the q we will handle these cases in same manner in the final else when we are incrementing front we need to increment it in a circular manner so we will say front equal front plus one modulo n where n is number of elements in the array total number
of elements in the array or size of array now let's perform a dq we will come to this condition front right now is two so this will be two plus one modulo 10 one more cell is available to us now this much is the core of our implementation front operation will be really straightforward
we simply need to return the element at front index here also we first need to check whether q is empty or not we should return a front only when front is not equal to minus one all these operations all these functions that i have written here will take constant time their time complexity
will be big o of one we are performing simple arithmetic and assignments in the functions and not doing anything costly like running a loop so time taken will not depend upon size of q or some other variable i leave this here it should not be very difficult converting this
pseudo code to a running program in a language of your choice if you want to see my code you can check the description of this video for a link thanks for watching in our previous lesson we saw how we can implement q using arrays now in this lesson we will see how
we can implement q using linked list q as we know from our previous discussions is a structure in which whatever goes in first comes out first q data structure is a list or collection with this restriction that insertion can be performed at one end and deletion can be performed at other end
these are typical operations that we defined with q and insertion is called in q operation and a deletion is called dq front operation front function should simply return the element at front of list and is empty should check whether q is empty or not and all these operations must
take constant time their time complexity should be big o of one when we were implementing q with arrays we used the idea of a circular array to implement q then in this case we have a limitation the limitation is that array will always have a fixed size and once all the positions in the
array are taken once the array is exhausted we have two options we can either deny insertion so we can say that the q is full and we cannot insert anything now or what we can do is we can create a new larger array and copy the elements from previous array to the new larger array
which will be a costly process we can avoid this problem if we will use linked list to implement q please note that this representation of circular array that i'm showing here is only a logical way of looking at an array we can show this array like this also as i was
saying in an array implementation we will have this question what if array gets filled and we need to take care of this we can either say q is full or we can create a new larger array and copy elements from previous filled array into this new larger array the time taken for this copy
operation will be proportional to number of elements in filled array or in other words we can say that the time complexity of this copy operation will be big o of n there is another problem with array implementation we can have a large enough array and q may not be using most of
it like right now in this array 90 percent of the memory is unused memory is an important resource and we should always avoid blocking memory unnecessarily it's not that some amount of unused memory will be a real problem in a modern day machine it's just that while designing solutions and
algorithms we should analyze and understand these implications let's now see how good we will be with a linked list implementation i have drawn a logical view of a linked list of integers here coming back to basic definition of q as we know a q is a list or collection with this
constraint with this property that an element must always be inserted from one side of the q that we call the rear of q and an element must always be removed from the other side that we call the front of q it's really easy to enforce this property in a linked list a linked list as we
know is a collection of entities that we call nodes and these nodes are stored at non-contiguous locations in memory each node contains two fields one to store data and another to store address of the next node or reference to the next node let's assume that nodes in this figure
are at addresses hundred two hundred and three hundred respectively i have also filled in the address fields the identity of linked list that we always keep with us is address of the head node we often name the pointer or reference variable that would store this address head okay so now
we are saying that we want to use linked list to implement q these are the typical operations that we define with a q we can use a linked list like a q we can pick one side for insertion or in q operation so a node in the linked list must always be inserted from this side the other side will
then be used for dq so if we are picking head side for in q operation a dq must always happen from tail if we are picking tail for in q operation then then dq must always happen from head whatever side we are picking for whatever operation we need to be taking care of one requirement and
the requirement is that these operations must take constant time or in other words their time complexity must be big o of one as we know from our previous lessons the cost of insertion or removal from head side is big o of one but the cost of insertion or removal from tail side is big o of
n so here's the deal in a normal implementation of linked list if we will insert at one side and remove from other side then one of these operations nq or dq depending on how we are picking the sides will cost us big o of n but the requirement that we have is that both these operations must take
constant time so we definitely need to do something to make sure that both nq and dq operations take constant time let's call this side front and this side rear so i want to nq a node from this side and i want to dq from this side we are good for dq operation because removal from front will
take constant time but insertion or nq operation will be big o of n let's first see why insertion at tail will be costly and then maybe we can try to do something to insert at rear end what we will have to do is first we will have to create a node we have a new node here let's say i've got
this node at address 350 and the integer that i want to nq is 7 the address part of this node can be set as null now what we need to do is we need to build this link we need to set the address part of the last node as address of this newly created node and to do so we first need to have a pointer
pointing to this last node storing the address of this last node in a linked list the only identity that we always keep with us is address of the head node to get a pointer to any other node we need to start at head so we will first create a pointer temp and we will initially set it to
head and now in one step we can move this pointer variable to the next of whatever node it is pointing to it's pointing to we use a statement like temp equal temp dot next to move to the next node so from first node we will go to the second node and then from second we will go to the third node
in this example third node is the rear node and now using this pointer temp we can write the address part of this node and build this link this whole traversal that we are having to get a pointer from head to tail is what's taking all the time what we can do is we can avoid this whole traversal
we can have a pointer variable just like head that should always store the address of rear node i can call this variable tail or rear let's call this rear and let's call this variable that is storing the address of head node front in any insertion or removal we will have to update
both front and rear now but now when we will in queue let's say i have got a node at address 450 and and i want to insert this node at rear end now using the rear pointer we can update the address field here so we are building this link and now we can update rear we will only have to modify
some address fields and time taken for in queue operation will not depend upon number of nodes in the linked list so now with this design both in queue and dequeue operations will be constant time operations the time complexity for both will be big o of one let's quickly see how real code
in c will look like for this design i have declared node as a structure with two fields one to store data and another to store address of next node and now instead of declaring a pointer variable named head appointed to node named head i'm declaring two pointers appointed to node
named front and another pointer to node named rear and initially i'm setting them both has null let's say i'm defining these two variables in global scope so they will be accessible to all functions my enqueue function will take an integer as argument in this function i'll first create a
node i'll use malloc in c or new operator in c plus plus to create a node in what we call dynamic memory i'm pointing to the newly created node using this variable which is pointed to node named temp now we can have two cases in insertion or in queue operation if there is no element in
the queue if the queue is empty in this case both front and rear will be null we will simply set both front and rear as address of this new node being pointed to by temp and we will return or exit else because we already have a pointer to rear node we will first set the address part of
current rear as the address of this newly created node and then we will modify the address in rear variable to make it point to this newly created node while writing all of this i'm assuming that you already know how to implement a linked list if you want to refresh your concepts you can check
earlier lessons in this series or you can check the description of this video for a link to lesson on linked list implementation in c or c plus plus this code will be further clear if i'll show things moving in a simulation let's say initially we have an empty queue so both front and
rear will be null null is only a macro for address zero at this stage let's say we are making a call to nq function passing it number two now let's go through the nq function and see what will happen first we will create a node data part of this node will be set as two and address part initially
will be set as null let's say we got this node at address temp at address hundred so a variable named temp is is storing this address this variable is pointing to this node right now front and rear are both null so we will go inside this if condition and simply set both front and rear
as hundred when the function will finish execution temp which is a local variable will be cleared from memory after setting both front and rear as address of this newly created node we are returning so this is how the queue will look like after first nq let's say we are making another call to
nq function at this stage passing number four as argument once again a new node will be created let's say i got the new node at address 200 this time the queue is not empty so in this function we will first go to this statement rear dot next equal temp so we will set the next part of this
node at address hundred as the address of the newly created node which is 200 so we will build this link and now we will store the address of the new rear node in this variable named rear so this is how my queue will look like after this second nq let's do one more nq let's send q number six
let's say we got a new node this time at address 300 so this is how our queue will look like okay let's now write the queue function in dequeue function i'll first create a temporary pointer to node in which i'll store the address of the current head or current front let's say for this
example at this stage i'm making a call to dequeue function we will have couple of cases in dequeue also the queue could be empty so in this case we can print an error message and return in case of empty queue front and rear will both be equal to null we can check in one of these and we will be
good in the case when front and rear will be equal we will simply set both front and rear as null in all other cases we can simply make front point to the next node so we will simply do a front equal front to odd next but why have we used this temporary pointer to node why have i declared
this temporary pointer to node in this code well simply incrementing front will not be good enough in this example when i'm calling dequeue i'm first creating temp let's walk through whatever code i've written so far so in the first line i'm creating temp and then because q is not empty
and there are more than one elements in the queue i'm setting front as address of the next node so my queue is good now all the links are appropriately modified but this node which was front previously is still in the memory anything in dynamic memory has to be explicitly freed
to free this node we will use free function and to this free function we should be passing address of the node and that's why we had created temp with this free the node will be wiped off from memory so these are nq and dequeue operations for you and if you can see there are simple
statements in these functions there are no loops so these functions will take constant time the time complexity will be big o of one in the beginning of this lesson we had also discussed some limitations with array implementation like what if array gets filled and that of unused memory
we do not have these limitations in a linked list implementation we are using some extra memory to store address of next node but apart from that there is no other major disadvantage i'll stop here now you can write rest of the functions like front function to look at the
element at front or is empty function to check whether queue is empty or not yourself if you want to get my source code then you can check the description of this video for a link so thanks for watching hello everyone in this lesson we'll introduce you to an interesting data structure
that has got its application in a wide number of scenarios in computer science and this data structure is tree so far in this series we have talked about what we can call linear data structures array linked list stack and queue all of these are linear data structures
all of these are basically collections of different kinds in which data is arranged in a sequential manner in all these structures that i'm showing here we have a logical start and a logical end and then an element in any of these collections can have a next element and
a previous element so all in all we have linear or sequential arrangement now as we understand these data structures are ways to store and organize data in computers for different kinds of data we use different kinds of data structure our choice of data structure depends upon a number
of factors first of all it's about what needs to be stored a certain data structure can be best fit for a particular kind of data then we make here for the cost of operations quite often we want to minimize the cost of most frequently performed operations for example let's say we have a simple
list and we are searching for an element in the list most of the time then we may want to store the list or collection as an array in sorted order so we can perform something like binary search really fast another factor can be memory consumption sometimes we may want to minimize
the memory usage and finally we may also choose a data structure for ease of implementation although this may not be the best strategy tree is one data structure that's quite often used to represent hierarchical data for example let's say we want to show employees in an organization
and their positions in organizational hierarchy then we can show it something like this let's say this is organizational hierarchy of some company in this company john is CEO and john has two direct reports Steve and Rama then Steve has three direct reports Steve is
manager of Lee Bob and Ella they may be having some designation Rama also has two direct reports then Bob has two direct reports and then Tom has one direct report this particular logical structure that I've drawn here is a tree well you have to look at look at the structure upside down
and then it will resemble a real tree the root here is at top and we are branching out in downward direction logical representation of tree data structure is always like this root at top and branching out in downward direction okay so tree is an efficient way of storing
and organizing data that is naturally hierarchical but this is not the only application of tree in computer science we will talk about other applications and some of the implementation details like how we can create such a logical structure in computer's memory later first I
want to define tree as a logical model tree data structure can be defined as a collection of entities called nodes linked together to simulate a hierarchy tree is a non-linear data structure it's a hierarchical structure the topmost node in the tree is called root of the tree each node will
contain some data and this can be data of any type in the tree that I'm showing in right here data is name of employee and designation so we can have an object with two string fields one to store name and another to store designation okay so each node will contain some data and may contain link
or reference to some other nodes that can be called its children now I'm introducing you to some vocabulary that we use for tree data structure what I'm going to do here is I'm going to number these nodes in the left trace so I can refer to these nodes using these numbers I'm numbering
these nodes only for my convenience it's not to show any order okay coming back as I had said each node will have some data we can fill in some data in these circles it can be data of any type it can be an integer or a character or a string or we can simply assume that there is some data
filled inside these nodes and we are not showing it okay as we were discussing a node may have link or reference to some other nodes that will be called its children each arrow in this structure here is a link okay now as you can see the root node which is numbered 1 by me and once again this
number is not indicative of any order I could have called the root node node number 10 also so root node has linked to these two nodes number two and three so two and three will be called children of one and node one will be called parent of nodes two and three I'll write down all these terms that
I am talking about we mentioned root children and parent in this tree one is a parent of one is parent of two and three two is child of one and now four five and six are children of two so node two is child of node one but parent of nodes four five and six children of same parent
are called sibling I'm showing siblings in same color here two and three are sibling then four five and six are sibling then seven eight are sibling and finally nine and ten are sibling I hope you are clear with these terms now the topmost node in the tree is called root root would be the
only node without a parent and then if a node has a direct link to some other node then we have a parent child relationship between the nodes any node in the tree that does not have a child is called leaf node all these nodes marked in black here are leaves so leaf is one more term
all other nodes with at least one child can be called internal nodes and we can have some more relationships like parent of parent can be called grandparent so one is grandparent of four and four is grandchild of one in general if we can grow
go from node a to b walking through the links and remember these links are not bidirectional we have a link from one to two so we can go from one to two but we cannot go from two to one when we are walking the tree we can walk in only one direction okay so if we can go from node
a to node b then a can be called ancestor of b and b can be called descendant of a let's pick up this node number 10 one two and five are all ancestors of 10 and 10 is a descendant of all of these nodes we can walk from any of these nodes to 10 okay let me now ask you some
questions to make sure you understand things what are the common ancestors of four and nine ancestors of four are one and two and ancestors of nine are one two and five so common ancestors will be one and two okay next question are six and seven sibling sibling must have same parent
six and seven do not have same parent they have same grandparent one is grandparent of both nodes not having same parent but having same grandparent can be called cousins so six and seven are cousins and these relationships are really interesting we can also say that node number three
is uncle of node number six because because it's sibling of two which is father of six or i should say parent of six so we have quite some terms in vocabulary of tree okay now i'll talk about some properties of tree tree can be called a recursive data structure
we can define tree recursively as a structure that consists of a distinguished node called root and some subtrees and the arrangement is such that root of the tree contains link two roots of all the subtrees t1 t2 and t3 in this figure are subtrees in the tree that i have drawn
in left here we have two subtrees for root node i'm showing the root node in red the left subtree in brown and the right subtree in yellow we can further split the left subtree and look at it like node number two is root of this subtree and this particular tree with node number two as root has
three subtrees i'm showing the three subtrees in three different colors recursion basically is reducing something in a self-similar manner this recursive property of tree will be used everywhere in all implementation and users of tree the next property that i want to talk about
is in a tree with n nodes there will be exactly n minus one links or edges each arrow in this figure can be called a link or an edge all nodes except the root node will have exactly one incoming edge if you can see i'll pick this node number two there is only one incoming link this is incoming
link and these three are outgoing links there will be one link for each parent child relationship so in a valid tree if there are n nodes there will be exactly n minus one edges one incoming edge for each node except the root okay now i want to talk about these two properties called depth
and height depth of some node x in a tree can be defined as length of the path from root to node x each edge in the path will contribute one unit to the length so we can also say number of edges in path from root to x the depth of root node will be zero let's pick some other node
for this node number five we have two edges in the path from root so the depth of this node is two in this tree here depth of nodes two and three is one depth of nodes four five six seven and eight is two and the depth of nodes nine ten and eleven is three okay now height of
a node entry can be defined as number of edges in longest path from that node to a leaf node so height of some node x will be equal to number of edges in longest path from x to a leaf in this figure for node three the longest path from this node to any leaf is two so height of node three is
two node eight is also a leaf node i'll mark all the leaf nodes here a leaf node is a node with zero child the longest path from node three to any of the leaf nodes is two so the height of node three is two height of leaf nodes will be zero so what will be the height of root node in this tree
we can reach all the leaves from root node number of edges in longest path is three so height of the root node here is three we also define height of a tree height of tree is defined as height of root node height of this tree that i'm showing here is three height and depth are different properties
and height and depth of a node may or may not be same we often confuse between the two based on properties trees are classified into various categories there are different kinds of trees that are used in different scenarios simplest and most common kind of tree is a tree
with this property that any node can have at most two children in this figure node two has three children i'm getting rid of some nodes and now this is a binary tree binary tree is most famous and throughout this series we will mostly be talking about binary trees the most common way of
implementing tree is dynamically created nodes linked using pointers or references just the way we do for linked list we can look at the tree like this in this structure that i have drawn in right here node has three fields one of the fields is to store data let's say middle cell is to store
data the left cell is to store the address of the left child and the right cell is to store address of right child because this is a binary tree we cannot have more than two children we can call one of the children left child and another right child programmatically in c or c++ we can define
node as a structure like this we have three fields here one to store data let's say data type is integer i have filled in some data in these nodes so in each node we have three fields we have an integer variable to store the data and then we have two pointers to node one to store the address of
the left child that will be the root of the left subtree and another to store the address of the right child we have kept only two pointers because because we can have at most two children in binary tree this particular definition of node can be used only for a binary tree for generic trees
that can have any number of children we use some other structure and i'll talk about it in later lessons in fact we will discuss implementation in detail in later lessons this is just to give you a brief idea of how things will be like in implementation okay so this is cool we understand
what a tree data structure is but in the beginning we had said that storing naturally hierarchical data is not the only application of tree so let's quickly have a look at some of the applications of tree in computer science first application of course is storing naturally hierarchical data
for example the file system on your disk drive the file and folder hierarchy is naturally hierarchical data it's stored in the form of tree next application is organizing data organizing collections for quick search insertion and deletion for example binary search tree that we'll be discussing a lot
in next couple of lessons can give us order of log n time for searching an element in it a special kind of tree called tri is used is used to store dictionary it's really fast and efficient and is used for dynamic spell checking tree data structure is also used in network routing
algorithms and this list goes on we'll talk about different kinds of trees and their applications in later lessons i'll stop here now this is good for an introduction in next couple of lessons we'll talk about binary search tree and its implementation this is it for this lesson thanks
for watching in our previous lesson we introduced you to tree data structure we discussed tree as a logical model and talked briefly about some of the applications of tree now in this lesson we will talk a little bit more about binary trees as we had seen in our previous lesson binary tree
is a tree with this property that each node in the tree can have at most two children we will first talk about some general properties of binary tree and then we can discuss some special kind of binary trees like binary search tree which is a really efficient structure for storing ordered data
in a binary tree as we were saying each node can have at most two children in this tree that I've drawn here nodes have either zero or two children we could have a node with just one child I have added one more node here and now we have a node with just one child because each node
in a binary tree can have at most two children we call one of the children left child and another right child for the root node this particular node is left child and this one is right child a node may have both left and right child and these four nodes have both left and right child
or a node can have either of left and right child this one has got a left child but has not got right child I'll add one more node here now this node has a right child but does not have a left child in a program we would set the reference or pointer to left child as null so we can say
that for this node left child is null and similarly for this node we can say that the right child is null for all the other nodes that do not have children that are leaf nodes a node with zero child is called leaf node for all these nodes we can say that both left and right child are null
based on properties we classify binary trees into different types I'll draw some more binary trees here if a tree has just one node then also it's a binary tree this structure is also a binary tree this is also a binary tree remember the only condition is that a node cannot have more than
two children a binary tree is called strict binary tree or proper binary tree if each node can have either two or zero children this tree that I'm showing here is not a strict binary tree because we have two nodes that have one child I'll get rid of two nodes and now this is a strict binary tree
we call a binary tree complete binary tree if all levels except possibly the last level are completely filled and all nodes are as far left as possible all levels except possibly the last level will anyway be filled so the nodes at the last level if it's not filled completely must be
as far left as possible right now this tree is not a complete binary tree nodes at same depth can be called nodes at same level root node in a tree has step zero depth of a node is defined as length of path from root to that node in this figure let's say nodes at step zero are nodes at level
zero I can simply say L0 for level zero now these two nodes are at level one these four nodes are at level two and finally these two nodes are at level three the maximum depth of any node in the tree is three maximum depth of a tree is also equal to height of the tree if we will go numbering
all the levels in the tree like L0 L1 L2 and so on then the maximum number of nodes that we can have at some level i will be equal to two to the power i at level zero we can have one node two to the power zero is one then at level one we can have at max two nodes at level two we can
have two to the power two nodes at max which is four so in general at any level i we can have at max two to the power i nodes you should be able to see this very clearly because each node can have two children so if we have x nodes at a level then each of these x nodes can have two children
so at next level we can have at most two x children here in this binary tree we have four nodes at level two which is the maximum for level two now each of these nodes can possibly have two children i'm just drawing the arrows here so at level three we can have max two times four that is eight
nodes now for a complete binary tree all the levels have to be completely filled we can give exception to the last level or the best level it doesn't have to be full but the nodes have to be as left as possible this particular tree that i'm showing here is not a complete binary tree
because we have two vacant node positions in left here i'll do slight change in this structure now this is a complete binary tree we can have more nodes at level three but there should not be a vacant position in left i have added one more node here and this still is a complete binary tree
if all the levels are completely filled such a binary tree can also be called perfect binary tree in a perfect binary tree all levels will be completely filled if h is the height of a perfect binary tree remember height of a binary tree is length of longest path between root to any of the
leaf nodes or i should say number of edges in longest path from root to any of the leaf nodes height of a binary tree will also be equal to max step here for this binary tree height or max step this tree maximum number of nodes in a tree with height h will be equal to we'll have two to the
power zero nodes at level zero two to the power one node at level one and we'll go on summing for height h we'll go till two to the power h at the best level we will have two to the power h nodes now this will be equal to two to the power h plus one minus one h plus one is number of levels here
we can also say two to the power number of levels minus one in this tree number of levels is four we have l zero till l three so number of nodes maximum number of nodes will be two to the power four minus one which is 15 so a perfect binary tree will have maximum number of nodes possible
for a height because all levels will be completely filled well i should say maximum number of nodes in a binary tree with height h okay i can ask you this also what will be height of a perfect binary tree with n nodes let's say n is number of nodes in a perfect binary tree to find out how
height we'll have to solve this equation n equal to the power h plus one minus one because if height is h number of nodes will be two to the power h plus one minus one we can solve this equation and the result will be this remember n is number of nodes here i leave the maths for you to understand
height will be equal to log n plus one to the base two minus one in this perfect binary tree that i'm showing here number of nodes is 15 so n is 15 n plus one will be 16 so the h will be log 16 to the base two minus one log 16 to the base two will be four so the final value will be
four minus one equal three in general for a complete binary tree we can also calculate height as floor-off log n to the base two so we need to take integral part of log n to the base two perfect binary tree is also a complete binary tree here n is 15 log of 15 to base two is
3.906891 if we'll take the integral part then this will be three i'll not go into proof of how height of complete binary tree will be log n to the base two we'll try to see that later all this maths will be really helpful when we will analyze cost of various operations on binary
tree cost of a lot of operations on tree in terms of time depends upon the height of tree for example in binary search tree which is a special kind of binary tree the cost of searching inserting or removing an element in terms of time is proportional to the height of tree so in such case we would
want the height of the tree to be less height of a tree will be less if the tree will be dense if the tree will be close to a perfect binary tree or a complete binary tree minimum height of a tree with n nodes can be log n to the base two when the tree will be a complete binary tree if we will
have an arrangement like this then the tree will have maximum height with n nodes minimum height possible is flow rough or integral part of log into the base two and maximum height possible with n nodes is n minus one when we will have a sparse tree like this which is as good as a linked
list now think about this if i'm saying that time taken for an operation is proportional to height of the tree or in other words i can say that if time complexity of an operation is big o of h where h is height of the binary tree then for a complete or perfect
binary tree my time complexity will be big o of log n to the base two and in worst case for this parse tree my time complexity will be big o of n order of log n is almost best running time possible for n as high as two to the power hundred log n to the base two is just hundred with order of
n running time if n will be two to the power hundred we won't be able to finish our computation in years even with most powerful machines ever made so here's the thing quite often we want to keep the height of a binary tree minimum possible or most commonly we say that we try to keep a
binary tree balanced we call a binary tree balanced binary tree if for each node the difference between height of left and right sub tree is not more than some number k mostly k would be one so we can say that for each node difference between height of left and right sub tree
should not be more than one there is something that i want to talk about height of a tree we had defined height earlier as number of edges in longest path from root to a leaf height of a tree with just one node where the node itself will be a leaf node will be zero we can define an empty tree
as a tree with no node and we can say that height of an empty tree is minus one so height of tree with just one node is zero and height of an empty tree is minus one quite often people calculate height as number of nodes in longest path from root to a leaf in this figure i have drawn one
of the longest paths from root to a leaf we have three edges in this path so the height is three if we will count number of nodes in the path height will be four this looks very intuitive and i have seen this definition of height at a lot of places if we will count the nodes height of tree with just
one node will be equal to one and then we can say height of an empty tree will be zero but this is not the correct definition and we are not going to use this assumption we are going to say say that height of an empty tree is minus one and height of tree with one node is zero the difference
between heights of left and right sub trees of a node can be calculated as absolute value of height of left sub tree minus height of right sub tree and in this calculation height of a sub tree can be minus one also for this leaf node here in this figure both left and right sub trees are empty
so both h left or height of left sub tree and h right or height of right sub tree will be minus one but the difference overall will be zero for all nodes in a perfect tree difference will be zero i have got rid of some nodes in this tree and now by the side of each node i have written
the value of diff this is still a balanced binary tree because the maximum diff for any node is one let's get rid of some more nodes in this tree and now this is not balanced because one of the nodes has diff two for this particular node height of left sub tree is one and height of
right sub tree is minus one because right sub tree is empty so the absolute value of difference is two we try to keep a tree balanced to make sure it's tense and its height is minimized if height is minimized cost of various operations that depend upon height are minimized okay the next thing that
i want to talk about very briefly is how we can store binary trees in memory one of the ways that we had seen in our previous lesson which is most commonly used is dynamically created nodes linked to each other using pointers or references for a binary tree of integers in c or c plus plus
we can define a node like this data type here is integer so we have a field to store data and we have two pointer variables one to store address of left child and another to store address of right child this of course is the most common way nodes dynamically created at random locations in
memory linked together through pointers but in some special cases we use arrays also arrays are typically used for complete binary trees i have drawn a perfect binary tree here let's say this is a tree of integers what we can do is we can number these nodes from zero starting at root and going
level by level from left to right so we'll go like zero one two three four five and six now i can create an array of seven integers and these numbers can be used as indices for these nodes so at zero th position i'll fill two at one th position i'll fill four at th position we'll have
one and i'll go on like this we have filled in all the data in the array but how will we store the information about the links how will we know that the left child of root has value four and the right child of root has value one well in case of complete binary tree if we will number the nodes
like this then for a node at index i the index of left child will be two i plus one and the index of right child will be two i plus two and remember this is true only for a complete binary tree for zero left child is two i plus one for i equals zero will be one and two i plus two will be two
now for one left child is at index three right child is at index four for i equal two two i plus one will be five and two i plus two will be six we will discuss our implementation in detail when we will talk about a special kind of binary tree called heap arrays are used to implement heaps
i'll stop here now in our next lesson we will talk about binary search tree which is also a special kind of binary tree that gives us a really efficient storing structure in which we can search something quickly as well as update it quickly this is it for this lesson thanks for watching
in our previous lesson we talked about binary trees in general now in this lesson we're going to talk about binary search tree a special kind of binary tree which is an efficient structure to organize data for quick search as well as quick update but before i start talking about
binary search trees i want you to think of a problem what data structure will you use to store a modifiable collection so let's say you have a collection and it can be a collection of any data type records in the collection can be of any type now you want to store this collection
in computer's memory in some structure and then you want to be able to quickly search for a record in the collection and you also want to be able to modify the collection you want to be able to insert an element in the collection or remove an element from the collection so what data structure
will you use well you can use an array or a linked list these are two well-known data structures in which we can store a collection now what will be the running time of these operations search insertion or removal if we will use an array or a linked list let's first talk about
arrays and for sake of simplicity let's say we want to store integers to store a modifiable list or collection of integers we can create a large enough array and we can store the records in some part of the array we can keep the end of the list marked in this array that i'm showing here
we have integers from 0 till 3 we have records from 0 till 3 and rest of the array is available space now to search some x in the collection we will have to scan the array from index 0 till end and in worst case we may have to look at all the elements in the list if n is the number of
elements in the list time taken will be proportional to n or in other words we can say that time complexity of this operation will be big o of n okay now what will be the cost of insertion let's say we want to insert number five in this list so if there is some available space
all these cells in yellow are available we can add one more cell by incrementing this marker end and we can fill in the integer to be added the time taken for this operation will be constant running time will not depend upon number of elements in the collection so we can say that
time complexity will be big o of 1 okay now what about removal let's say we want to remove one from the collection what we'll have to do is we'll have to shift all records to the right of one by one position to the left and then we can decrement end the cost of removal in worst case
once again will be big o of n in worst case we will have to shift n minus one elements here the cost of insertion will be big o of 1 if the array will have some available space so the array has to be large enough if the array gets filled what we can do is we can create a new
larger array typically we create an array twice the size of the filled up array so we can create a new larger array and then we can copy the content of the filled up array into this new larger array the copy operation will cost us big o of n we have discussed this idea of dynamic array quite
a bit in our previous lessons so insertion will be big o of 1 if array is not filled up and it will be big o of n if array is filled up for now let's just assume that the array will always be large enough let's now discuss the cost of these operations if we will use a linked list
if we would use a linked list i have drawn a linked list of integers here data type can be anything the cost of search operation once again will be big o of n where n is number of records in the collection or number of nodes in the linked list to search in worst case we will
have to traverse the whole list we will have to look at all the nodes the cost of insertion in a linked list is big o of 1 at head and it's big o of n at tail we can choose to insert at head to keep the cost low so running time of insertion we can say is big o of 1 or in other words we will
take constant time removal once again will be big o of n we will first have to traverse the linked list and search the record and in worst case we may have to look at all the nodes okay so this is the cost of operations if we are going to use array or linked list
insertion definitely is fast but how good is big o of n for an operational like search what do you think if we are searching for a record x then in the worst case we will have to compare this record x with all the n records in the collection let's say our machine can perform
a million comparisons in one second so we can say that machine can perform 10 to the power 6 comparisons in one second so cost of one comparison will be 10 to the power minus 6 second machines in today's world deal with really large data it's very much possible for real world data
to have 100 million or billion records a lot of countries in this world have population more than 100 million two countries have more than a billion people living in them if we will have data about all the people living in a country then it can easily be 100 million records okay so if we are
saying that the cost of one comparison is 10 to the power minus 6 second if n will be 100 million time taken will be 100 seconds 100 seconds for a search is not reasonable and search may be a frequently performed operation can we do something better can we do better than big o of n well in
an array we can perform binary search if it's sorted and the running time of binary search is big o of log n which is the best running time to have i have drawn this array of integers here records in the array are sorted here the data type is integer for some other data type for some
complex data type we should be able to sort the collection based on some property or some key of the records we should be able to compare the keys of records and the comparison logic will be different for different data types for a collection of strings for example we may want to have the
records sorted in dictionary or lexicographical order so we will compare and see which string will come first in dictionary order now this is the requirement that we have for binary search the data structure should be an array and the records must be sorted okay so the cost of search
operation can be minimized if we will use a sorted array but in insertion or removal we will have to make sure that the array is sorted afterwards in this array if i want to insert number five at this stage i can't simply put five at index six what i'll have to do is i'll first have to find the
position at which i can insert five in the sorted list we can find the position in order of log n time using binary search we can perform a binary search to find the first integer greater than five in the list so we can find the position quickly in this case it's index two but then we will have to
shift all the records starting this position one position to the right and now i can insert five so even though we can find the position at which a record should be inserted quickly in big go off log n this shifting in worst case will cost us big go off n so the running
time overall for an insertion will be big o of n and similarly the cost of removal will also be big o of n we will have to shift some records okay so when we are using sorted array cost of search operation is minimized in binary search for n records we will have at max log n to the base
two comparisons so if we can compare if we can perform million comparisons in a second then for n equal two to the power 31 which is greater than two billion we are going to take only 31 microseconds log of two to the power 31 to base two will be 31 okay we are fine with
search now we will be good for any practical value of n but what about insertion and removal they are still big o of n can we do something better here well if we will use this data structure called binary search tree i'm writing it in short bst for binary search tree then the cost of all
these three operations can be big o of log n in average case the cost of all the operations will be big o of n in worst case but we can avoid the worst case by making sure that the tree is always balanced we have talked about balanced binary tree in our previous lesson binary search tree is
only a special kind of binary tree to make sure that the cost of these operations is always big o of log n we should keep the binary search tree balanced we'll talk about this in detail later let's first see what a binary search tree is and how cost of these operations is minimized when
we use a binary search tree binary search tree is a binary tree in which for each node value of all the nodes in left subtree is lesser and value of all the nodes in right subtree is greater i have drawn binary tree as a recursive structure here as we know in a binary tree each node can
have at most two children we can call one of the children left child if we will look at the tree as a recursive structure left child will be the root of left subtree and similarly right child will be the root of right subtree now for a binary tree to be called binary search tree
value of all the nodes in left subtree must be lesser or we can say lesser or equal to handle duplicates and the value of all the nodes in right subtree must be greater and this must be true for all the nodes so in this recursive structure here both left and right subtrees must also be
binary search trees i'll draw a binary search tree of integers now i have drawn a binary search tree of integers here let's see whether this property that for each node value of all the nodes in left subtree must be lesser or equal and value of all the nodes in right subtree must be
greater is true or not let's first look at the root node nodes in the left subtree have values 10 8 and 12 so they're all lesser than 15 and in right subtree we have 17 20 and 25 they're all greater than 15 so we are good for the root node now let's look at this node with value 10
in left we have 8 which is lesser in right we have 12 which is greater so we are good we are good for this node 2 having value 20 and we don't need to bother about leaf nodes because they do not have children so this is a binary search tree now what if i change this value 12
to 16 now is this still a binary search tree well for node with value 10 we are good the node with value 16 is in its right so not a problem but for the root node we have a node in left subtree with higher value now so this tree is not a binary search tree i'll revert back and
make the value 12 again now as we were saying we can search in search or delete in a binary search tree in big o of log n time in average case how is it really possible let's first talk about search if these integers that i have here in the tree were in a sorted array we could have performed
binary search and what do we do in binary search let's say we want to search number 10 in this array what we do in binary search is we first define the complete list as our search space the number can exist only within the search space i'll mark search space using these two pointers
start and end now we compare the number to be searched or the element to be searched with mid element of the search space or the median and if the record being searched if the element being searched is lesser we go searching in the left half else we go searching in the right half
in case of equality we have found the element in this case 10 is lesser than 15 so we will go searching towards left our search space is reduced now to half once again we will compare to the mid element and bingo this time we have got a match in binary search we start with
n elements in search space and then if mid element is not the element that we are looking for we reduce the search space to n by 2 and we go on reducing the search space to half till we either find the record that we are looking for or we get to only one element in search space
and be done in this whole reduction if we will go from n to n by 2 to n by 4 to n by 8 and so on we will have log n to the base two steps if we are taking k steps then n upon 2 to the power k will be equal to 1 which will imply 2 to the power k will be equal to n and k will be equal to log
n to the base 2 so this is why running time of binary search is big o of log n now if we'll use this binary search tree to store the integers search operation will be very similar let's say we want to search for number 12 what we'll do is we'll start at root and then we will compare the
value to be searched the integer to be searched with value of root if it's equal we are done with the search if it's lesser we know that we need to go to the left subtree because in a binary search tree all the elements in left subtree are lesser and all the elements in right subtree are greater
now we'll go and look at the left child of node with value 15 we know that number 12 that we are looking for can exist in this subtree only and anything apart from this subtree is discarded so we have reduced the search space to only these three nodes having value 10, 8 and 12 now once
again we'll compare 12 with 10 we are not equal 12 is greater so we know that we need to go looking in right subtree of this node with value 10 so now our search space is reduced to just one node once again we will compare the value here at this node and we have a match
so searching an element in binary search tree is basically this traversal in which at each step we will go either towards left or right and hence in at each step we will discard one of the subtrees if the tree is balanced we call a tree balanced if for all nodes
the difference between the heights of left and right subtrees is not greater than one so if the tree is balanced we will start with a search space of n nodes and when we will discard one of the subtrees we will discard n by two nodes so our search space will be reduced to n by two
and then in next step we will reduce the search space to n by four we will go on reducing like this till we find the element or till our search space is reduced to only one node when we will be done so the search here is also a binary search and that's why the name binary search tree
this tree that i'm showing here is balanced in fact this is a perfect binary tree but with same records we can have an unbalanced tree like this this tree has got the same integer values as we had in the previous structure and this is also a binary search tree but this is unbalanced
this is as good as a linked list in this tree there is no right subtree for any of the nodes search space will be reduced by only one at each step from n nodes in search space we will go to n minus one nodes and then to n minus two nodes all the way till one will be n steps in binary search
tree in average case cost of search insertion or deletion is big o of log n and in worst case this is the worst case arrangement that i'm showing you running time will be big o of n we always try to avoid the worst case by trying to keep the binary search tree balanced
with same records in the tree there can be multiple possible arrangements for these integers in this tree another arrangement is this for all the nodes we have nothing to discard in left subtree in a search this is another arrangement this is still balanced because for all the nodes the
difference between the heights of left and right subtrees is not greater than one but this is the best arrangement when we have a perfect binary tree at each step we will have exactly n by two nodes to discard okay now to insert some record in binary search tree we will first have to find
the position at which we can insert and we can find the position in big o of log n time let's say we want to insert 19 in this tree what we will do is we will start at the root if the value to be inserted is lesser or equal if there is no child insert as left child or go left if the value is
critter and there is no right child insert as right child or go right in this case 19 is critter so we will go right now we are at 20 19 is lesser and left subtree is not empty we have a left child so we will go left now we are at 17 19 is critter than 17 so it should go in right of 17 there is no
right child of 17 so we will create a node with value 19 and link it to this node with value 17 as right child because we are using pointers or references here just like linked list no shifting is needed like an array creating a link will take constant time so overall insertion will also
cost us like search to delete also we will first have to search the node search once again will be big o of log n and deleting the node will only mean adjusting some links so removal also is going to be like search big o of log n in average case binary search tree gets unbalanced during insertion
and deletion so often during insertion and deletion we restore the balancing there are ways to do it and we will talk about all of this in detail in later lessons in next lesson we will discuss implementation of binary search tree in detail this is it for this lesson thanks for watching
in our previous lesson we saw what binary search trees are now in this lesson we are going to implement binary search tree we will be writing some code for binary search tree prerequisites for this lesson is that you must understand the concepts of pointers and dynamic memory allocation
in cc++ if you have already followed this series and seen our lessons on linked list then implementation of binary search tree or binary tree in general it's not going to be very different we will have nodes and links here as well okay so let's get started binary search tree or
bst as we know is a binary tree in which for each node value of all the nodes in left subtree is lesser or equal and value of all the nodes in right subtree is greater we can draw bst as a recursive structure like this value of all the nodes in left subtree must be lesser or equal
and value of all the nodes in right subtree must be greater and this must be true for all nodes and not just a root node so in this recursive definition here both left and right subtrees must also be binary search trees i have drawn a binary search tree of integers here now the question is
how can we create this non-linear logical structure in computer's memory i had talked about this briefly when we had discussed binary trees the most popular way is dynamically created nodes linked to each other using pointers or references just the way we do it for linked lists because in
a binary search tree or in a binary tree in general each node can have at most two children we can define node as an object with three fields something like what i'm showing here we can have a field to store data another to store address or reference to left child and another to store address or
reference to right child if there is no left or right child for a node reference can be set as null in c or c++ we can define node like this there is a field to store data here the data type is integer but it can be anything there is one field which is pointer to node node asterisk means
pointer to node this one is to store the address of left child and we have another one to store the address of right child this definition of node is very similar to definition of node for doubly linked list remember in doubly linked list also each node had two links
one to previous node and another to next node but doubly linked list was a linear arrangement this definition of node is for a binary tree we could also name this something like bst node but node is also fine let's go with node now in our implementation just like linked list
all the nodes will be created in dynamic memory or heap section of applications memory using malloc function in c or new operator in c++ we can use malloc in c++ as well now as we know any object created in dynamic memory or heap section of applications memory
cannot have a name or identifier it has to be accessed through a pointer malloc or new operator returners pointer to the object created in heap if you want to revise some of these concepts of dynamic memory allocation you can check the description of this video for
link to a lesson it's really important that you understand this concept of stack and heap in applications memory really well now for a linked list if you remember the information that we always keep with us is address of the head node if we know the head node we can access all other nodes
using links in case of trees the information that we always keep with us is address of the root node if we know the root node we can access all other nodes in the tree using links to create a tree we first need to declare a pointer to bst node i'll rather call node bst node here bst
for binary search tree so to create a tree we first need to declare a pointer to bst node that will always store the address of root node i have declared a pointer to node here named root ptr for pointer in c you can't just write bst node as to risk root ptr you will have to write
struct space bst node as to risk you will have to write struct here as well i'm gonna write c++ here but anyway right now i'm trying to explain the logic we will not bother about my new details of implementation in this logical structure of tree that i'm showing here each node as you can see
has three fields three cells leftmost cell is to store the address of left child and rightmost cell is to store address of right child let's say root node is at address 200 in memory and i'll assume some random addresses for all other nodes as well now i can fill in these left and right
cells for each node with addresses of left and right children in our definition data is first field but in this logical structure i'm showing data in the middle okay so for each node i have filled in addresses for both left and right child address is zero or null if there is no child now as we were
saying identity of the tree is address of the root node we need to have a pointer to node in which we can store the address of the root node we must have a variable of type pointer to node to store the address of root node all these rectangles with three cells are nodes they are created using
malloc or new operator and live in heap section of applications memory we cannot have name or identifier for them they are always accessed through pointers this root PTR root pointer has to be a local or global variable we will discuss this in a little more detail in some time
quite often we like to name this root pointer just root we can do so but we must not confuse this is pointer to root and not the root itself to create a BST as i was saying we first need to declare this pointer initially we can set this pointer as null to say that the tree is empty
a tree with no node can be called empty tree and for empty tree root pointer should be set as null we can do this declaration and setting the root as null in main function in our program actually let's just write this code in a real compiler i'm writing c++ here as you can see in
the main function i have declared this pointer to node which will always store the address of root node of my tree and i'm initially setting this as null to say that the tree is empty with this much of code we have created an empty tree but what's the point of having an empty tree we should
have some data in it so what i want to do now is i want to write a function to be able to insert a node in the tree i will write a function named insert that will take address of the root node and data to be inserted as argument and this function will insert a node with this data
at appropriate position in the tree in the main function i'll make calls to this insert function passing it address of root and the data to be inserted let's say first i want to insert number 15 and then i want to insert number 10 and then number 20 we can insert more but let's first
write the logic for insert function before i write the logic for insert function i want to write a function to create a new node in dynamic memory or heap this function get new node should take an integer the data to be inserted as argument create a node in heap using
new or malloc and return back the address of this new node i'm creating the new node here using this new operator the operator will will return me the address of the newly created node which i'm collecting in this variable of type pointer to bst node in c instead of new operator
we will have to use malloc we can use malloc in c++ as well c++ is only a super set of c malloc will also do here now anything in dynamic memory or heap is always accessed through pointer now using this pointer new node we can access the fields of the newly created node
i'll have to dereference this pointer using asterisk operator so i'm writing asterisk new node and now i can access the fields we have three fields in node data and two pointers to node left and right i've set the data here instead of writing asterisk new node dot data we have
this alternate syntax that we could use you could simply write new node arrow data and this will mean the same we have used this syntax quite a lot in our lessons on linked list now for the new node we can set the left and right child as null and finally we can return the
address of the new node okay coming back to the insert function we can have a couple of cases in insertion first of all tree may be empty for this first insertion when we are inserting this number 15 tree will be empty if tree is empty we can simply create a new node and set it as root
with this statement root equal get new node i'm setting root as address of the new node but there is something not all right here this root is local variable of insert function and its scope is only within this function we want this root root in main to be modified
this guy is a local variable of main function there are two ways of doing this we can either return the address of the new root so return type of insert function will be pointer to bst node and not void and here in the main function we will have to write statement
like root equal insert and the arguments so we will have to collect the return and update our root in main function another way is that we can pass the address of this root of main to the insert function this root is already a pointer to node so its address can be collected in a pointer to
pointer so insert function in insert function first argument will be a pointer to pointer and here we can pass the address we'll say ampersand root to pass the address we can name this argument root or we can name this argument root ptr we can name this whatever now what we need to do is we
need to dereference this using asterisk operator to access the value in root of main and we can also set the value in root of main so here with this statement we are setting the value and the return type now can be void this pointer to pointer thing gets a little tricky
i'll go with the former approach actually there is another way instead of declaring root as a local variable in main function we can declare root as global variable global variable as you know has to be declared outside all the functions if root would be global
variable it would be accessible to all the functions and we will not have to pass the address stored in it as argument anyway coming back to the logic for insertion as we were saying if the tree is empty we can simply create a new node and we can simply set it as root at this stage we wanted to
insert 15 if we will make call to the insert function address of root is 0 or null null is only a macro for 0 and the second argument is the number to be inserted in this call to insert function we will make call to get new node let's say we got this new node at address 200 get new node
function will return us address 200 which we can set as root here but this root is a local variable we will return this address 200 back to the main function and in the main function we are actually doing this root equal insert so in the main function we are building this link
okay our next call in the main function was to insert number 10 at this stage root is 200 the address in root is 200 and the value to be inserted is 10 now the tree is not empty so what do we do if the tree is not empty we can basically have two cases if the data to be inserted is lesser
or equal we need to insert it in the left subtree of root and if the data to be inserted is greater we need to insert it in right subtree of the root so we can reduce this problem in a self-similar manner in a recursive manner recursion is one thing that we are going to use almost all the time while
working with trees in this function i'll say that if the data to be inserted is less than or equal to the data in root then make a recursive call to insert data in left subtree the root of the left subtree will be the left child so in this recursive call we are passing address of left child and
data as argument and after the data is inserted in left subtree the root of the left subtree can change insert function will return the address of the new root of the left subtree and we need to set it as left child of the current node in this example tree here right now both left and right
subtree are empty we are trying to insert number 10 so we have made call to this function insert from main function we have called insert passing it address 200 and value or data 10 now 10 is lesser than 15 so control will come to this line and a call will be made to insert data in left
subtree now left subtree is empty so address of root for left subtree is 0 data passed data to be inserted passed as argument is 10 now this first insert call will wait for this insert below to finish and return for this last insert call root is null let's say we got this node at address 150
now this insert call will return back 150 and execution of first insert call will resume at this line and now this particular address will be set as 150 so we will build this link and now this insert call can finish it can return back the current root actually this return root
should be there for all cases so i'm taking it out and i have it after all these conditions of course we will have one more else here if the data is greater we need to go insert in right subtree the third call in insert function was to insert number 20 now this time we will go to this
else statement this statement in else let's say we got this new node at address 300 so this guy will return 300 for this node at 200 right child will will be set as 300 and now this call to insert can finish the return will be 200 okay at this stage what if a call is made to insert number
25 we are at root right now the node with address 200 25 is greater so we need to go and insert in right subtree right subtree is not empty this time so once again for this call also we will come to this else last else because 25 is greater than 20 now in this call we will go to the first if
a node will be created let's say we got this node in heap at address 500 this particular call insert 025 will return 500 and finish now for the node at 300 right child will be set as 500 so this link will get built now this guy will return 300 the root for this subtree has not changed and this
first call to insert will also wrap up it will return to 200 so we are looking good for all cases this insert function will work for all cases we could write this insert function without using recursion i encourage you to do so you will have to use some temporary pointer to node and loops
recursion is very intuitive here and recursion is intuitive in pretty much everything that we do with trees so it's really important that we understand recursion really well okay i'll write one more function now to search some data in bst in the main function here i have made some more calls to
insert now i want to write a function named search that should take as argument address of the root node and the data to be searched and this function should return me true if data is there in the tree false otherwise once again we will have couple of cases if the root is null then we can return
false if the data in root is equal to the data that we are looking for then we can return true else we can have two cases either we need to go and search in the left subtree or we need to go in the right subtree so once again i'm using recursion here i am making recursive call to search function
in these two cases if you have understood the previous recursion then this is very similar let's test this code now what i've done here is i've asked the user to enter a number to be searched and then i'm making call to the search function and if this function is returning me true
i'm printing found else i'm printing not found let's run this code and see what happens i have moved multiple insert statements in one line because i'm short of space here let's say we want to search for number eight eight is found and now let's say we want to search for 22 22 is not found
so we are looking good i'll stop here now you can check the description of this video for link to all the source code we will do a lot more with trees in coming lessons in our next lesson we will go a little deeper and try to see how things move in various sections of applications
memory how things move in stack and heap sections of memory when we execute these functions it will give you a lot of clarity this is it for this lesson thanks for watching in our previous lesson we wrote some code for binary search tree we wrote functions to insert and search data in bst now in this lesson
we will go a little deeper and try to understand how things move in various sections of applications memory when these functions get executed and this will give you a lot of clarity this will give you some general insight into how memory is managed for execution of a program and how recursion
which is so frequently used in case of trees works the concepts that i'm going to talk about in this lesson have been discussed earlier in some of our previous lessons but it will be good to go through these concepts again when we are implementing trees so here is the code that we have written
we have this function get new node to create a new node in dynamic memory and then we have this function insert to insert a new node in the tree and then we have this function to search some data in the tree and finally this is the main function you can check the description of this
video for link to this source code now in main function here we have this pointer to bst node named root to store the address of root node of my tree and i'm initially setting it as null to create an empty tree and then i'm making some calls to insert function to insert some data in
the tree and finally i'm asking user to input a number and i'm making call to search function to find this number in the tree if the search function is returning me true i'm printing found else i'm printing not found let's see what will happen in memory when this program will execute
the memory that is allocated to a program or application for its execution in a typical architecture can be divided into these four segments there is one segment called text segment to store all the instructions in the program the instructions would be compiled instructions
in machine language there is another segment to store all the global variables a variable that is declared outside all the functions is called global variable it is accessible to all the functions the next segment stack is basically scratch space for function call execution
all the local variables the variables that are declared within functions live in stack and finally the fourth section heap which we also call the free store is the dynamic memory that can grow or shrink as per our need the size of all of the segments is fixed the size of all of the segments
is decided at compile time but heap can grow during runtime and we cannot control allocation or deallocation of memory in any other segment during runtime but we can control allocation and deallocation in heap we have discussed all of this in detail in our lesson on dynamic memory
allocation you can check the description for a link now what i'm going to do here is i'm going to draw stack and heap sections as these two rectangular containers i'm kind of zooming into these two sections now i'll show you how things will move in these two sections of applications
memory when this program will execute when this program will start execution first the main function will be called now whenever a function is called some amount of memory from the stack is allocated for its execution the allocated memory is called stack frame of the function call all the local
variables and the state of execution of the function call would be stored in the stack frame of the function call in the main function we have this local variable root which is pointer to bst node so i'm showing root here in this stack frame we will execute the instructions sequentially
in the first line in main function we have declared root and we are initializing it and setting it as null null is only a macro for address 0 so here in in this figure i'm setting address in root as 0 now in the next line we are making a call to insert function so what will happen is execution
of main will pause at this stage and a new stack frame will be allocated for execution of insert main will wait for this insert above to finish and return once this insert call finishes main will resume at line 2 we have these two local variables root and data in insert function in which we are
collecting the arguments now for this call to insert function we will go inside the first if condition here because root is null at this line we will make call to get new node function so once again execution of this insert call will pause and a new stack frame will be allocated
for execution of get new node function we have two local variables in get new node data in which we are collecting the argument and this pointer to bst node named new node now in this function we are using new operator to create a bst node in heap now let's say we got a new node at address 200
new operator will return us this address 200 so this address will be set here in new node so we have this link here and now using this pointer new node we are setting value in these three fields of node let's say the first field is to store data so we are setting value 15
here and let's say this second cell is to store address of left child this is being set as null and the address of right child is also being set as null and now get new node will return the address of new node and finish its execution whenever a function call finishes the stack frame
allocated to it is reclaimed call to insert function will resume at this line and the return of get new node address 200 will be set in this root which is local variable for insert call and now insert function this particular call to insert function will return the address of root
the address stored in this variable root which is 200 now and finish and now main will resume at this line and root of main will be set as 200 the return of this insert call insert root 15 will be set here now in the execution of main control we'll go to the next line
and we have this call to insert function to insert number 10 once again execution of main will be paused and a stack frame will be allocated for execution of insert now this time for insert call root is not null so we will not go inside the first if we will access the data field of this node
at address 200 using this pointer named root in insert function and we will compare it with this value 10 10 is lesser than 15 so we will go to this line and now we are making a recursive call here recursion is a function calling itself and a function calling itself is not any different from
a function a calling another function b so what will happen here is that execution of this particular insert call will be paused and a new stack frame will be allocated for execution of this another insert call to which the arguments passed are address 0 in this local variable root left child
of node at address 200 is null so we are passing 0 in root and in data we are passing 10 now for this particular insert call control will go inside first if and we will make a call to get new node function at this line so execution of this insert will pause and we will go to get new node function
here we are creating a new node in heap let's say we got this new node at address 150 now get new node will return 150 and finish execution of this call to insert will resume at this line return of get new node will be set here and now this call to insert will return address 150 and finish insert
below will resume at this line and now in this insert call left child of this node at address 200 will be set as return of the previous insert call which is 150 so now these two nodes are linked and finally this insert call will finish control will return back to main at this line root will
be rewritten as 200 but earlier also it was 200 it's not changing next in the main function we have called to insert number 20 i'm not going to show the simulation for this one once again the allocated memory in stack will grow and shrink and finally when the control will return back to
main function after this insert call is over we will have a node in heap with value 20 set as right child of this node at 200 let's say we got this new node with value 20 at address 300 so as you can see the address of right child in node at address 200 is set as 300 now next one is to insert number
25 this one is interesting let's see what will happen for this one main will be paused and we will go to this call to insert in the root which is local to this call address passed is 200 and we have passed number 25 in data now here 25 is greater than the value in this node at address 200 so
we will go inside this last else condition we need to insert in the right sub tree so another call to insert will be made we will pass address 300 as root and data passed will be 25 only now for this call once again the value in node at 300 for this call root is 300 is lesser than 25 25 is greater
and greater than 20 so once again we will come to this last else and make a recursive call to insert in the right subtree the right subtree is empty this time so for this insert call at top the address in root here will be 0 so for this call we will go to the first if and make a call to get new node
let's say this new node returns us node at address 100 i'm short of space so i'm not showing everything and get new nodes stack frame here we will return back to this insert call at top and now this root is set as 100 address of the newly created node and now this call to insert will finish we will
come back to this insert below and this insert will resume at this line inside the last else and the right child of node at address 300 will be set as 100 and now this insert will return back address 300 whatever is set in its root and this insert below will resume at this line
inside the last else right child of node at address 200 will be set as 300 it was 300 previously also so even after overwriting we will not change and this insert will finish now finally main will resume at this line root of main will be set as return of this insert call it will only
be overwritten with same value it's really important that this root in main and all the links in nodes are properly updated quite often because of bugs in our code will lose some links or some unwanted links are created now as you can see we are creating all the nodes in heap here heap gives us this
flexibility that we can decide the creation of node during runtime and we can control the lifetime of anything in heap any memory claimed in heap has to be explicitly deallocated using free in c or delete operator in c++ else the memory in heap remains allocated till the program is running
the memory in stack as you can see gets deallocated when function call finishes the rest of the function calls here in main function will execute in similar manner i'll leave it for you to see and think about right now we have this tree in the heap logically
memory itself is a linear structure and this is how tree which is a non-linear structure which is logically a non-linear structure will fit in it the way i'm showing the nodes at random locations linked to each other in this heap i hope this explanation gave you some clarity
in coming lessons we will solve some problems on tree this is it for this lesson thanks for watching in our previous lessons we wrote some basic code for binary search tree but to solidify our concepts we need to write some more code so i've picked this simple problem for you
given a binary search tree we want to find minimum and maximum element in it let's see how we can solve this problem i have drawn logical representation of a binary search tree of integers here as we know in a binary search tree for all nodes value of nodes in left sub tree
is lesser and value of nodes in right sub tree is greater this is how we can define node for a binary search tree in cc plus plus we can have a structure with three fields one to store data another to store address of left child and another to store address of right child as we had seen
earlier in bst implementation identity of the tree that we always keep with us that we pass to functions is address of the root node so what i want to do here is i first want to write a function named find min that should take address of the root node as argument and return me the
minimum element in the tree and just like find min we can write another function named find max that can return us the maximum element in bst let's first see how we can find the minimum element there are two possible approaches here we can write an iterative solution in which we can use
a simple loop to find the minimum element or we can use recursion let's first see the iterative solution if we have a pointer to the root node and we want to find the minimum element in bst then from root we need to go left as long as it's possible to go using the left links
because in a bst for all nodes nodes in left have lesser value and nodes in right have greater value so we need to go left as long as it's possible we can start with a temporary pointer to root node we can name this pointer temp or we can name this pointer current to say that we are currently
pointing to this node in my function here i have declared this pointer to bst node named current and initially i'm setting the address of root in it and with this pointer we can go to the left child with a statement like current equal current arrow left we first need to check if there is a left child
and then we need to move the pointer we can use a while loop like this if the left child of current node is not null we can move this pointer current to the left child with this statement current equal current arrow left here in this example currently we are pointing to this node with value 15
it has a left child so we can move to this node with value 10 once again this node 2 has a left child so we can go left again now this node with value 8 does not have a left child so we cannot go towards left any further we will come out of the while loop and at this point the node that we are
pointing to has minimum value so we can return the data in that node there is one case that we are missing in this function if the tree is empty we can throw some error now we can return some value indicative of empty tree if i know that the tree would have only positive values i can
return something like minus one so here in my function i have added this condition if root is equal to null that is if the tree is empty print this error and return minus one one more thing we do not need to use this extra pointer to bst node named current root here is a local variable
and we can use this root itself so we can write our code like this while left of root is not equal to null we can go left with this statement root equal root arrow left and finally we can return root arrow data which is only an alternate syntax for asterisk root dot data modifying
this local root is not going to modify my root in main function or whatever function i'm calling this find min function from so this is our iterative solution to find minimum element in bst the logic for finding maximum is similar the only difference will be that instead of going left
we will have to go right all the time i leave it for you to implement let's now see how we can find minimum element using recursion if we want to reduce this problem in a recursive manner in a self-similar manner then what we can say is if the left subtree is not empty then we can reduce the
problem to finding minimum in left subtree if left subtree is empty we already know the minimum because we cannot have a minimum in right subtree here is the recursion that we can write root being null is a corner case if root is null that is if the tree is empty we can throw error else if left
child of root is null we can return the data in root else if left child is not null or in other words if the left subtree is not empty we can reduce the problem to searching minimum in the left subtree so we are making this recursive call to find min passing it address of the left child
passing it address of the root of left subtree left child would be the root of left subtree this second else if is our base condition to exit from recursion if you had understood the recursion that we had written earlier to insert a node in bst then this recursion should not
be very difficult for you to understand so here is our recursive solution to find minimum in a bst to find maximum element all we need to do is we need to go searching in right subtree okay i'll stop here now in coming lessons we will solve some more interesting problems on bst
thanks for watching in this lesson we're going to write code to find height or what we can also call maximum depth of a binary tree we have already discussed depth and height in our first introductory lesson on trees but i'll do a quick recap here first of all i've drawn a binary tree here
i've not filled in any data in the nodes data can be anything binary tree as we know is a tree in which each node can have at most two children so a node can have zero one or two children i'll just number these nodes so i can refer to them i'll say this root node is number one and i'll go
level by level from left to right counting two three four and so on now height of a tree is defined as number of edges in longest path from root to a leaf node in this example three four six seven eight and nine are leaf nodes a leaf node is a node with
zero children number of edges in longest path from root to a leaf node is three for both eight and nine number of edges in path from root is three so height of the tree is three actually we can define height of a node in the tree as number of edges in longest path from that node to a leaf
node so height of a tree basically is height of the root node in this example tree height of node three is one height of node two is two and height of node one is three and because this is the root node this is also the height of the tree height of a leaf node would be zero so if a tree has only
one node then the root node itself would be a leaf node and so height of the tree would be zero so this is definition of height of a tree we often also talk about depth and we often confuse between depth and height but these two are different properties
depth of a node is defined as number of edges in path from root to that node basically depth is distance from root and height is distance from the best accessible leaf node for node two in this example tree depth is one and height is two for node number nine which is
a leaf node depth is three and height is zero for root node depth is zero and height is three height of a tree would be equal to maximum depth of any node in the tree so height and max depth these two terms are used for each other okay let's now see how we can calculate height
or max depth of a binary tree i'm going to write a function named find height that will take reference or address of the root node as argument and return me the height of the binary tree now the logic to calculate height can be something like this for any node if we can somehow calculate
the height of its left subtree and also the height of its right subtree then the height of that node would be greater of the heights of left and right subtrees plus one for the root node in this tree height of the left subtree is two and height of the right subtree is one so height of the root
node would be greater of these two values plus one plus one for the edge connecting the root node to the subtree so height of the root node which would also be the height of the tree is three here in our code we can calculate height of left and right subtrees using recursion what i'll do here
and find height function is i'll first make a recursive call to find height of the left subtree we can say to find height of left subtree or to find height of left child both will mean the same i'm collecting the return of this recursive call in a variable named left height
and now i'll make another recursive call to calculate height of right subtree or right child now height of the tree or height of whatever node for which we have made this function call would be greater of these two values left height and right height plus one now there is only one
more thing missing in this recursion we need to write the base or exit condition we cannot go into recursion infinitely what we can do is we can go on till we make a recursive call with root equal null and if root is null that is if the tree or subtree is empty we can return something what
should we return here give this some thought if i have made a call to find height of let's say this leaf node this node with number seven then for this guy both left and right children are null in call for this node number seven we will make two recursive calls passing null in both the calls
so what should we return should we return zero if these two calls will return zero then height of seven will be one because in the return statement here we're saying max of left and right height plus one but as we had discussed earlier height of a leaf node should be zero so if we are returning
zero for root equal null it's not all right what we can do is we can return minus one when we are returning minus one then this edge to null that does not exist but still was getting counted will be balanced with this minus one i hope this is making sense and going by convention
also height of an empty tree is set to be minus one so this is pseudocode for my function to find height of a binary tree some people define height as number of nodes in longest path from root to a leaf node we are counting edges here and this is the right definition if you want to count
number of nodes then for a leaf node height would be one and for empty tree height would be zero so all you need to do is return zero here and this is the code if you want to count number of nodes but i think the right definition is number of edges so i'll return minus one here
time complexity of this function is big o of n where n is number of nodes in the tree we will make one recursive call corresponding to each node in the tree so we are kind of visiting each node in the tree once and so running time will be proportional to number of nodes i'll skip detail
analysis of running time in this lesson this is what my find height function will look like in c or c plus plus max here is a function that will return greater of two values passed to it as arguments so this is it for this lesson thanks for watching in this lesson we are going to talk
about binary tree traversal when we are working with trees we may often want to visit all the nodes in the tree now tree is not a linear data structure like array or linked list in a linear data structure there would be a logical start and a logical end so we can start
with a pointer at one of the ends and keep moving it towards the other end for a linear data structure like linked list for each node or element we would have only one next element but tree is not a linear data structure i have drawn a binary tree here data type is
character this time i fill in these characters in the nodes now for a tree at any time if we are pointing to a particular node then we can have more than one possible directions we can have more than one possible next nodes in this binary tree for example if we will start with a pointer
at root node then we have two possible directions from f we can either go left to d or we can go right to j and of course if we will go in one direction then we will somehow have to come back and go into the other direction later so tree traversal is not so straightforward
and what we are going to discuss in this lesson is algorithms for tree traversal tree traversal can formally be defined as the process of visiting each node in the tree exactly once in some order and by visiting a node we mean reading or processing data in the node for us in this lesson
visit will mean printing the data in the node based on the order in which nodes are visited tree traversal algorithms can broadly be classified into two categories we can either go breadth first or we can go depth first breadth first traversal and depth first traversal are general techniques
to traverse or search a craft craft is a data structure and we have not talked about craft so far in this series we will discuss craft in later lessons for now just know that tree is only a special kind of craft and in this lesson we are going to discuss breadth first and depth first
traversal in context of trees in a tree in breadth first approach we would visit all the nodes at same depth or level before visiting the nodes at next level in this binary tree that i'm showing here this node with value f which is the root node is at level 0 i'm writing l0 here for level 0
depth of a node is defined as number of edges in path from root to that node root node would have depth 0 these two nodes d and j are at depth 1 so we can say that these nodes are at level 1 now these four nodes are at level 2 these three nodes are at level 3 and finally this node with
value h is at level 4 so what we can do in breadth first approach is that we can start at level 0 we would have only one node at level 0 the root node so we can visit the root node i'll write the value in the node as i'm visiting it now level 0 is done now i can go to level 1 and visit the
nodes from left to right so after f we would visit t and then we would visit j and now we are done with level 1 so we can go to level 2 now we will go like b then e then g and then k and now we can go to level 3 ac and i and finally i can go to level 4 this kind of breadth first traversal in case
of trees is called level order traversal and we will discuss how we can do this programmatically in some time but this is the order in which we would visit the nodes we would go level by level from left to right in breadth first approach for any node we visit all its children before visiting
any of its grandchildren in this tree first we are visiting f and then we are visiting d and then we are not going to any child of d like b or e along the depth next we are going to j but in depth first approach if we would go to a child we would complete the whole subtree of the
child before going to the next child in this example tree here from f the root node if we are going left to d then we should visit all the nodes in this left subtree that is we should finish this left subtree in its complete depth or in other words we should finish all the grandchildren of f
along this path before going to right child of f j and once again when we will go to j we will visit all the grandchildren along this path so basically we will visit the complete right subtree in depth first approach the relative order of visiting the left subtree the right subtree
and the root node can be different for example we can first visit the right subtree and then the root and then the left subtree or we can do something like we can first visit the root and then the left subtree and then the right subtree so the relative order can be different but the core idea
in depth first strategy is that visiting a child is visiting the complete subtree in that path and remember visiting a node is reading processing or printing the data in that node based on the relative order of left subtree right subtree and the root there are three popular
depth first strategies one way is that we can first visit the root node then the left subtree and then the right subtree left and right subtrees will be visited recursively in same manner such a traversal is called tree order traversal another way is that we can first
visit the left subtree then the root and then the right subtree such a traversal is called in order traversal and if root is visited after left and right subtrees then such a traversal is called post order traversal in total there are six possible permutations for left right and root
but conventionally a left subtree is always visited before the right subtree so these are the three strategies that we use only the position of root is changing here if it's before left and right then it's pre-order if it's in between it's in order and if it's after left and right subtrees
then it's post order there is an easy way to remember these three depth first algorithms if we can denote visiting a node or reading the data in that node with letter d going to the left subtree as l and going to the right subtree as r so if we can say d for data l for left and r for
right then in pre-order for each node we will go d l r first we will read the data in that node then we will go left and once the left subtree is done we will go right in in order traversal first we will finish the left subtree then we will read the data in current node
and then we will go right in post order for each node first we will go left once left subtree is done we will go right and then we will read the data in current node so pre-order is data left right in order is left data right and post-order is left right and then data
pre-order in order and post-order are really easy and intuitive to implement using recursion but we will discuss implementation later let's now see what will be the pre-order in order and post-order traversal for this tree that i've drawn here let's first see what will be
the pre-order traversal for this binary tree we need to start at root node and for each node we first need to read the data or in other words visit that node in fact instead of d l r we could have said v l r here v for visit we can use any of these assumptions v for visit or d for data
i will go with d l r here so let's start at the root for each node we first need to read the data i'm writing f here the data that i just read and now i need to go left and finish the complete left subtree and once all the nodes in the left subtree are visited then only i can go to the right
subtree the problem here is actually getting reduced in a self-similar or recursive manner now we need to focus on this left subtree now we are at d root of this left subtree of f once again for this node we will first read the data and now we can go left we will go towards e
only when these three nodes a b and c will be done now we are focusing on this subtree comprising of these three nodes now we are at b we can read the data and now we can go left to a there is nothing in left of a so we can say that for left for a left subtree is done
and there is nothing in right as well so we can say right is also done now for b left subtree is done so we can go right to c and left and right of c and null and now for d left subtree is done so we can go right once again for e left and right and null and now at this stage for f complete left
subtree is visited so we can go right now we need to go left of j and there is nothing in left of g so we can go right and now we can go left of i for h there is nothing in left and right now at this stage left subtree of i is done and right subtree is null and now we can go back to j
the left subtree for j is done so we can go to its right subtree finally we have k here and we are done with all the nodes this is how we can perform a preorder traversal manually actual implementation would be a simple recursion and we will discuss it later let's now see what will be the in order
traversal for this binary tree in in order traversal we will first finish visiting the left subtree then visit the current node and then go right once again we will start at the root and we will first go left now we will first finish this subtree once again for d we will first go left to b
and from b we will go to a now for a there is nothing in left so we can say that for this guy left subtree is done so we can read the data and now we can go to its right but there is nothing in right as well so this guy is done now for b left subtree is done so we can read the data
and now for b we can go right for c once again there is nothing in left so we can read the data and there is nothing in right as well now left of d is completely done so we can visit it read the data here now we can go to its right to e for e once again left and right and null at this
stage left subtree of f is done so we can read on the data and now we can go to right of f if we will go on like this this finally will be my in order traversal this tree that i'm showing here is actually a binary search tree for each node the value of nodes in left is lesser
and the value of nodes in right is greater so if we are printing in this order left subtree and then the current node and then the right subtree then we would get a sorted list in order traversal of a binary search tree would give you a sorted list okay now you should be able to figure
out the post order traversal this is what we will get for post order traversal i leave it for you to see whether this is correct or not i'll stop here now in next lesson we will discuss implementation of these tree traversal algorithms thanks for watching in this lesson we are going to
write code for level order traversal of a binary tree as we have discussed in our previous lesson in level order traversal we visit all nodes at a particular depth or level in the tree before visiting the nodes at next deeper level for this binary tree that i'm showing here if i have to
traverse the tree and print the data in nodes in level order then this is how we will go we will start at level 0 and print f and now we are done with level 0 so we can go to level 1 and we can visit the nodes at level 1 from left to right from f we will go to d and from d we
will go to j now level 1 is done so we can go to level 2 so we will go like b e g and then k and now we can go to next level aci and finally we will be done at h this is the order in which we should visit the nodes but the question is how can we visit the nodes in this order in a program
like linked list we can't just have one pointer and keep moving it if i'll start with a pointer at root let's say i have a pointer named current to point to the current node that i'm visiting then it's possible for me to go from f to d using this pointer because there is a link so i can go
left to d but from d i cannot go to j because there is no link from d to j the only way we can go to j is from f and once we have moved the pointer to d we can't even go back to f because there is no backward link from d to f so what can we do to traverse the nodes in level order clearly
we can't go with just one pointer what we can do is as we visit a node we can keep reference or address of all its children in a queue so we can visit them later a node in the queue can be called discovered node whose address is known to us but we have not visited it yet initially we can start
with a address of root node in the queue to mean that initially this is the only discovered node let's say for this example tree address of the root node is 400 and i'll assume some random addresses for other nodes as well i will mark a discovered node in yellow okay now initially i'm
in queuing the root node and by storing a node in the queue i'll mean storing the address of the node in the queue so initially we are starting with one discovered node now as long as the queue has some discovered node at least one discovered node that is as long as the queue is not empty
we can take out a node from the front visit it and then enqueue its children visiting a node for us is printing the value in that node so i'll write f here and now i'll enqueue the children of this root node first i'll enqueue the left child and then the right child i'll mark visited
node in another color okay now we have one visited node and two discovered node and now once again we can take out the node at front of the queue visit it and enqueue its children by using a queue we are doing two things here first of all as we are moving from a node
we are not losing reference to its children because we are storing the references and then because queue is our first in first out structure so a node that is discovered first that is inserted first will be visited first so we will get this ordered that we are desiring give this some thought
and it's not very difficult to see okay so now we can dequeue and visit this node at address 200 and once again before i move on from this node i need to enqueue its children so now at this stage we have two visited nodes three discovered nodes
and six undiscovered nodes and now we can take out the next node from front of queue we'll visit it and enqueue its children if we will go on like this all the nodes will be visited in the order that we are desiring at this stage we can dequeue node at 120 visit it
and then queue its children so we will go on like this until all the nodes are visited and the queue is empty after b we will have e here nothing will go into the queue this time next we will have g here and the address of i will go into the queue now k will be visited now at this stage we have
reference to three nodes in the queue now we will visit this node at 320 with value a then we have c and now we will print i and the node with value h the node with data h will go into the queue finally we will visit this node and now we are done with all the nodes the queue is empty
once the queue is empty we are done with our traversal so this is the algorithm for level order traversal of a binary tree as you saw in this approach at any time we are keeping a bunch of addresses in the memory in the queue instead of using just one pointer to move around
so of course we are using a lot of extra memory and i'll talk about the space complexity of this algorithm and sometime i hope you got the core logic right let's now write code for this algorithm i'm going to write c plus plus here this is how i'm defining node for my binary tree
i have a structure here with three fields one to store data and the data type is character this time because in the example tree that we were showing earlier data type was character and we have two more fields that are pointers to node one to store the address of left child and another
to store the address of right child now what i want to do here is i want to write a function named level order that should take address of the root node as argument and print the data in the nodes in level order now to test this function i'll have to write a lot of code to create and insert
nodes in a binary tree i'll have to write some more functions i'll skip writing all that code you can pick the code for creation and insertion from previous lessons all i'll write is this function level order now in this function here i'll first take care of one corner case if the tree is empty
that is if root is null we can simply return else if the tree is not empty we need to create a queue i'm not going to write my own implementation of queue here in c plus plus we can use the queue in standard template library and to use it first we'll have to write a statement like hash include
queue here and now i can create a queue of any type in this function i'll create a queue of pointer to node with a statement like this now as we had discussed earlier initially we start with one discovered node in the queue the only node known to us initially is the root node with this statement
queue dot push root i have inserted the address of root node in the queue and now i'll run this while loop for which the condition is that the queue should not be empty and what i really mean here is that while there is at least one discovered node we should go inside the loop and inside the
loop we should take out a node from the front this function front returns the element at front of the queue and because the data type is pointer to node i'm collecting the return of this function in this pointer to node a named current now i can visit this node being pointed by current and by
visiting if we mean reading the data in that node i'll simply print the data and now we want to push the addresses of children of this node into the queue so i'll say that if the left child is not null insert it into the queue and similarly if right child is not null push it into the queue or
rather push its address into the queue and i'll write one more statement to remove the element from front of the queue with call to front the element is not removed from the queue with this call pop we are removing the element okay so this is implementation of level order traversal in
c++ you can check the description of this video for a link to source code and there you can also find all the extra code to test this function let's now talk about time and space complexity of level order traversal if there are n nodes in the tree and in this algorithm visit to a node
is reading the data in that node and inserting its children in the queue then a visit to a node will take constant time and each node will be visited exactly once so time taken will be proportional to the number of nodes or in other words we can say that the time complexity is big
o of n for all cases irrespective of the shape of the tree time complexity of level order traversal will be big o of n now let's talk about space complexity space complexity as we know is the measure of rate of growth of extra memory used with input size we are not using constant amount
of extra memory in this algorithm we have this queue that will grow and shrink while executing this algorithm assuming that the queue is dynamic maximum amount of extra memory used will depend upon maximum number of elements in the queue at any time we can have couple of cases in some cases
extra memory used will be lesser and in some cases extra memory used will be greater for a tree like this where each node has only one child we will have maximum one element in the queue at any time during each visit one node will be taken out from the queue and one node
will be inserted so the amount of extra memory taken will not depend upon the number of nodes space complexity will be big o of one but for a tree like this the amount of extra memory used will depend upon the number of nodes in the tree this is a perfect binary tree all the levels
are full if you can see as the algorithm will execute at some point for each level all the nodes in that level will be in the queue in a perfect binary tree we will have n by 2 nodes at the deepest level so maximum number of nodes in the queue is going to be at least n by 2
so basically extra memory used is proportional to n the number of nodes so space complexity will be big o of n for this case i'm not going to prove it but for average case space complexity will be big o of n so for both worst and average cases we will be big o of n in terms of space
complexity and when we are saying best average and worst cases here it's only going by space complexity time complexity will be big o of n for all cases so this is time and space complexity analysis of level order traversal i'll stop here now in next lesson we will discuss depth first
traversal algorithms pre-order in order and post-order this is it for this lesson thanks for watching in our previous lesson we talked about level order traversal of binary tree which is basically breadth first traversal now in this lesson we are going to discuss these three depth first
algorithms pre-order in order and post-order i have drawn a binary tree here data type filled in the nodes is character now as we had discussed in earlier lessons in depth first traversal of binary tree if we go in one direction then we visit all the nodes in that direction or in other
words we visit the complete subtree in that direction and then only we go in other direction in this example tree that i've drawn here if i'm at root and i'm going left then i'll visit all the nodes in this left subtree and then only i can go right and once again when i'll go right
i'll visit all the nodes in this right subtree if you can see in this approach we are reducing the problem in a self-similar or recursive manner we can say that in total visiting all the nodes in the tree is visiting the root node visiting the left subtree and visiting the right
subtree remember by visiting a node we mean reading or processing the data in that node and by visiting a subtree we mean visiting all the nodes in the subtree in depth first strategy relative order of visiting the left subtree right subtree and the root can be different
for example we can first visit the right subtree then the root and then the left subtree or we can first visit the root and then the left subtree and then the right subtree conventionally left subtree is always visited before right subtree with this constraint we will
have three permutations we can first visit the root and then the left subtree and then the right subtree and such a traversal will be called pre-order traversal or we can first visit the left subtree then the root and then the right subtree and such a traversal will be called
in order traversal and we can also go left right and then root and such a traversal will be called post-order traversal left and right subtrees will be visited recursively in same manner as the original tree so in pre-order once again for the subtrees we will go root left and then right
in in order we'll keep going left root and then right the actual implementation of these algorithms is really easy and intuitive let's first see code for pre-order traversal I first written the algorithm in words here in pre-order traversal we first need to visit
the root then the left subtree and then the right subtree now I want to write a function that should take pointer or reference to root node as argument and print data in all the nodes in pre-order let's say visiting a node for us is printing the data in that node in c or c++ my method signature
will look something like this this function will take address of the root node as argument argument type is pointer to node I'll define node as a structure with three fields like this data type in this definition is character and there are two fields to store the addresses of
left and right children now in pre-order function I'll first visit or print the data in root node and now I'll make a recursive call to visit the left subtree I have made a recursive call here and to this call I'm passing address of the left child of my current root because left child will be
the root of left subtree and I'll have another call like this to visit the right subtree there is one more thing that we need to add in this function and we will be done we cannot go into recursion infinitely we need to have a base condition where we should exit if a tree or subtree is empty or
in other words for any call if root is null we can return or exit now with this much of code I'm done with my pre-order function this will work fine in c or c++ actually in c make sure you write struct space node instead of writing just node rest of the things are fine it will be good to
visualize this recursion so let's now quickly see how this pre-order function will work if this example tree that I'm showing in right here is passed to it I'll redraw this tree and show it like this here I'm depicting node as a structure with three fields let's say the leftmost cell here
is to store the address of left child the cell in middle is to store the data and the rightmost cell is to store the address of right child now let's assume some addresses for these nodes let's say the root node is at address 200 and I'll assume some random addresses for other nodes as
well and now I can fill in left and right fields for each node and as we know the identity of tree that we always keep with us is reference or address of the root node this is what we pass to all the functions in our implementation we often use a variable of type pointer to node named root to
store the address of root node we can name this variable anything we can name this variable root or we can name this variable root ptr but this is just a pointer this particular block that I'm showing here is for pointer to node and all these rectangles with three cells are nodes
this is how things are organized in memory now for this tree let's say we are making a call to this pre-order function I'll make a call to pre-order passing it address 200 for this call root is not null so we will not return at first line in this function we will go ahead and print
the data in this node at address 200 I'll write output for all print statements here and now this function will make a recursive call execution of this particular function call will pause it will resume only after this recursive call pre-order 150 finishes this second call is to visit this
left subtree this call pre-order 150 is to visit this left subtree address of the left child of node at 200 is 150 once again for this call root is not null so we will go ahead and print the data in node at 150 is d and now once again there will be a recursive call with this call pre-order 400
we are saying that we're going to visit this subtree once again we will print the data and make another recursive call now we have made a call to visit this particular subtree with just one node for this call we will print the data and now for node at 250 address of left child is zero
or null we will make a call pre-order zero but for this call we will simply return because the address in this variable root will be null we have hit the base condition for our recursion call to pre-order zero we'll finish and pre-order 250 will resume now in this particular
function call we'll make another call for right subtree for node at 250 even the right child is null we will have another recursive call passing address zero but this once again will simply return and now call to pre-order 250 will finish and call to pre-order 400 will resume
now in call to pre-order 400 we will make another recursive call to pre-order 180 with this call pre-order 180 we are visiting this particular subtree with just one node for this call first we will print the data and then we will make a recursive call to pre-order zero
now pre-order zero will simply return and then we will have another call to pre-order zero for right child of 180 the recursion will go on like this there is one thing that i want to talk about here that's happening in this whole process even though we are not using any extra memory
explicitly in our function because of the recursion we are growing the function call stack we have discussed memory management a number of times in our earlier lessons you can check description of this video for a link to one of those lessons as we know for
each function call we allocate some amount of memory in what we call stack section of applications memory and this allocated memory is reclaimed when the function call finishes at this stage of execution of my recursion for this example my call stack will look something like this i'm writing
p as shortcut for pre-order because i'm short of space here let's say we made a call to pre-order passing it address 200 from main function main function will be at bottom of stack at any time only the call at top of stack will be executing and all other calls will be paused call stack keeps
growing and shrinking during execution of a program because memory is allocated for a new function call and it's reclaimed when a function call finishes so even though we are not using any extra memory explicitly here we are using memory implicitly in the call stack so space complexity
which is measure of rate of growth of extra memory used with input will depend upon the maximum amount of extra memory used in the call stack i'll talk about space complexity once more later for now let's come back to this recursion that i was executing call to this pre-order 0 will
finish and pre-order 180 will resume memory allocated for execution of pre-order 0 will be reclaimed now for pre-order 180 both recursive calls have finished so this guy will also finish even for pre-order 400 both calls have finished so pre-order 150 will resume now this guy will
make a recursive call to pre-order function passing it address 450 address of its right child memory in the stack will be allocated for execution of pre-order 450 now in this call we will first print the data and then we will make two recursive calls to pre-order passing
address 0 each time because for this node at 450 both children are null both calls will simply return and then pre-order 450 will finish and now pre-order 150 will also be done if you can see the call stack will grow only till we reach a leaf node a node with no children and then it will
start shrinking again maximum growth of call stack due to this recursion will depend upon maximum depth or height of the tree we can say that extra space used will be proportional to height of the tree or in other words space complexity of this algorithm is big o of h
where h is height of the tree okay coming back to the recursion we are done with pre-order 150 so pre-order 200 will resume and now we will make a call to visit this particular sub tree in this call we will print j and then we will make a call passing address 60 so now we are
visiting this particular sub tree here we will first print g and then this guy will make a call to pre-order 0 which will simply return and then there will be another call to pre-order 500 here we will print i and then we will make two recursive calls passing address 0 every time because
node at 500 is a leaf node with no children after this guy finishes pre-order 60 will resume now this guy will also finish and pre-order 350 will resume and now we will have a call to pre-order 700 which once again is a leaf node so k which is data in this node will be printed
and then we will make two calls passing address 0 which will simply return now at this stage all these calls can finish we are done visiting all the nodes finally we will return back to the caller of pre-order 200 which probably would be the main function so this is pre-order traversal
for you i hope you got how this regression works code for in-order and post-order will be very similar in in-order traversal my base case will be the same so i'll say if root is null then return or exit if root is not null i first need to visit the left sub tree i am visiting the left
sub tree with this recursive call then i need to visit the root so now i'm writing the sprint of statement to print the data and now i can visit the right sub tree so this second recursive call and this is my in-order function in-order traversal of this example tree that i have drawn here
will be this this particular binary tree is actually also a binary search tree and in-order traversal of a binary search tree would give us elements in the tree in sorted order okay let's now write code for post-order for this function once again the base case will
be the same so i'll say if root is null return or exit if root is not null i first need to visit the left sub tree so i have made this recursive call then the right sub tree so i'll have this another recursive call and now i can visit the root node post-order traversal for this example tree
will be this so this is pre-order in-order and post-order for you you can check the description of this video for a link to all the source code let's now quickly talk about time and space complexity of these algorithms time complexity of all these three algorithms is big o of n
if you could see then there was one function call corresponding to each node where we were actually visiting that node where we were actually printing the data in that node so running time should actually be proportional to number of nodes there is a better formal and mathematical
way of proving that time complexity of these algorithms is big o of n you can check the description of this video for link to that space complexity as we had discussed earlier will be big o of h where h is height of the tree height of a tree in worst case will be n minus 1 so in worst case
space complexity of these algorithms can be big o of n in best or average case height of a tree will be big o of log n to the base two so we can say that in best or average case space complexity will be big o of log n i'll stop here now in coming lessons we will solve
some problems on binary tree thanks for watching in this lesson we are going to solve a simple problem on binary tree which is also a famous programming interview question and the problem is given a binary tree we need to check if the binary tree is a binary search tree or not
as we know a binary tree is a tree in which each node can have act most two children all these trees that i have drawn here are binary trees but not all of them are binary search trees binary search tree as we know is a binary tree in which for each node value of all the nodes in
left subtree is lesser and if we want to allow duplicates we can say lesser or equal and value of all the nodes in right subtree is greater we can define binary search tree as a recursive structure like this elements in left subtree must be lesser or equal and elements in right
subtree must be greater and this should be true for all nodes and not just the root node so left and right subtrees should themselves also be binary search trees of these binary trees that i'm showing here a and c are binary search trees but b and d
are not in b for the root node with value 10 we have 11 in its left subtree which is greater than 10 and in a binary tree for any node all values in its left subtree must be lesser in d we are good for the root node the value in root node is 5 and we have 1 in left subtree
which is lesser and we have 8 9 and 12 in right subtree which are greater so we are good for the root node but for this node with value 8 we have 9 in its left so this tree is not a binary search tree so how should we go about solving this problem basically i want to write a function
that should take pointer or reference to root node of a binary tree as argument and the function should return true if the binary tree is bst false otherwise this is how my method signature will look like in c plus plus in c we do not have boolean type so return type here can be int
we can return 1 for true and 0 for false i'll also write the definition of node here for a binary tree node would be a structure with 3 fields one to store data and two to store addresses of left and right children in my definition of node here data type is integer
and we have two pointers to node to store addresses of left and right children okay coming back to the problem there are multiple approaches and we are going to talk about all of them the first approach that i'm going to talk about is easy to think of but it's not so efficient
but let's discuss it anyway we are saying that for a binary tree to be called binary search tree it should have a recursive structure like this for the root node all the elements in left sub tree must be lesser or equal and all the elements in right sub tree must be greater
and left and right sub trees should themselves also be binary search trees so let's just check for all of this i'm going to write a function named its sub tree lesser that will take a dress of root node of a binary tree or sub tree and an integer value as argument
and this function will return true if all the elements in the sub tree are lesser than this value and similarly i'll write another function named its sub tree greater that will return true if all the elements in a sub tree are greater than a given value i have just declared these functions
i'll write body of these functions later let's come back to this function is binary search tree in this function i'm going to say that if all elements in left sub tree are lesser and i'll verify this by making a call to its sub tree lesser function passing it a dress of
left child of my current root left child would be the root of left sub tree and the data in root this function call will return true if all the elements in left sub tree would be lesser than the data in root now the next thing that i want to check for is if elements in right sub tree
are greater than the data in root or not these two conditions are not sufficient we also need to check if left and right sub trees are binary search trees or not so i'll add two more conditions here i have made a recursive call to its binary search tree
function passing it a dress of left child and i have made another call passing a dress of right child and if all these four function calls is sub tree lesser is sub tree greater and is binary search tree for left and right sub trees return true if all these four checks pass then our tree is
a binary search tree we can return true else we need to return false there is only one thing that we are missing in this function now we are missing the base case if root is null that is if the tree or sub tree is empty we can return true this is the base case for our recursion
where we should stop with this much of code is binary search tree function is complete but let's also write its sub tree lesser and its sub tree greater functions because they're also part of our logic this function has to be a generic function that should check
if all the elements in a given tree are lesser than a given value or not we will have to traverse the complete tree or sub tree and see value in all the nodes and compare these values against this given integer i'll first handle the base case in this function if the tree is empty we can
return true else we need to check if the data in root is less than or equal to the given value and we also need to recursively check if left and right sub trees of the current root have lesser value or not so i'm adding two more conditions here i'm making two recursive calls one for the
left sub tree and another for the right sub tree if all these three conditions are true then we are good else we can return false its sub tree greater function will be very similar instead of writing these two functions is sub tree lesser and its sub tree and sub tree
greater we could also do something like this we could find the maximum in left sub tree and compare it with the data in root if maximum of a sub tree is lesser then all the elements are lesser and similarly if the minimum of a sub tree is greater all the elements are greater for the right sub tree
we could find the minimum so instead of writing these two functions is sub tree lesser and its sub tree greater we could write something like find max and find min and this would also fit so this is our solution using one of the approaches let's quickly run this code on an
example binary tree and see how it will execute i have drawn a very simple binary tree here which actually is a binary search tree let's assume some addresses for these nodes in the tree let's say the root node is at address 200 and i'll assume some random addresses for other nodes as well
to check if this binary tree is a binary search tree or not we will make a call to his binary search tree function i'm writing i b s t here as shortcut for his binary search tree because i'm short of space here so i'll make a call to this function maybe from the main function
passing address 200 address of the root node for this function call address in this local variable address collected in this local variable root will be 200 root is not null null is only a macro for address 0 for this call root is not null so we will not return true at this line we will go to the
next f now here we will make a call to his subtree lesser function arguments passed will be address of left child which is 150 and seven the data in node at 200 execution of the calling function will pause and will resume only after the called function returns now in this call to his subtree
lesser root is not null so we will not return true at first line we will go to the next f now here the first condition is if data in root and the root this time is 150 because this call is for this left subtree and for this left subtree address of root is 150
data in root is 4 which is lesser than 7 so the first condition is true and we can go to the second condition which is a recursive call this call will pause and we will go to the next call here once again the data in node at 180 one is lesser than 7 so first condition is true and we will
make a recursive call left subtree for node at 180 is null there is no left child so we will return at first line root is null this time this particular call will simply return true now in this previous call when root is 180 second condition for if is also true so we will make another call for right
subtree once again at rest past will be 0 and we will simply return true and now for this call is subtree lesser 187 all three conditions are true so this guy can also return true and now this call ISL 157 will resume now this guy will make a recursive call for the right subtree and this guy
after everything will also return true now for this call because all three conditions in the if statement are true this guy will also return true and now is binary search tree function will resume for this call we have evaluated the first condition we have got true now this guy will make another
call to its subtree crater passing address of right child and value 7 this guy after everything will return true and now we will have two recursive calls to check if left and right subtrees are binary search trees or not we will first have a call for the left subtree the execution will go
on like this but i want you to see something in each call to binary search tree function we are comparing the data in root with all the elements in left subtree and then all the elements in right subtree this example tree could be really large then in that case in the first call to is binary
search tree for this complete tree we would recursively traverse this whole left subtree to see whether all the values in this up tree are less than seven or not and then we will traverse all nodes in this right subtree to see if values are greater than seven or not and then in next
call to is binary search tree when we would be validating whether this particular subtree is BST or not we would recursively traverse this subtree if values are lesser than four or not and this subtree to see if values are greater than four or not so all in all during this whole
process there will be a lot of traversal data in notes will be read and compared multiple times if you can see all notes in this particular subtree will be traversed once in call to is binary search tree for 200 when we will compare value in these notes with seven and then these
notes will once again be traversed in call to is binary search tree for 150 when they will be compared with four they will be traversed in call to its subtree lesser all in all these two functions is subtree lesser and its subtree greater are very expensive for each node we are looking at
all nodes in its subtrees there is an efficient solution in which we do not need to compare data in a node with data in all nodes in its subtrees and let's see what the solution is what we can do is we can define a permissible range for each node and data in that node must
be in that range we can start at the root node with range minus infinity to infinity because for the root node there is no upper and lower limit and now as we are traversing we can set a range for other nodes when we are going left we need to reset the upper bound so for this node at
150 data has to be between minus infinity and seven data in left child cannot be greater than data in root if we are going right we need to set the lower bound for this node at 300 range would be seven to infinity seven is not included in the range data has to be strictly greater than
seven for this node at 180 the range will be minus infinity to four for this node with value six lower bound will be four and upper bound would be seven now my code will go like this my function is binary search tree will take two more arguments an integer to mark the lower bound or min value
and another integer to mark the upper bound or max value and now instead of checking whether all the elements in left subtree are lesser than the data in root and all the elements in right subtree are greater than the data in root or not we will simply check whether data in root
is in this range or not so i'll get rid of these two function calls it's subtree lesser and it's subtree greater which are really expensive and i'll add these two conditions data in root must be greater than min value and data in root must be less than max value these two checks will take
constant time it's subtree lesser and it's subtree greater functions were not taking constant time running time for them was proportional to number of nodes in the subtree okay now these two recursive calls should also have two more arguments for the left child lower bound will not change
upper bound will be the data in current node and for the right child upper bound will not change and lower bound will be the data in current node this recursion looks good to me we already have the base case written the only thing is that the caller of his binary search tree function
may only want to pass the address of root node so what we can do is instead of naming this function is binary search tree we can name this function as a utility function like is BSTutil and we can have another function named its binary search tree in which we can take only the address of root
node and this function can call BST is BSTutil function passing address of root minimum possible value in integer variable for minus infinity and maximum possible value in integer variable for plus infinity int min and int max here are macros for minimum and maximum possible values in
int so this is our solution using second approach which is quite efficient in this recursion we will go to each node once and at each node we will take constant time to see whether data in that node is in a defined range or not time complexity would be big o of n where n is number
of nodes in the binary tree for the previous algorithm time complexity was big o of n square one more thing in this code i have not handled the case that binary search tree can have duplicates i'm saying that elements in left subtree must be strictly lesser and elements in right subtree
must be strictly greater i'll leave it for you to see how you will allow duplicates there is another solution to this problem you can perform in order traversal of binary tree and if the tree is binary search tree you would read the data in sorted order
in order traversal of a binary search tree gives a sorted list you can do some hack while performing in order traversal and check if you are getting the elements in sorted order or not during the whole traversal you only need to keep track of previously read node and at any time
data in a node that you are reading must be greater than data in previously read node try implementing this solution it will be interesting okay i'll stop here now in common lessons we will discuss some more problems on binary tree thanks for watching
in this lesson we are going to write code to delete a node from binary search tree in most data structures deletion is tricky in case of binary search trees too it's not so straightforward so let's first see what all complications we may have while trying to delete
a node from binary search tree i have drawn a binary search tree of integers here as we know in a binary search tree for each node value of all nodes in its left subtree is lesser and value of all nodes in its right subtree is greater for example in this tree if i'll pick
this node with value five then we have three and one in its left subtree which are lesser and we have seven and nine in its right subtree which are greater and you can pick any other node in the tree and this property will be true else the tree is not a bst now when we need to
delete a node this property must be conserved let's try to delete some nodes from this example tree and see if we can rearrange things and conserve this property of binary search tree or not what if i want to delete this node with value 19 to delete a node from tree we need to do two
things we need to remove the reference of the node from its parent so the node is detached from the tree here we will cut this link we will set right child of this node with value 17 as null and the second thing that we need to do is reclaim the memory allocated to the node being
deleted that is wipe off the node object from memory this particular node with value 19 that we are trying to delete here is a leaf node it has no children and even if we take this guy out by simply cutting this link that is removing its reference from its parent and then wiping it off
from memory there is no problem property of binary search tree that for each node value of nodes and left should be lesser and value of nodes in right should be greater is conserved so deleting a leaf node a node with no children is really easy in this tree these four nodes with values
one nine thirteen and nineteen are leaf nodes to delete any of these we just need to cut the link and wipe off the node that is clear it from memory but what if we want to delete a non-leaf node what if in this example we want to delete this node with value 15 i can't just cut this link
because if i'll cut this link we will detach not just the node with value 15 but this complete sub tree we have two more nodes in this sub tree we could have had a lot more we need to make sure that all other nodes except the node with value 15 that's being deleted remain in the tree so what
do we do now this particular node that we are trying to delete here has two children or two subtrees i'll come back to case of node with two children later because this is not so easy to crack what i want to discuss first is the case when the node being deleted would have only one child
if the node being deleted would have only one child like in this example this node with value seven this guy has only one child this guy has a right child but does not have a left child for such a node what we can do is we can link its parent to this only child so the child and
everything below the child we could have some more nodes below nine as well will remain attached to the tree and only the node being deleted will be detached now we are not losing any other node than the node with value seven this is my tree after the deletion is there still a binary search
tree yes it is only the right subtree of node with value five has changed earlier we had seven and nine in right subtree of five and now we have nine which is fine what if we were having some more nodes below nine here in this tree i can have a node in the left of nine and the value in this
node has to be lesser than twelve greater than five greater than seven and lesser than nine we are left with only one choice we can only have eight here in right we can have something lesser than twelve and greater than five seven and nine all in all between nine and twelve
okay so if the original tree was this much after deletion this is how my tree will look like okay so are we good now is the tree in right of bst well yes it is when we are setting this node with value nine as right child of the node with value five we are basically setting this particular
subtree as right subtree of the node with value five now this subtree is already in right of five so value of all nodes in this subtree is already greater than five and the subtree itself of course is a binary search tree any subtree in a binary search tree will also be a binary search tree
so even after deletion even after the rearrangement property of the tree that for each node nodes in left should be lesser and nodes in right should be greater in value is conserved so this is what we need to do to delete a node with just one child or a node with just one subtree connect
its parent to its only child and then wipe it off from memory there are only two nodes in this tree that have only one child let's try to delete this other one with value three all we need to do here is set one has left child of five once again if there were some more nodes below one then also
there was no issue okay so now we are good for two cases we're good for leaf nodes and we are good for nodes with just one child and now we should think about the third case what if a node has two children what should we do in this case let's come back to this node with value 15 that we were
trying to delete earlier with two children we can't do something like connect parent to one of the children while trying to delete 15 if we will connect 12 to 13 if we will make 13 the right child of 12 then we will include 13 and anything below 13 that is we will include the left subtree
of 15 but we will lose the right subtree of 15 that is 17 and anything below 17 similarly if we will make 17 the right child then we will lose the left subtree of 15 that is 13 and anything below 13 actually this case is tricky and before I talk about a possible solution I want to insert some
more nodes here I want to have some more nodes in subtrees of 13 and 17 the reason I'm inserting some more nodes here is because I want to discuss a generic case and that's why I want these two subtrees to have more than one node okay coming back when I'm trying to delete this node my intent
basically is to remove this value 15 from the tree my delete function will have signature something like this it will take pointer or reference to the root node and value to be deleted as argument so here I'm deleting this particular node because I want to remove 15 from the tree
what I'm going to do now is something with which I can reduce case three to either case one or case two I'll wipe off 15 from this node and I'll fill in some other value in this node of course I can't fill in any random value what I'll do is I'll look for the minimum in right subtree of this node
and I'll fill in that value here minimum in right subtree of this node is 17 so I have filled 17 here we now have two nodes with value 17 but notice that this node has only one child we can delete this node because we know how to delete a node with one child and once this node is deleted my
tree will be good the final arrangement will be a valid arrangement for my BST but why minimum in right subtree why not value in any other leaf node or any other node with one child well we also need to conserve this property that for each node nodes in left should have lesser value
and nodes in right should have greater value for this node if I'm bringing in the minimum from its right subtree then because I'm bringing in something from its right subtree it will be greater than the previous value 17 is greater than 15 so all the elements in left of course will be
lesser and because it's the minimum in right subtree all the elements in right of this guy would either be greater or equal we'll have a duplicate that will be equal once the duplicate is removed everything else will be fine in a tree or sub tree if a node has minimum value it won't
have a left child because if there is a left child there is something lesser and this is another property that we are exploiting give this some thought in a tree or sub tree node with minimum value will not have a left child there may or may not be a right child if we would have a right
child like here we have a right child so here we are reducing case three to case two if there was no child we would have reduced case three to case one okay so let's get rid of the duplicate I'll build a link like this and after deletion this is what my tree will look like so this is
what we need to do in case three we need to find the minimum in right subtree of the targeted node then copy or fill in this value and finally we need to delete the duplicate or the node with minimum value from right subtree there was another possible approach here and I must talk about it
instead of going for minimum in right we could also go for maximum in left subtree maximum in left subtree would of course be greater than or equal to all the values in left maximum in left subtree of node with value 15 is 14 I'm copying 14 here now all the nodes in left are lesser than
or equal to 14 and because we are picking something from left subtree it will still be lesser than the value being deleted 14 is less than 15 so all the nodes in this right subtree will still be greater and if we are picking maximum in a tree or subtree then that node will not have a right
child because if we have something in right we have something greater so the value can't be maximum the node may have a left child in this case node with value 14 doesn't have a left child so we are basically introducing case three to case one I'll simply get rid of this node
so we are looking good even after deletion in case three we can apply any of these methods and this is all in logic part let's now write code for this logic I'll write c++ and we will use recursion if you're not very comfortable applying recursion on trees then make sure you watch earlier
lessons in this series you can find link to them in description of this video in my code here I have defined node as a structure with three fields we have one field to store data and we have two fields that are pointers to node to store addresses of left and right children
and I want to write a function named delete that should take pointer to root node and the data to be deleted as argument and this function should return pointer to root node because the root may change after deletion what we're passing to delete function is only a local copy of a root's address
if the address is changing we need to return it back to delete a given value or data we first need to find it in the tree and once we find the node containing that data we can try to delete it remember the only identity of tree that we pass to functions is a address of the root node and to
perform any action on the tree we need to start at root so let's first search for the node with this data first I'll cover a corner case if root is null that is if the tree is empty we can simply return I can say return root or return null here they will mean the same
because root is null else if the data that we are looking for is less than the data in root then it's in the left subtree the problem can be reduced to deleting the data from left subtree we need to go and find the data in left subtree so we can make a recursive call to delete function
passing address of the left child and the data to be deleted now the root of the left subtree that is the left child of this current node may change after deletion but the good thing is delete function will return address of the modified root of the left subtree so we can set the return
as left child of the current node now if data that we are trying to delete is greater than the data in root we need to go and delete the data from right subtree and if the data is needed greater nor lesser that is if it's equal then we can try deleting the node containing that data
now let's handle the three cases one by one if there is no child we can simply delete that node what I'll do here is I'll first wipe off the node from memory and this is how I'll do it what we have in root right now is address of the node to be deleted I'm using delete operator here
and that's used to deallocate memory of an object in heap in c you would use free function now root is a dangling pointer because the object in heap is deleted but root still has its address so we can set root as null and now we can return root reference of this node in its parent will
not be fixed here once this recursive call finishes then somewhere in these two statements in any of these two statements in any of these two else ifs the link will be corrected I hope this is making sense okay now let's handle other cases if only the left child is null then what I want to do is
I first want to store the address of current node that I'm trying to delete in a temporary pointer to node and now I want to move the root this pointer named root to the right child so the right child becomes the root of this sub tree and now we can delete the node
that is being pointed to by temp we will use delete operator in c we would be using free function and now we can return root similarly if the right child is null I'll first store the address of current root in a temporary pointer to node then I'll make the left child new root of
the sub tree so we'll move to the left child and then I'll delete the previous root whose address I have in temp and finally I'll return root actually we need to return root in all cases so I'll remove this return root statement from all this if and else if and write one return root after everything
let's talk about the third case now in case of two children what we need to do is we need to search for minimum element in right sub tree of the node that we are trying to delete let's say this function find min will give me address of the node with minimum value in our tree
or sub tree so I'm calling this function find min and I'm collecting the return in a pointer to node named temp now I should set the data in current node that I'm trying to delete as this minimum value and now the problem is getting reduced to deleting this minimum value from the
right sub tree of current node with this much code I think I'm done with delete function this looks good to me let's quickly run this code on an example tree and see if this works or not I have drawn a binary search tree here let's say these values outside these nodes are addresses of
the nodes now I want to delete number 15 from this tree so I'll make a call to delete function passing address of the root which is 200 and 15 the value to be deleted in delete function for this particular call control will come to this line a recursive call will be made execution of
this call delete 200 comma 15 will pause and it will resume only after this function below delete 350 comma 15 returns now for this call below we will go inside the third else in case 3 here we will find the node with minimum value in right which is 17 which is 400 the value is 17
address is 400 first we will set the data in node at 350 as 17 and now we are making a recursive call to delete 17 from right sub tree of 350 we have only one node in right sub tree of 350 here we have case 1 in this call we will simply delete the node at 400 and return null remember
root will be returned in all calls in the end now delete 350 comma 15 will resume and in this resumed call we will set a address of right child of node at 350 as null as you can see the link in parent is being corrected when the recursion is unfolding and the function call corresponding to
the parent is resuming and now this guy can return and now in this call we will resume at this line so right child of node at 200 will be set as 350 it's already 350 but it will be written again and now this call can also finish so I hope you got some sense of how this recursion is working
you can find link to all the source code and code to test the delete function in description of this video this is it for this lesson thanks for watching in this lesson we are going to solve one other interesting problem on binary search tree
and the problem is given a node in a binary search tree we need to find its in-order successor that is the node that would come immediately after the given node in in-order traversal of the binary search tree as we know in in-order traversal of a binary tree we first
visit the left subtree then the root and then the right subtree left and right subtrees are visited recursively in same manner so for each node we first visit its left subtree then the node itself and then its right subtree we have already discussed in-order traversal
in detail in a previous lesson in the series you can check the description of this video for a link to it in-order implementation will basically be a recursive function something like what i'm showing in right here there are two recursive calls in this function one to visit the left subtree
and another to visit the right subtree time complexity of in-order traversal is big o of n where n is number of nodes in the tree we visit each node exactly once so time taken is proportional to number of nodes in the tree i have drawn a binary search tree of integers here binary search
tree as we know is a binary tree in which for each node value of nodes in left is lesser and value of nodes in right is greater let's quickly see what will be the in-order traversal for this binary search tree we'll start at root of the tree now for any node we first need to visit all
nodes in its left and then only we can visit that node so we will have to go left basically we will make a recursive call to go to left child of this node for this guy once again we have something in left so we will make another recursive call and go to its left child now we are at this node
with value eight and we will have to go left one more time and now for this node with value six which is a leaf node we have nothing in left so we can simply say that its left subtree is done and hence we can visit this guy visiting for me is reading the data in that node
i'll write the data here and now for this node there's nothing in right as well so we can simply say that its right is also done and now we're completely done for this guy so recursive call corresponding to this node will finish and we will go back to call corresponding
to its parent if we will come back to a node from its left child then it will be unvisited because we can't visit a node until its left is done so when we are coming back to eight eight is unvisited so we can simply visit this node that is read the data in this node
when i'll visit a node i'll paint it in yellow and now there's nothing in right of this node so we can simply say that right is done now we are done with this node so call corresponding to this node will finish and we will go back to its parent once again we're coming back to the parent from
left so the parent that is this node with value 10 is unvisited if we would come back to a node from right then it would already be visited so i'm visiting 10 and now we can go to right of 10 so far we have visited three nodes we first visited node with value six and then we visited
node with value eight so eight is successor of six and then 10 is successor of eight now let's see what will be the successor of 10 for nodes with values six and eight there was nothing in right so we were unwinding and going to the parent but for a node if there would be
something in right that is if there would be a right sub tree then its successor would definitely be in its right sub tree because after visiting that node we will go right now at this stage we are at this node with value 12 for this guy we will first go left and now we are at node with value 11
which is a leaf node there's nothing in left so we can simply say that left is done and we can print the data that is visit this node so in order successor of 10 is 11 now for node with value 11 there's nothing in right so we will go back to its parent and now we can visit this guy so after 11
we have 12 there's nothing in right of 12 so call for this guy will finish and we will go to its parent now we're coming back to 10 again but this time from right so this guy is already visited so we need not do anything we can simply go to its parent and now we are at this node with value 15
we are coming from left this guy is unvisited so we can visit it and now we can go to its right we will go on like this successor of 15 would be 16 and after 16 we will print 17 then after 17 we will print 20 then 25 and the last element would be 27
so this is in order traversal of this binary search tree notice that we have printed the integers in sorted order when we perform in order traversal on a binary search tree then elements are visited in sorted order now the problem that we want to solve is given a value in the tree
we want to find its in order successor in a binary search tree it would be the next higher value in the tree but what's the big deal here can't we just perform in order traversal and while performing the traversal figure order successor well we can do so but it will be expensive running time of
in order is big o of n and we may want to do better finding next and previous element in some data could be a frequently performed operation and good thing about binary search tree is that frequently performed operations like insertion, deletion and search happen in big o of h where
h is height of the tree so it would be good if we are able to find successor and predecessor in big o of h we always try to keep a tree balanced to minimize its height height of a balanced binary tree is log n to the base 2 and big o of log n running time for any operation is almost
the best running time that we can have so can we find in order successor in big o of h i have retrawn the example tree here let's see what we can do in various cases what node would we visit after this node with value 10 can we deduce this logically well if you remember the simulation
of in order traversal that we had done earlier then if we have already visited this node then we are done with its left subtree and we have read the data in this node and we need to go right now in the right subtree we will have to go left as long as it's possible to go
and if we can't go left anymore like here there is nothing in left of this node with value 11 then this is the node that i'm visiting next so for a node if there is a right subtree then in order successor would be the leftmost node in its right subtree in a bst it would be
the node with minimum value in its right subtree i would say this is case one in this case all we need to do is we need to go as left as possible in right subtree in a bst it will also mean finding the minimum in right subtree leftmost node will also be the minimum in the subtree
now this is one case our node here had a right subtree what would be in the successor if there would be no right subtree what node would we visit after this node with value 8 this guy does not have a right subtree if we have already visited this guy then we have visited its left and this
node itself and there is nothing in right so we can say that right is also visited but we have not found the successor yet now where do we go from here well if you remember the simulation that we had done earlier we need to go to the parent of this node and if we are going to the parent from
left which is the case here then the parent would be unvisited for this node with value 10 we just finished its left subtree and we are coming back so now we can visit this node so this is my successor let's now pick another node with no right subtree what would be in order successor of this node
with value 12 what node would we visit next now here once again we do not have a right subtree for this node so we must go back to its parent and see if it's unvisited but if we are going to the parent from right if the node that we just visited is a right child which is the case here
then the parent would already be visited because we are coming back after visiting its right subtree this node must have been visited before going right so what should we do now the recursion will roll back further and we need to go to parent of 10 and now we are going to 15 from left so this
guy is unvisited so we can visit this node and this is my successor if the node does not have a right subtree we need to go to the nearest ancestor for which given node would be in its left subtree here for 12 we first went to 10 but 12 is in right subtree of 10 so we went to the next ancestor
15 and 12 is in left of 15 so this is the nearest ancestor for which 12 is in left and hence this is my in order successor this algorithm works fine but there is an issue how do we go from a node to its parent well we can design our tree such that node can have
reference to its parent so far in most lessons we have defined node as a structure with three fields something like this this is how we would define node in c or c plus plus we have one field to store data and we have two pointers to node to store reference or addresses of left and right
children often it makes a lot of sense to have one more field to store the address of parent we can design a tree like this and then we will not have problem walking the tree up using parent link we can easily go to the ancestors but what if there is no link to parent in this case what we
can do is we can start at root and walk the tree from root to the given node in a bst this is really easy for 12 we will start at root 12 is lesser than value in root so we need to go left and now we are at 10 or 12 is greater than 10 so we need to go right and now we are at 12 if we will walk
the tree from root to the given node we will go through all the ancestors of the given node in order successor would be the deepest node or the best ancestor in this path for which given node would be in left subtree 12 has only two ancestors we have 10 but 12 is in
right of 10 and then we have 15 and 12 is in left of 15 so 15 is my successor now let's use this technique to find successor of 6 we will first walk down from root to this node 6 is in left for all the ancestors but the best ancestor for which 6 is in left is this node with value 8 so this is
my successor remember we need to look at ancestors only if there is no right subtree for 6 there is no right subtree okay so the algorithm looks good let's now write code for this in my c++ code here i'm going to write a function named ket successor that will take address of
root node and address of another node for which we need to find the successor and this function will return address of the successor node we could design this function differently instead of taking pointer to the node for which we want to find the successor as argument we could just take
the data as argument and for this data for this element we can find the successor node and return its address and that's why the return type here is struct node asterisk because we will be returning address in a pointer or what we can also do is we can return the element itself the
successor element itself we can implement with any of these signatures let's implement this one we will pass the data in current node and we will return back the address of the successor now the first thing that we need to do is we need to search the node with this data
i'm going to make call to a function named find that will take address of the root node and the data and will return me pointer to the node with this data if this function returns me null that is if the data is not found in the tree we can simply return null else we have the address of the current
node in this pointer to node that we have named current now in a bsd this search operation will cost us big o of h where h is height of the tree search in our bsd is not very expensive we could have avoided this search if we would have passed address of the current node instead of
passing the data as this second argument but let's go with this let's now find the successor of this node if this node has rights up tree that is if the right sub tree is not null we need to go to the leftmost node in the right sub tree i have declared a temporary pointer to node here and initially
i've set it to current dot right and with this while loop i'll go to the leftmost node while there is something in the left keep going and finally when i'll come out of this loop i'll have address of leftmost node in the right sub tree and i can return this address
this particular node will also be the node with minimum value in right sub tree i'll move this code in another function i have written this function named find min that will return node with minimum value in a tree or sub tree in get successor function i'll simply say return find min and i'll
pass the address of right child of current node so basically i'm passing the right sub tree here okay now let's talk about case two if there is no right sub tree what we need to do is we need to walk the tree from root till current node and we need to find the deepest ancestor for which
current node will be in its left subtree what i'm going to do here is i'm going to declare a pointer to node named successor and initially i'll set it as null and i'll have another pointer to node named ancestor and initially i'll set this as root and with this while loop we will
walk the tree till we have not reached the current node to walk the tree we will use the property of binary search tree that for each node value of nodes in left is lesser and value of nodes in right is greater if data in current node is less than the data in ancestor then first of all this ancestor
may be my in order successor because the current node is in its left so what we can do is we can set this guy as successor and we can go left while traversing if we will find a deeper node with this property that current node will be in its left sub tree then successor will be updated
else if the current node lies in right we simply need to move right when we'll come out of this while loop successor will either be null or it will be the address of some node not all nodes in the tree will have a successor node with maximum value will not have a successor after coming out of this
while loop we can return the successor so this is my get successor function and i think this should work you can find link to complete source code in description of this video overall time complexity of this function will be big o of h and this is what we wanted we wanted to find
successor in big o of h here we are already performing the search in big o of h uh finding minimum will also take big o of h and walking the tree from root to a node in bst will also take big o of h so overall this is big o of h if you have understood this code this logic then it should
be very easy for you writing function to find in order predecessor i encourage you to write it i'll stop here now in coming lessons we will solve some more interesting problems on binary trees and binary search trees thanks for watching hello everyone so far in this series on data
structures we have talked about some of the linear data structures like array linked list stack and q in all these structures data is arranged in a linear or sequential manner so we can call them linear data structures and we have also talked about tree which is a non-linear data structure
tree is a hierarchical structure now as we understand data structures are ways to store and organize data and for different kinds of data we use different kinds of data structures in this lesson we are going to introduce you to another non-linear data structure
that has got its application in a wide number of scenarios in computer science it is used to model and represent a variety of systems and this data structure is graph when we study data structures we often first study them as mathematical or logical models here also we will first study graph
as a mathematical or logical model and we will go into implementation details later okay so let's get started a graph just like a tree is a collection of objects or entities that we call nodes or vertices connected to each other through a set of edges but in a tree connections are bound
to be in a certain way in a tree there are rules dictating the connection among the nodes in a tree with n nodes we must have exactly n minus one edges one edge for each parent child relationship as we know an edge in a tree is for a parent child relationship and all nodes in a tree
except the root node would have a parent would have exactly one parent and that's why if there are n nodes there must be exactly n minus one edges in a tree all nodes must be reachable from the root and there must be exactly one possible path from root to a node now in a graph there are
no rules dictating the connection among the nodes a graph contains a set of nodes and a set of edges and edges can be connecting nodes in any possible way tree is only a special kind of graph now graph as a concept has been studied extensively in mathematics if you have taken a course on discrete
mathematics then you must be knowing about graphs already in computer science we basically study and implement the same concept of craft from mathematics the study of crafts is often referred to as craft theory in pure mathematical terms we can define graph something like this a craft g
is an ordered pair of a set v of vertices and a set e of edges now i'm using some mathematical jargon here an ordered pair is just a pair of mathematical objects in which the order of objects in the pair matters this is how we write and represent an ordered pair objects separated by
comma put within parenthesis now because the order here matters we can say that v is the first object in the pair and e is the second object an ordered pair ab is not equal to ba unless a and b are equal in our definition of craft here first object in the pair must always be a set of vertices
and the second object must be a set of edges that's why we are calling the pair an ordered pair we also have concept of unordered pair an unordered pair is simply a set of two elements order is not important here we write an unordered pair using curly brackets or braces because the order
is not important here unordered pair ab is equal to ba it doesn't matter which object is first and which object is second okay coming back so a graph is an ordered pair of a set of vertices and a set of edges and g equal ve is a formal mathematical notation that we use to define a
graph now i have a craft drawn here in the right this graph has eight vertices and ten edges what i want to do is i want to give some names to these vertices because each node in a graph must have some identification it can be a name or it can be an index i'm naming these vertices as v1 v2 v3 v4
v5 and so on and this naming is not indicative of any order there is no first second and third node here i could give any name to any node so my set of vertices here is this we have eight elements in the set v1 v2 v3 v4 v5 v6 v7 and v8 so this is my set of vertices for this
graph now what's my set of edges to answer this we first need to know how to represent an edge an edge is uniquely identified by its two end points so we can just write the names of the two end points of an edge as a pair and it can be a representation for the edge but edges can
be of two types we can have a directed edge in which connection is one way or we can have an undirected edge in which connection is two way in this example graph that i'm showing here edges are undirected but if you remember the tree that i had shown earlier then we had directed edges in
that tree with this directed edge that i'm showing you here we are saying that there is a link or path from vertex u to v but we cannot assume a path from v to u this connection is one way for a directed edge one of the end points would be the origin and the other end point would be the
destination and we draw the edge with an arrowhead pointing towards the destination for our edge here origin is u and destination is v a directed edge can be represented as an ordered pair first element in the pair can be the origin and second element can be the destination
so with this directed edge represented as ordered pair uv we have a path from u to v if we want a path from v to u we need to draw another directed edge here with v as origin and u as destination and this edge can be represented as ordered pair vu the upper one here is uv and
the below one is vu and they are not same now if the edge is undirected the connection is two way an undirected edge can be represented as an unordered pair here because the edge is bidirectional origin and destination are not fixed we only need to know what two end points are being connected
by the edge so now that we know how to represent edges we can write the set of edges for this example graph here we have an undirected edge between v1 and v2 then we have one between v1 and v3 then we have v1 v4 this is really simple i'll just go ahead and write all of them
so this is my set of edges typically in a graph all edges would either be directed or undirected it's possible for a graph to have both directed and undirected edges but we are not going to study such graphs we are only going to study graphs in which
all edges would either be directed or undirected a graph with all directed edges is called a directed graph or digraph and a graph with all undirected edges is called an undirected graph there is no special name for an undirected graph usually if the graph is directed
we explicitly say that it's a directed graph or digraph so these are two types of graph directed graph or digraph in which edges are unidirectional or ordered pairs and undirected graph in which edges are bi-directional or unordered pairs now many real world systems and problems
can be modeled using a graph graphs can be used to represent any collection of objects having some kind of pairwise relationship let's have a look at some of the interesting examples a social network like facebook can be represented as an undirected graph
a user would be a node in the graph and if two users are friends there would be an edge connecting them a real social network would have millions and billions of nodes i can show only few in my diagram here because i'm short of space now social network is an undirected graph because
friendship is a mutual relationship if i'm your friend you are my friend too so connections have to be two-way now once a system is modeled as a graph a lot of problems can easily be solved by applying standard algorithms in graph theory like here in this social network let's say we want
to do something like suggest friends to a user let's say we want to suggest some connections to rama one possible approach to do so can be suggesting friends of friends who are not connected already rama has three friends ela bob and kt and friends of these three that are not connected to
rama already can be suggested there is no friend of ela which is not connected to rama already bob however has three friends storm sam and lee that are not friends with rama so they can be suggested and kt has two friends lee and swati that are not connected to rama we have
counted lee already so in all we can suggest these four users to rama now even though we described this problem in context of a social network this is a standard graph problem the problem here in pure graph terms is finding all nodes having length of shortest path from a given node
equal to two standard algorithms can be applied to solve this problem we'll talk about concepts like path in a graph in some time for now just know that the problem that we just described in context of a social network is a standard craft problem okay so a social network like facebook
is an undirected graph now let's have a look at another example interlinked web pages on the internet or the world wide web can be represented as a directed graph a web page that would have a unique address or url would be a node in the graph and we can
have a directed edge if a page contains link to another page now once again there are billions of pages on the web but i can show only few here the edges in this graph are directed because the relationship is not mutual this time if page a has a link to page b then it's not necessary that
page b will also have a link to page a let's say one of the pages on mycodeschool.com has a tutorial on graph and on this page i have put a link to wikipedia article on graph let's assume that in this example graph that i'm showing you here page p is my mycode school tutorial on graph
with this address or url mycodeschool.com/videos/craft and let's say page q is the wikipedia article on graph with this url wikipedia.org/vicky/craft now on my page that is page p i have put a link to wikipedia page on graph if you are on page p you can click on this link and go to page q
but wikipedia has not reciprocated to my favor by putting a link back to my page so if you are on page q you cannot click on a link and come to page p connection here is one way and that's why we have drawn a directed edge here okay now once again if we are able to represent
web as a directed graph we can apply standard graph theory algorithms to solve problems and perform tasks one of the tasks that search engines like google perform very regularly is web crawling search engines use a program called web crawler that systematically
prozes the world wide web to collect and store data about web pages search engines can then use this data to provide quick and accurate results against search queries now even though in this context we are using a nice and heavy term like web crawling web crawling is basically
graph traversal or in simpler words act of visiting all notes in a graph and no prizes for guessing that there are standard algorithms for graph traversal we'll be studying graph traversal algorithms in later lessons okay now the next thing that i want to talk about is concept of a
weighted graph sometimes in a graph all connections cannot be treated as equal some connections can be preferable to others like for example we can represent intercity road network that is the network of highways and freeways between cities as an undirected graph i'm assuming that all highways
would be bidirectional intra city road network that is road network within a city would definitely have one way roads and so intra city road network must be represented as a directed graph but intercity road network in my opinion can be represented as an undirected graph now clearly
we cannot treat all connections as equal here roads would be of different lengths and to perform a lot of tasks to solve a lot of problems we need to take length of roads into account in such cases we associate some weight or cost with every edge we label the edges with their weights in this case
weight can be length of the roads so what i'll do here is i'll just label these edges with some values for their lengths and let's say these values are in kilometers and now edges in this graph are weighted and this graph can be called a weighted graph let's say in this graph we want
to pick the best route from city A to city D have a look at these four possible routes i'm showing them in different colors now if i would treat all edges as equal then i would say that the green route through B and C and the red route through E and F are equally good both these parts have
three edges and this yellow route through E is the best because we have only two edges in this path but with different weights assigned to the connections i need to add up weights of edges in a path to calculate total cost when i'm taking weight into account shortest route is through B and C
connections have different weights and this is really important here in this graph actually we can look at all the graphs as weighted graphs an unweighted graph can basically be seen as a weighted graph in which weight of all the edges is same and typically we assume the weight
as one okay so we have represented inter cities road network as a weighted undirected graph social network was an unweighted undirected graph and worldwide web was an unweighted directed graph and this one is a weighted undirected graph now this was inter city road network i think
inter city road network that is road network within our city can be modeled as a weighted directed graph because in a city there would be some one ways intersections in inter city road network would be nodes and road segments would be our edges and by the way we can also draw an undirected
graph as directed it's just that for each undirected edge we'll have two directed edges we may not be able to redraw a directed graph as undirected but we can always redraw an undirected graph as directed okay i'll stop here now this much is good for an introductory lesson
in next lesson we'll talk about some more properties of graph this is it for this lesson thanks for watching in our previous lesson we introduced you to graphs we defined graph as a mathematical or logical model and talked about some of the properties and applications of graph
now in this lesson we will discuss some more properties of graph but first i want to do a quick recap of what we have discussed in our previous lesson a graph can be defined as an ordered pair of a set of vertices and a set of edges we use this formal mathematical notation
g equal ve to define a graph here v is set of vertices and e is set of edges ordered pair is just a pair of mathematical objects in which order of objects in the pair matters it matters which element is first and which element is second in the pair now as we know to denote number of
elements in a set that we also call cardinality of a set we use the same notation that we use for modulus or absolute value so this is how we can denote number of vertices and number of edges in a graph number of vertices would be number of elements in set v and number of edges would be
number of elements in set e moving forward this is how i'm going to denote number of vertices and number of edges in all my explanations now as we had discussed earlier edges in a graph can either be directed that is one way connections or undirected that is two way connections
a graph with only directed edges is called a directed graph or die graph and a graph with only undirected edges is called an undirected graph now sometimes all connections in a graph cannot be treated as equal so we label edges with some weight or cost like what i'm showing here
and a graph in which some value is associated to connections as cost or weight is called a weighted graph a graph is unweighted if there is no cost distinction among edges okay now we can also have some special kind of edges in a graph these edges complicate algorithms and make working
with graphs difficult but i'm going to talk about them anyway an edge is called a self loop or self edge if it involves only one vertex if both end points of an edge are same then it's called a self loop we can have a self loop in both directed and undirected graphs but the question is
why would we ever have a self loop in a graph well sometimes if edges are depicting some relationship or connection that's possible with the same node as origin as well as destination then we can have a self loop for example as we had discussed in our previous lesson
interlinked web pages on the internet or the worldwide web can be presented as a directed graph a page with a unique URL can be a node in the graph and we can have a directed edge if a page contains link to another page now we can have a self loop in this graph because it's very much
possible for a web page to have a link to itself have a look at this web page my code school dot com slash videos in the header we have links for workouts page problems page and videos page right now i'm already on videos page but i can still click on videos link and all that will
happen with the click is a refresh because i'm already on videos page my origin and destination are same here so if i'm representing worldwide web as a directed graph the way we just discussed then we have a self loop here now the next special type of edge that i want to talk about
is multi edge an edge is called a multi edge if it occurs more than once in a graph once again we can have a multi edge in both directed and undirected graphs first multi edge that i'm showing you here is undirected and the second one is directed now once again the question why should we ever
have a multi edge well let's say we are representing flight network between cities as a graph a city would be a node and we can have an edge if there is a direct flight connection between any two cities but then there can be multiple flights between a pair of cities these flights would have different
names and may have different costs if i want to keep the information about all the flights in my graph i can draw multi edges i can draw one directed edge for each flight and then i can label an edge with its cost or any other property i just labeled edges here with some random flight
numbers now as we were saying earlier self loops and multi edges often complicate working with graphs their presence means we need to take extra care while solving problems if a graph contains no self loop or multi edge it's called a simple graph in our lessons we will mostly be dealing
with simple graphs now i want you to answer a very simple question given number of vertices in a simple graph that is a graph with no self loop or multi edge what would be maximum possible number of edges well let's see let's say we want to draw a directed graph with four vertices
i have drawn four vertices here i'll name these vertices v1 v2 v3 and v4 so this is my set of vertices number of elements in set v is four now it's perfectly fine if i choose not to draw any edge here this will still be a graph set of edges can be empty nodes can be totally disconnected
so minimum possible number of edges in a graph is zero now if this is a directed graph what do you think can be maximum number of edges here well each node can have directed edges to all other nodes in this figure here each node can have directed edges to three other nodes
we have four nodes in total so maximum possible number of edges here is four into three that is twelve i have shown edges originating from our vertex in same color here this is the maximum that we can draw if there is no self loop or multi edge in general if there are n vertices
then maximum number of edges in a directed graph would be n into n minus one so in a simple directed graph number of edges would be in this range zero to n into n minus one now what do you think would be the maximum for an undirected graph in an undirected graph we can have only one
bi-directional edge between a pair of nodes we can't have two edges in different directions so here the maximum would be half of the maximum for directed so if the graph is simple and undirected number of edges would be in the range zero to n into n minus one by two remember this is true only
if there is no self loop or multi edge now if you can see number of edges in a graph can be really really large compared to number of vertices for example if number of vertices in a directed graph is equal to ten maximum number of edges would be ninety if number of vertices is hundred
maximum number of edges would be ninety nine hundred maximum number of edges would be close to square of number of vertices a graph is called dense if number of edges in the graph is close to maximum possible number of edges that is if the number of edges is of the order of square of
number of vertices and a graph is called sparse if the number of edges is really less typically close to number of vertices and not more than that there is no defined boundary for what can be called dense and what can be called sparse it all depends on context but this is an important
classification while working with graphs a lot of decisions are made based on whether the graph is dense or sparse for example we typically choose a different kind of storage structure in computer's memory for a dense graph we typically store a dense graph in something called adjacency matrix
and for a sparse graph we typically use something called adjacency list i'll be talking about adjacency matrix and adjacency list in next lesson okay now the next concept that i want to talk about is concept of path in a graph a path in a graph is a sequence of vertices
where each adjacent pair in the sequence is connected by an edge i'm highlighting a path here in this example graph the sequence of vertices a b f h is a path in this graph now we have an undirected graph here edges are bidirectional in a directed graph all edges must also be aligned in one direction
the direction of the path a path is called simple path if no vertices are repeated and if vertices are not repeated then edges will also not be repeated so in a simple path both vertices and edges are not repeated this path a b f h that i have highlighted here is a simple path but we could also have a
path like this here start vertex is a and end vertex is d in this path one edge and two vertices are repeated in graph theory there is some inconsistency in use of this term path most of the time when we say path we mean a simple path and if repetition is possible we use this term
walk so a path is basically a walk in which no vertices or edges are repeated a walk is called a trail if vertices can be repeated but edges cannot be repeated i'm highlighting a trail here in this example graph okay now i want to say this once again walk and path are often used as synonyms
but most often when we say path we mean simple path a path in which vertices and edges are not repeated between two different vertices if there is a walk in which vertices or edges are repeated like this walk that i'm showing you here in this example graph then there must also be a path
or simple path that is a walk in which vertices or edges would not be repeated in this walk that i'm showing you here we are starting at a and we are ending our walk at c there is a simple path from a to c with just one edge all we need to do is we need to avoid going to be e h d and then
coming back again to a so this is why we mostly talk about simple path between two vertices because if any other walk is possible simple path is also possible and it makes most sense to look for a simple path so this is what i'm going to do throughout our lessons i'm going to say path
and by path i'll mean simple path and if it's not a simple path i'll say it explicitly a graph is called strongly connected if in the graph there is a path from any vertex to any other vertex if it's an undirected graph we simply call it connected and if it's a directed graph
we call it strongly connected in leftmost and rightmost graphs that i'm showing you here we have a path from any vertex to any other vertex but in this graph in the middle we do not have a path from any vertex to any other vertex we cannot go from vertex c to a we can go from a to c but
we cannot go from c to a so this is not a strongly connected graph remember if it's an undirected graph we simply say connected and if it's a directed graph we say strongly connected if a directed graph is not strongly connected but can be turned into connected graph
by treating all edges as undirected then such a directed graph is called weakly connected if we just ignore the directions of the edges here this is connected but i would recommend that you just remember connected and strongly connected this leftmost undirected graph is connected
i removed one of the edges and now this is not connected now we have two disjoint connected components here but the graph overall is not connected connectedness of a graph is a really important property if you remember introsity road network road network within a city that would
have a lot of one ways can be represented as a directed graph now an introsity road network should always be strongly connected we should be able to reach any street from any street any intersection to any intersection okay now that we understand concept of a path
next i want to talk about cycle in a graph a walk is called a closed walk if it starts and ends at same vertex like what i'm showing here and there is one more condition the length of the walk must be greater than zero length of a walk or path is number of edges in the path
like for this closed walk that i'm showing you here length is five because we have five edges in this walk so a closed walk is walk that starts and ends at same vertex and the length of which is greater than zero now some may call closed walk a cycle but generally we use the term cycle for
a simple cycle a simple cycle is a closed walk in which other than start and end vertices no other vertex or edge is repeated right now what i'm showing you here in this example graph is a simple cycle or we can just say cycle a graph with no cycle is called an acyclic graph
a tree if drawn with undirected edges would be an example of an undirected acyclic graph here in this tree we can have a closed walk but we cannot have a simple cycle in this closed walk that i'm showing you here our edge is repeated there would be no simple cycle in a tree
and apart from tree we can have other kind of undirected acyclic graphs also our tree also has to be connected now we can also have a directed acyclic graph as you can see here also we do not have any cycle you cannot have a path of length
greater than zero starting and ending at the same vertex or directed acyclic graph is often called a dag cycles in a graph cause a lot of issues in designing algorithms for problems like finding shortest route from one vertex to another and we will talk about cycles a lot when we will
study some of these advanced algorithms in coming lessons for this lesson i'll stop here now in our next lesson we will discuss ways of creating and storing graph in computer's memory this is it for this lesson thanks for watching hello everyone in our previous lessons we introduced
you to graphs and we also looked at and talked about some of the properties of graph but so far we have not discussed how we can implement graph how we can create a logical structure like graph in computer's memory so let us try to discuss this a graph as we know contains a set of vertices
and a set of edges and this is how we define graph in pure mathematical terms a graph g is defined as an ordered pair of a set v of vertices and a set e of edges now to create and store a graph in computer's memory the simplest thing that we probably can do is that
we can create two lists one to store all the vertices and another to store all the edges for a list we can use an array of appropriate size or we can use an implementation of a dynamic list in fact we can use a dynamic list available to us in language libraries something like vector
in c++ or array list in Java now a vertex is identified by its name so the first list the list of vertices would simply be a list of names or strings i just filled in names of all the vertices for this example graph here now what should we fill in this edge list here an edge is identified
by its two endpoints so what we can do is we can create an edge as an object with two fields we can define edge as a structure or class with two fields one to store the start vertex and another to store the end vertex edge list would basically be an array or list of this type
struct edge in these two definitions of edge that i have written here in the first one i have used character pointers because in c we typically use character pointers to store or refer to strings we could use character array also in c++ or Java where we can create classes we have string
available to us as a data type so we can use that also so we can use any of these for the fields we can use character pointer or character array or string data type if it's available depends on how you want to design your implementation now let's fill this edge list here for this example graph
each row now here has two boxes let's say the first one is to store the start vertex and the second one is to store the end vertex the graph that we have here is an undirected graph so any vertex can be called start vertex and any vertex can be called end vertex
order of the vertices is not important here we have nine edges here one between a and b another between a and c another between a and d and then we have be and bf instead of having bf as an entry we could also have fb but we just need one of them and then we have cg dh
eh and fh actually there's one more we also have gh we have 10 edges in total here and not nine now once again because this is an undirected graph if we are saying that there is an edge from f to h we are also saying that there is an edge from h to f there is no need to have another
entry as hf we will unnecessarily be using extra memory if this was a directed graph fh and hf would have meant two different connections which is the start vertex and which is the end vertex would have mattered maybe in case of undirected graphs we should name the fields as first vertex
and second vertex and in case of directed graphs we should name the fields as start vertex and end vertex now our graph here could also be a weighted graph we could have some cost or weight associated with the edges as we know in an unweighted graph cost of all the connections is equal
but in a weighted graph different connections would have different weight or different cost now in this example graph here i have associated some weights to these edges now how do you think we should store this data the weight of edges well if the graph is weighted we can have one more field
in the edge object to store the weight now an entry in my edge list has three fields one to store the start vertex one to store the end vertex and one more to store the weight so this is one possible way of storing a graph we can simply create two lists one to store the
vertices and another to store the edges but this is not very efficient for any possible way of storing and organizing data we must also see its cost and when we say cost we mean two things time cost of various operations and the memory usage typically we measure the rate of growth of
time taken with size of input or data what we also call time complexity and we measure the rate of growth of memory consumed with size of input or data what we also call space complexity time and space complexities are most commonly expressed in terms of what we
call big o notation for this lesson i'm assuming that you already know about time and space complexity analysis and big o notation if you want to revise some of these concepts then you can check the description of this video for link to some lessons we always want to minimize the time cost of most
frequently performed operations and we always want to make sure that we do not consume unreasonably high memory okay so let's now analyze this particular structure that we are trying to use to store our graph let's first discuss the memory usage for the first list
the vertex list least number of rows needed or consumed would be equal to number of vertices now each row here in this vertex list is a name or string and string can be of any length right now all strings have just one character because i simply named the notes a b c and so on
but we could have names with multiple characters and because strings can be of different lengths all rows may not be consuming the same amount of memory like here here i'm showing an intra city road network as a weighted graph cities are my notes and road distances are my weights
now for this graph as you can see names are of different lengths so all rows in vertex list or all rows in edge list would not cost us same more characters will cost us more bytes but we can safely assume that the names will not be too long we can safely assume that in almost
all practical scenarios average length of strings will be a really small value if we assume it to be always lesser than some constant then the total space consumed in this vertex list will be proportional to the number of rows consumed that is the number of vertices or in other words
we can say that space complexity here is big o of number of vertices this is how we write number of vertices with two vertical bars what we basically mean here is number of elements in set V now for the edge list once again we are storing strings in first two fields of the edge object
so once again each row here will not consume same amount of memory but if we are just storing the reference or pointer to a string like here in the first row instead of having values filled in these two fields we could have references or pointers to the names in the vertex list
if we will design things like this each row will consume same memory this in fact is better because references in most cases would cost us a lot lesser than a copy of the name and as reference we can have the actual address of the string and that's what we are doing when we are
saying that start vertex and end vertex can be character pointers or maybe a better design would be simply having the index of the name or string in vertex list let's say a is at index 0 in the vertex list and b is at index 1 and c is at index 2 and i'll go on like this
now for start vertex and end vertex we can have two integer fields as you can see in both my definitions of edge start vertex and end vertex are of type int now and in each row of edge list first and second field are filled with integer values
i have filled in appropriate values of indices this definitely is a better design and if you can see now each row in edge list would cost us the same amount of memory so overall space consumed in edge list would be proportional to number of edges or in other words space complexity here is
big o of number of edges okay so this is analysis of our memory usage overall space complexity of this design would be big o of number of vertices plus number of edges is this memory usage and reasonably high well we cannot do a lot better than this if we want to store a graph
in computer's memory so we are all right in terms of memory usage now let's discuss time cost of operations what do you think can be most frequently performed operations while working with craft one of the most frequently performed operations while working with craft would be finding all
nodes adjacent to a given node that is finding all nodes directly connected to a given node what do you think would be time cost of finding all nodes directly connected to a given node well we will have to scan the whole edge list we will have to perform a linear search we will
have to go through all the entries in the list and see if the start or end node in the entry is our given node for a directed graph we would see if the start node in the entry is our given node or not and for an undirected graph we would see both the start as well as the end node running time
would be proportional to number of edges or in other words time complexity of this operation would be big o of number of edges okay now another frequently performed operation can be finding if two given nodes are connected or not in this case also we will have to perform
a linear search on the edge list in worst case we will have to look at all the entries in the edge list so worst case running time would be proportional to number of edges so for this operation to time complexity is big o of number of edges now let's try to see how good or bad
this running time big o of number of edges is if you remember this discussion from our previous lesson in a simple graph in a graph with no self loop or multi-edge if number of vertices that is the number of elements in set v is equal to n then maximum number of edges
would be n into n minus one if the graph is directed each node will be connected to every other node and of course minimum number of edges can be zero we can have a graph with no edge maximum number of edges would be n into n minus one by two if the graph is undirected
but all in all if you can see number of edges can go almost up to square of number of vertices number of edges can be of the order of square of number of vertices let's denote number of vertices here as small v so number of edges can be of the order of v square in a graph typically any
operation running in order of number of edges would be considered very costly we try to keep things in order of number of vertices when we are comparing the two running times this is very obvious big o of v is a lot better than big o of v square all in all this vertex list and edge
list kind of representation is not very efficient in terms of time cost of operations we should think of some other efficient design we should think of something better we will talk about another possible way of storing and representing graph in next lesson this is it for this lesson
thanks for watching so in our previous lesson we discussed one possible way of storing and representing a graph in which we used two lists one to store the vertices and another to store the edges a record in vertex list here is name of a node and a
record in edge list is an object containing references to the two end points of an edge and also the weight of that edge because this example graph that i'm showing you here is a weighted graph we called this kind of representation edge list representation
but we realized that this kind of storage is not very efficient in terms of time cost of most frequently performed operations like finding nodes adjacent to a given node or finding if two nodes are connected or not to perform any of these operations we need to scan
the whole edge list we need to perform a linear search on the edge list so the time complexity is big o of number of edges and we know that number of edges in a graph can be really really large in worst case it can be close to square of number of vertices in a graph anything running in order
of number of edges is considered very costly we often want to keep the cost in order of number of vertices so we should think of some other efficient design we should think of something better than this one more possible design is that we can store the edges in a two-dimensional
array or matrix we can have a two-dimensional matrix or array of size v cross v where v is number of vertices as you can see i have drawn an 8 cross 8 array here because number of vertices in my example graph here is 8 let's name this array a now if we want to store a graph that is
unweighted let's just remove the weights from this example graph here and now our graph is unweighted and if we have a value or index between zero and v minus one for each vertex which we have here if we are storing the vertices in a vertex list then we have an index between
zero and v minus one for each vertex we can say that a is 0th node b is 1th node c is 2th node and so on we are picking up indices from the vertex list okay so if the graph is unweighted and each vertex has an index between zero and v minus one then in this matrix or to the array we can set
ith row and jth column that is aij as one or boolean value true if there is an edge from i to j zero or false otherwise if I have to fill this matrix for this example graph here then I'll go vertex by vertex vertex vertex zero is connected to vertex one two and three vertex one is connected
to zero four and five this is an undirected graph so if we have an edge from zero to one we also have an edge from one to zero so one at row and zero at the column should also be set as one now let's go to node two it's connected to zero and six three is connected to zero and seven
four is connected to one and seven five once again is connected to one and seven six is connected to two and seven and seven is connected to three four five and six all the remaining positions in this array should be set as zero
notice that this matrix is symmetric for an undirected graph this matrix would be symmetric because aij would be equal to aji we would have two positions filled for each edge in fact to see all the edges in the graph we need to go through only one of these two halves
now this would not be true for a directed graph only one position will be filled for each edge and we will have to go through the entire matrix to see all the edges okay now this kind of representation of a graph in which edges or connections are stored in a matrix or to the array is called
a jcnc matrix representation this particular matrix that I have drawn here is an ajcnc matrix now with this kind of storage or representation what do you think would be the time cost of finding all nodes adjacent to a given node let's say given this vertex list and adjacency matrix
we want to find all nodes adjacent to node named f if we are given name of a node then we first need to know its index and to know the index we will have to scan the vertex list there is no other way once we figure out the index like for f index is 5 then we can go to the row with that index in
the ajcnc matrix and we can scan this complete row to find all the adjacent nodes scanning the vertex list to figure out the index in worst case will cost us time proportional to the number of vertices because in worst case we may have to scan the whole list and scanning a row in the
adjacency matrix would once again cost us time proportional to number of vertices because in a row we would have exactly v columns where v is number of vertices so overall time cost of this operation is big o of v now most of the time while performing operations we must pass indices to avoid scanning
the vertex list all the time if we know an index we can figure out the name in constant time because in an array we can access element at any index in constant time but if we know a name and want to figure out the index then it will cost us we go off v we will have to scan the vertex list
we will have to perform a linear search on it okay moving on now what would be the time cost of finding if two nodes are connected or not now once again the two nodes can be given to us as indices or names if the nodes would be passed as indices then we simply need to look at value in a particular
row and particular column we simply need to look at aij for some values of i and j and this will cost us constant time you can look at value in any cell in a two-dimensional array in constant time so if indices are given time complexity of this operation would be big o of 1 which simply
means that we will take constant time but if names are given then we also need to do the scanning to figure out the indices which will cost us big o of v overall time complexity would be big o of v the constant time access would not mean anything the scanning of vertex list all the time to figure
out the indices can be avoided we can use some extra memory to create a hash table with names and indices as key value pairs and then the time cost of finding index from name would also be big o of 1 that is constant hash table is a data structure and i have not talked about it in
any of my lessons so far if you do not know about hash table just search online for a basic idea of it okay so as you can see with a jcense matrix representation our time cost of some of the most frequently performed operations is in order of number of vertices and not in order of number of
edges which can be as high as square of number of vertices okay now if we want to store a weighted graph in a jcense matrix representation then aij in the matrix can be set as weight of an edge for non-existent edges we can have a default value like a really large or maximum possible
integer value that is never expected to be an edge weight i have just filled in infinity here to mean that we can choose the default as infinity minus infinity or any other value that would never ever be a valid edge weight okay now for further discussion i'll come back to an unweighted graph
a jcense matrix looks really good so should we not use it always well with this design we have improved on time but we have gone really high on memory usage instead of using memory units exactly equal to number of edges what we were doing with an edge list kind of storage
here we are using exactly v square units of memory we are using big o of v square space we are not just storing the information that these two nodes are connected we are also storing not of it that is these two nodes are not connected which probably is redundant information
if a graph is tense if the number of edges is really close to v square then this is good but if the graph is sparse that is if number of edges is lot lesser than v square then we are wasting a lot of memory in storing these zeros like for this example graph that i have drawn here in the edge
list we were consuming 10 units of memory we had 10 rows consumed in the edge list but here we are consuming 64 units most graphs with a really large number of vertices would not be very tense would not have number of edges anywhere close to v square like for example let's say we are
modeling a social network like facebook has a graph such that a user in the network is a node and there is an undirected edge if two users are friends facebook has a billion users but i'm showing only a few in my example graph here because i'm short of space
let's just assume that we have a billion users in our network so number of vertices in our graph is 10 to the power 9 which is a billion now do you think number of connections in our social network can ever be close to square of number of users that will mean everyone in the network is a friend
of everyone else a user of our social network will not be friend to all other billion users we can safely assume that a user on an average would not have more than a thousand friends with this assumption we would have 10 to the power 12 edges in our graph
actually this is an undirected graph so we should do a divide by two here so that we do not count an edge twice so if average number of friends is thousand then total number of connections in my graph is 5 into 10 to the power 11 now this is a lot lesser than square of number of vertices
so basically if we would use an adjacency matrix for this kind of a graph we would waste a hell lot of space and moreover even if we are not looking in relative terms 10 to the power 18 units of memory even in absolute sense is a lot 10 to the power 18 bytes would be about a thousand petabytes
now this really is a lot of space this much data would never ever fit on one physical disc 5 into 10 to the power 11 bytes on the other hand is just 0.5 terabytes a typical personal computer these days would have this much of storage so as you can see for
something like a large social graph adjacency matrix representation is not very efficient adjacency matrix is good when a graph is tens that is when the number of edges is close to square of number of vertices or sometimes when total number of possible connections that is v
square is so less that wasted space would not even matter but most real world graphs would be sparse and adjacency matrix would not be a good fit let's think about another example let's think about world wide web as a directed graph if you can think of web pages as nodes in a graph and hyperlinks
as directed edges then a web page would not have linked to all other web pages and once again number of web pages would be in order of millions a web page would have linked to only a few other web pages so the graph would be sparse most real world graphs would be sparse
and adjacency matrix even though it's giving us good running time for most frequently performed operations would not be a good fit because it's not very efficient in terms of space so what should we do well there's another representation that gives us similar or maybe
even better running time than adjacency matrix and does not consume so much space it's called adjacency list representation and we will talk about it in our next lesson this is it for this lesson thanks for watching so in our previous lesson we talked about adjacency
matrix as a way to store and represent a graph and as we discussed and analyzed this data structure we saw that it's very efficient in terms of time cost of operations with this data structure it costs big o of 1 that is constant time to find if two nodes are connected or not and it costs big o of
v where v is number of vertices to find all nodes adjacent to a given node but we also saw that adjacency matrix is not very efficient when it comes to space consumption we consume space in order of square of number of vertices in adjacency matrix representation as you know
we store edges in a two-dimensional array or matrix of size v cross v where v is number of vertices in my example graph here we have eight vertices that's why i have an eight cross eight matrix here we are consuming eight square that is 64 units of space here now what's basically
happening is that for each vertex for each node we have a row in this matrix where we are storing information about all its connections this is the row for the zeroth node that is a this is the row for the one-th node that is b this is for c and we can go on like this so each node has got a row
and a row is basically a one-dimensional array of size equal to number of vertices that is v and what exactly are we storing in a row let's just look at this first row in which we are storing connections of node a this two-dimensional matrix or array that we have here is basically an array
of one-dimensional arrays so each row has to be a one-dimensional array so how are we storing the connections of node a in these eight cells in this one-dimensional array of size eight a zero in the zeroth position means that there is no edge starting a and ending at
zeroth node which again is a an edge starting and ending at itself is called a self loop and there is no self loop on a of one in one-th position here means that there is an edge from a to one-th node that is b the way we are storing information here is that
index or position in this one-dimensional array is being used to represent end point of an edge for this complete row for this complete one-dimensional array start is always the same it's always the zeroth node that is a in general in the
adjacency matrix row index represents the start point and column index represents the end point now here when we are looking only at the first row start is always a and the indices 0 1 2 and so on are representing the end points and the value at a particular index or position tells us whether
we have an edge ending at that node or not one here means that the edge exists 0 would have meant that the edge does not exist now when we are storing information like this if you can see we are not just storing that b c and d are connected to a we are also storing the not of it
we are also storing the information that a e f g and h are not connected to a if we are storing what all nodes are connected through that we can also deduce what all nodes are not connected these zeros in my opinion are redundant information causing extra consumption
of memory most real-world graphs are sparse that is number of connections is really small compared to total number of possible connections so most often there would be too many zeros and very few ones think about it let's say we are trying to store connections in a social network like Facebook
in an adjacency matrix which would be the most impractical thing to do in my opinion but anyway for the sake of argument let's say we are trying to do it just to store connections of one user i would have a row or one-dimensional matrix of size 10 to the power nine on an average
in a social network you would not have more than thousand friends if i have thousand friends then in the row used to store my connections i would only have thousand ones and rest that is 10 to the power nine minus thousand would be zeros and i'm not trying to force you to agree but just like me
if you also think that these zeros are storing redundant information and are extra consumption of memory then even if we are storing these ones and zeros in just one byte as boolean values these many zeros here is almost one gigabyte of memory once are just one kilobyte so given
this problem let's try to do something different here let's just try to keep the information that these nodes are connected and get rid of the information that these nodes are not connected because it can be inferred it can be deduced and there are a couple of ways in which we can do this
here to store connections of a instead of using an array such that index represents end point of an edge and value at that particular index represents whether we have an edge ending there or not we can simply keep a list of all the nodes to which we are connected this is the list or set of
nodes to which a is connected we can represent this list either using the indices or using the actual names for the nodes let's just use indices because names can be long and may consume more memory you can always look at the vertex list and find out the name in constant time now in our
machine we can store this set of nodes which basically is a set of integers in something as simple as an array and this array as you can see is a different arrangement from our previous array in our earlier arrangement index was representing index of a node in the graph and value was
representing whether there was a connection to that node or not here index does not represent anything and the values are the actual indices of the nodes to which we are connected now instead of using an array here to store this set of integers we can also use a linked list
and why just array or linked list i would argue that we can also use a tree here in fact a binary search tree is a good way to store a set of values there are ways to keep a binary search tree balanced and if you always keep a binary search tree balanced you can perform search insertion
and deletion all three operations in order of log of number of nodes we will discuss cost of operations for any of these possible ways in some time right now all i want to say is that there are a bunch of ways in which we can store connections of a node for our example graph that we started with
instead of an adjacency matrix we can try to do something like this we are still storing the same information we are still saying that 0th node is connected to 1th 2th and 3th node 1th node is connected to 0th 4th and 5th node 2th node is connected to 0th and 6th node and so on
but we are consuming a lot less memory here programmatically this adjacency matrix here is just a two-dimensional array of size 8 cross 8 so we are consuming 64 units of space in total but this structure in right does not have all the rows of same size how do you think we can create
such a structure programmatically well it depends in c or c plus plus if you understand pointers then we can create an array of pointers of size 8 and each pointer can point to a one-dimensional array of different size 0th pointer can point to an array of size 3 because 0th node has
three connections and we need an array of size 3 1th pointer can point to an array of size 3 because 1th node also has three connections 2th node however has only two connections so 2th pointer should point to an array of size 2 and we can go on like this the 7th node has
four connections so 7th pointer should should point to an array of size 4 if you do not understand any of this pointer thing that i'm doing right now you can refer to my code schools lesson titled pointers and arrays the link to which you can find
in the description of this video but think about it the basic idea is that each row can be a one-dimensional array of different size and you can implement this with whatever tools you have in your favorite programming language now let's quickly see what are the pros and cons of this structure in the right
in comparison to the matrix in the left we are definitely consuming less memory with the structure in right with a jcnc matrix our space consumption is proportional to square of number of vertices while with the second structure space consumption is proportional to number of edges
and we know that most real-world graphs are sparse that is the number of edges is really small in comparison to square of number of vertices square of number of vertices is basically total number of possible edges and for us to reach this number every node should be connected to
every other node in most graphs a node is connected to few other nodes and not all other nodes in this second structure we are avoiding this typical problem of too much space consumption in an a jcnc matrix by only keeping the ones and getting rid of the redundant zeros
here for an undirected graph like this one we would consume exactly two into number of edges units of memory and for a directed graph we would consume exactly e that is number of edges units of memory but all in all space consumption will be proportional to number of edges
or in other words space complexity would be big o of e so the second structure is definitely better in terms of space consumption but let's now also try to compare these two structures for time cost of operations what do you think would be the time cost of finding if two nodes are
connected or not we know that it's constant time or big o of 1 for an a jcnc matrix because if we know the start and end point we know the cell in which to look for 0 or 1 but in the second structure we cannot do this we will have to scan through a row so if I ask you something like
can you tell me if there is a connection from node 0 to 7 then you will have to scan this zeroeth row you will have to perform a linear search on this zeroeth row to find 7 right now all the rows in this structure are sorted you can argue that I can keep all the rows sorted and then
I can perform a binary search which would be a lot less costlier that's fine but if you just perform a linear search then in worst case we can have exactly v that is number of vertices cells in a row so if we perform a linear search in worst case we will take
time proportional to number of vertices and of course the time cost would be big o of log v if we would perform a binary search logarithmic run times are really good but to get this here we always need to keep our rows sorted keeping an array always sorted
is costly in other ways and I'll come back to it later for now let's just say that this would cost us big o of v now what do you think would be the time cost of finding all nodes adjacent to a given node that is finding all neighbors of a node well even in case of a adjacency matrix
we now have to scan a complete row so it would be big o of v for the matrix as well as this second structure here because here also in worst case we can have v cells in a row equivalent to having all ones in a row in an adjacency matrix when we try to see the time cost of an operation
we mostly analyze the worst case so for this operation we are big o of v for both so this is the picture that we are getting looks like we are saving some space with this second structure but we are not saving much on time well I would still argue that it's not true when we analyze time
complexity we mostly analyze it for the worst case but what if we already know that we are not going to hit the worst case if we can go back to our previous assumption that we are dealing with a sparse graph that we are dealing with a graph in which a node would be connected to
few other nodes and not all other nodes then the second structure will definitely save us time things would look better once again if we would analyze them in context of a social network I'll set some assumptions let's say we have a billion users in our social network
and the maximum number of friends that anybody has is 10,000 and let's also assume computational power of our machine let's say our machine or system can scan or read 10 to the power 6 cells in a second this is a reasonable assumption because machines often execute a couple of millions
instructions per second now what would be the actual cost of finding all nodes adjacent to a given node in a JNC matrix well we will have to scan a complete row in the matrix that would be 10 to the power 9 cells because in a matrix we would always have cells equal to number of vertices
and if we would divide this by a million we would get the time in seconds to scan a row of 10 to the power 9 cells we would take 1000 seconds which is also 16.66 minutes this is unreasonably high but with the second structure maximum number of cells in a row
would be 10,000 because the number of cells would exactly be equal to number of connections and this is the maximum number of friends or connections a person in the network has so here we would take 10 to the power 4 upon 10 to the power 6 that is 10 to the power minus
two seconds which is equal to 10 milliseconds 10 milliseconds is not unreasonable now let's try to deduce the cost for the second operation finding if two nodes are connected or not in case of a JNC matrix we would know exactly what cell to read we would know the memory location of that
specific cell and reading that one cell would cost us one upon 10 to the power 6 seconds which is one microsecond in the second structure we would not know the exact cell we will have to scan a row so once again maximum time taken would be 10 milliseconds
just like finding adjacent nodes so now given this analysis if you would have to design a social network what structure would you choose no brainer isn't it machine cannot make a user wait for 16 minutes would you ever use such a system milliseconds is fine but minutes it's just
too much so now we know that for most real world graphs this second structure is better because it saves us space as well as time remember i'm saying most and not all because for this logic to be true for my reasoning to be valid graph has to be sparse number of edges has to be significantly
lesser than square of number of vertices so now having analyzed space consumption and time cost of at least two most frequently performed operations looks like this second structure would be better for most graphs well there can be a bunch of operations in a graph and we should account for
all kind of operations so before making up my mind i would analyze cost of few more operations what if after storing this example graph in computer's memory in any of these structures we decide to add a new edge let's say we got a new connection in the graph from a to g
then how do you think we can store this new information this new edge in both these structures the idea here is to assess that once the structures are created in computer's memory how would we do if the graph changes how would we do if a node or edge is inserted or deleted
if a new edge is inserted in case of an adjacency matrix we just need to go to a specific cell and flip the zero at that cell to one in this case we would go to zero at row and sixth column and overwrite it with value one and if it was a deletion then we would go to a
specific cell and make the one zero now how about this second structure how would you do it here we need to add a six in the first row and if you have followed this series on data structures then you know that it's not possible
to dynamically increase size of an existing array this would not be so straightforward we will have to create a new array of size four for the zero at row then we will have to copy content of the old array write the new value and then wipe off the old one from the memory
it's tricky implementing a dynamic or changing list using arrays this creation of new array and copying of old data is costly and this is the precise reason why we often use another data structure to store dynamic or changing lists and this another data structure is linked list
so why not use a linked list why can't each row be a linked list something like this logically we still have a list here but concrete implementation wise we are no more using an array that we need to change dynamically we are using a linked list it's a lot easier to do
insertions and deletions in a linked list now programmatically to create this kind of structure in computers memory we need to create a linked list for each node to store its neighbors so what we can do is we can create an array of pointers just like what we had done when we were
using arrays the only difference would be that this time each of these pointers would point to head of a linked list that would be a node i have defined node of a linked list here node of a linked list would have two fields one to store data and another to store address
of the next node a0 would be a pointer to head or first node of linked list for a a1 would be a pointer to head of linked list for b and we will go on like a2 for c a3 for d and so on actually i have drawn the linked lists here in the left but i have not drawn the array of pointers
let's say this is my array of pointers now a0 here this one is a pointer to node and it points to the head of linked list containing the neighbors of a let's assume that head of linked list for a has addressed 400 so in a0 we would have 400 it's really important to understand what is what
here in this structure this one a0 is a pointer to node and all a pointer does is store an address or reference this one is a node and it has two fields one to store data and another a pointer to node to store the address of next node let's assume that the address of next node in this first
linked list is 450 then we should have 450 here and if the next one is at let's say 500 then we should have 500 in address part of the second node the address in last one would be 0 or null now this kind of structure in which we store information about neighbors of a node
in a linked list is what we typically call an adjacency list what i have here is an adjacency list for an undirected unweighted graph to store a weighted graph in an adjacency list i would have one more field in node to store weight i have written some random weights next to the edges
in this graph and to store this extra information i have added one extra field in node both in logical structure and the code all right now finally with this particular structure that we are calling a adjacency list we should be fine with space consumption
space consumed will be proportional to number of edges and not to square of number of vertices most graphs are sparse and number of edges in most cases is significantly lesser than square of number of vertices ideally for space complexity i should say big o of number of edges
plus number of vertices because storing vertices will also consume some memory but if we can assume that number of vertices will be significantly lesser in comparison to number of edges then we can simply say big o of number of edges but it's always good if we do
the counting right now for time cost of operations the argument that we were earlier making using a sparse graph like social network is still true adjacency list would overall be better than adjacency matrix finally let's come back to the question how flexible are we with this structure
if we need to add a new connection or delete an existing connection and is there any way we can improve upon it well i'll leave this for you to think but i'll give you a hint what if instead of using a linked list to store information about all the neighbors we use a binary search tree
do you think we would do better for some of these operations i think we would do better because the time cost for searching inserting and deleting a neighbor would reduce with this thought i'll sign off, this is it for this lesson, thanks for watching.
Arrays store elements in contiguous memory locations allowing fast indexed access (O(1)), but have fixed size which makes insertions and deletions costly. Linked lists consist of nodes linked by pointers, allowing dynamic sizing and efficient insertions/removals especially at the head, but have slower access times (O(n)). Use arrays when you need fast random access and have a known size; choose linked lists when frequent insertions/deletions or dynamic sizing is required.
A stack follows Last-In-First-Out (LIFO) order where elements are added (push) and removed (pop) only from the top, with all main operations running in O(1) time. Common uses include managing function calls and recursion, implementing undo functionality, and evaluating expressions. Stacks can be implemented using arrays or linked lists depending on requirements.
Key BST operations include search, insertion, and deletion. On average, these operations run in O(log n) time due to the tree's ordered structure, but in the worst case (e.g., unbalanced tree), the time can degrade to O(n). BST maintains the property that left children are less or equal, and right children are greater than the node, enabling efficient sorted data management.
Graphs can be represented by edge lists, adjacency matrices, or adjacency lists. Edge lists are simple but inefficient for neighbor queries. Adjacency matrices allow O(1) adjacency checks but consume O(V²) memory, making them suitable for dense graphs. Adjacency lists store neighbors per vertex, use memory proportional to the number of edges, and are ideal for sparse graphs since adjacency queries take O(k), where k is the node's degree.
Tree traversal methods include pre-order (Root, Left, Right), in-order (Left, Root, Right), post-order (Left, Right, Root), and level-order (breadth-first). In-order traversal is commonly used to retrieve sorted data from binary search trees. Pre-order is useful for copying trees or expression evaluation. Post-order is applied in deleting trees or postfix expression evaluations. Level-order traversal is ideal for shortest path-like operations or breadth-first processing.
A queue operates on First-In-First-Out (FIFO) principle, where elements are enqueued at the rear and dequeued from the front, both in O(1) time. Queues are ideal for scenarios requiring ordered processing such as resource scheduling, breadth-first search in graphs, and task buffering. Implementations include arrays (circular buffers) and linked lists depending on use-case and memory considerations.
Heads up!
This summary and transcript were automatically generated using AI with the Free YouTube Transcript Summary Tool by LunaNotes.
Generate a summary for freeRelated Summaries
Understanding Data Structures Through C Language: A Comprehensive Guide
This video introduces the concept of data structures using the C programming language, explaining the importance of algorithms in structuring information. It covers various types of data structures, including linear and nonlinear types, and emphasizes the significance of arrays, stacks, queues, and linked lists in effective data storage and processing.
Introduction to Data Structures and Algorithms
This video provides a comprehensive introduction to data structures and algorithms, explaining key concepts such as data, data structures, their purpose, classifications, and operations. It also covers algorithms, their properties, and practical implementation examples.
Comprehensive Overview of Data Structures and Algorithms Using Python
This video provides an in-depth exploration of data structures and algorithms using Python, covering essential topics such as linked lists, stacks, queues, and sorting algorithms. The session includes practical coding examples, theoretical explanations, and insights into the efficiency of various algorithms.
Understanding Static Arrays, Dynamic Arrays, and Strings in Python
Explore the differences between static arrays, dynamic arrays, and strings in Python, their operations and complexities.
Comprehensive Overview of Algorithms and Data Structures Course
This course provides an in-depth exploration of algorithms and data structures, focusing on sorting and searching algorithms. It covers key concepts such as recursion, big O notation, and the implementation of various algorithms including merge sort, quick sort, and linear search, using Python as the primary programming language.
Most Viewed Summaries
Kolonyalismo at Imperyalismo: Ang Kasaysayan ng Pagsakop sa Pilipinas
Tuklasin ang kasaysayan ng kolonyalismo at imperyalismo sa Pilipinas sa pamamagitan ni Ferdinand Magellan.
A Comprehensive Guide to Using Stable Diffusion Forge UI
Explore the Stable Diffusion Forge UI, customizable settings, models, and more to enhance your image generation experience.
Pamamaraan at Patakarang Kolonyal ng mga Espanyol sa Pilipinas
Tuklasin ang mga pamamaraan at patakaran ng mga Espanyol sa Pilipinas, at ang epekto nito sa mga Pilipino.
Mastering Inpainting with Stable Diffusion: Fix Mistakes and Enhance Your Images
Learn to fix mistakes and enhance images with Stable Diffusion's inpainting features effectively.
Pamaraan at Patakarang Kolonyal ng mga Espanyol sa Pilipinas
Tuklasin ang mga pamamaraan at patakarang kolonyal ng mga Espanyol sa Pilipinas at ang mga epekto nito sa mga Pilipino.

