1 00:00:00,650 --> 00:00:02,283 - Introduction to graphs. 2 00:00:02,283 --> 00:00:03,757 Introduction to graphs. 3 00:00:03,757 --> 00:00:05,240 I've been so excited. 4 00:00:05,240 --> 00:00:06,800 I'm waiting for this part. 5 00:00:06,800 --> 00:00:08,130 And now it's here. 6 00:00:08,130 --> 00:00:09,110 Introduction to graphs. 7 00:00:09,110 --> 00:00:10,820 I love graphs. Okay? 8 00:00:10,820 --> 00:00:11,653 I really do. 9 00:00:13,520 --> 00:00:15,440 So let's get started. 10 00:00:15,440 --> 00:00:18,390 And to be honest, we've already talked about graphs. 11 00:00:18,390 --> 00:00:19,223 Do you remember? 12 00:00:20,330 --> 00:00:22,780 Trees, trees are graphs. 13 00:00:22,780 --> 00:00:25,680 Special kind of graph, but they're graphs. 14 00:00:25,680 --> 00:00:27,593 And when we were talking about trees, 15 00:00:28,530 --> 00:00:30,530 we introduced the terms node, 16 00:00:30,530 --> 00:00:31,950 or sometimes vertex. 17 00:00:31,950 --> 00:00:33,550 I'll use them interchangeably. 18 00:00:33,550 --> 00:00:36,120 The book talks about nodes. 19 00:00:36,120 --> 00:00:37,990 I'm sort of conditioned to say vertex. 20 00:00:37,990 --> 00:00:40,130 So if I mix them up, please forgive me. 21 00:00:40,130 --> 00:00:41,180 I'll try to say node. 22 00:00:42,590 --> 00:00:44,980 We have edges, edges connect nodes. 23 00:00:44,980 --> 00:00:49,013 Each edge has exactly two end points that are nodes. 24 00:00:50,410 --> 00:00:54,270 We can count the number of edges that a node has. 25 00:00:54,270 --> 00:00:56,480 The number of edges that are incident to the node. 26 00:00:56,480 --> 00:00:58,523 That's called the degree of the node. 27 00:00:59,780 --> 00:01:02,780 There are paths in a graph, just like we have paths 28 00:01:02,780 --> 00:01:07,560 in a tree that we can traverse through a number of vertices, 29 00:01:07,560 --> 00:01:12,560 one or more to get from one point to another. 30 00:01:13,130 --> 00:01:15,970 Paths exist in graphs, just the same. 31 00:01:15,970 --> 00:01:18,610 And we also talked about cycles. 32 00:01:18,610 --> 00:01:22,270 Particularly in the context of the trees, 33 00:01:22,270 --> 00:01:24,360 trees don't have cycles, and that's why we brought them up. 34 00:01:24,360 --> 00:01:26,700 That's one of the conditions of being a trees 35 00:01:26,700 --> 00:01:28,303 is you do not have cycles. 36 00:01:29,490 --> 00:01:34,010 So let's go back to some of those slides about trees. 37 00:01:34,010 --> 00:01:36,203 So here we have a lovely tree. 38 00:01:37,040 --> 00:01:40,550 It's a structure consisting of nodes and edges. 39 00:01:40,550 --> 00:01:44,630 And these are the nodes and these are the edges. 40 00:01:44,630 --> 00:01:45,463 Okay? 41 00:01:46,600 --> 00:01:50,660 And so if we take the set of all possible graphs, 42 00:01:50,660 --> 00:01:53,703 trees are a subset of that set. 43 00:01:54,850 --> 00:01:56,600 What's special about trees? 44 00:01:56,600 --> 00:01:59,060 Well, again, they're, acyclic. 45 00:01:59,060 --> 00:02:00,263 What do I mean by that? 46 00:02:01,250 --> 00:02:03,950 Here is a very bad tree. 47 00:02:03,950 --> 00:02:06,050 It has a cycle. 48 00:02:06,050 --> 00:02:10,730 It goes round in a loop like this here. 49 00:02:10,730 --> 00:02:11,730 And you can pick any pair 50 00:02:11,730 --> 00:02:16,730 of nodes on that yellow loop, say A and M. 51 00:02:16,890 --> 00:02:21,760 And we have more than one path that connects A and M. 52 00:02:21,760 --> 00:02:24,060 So we have a path that goes through B and G. 53 00:02:24,060 --> 00:02:26,670 We have a path that goes through D and H. 54 00:02:26,670 --> 00:02:27,880 That's two paths. 55 00:02:27,880 --> 00:02:29,280 Two is not one. 56 00:02:29,280 --> 00:02:30,660 They're supposed to be exactly one. 57 00:02:30,660 --> 00:02:31,893 So that's a bad tree. 58 00:02:32,860 --> 00:02:36,370 But it is perfectly valid graph. 59 00:02:36,370 --> 00:02:38,190 So if we relax the condition 60 00:02:38,190 --> 00:02:40,360 that there must be exactly one path 61 00:02:40,360 --> 00:02:43,463 between any pair of nodes, then we have a graph. 62 00:02:44,850 --> 00:02:49,850 So all trees are graphs, but not all graphs are trees. 63 00:02:50,310 --> 00:02:54,493 Trees are that special case of acyclic connected graphs. 64 00:02:56,600 --> 00:02:59,020 So graphs may or may not contain cycles. 65 00:02:59,020 --> 00:03:02,063 They don't have to, they may or may not. 66 00:03:02,950 --> 00:03:05,010 And graphs may or may not be connected. 67 00:03:05,010 --> 00:03:08,490 So there may be more than one component in a graph. 68 00:03:08,490 --> 00:03:11,510 So it's perfectly fine to have an isolated vertex 69 00:03:11,510 --> 00:03:16,510 or two little trees that have no edges between them. 70 00:03:16,960 --> 00:03:18,150 That's still a graph. 71 00:03:18,150 --> 00:03:18,983 That's fine. 72 00:03:21,570 --> 00:03:24,560 So we need a little notation here, 73 00:03:24,560 --> 00:03:26,060 and I'll keep it to a minimum. 74 00:03:27,240 --> 00:03:31,560 We denote the set of all nodes by the big letter V. 75 00:03:31,560 --> 00:03:33,660 That stands for vertex. 76 00:03:33,660 --> 00:03:38,150 And we denote the set of all edges with a big letter E. 77 00:03:38,150 --> 00:03:39,440 Okay? 78 00:03:39,440 --> 00:03:42,340 And you'll remember when we talked about sets, 79 00:03:42,340 --> 00:03:46,720 we talked about this notation that looks like absolute value 80 00:03:46,720 --> 00:03:48,390 but it means cardinality of a set 81 00:03:48,390 --> 00:03:50,000 when we're talking about sets. 82 00:03:50,000 --> 00:03:54,030 And so if we take the cardinality of the set V, 83 00:03:54,030 --> 00:03:57,240 that gives us the number of nodes in a graph. 84 00:03:57,240 --> 00:03:59,960 And if we take the cardinality of the set E, 85 00:03:59,960 --> 00:04:02,263 that gives us the number of edges in a graph. 86 00:04:03,387 --> 00:04:06,920 And that's I think all the notation to throw at you today. 87 00:04:06,920 --> 00:04:09,773 So we've got this graph here. 88 00:04:11,290 --> 00:04:14,190 Let's figure out what the vertex set is. 89 00:04:14,190 --> 00:04:16,773 Well, it's A, B, C, D, E, F. 90 00:04:17,800 --> 00:04:18,633 So there we go. 91 00:04:18,633 --> 00:04:21,500 A, B, C, D, E, F, that's the vertex set. 92 00:04:21,500 --> 00:04:22,950 And what's the set of all edges? 93 00:04:22,950 --> 00:04:25,560 Well, we need some way to refer to the edges, 94 00:04:25,560 --> 00:04:27,960 and it's perfectly fine if we just refer to them 95 00:04:27,960 --> 00:04:30,110 by their two end points. 96 00:04:30,110 --> 00:04:32,050 Now this is an undirected graph. 97 00:04:32,050 --> 00:04:34,840 We'll get to directed graphs in a minute, 98 00:04:34,840 --> 00:04:36,600 but it doesn't matter. 99 00:04:36,600 --> 00:04:39,580 We can call this edge here BA or AB. 100 00:04:39,580 --> 00:04:41,840 Same edge, okay? 101 00:04:41,840 --> 00:04:43,700 So we're gonna have a set of edges. 102 00:04:43,700 --> 00:04:48,700 That's AB, BC, CD, CF, BF, FA, and so on. 103 00:04:51,170 --> 00:04:53,840 So there's our edge step. 104 00:04:53,840 --> 00:04:56,480 So what's the cardinality of these two sets? 105 00:04:56,480 --> 00:04:59,630 Well, we got one, two, three, four, five, six elements 106 00:04:59,630 --> 00:05:02,783 in the vertex set, or the edge or the node set. 107 00:05:04,590 --> 00:05:07,523 And we have one, two, three, four, five, six, seven, eight, 108 00:05:07,523 --> 00:05:09,253 nine in the edge set. 109 00:05:10,280 --> 00:05:14,223 So we've got six nodes and nine edges. 110 00:05:15,700 --> 00:05:17,453 Now all of these are graphs. 111 00:05:18,760 --> 00:05:19,740 Two of them are trees. 112 00:05:19,740 --> 00:05:20,983 Which two are trees? 113 00:05:22,380 --> 00:05:24,400 This one's a perfectly good tree. 114 00:05:24,400 --> 00:05:27,000 It's got one root, one leaf. 115 00:05:27,000 --> 00:05:28,170 That's fine. 116 00:05:28,170 --> 00:05:30,743 This is not a tree because it's not connected. 117 00:05:31,960 --> 00:05:36,193 This is not a tree because it has a cycle. 118 00:05:37,170 --> 00:05:40,050 This is a perfectly good tree right here. 119 00:05:40,050 --> 00:05:43,680 And we could take all of these together, all nine 120 00:05:43,680 --> 00:05:46,150 of these nodes, 121 00:05:46,150 --> 00:05:48,983 and that constitutes a perfectly fine graph too. 122 00:05:50,350 --> 00:05:52,653 So again, graphs needn't connected. 123 00:05:54,760 --> 00:05:57,710 Graphs can include self loops. 124 00:05:57,710 --> 00:06:01,390 So it's possible in a graph to have an edge 125 00:06:01,390 --> 00:06:04,420 that connects a node to itself. 126 00:06:04,420 --> 00:06:05,800 We're not gonna worry about those, 127 00:06:05,800 --> 00:06:09,040 many problems ignore or prohibits such loops anyways. 128 00:06:09,040 --> 00:06:11,953 But I just thought I'd mentioned that that's a possibility. 129 00:06:13,500 --> 00:06:15,970 So let's talk about adjacency. 130 00:06:15,970 --> 00:06:17,960 What does it mean to be adjacent? 131 00:06:17,960 --> 00:06:20,800 Well, here, A and B are adjacent. 132 00:06:20,800 --> 00:06:24,263 Why? Because there is an edge that connects them. 133 00:06:25,800 --> 00:06:30,800 A and B share a common edge, that makes them adjacent. 134 00:06:30,890 --> 00:06:31,723 Okay? 135 00:06:33,170 --> 00:06:34,730 There is a direct path 136 00:06:34,730 --> 00:06:36,943 that exists from one node to the other. 137 00:06:40,450 --> 00:06:43,740 Now we can also add direction to our edges. 138 00:06:43,740 --> 00:06:46,400 That complicates things a little bit. 139 00:06:46,400 --> 00:06:50,180 So in the case on the top, we have an undirected edge. 140 00:06:50,180 --> 00:06:53,583 A is adjacent to B, B is adjacent to A. 141 00:06:54,640 --> 00:06:58,970 But in the lower figure, C is adjacent to D 142 00:06:58,970 --> 00:07:02,510 because I can get to D from C. 143 00:07:02,510 --> 00:07:04,370 But D is not adjacent to C. 144 00:07:04,370 --> 00:07:05,203 There's no way back. 145 00:07:05,203 --> 00:07:06,800 It's a one-way street. 146 00:07:06,800 --> 00:07:08,430 That's a directed edge. 147 00:07:08,430 --> 00:07:11,040 We're gonna look at directed graphs in a minute. 148 00:07:11,040 --> 00:07:12,940 Well, this is a directed graph here. 149 00:07:12,940 --> 00:07:14,283 We're gonna look a bigger ones. 150 00:07:15,210 --> 00:07:17,040 So this is an undirected edge. 151 00:07:17,040 --> 00:07:19,120 This is a directed edge. 152 00:07:19,120 --> 00:07:20,740 Again, you can think of the directed edges 153 00:07:20,740 --> 00:07:22,230 as one-way streets. 154 00:07:22,230 --> 00:07:24,790 And we don't mix directed and undirected edges 155 00:07:24,790 --> 00:07:25,623 in the same graph. 156 00:07:25,623 --> 00:07:26,560 It's either one or the other. 157 00:07:26,560 --> 00:07:28,950 So think of these as two separate graphs, 158 00:07:28,950 --> 00:07:31,193 one that's undirected, one that's directed. 159 00:07:32,460 --> 00:07:36,250 And we can always add to a directed graph 160 00:07:36,250 --> 00:07:39,540 so that there are two ways to go. 161 00:07:39,540 --> 00:07:41,150 So this is perfectly fine. 162 00:07:41,150 --> 00:07:42,290 We have two directed edges, 163 00:07:42,290 --> 00:07:45,270 one from C to D and one from D to C. 164 00:07:45,270 --> 00:07:49,530 So in the case on top, C is adjacent to D, 165 00:07:49,530 --> 00:07:51,690 but D is not adjacent to C. 166 00:07:51,690 --> 00:07:53,600 But here, C is adjacent to D 167 00:07:53,600 --> 00:07:55,013 and D is adjacent to C. 168 00:07:56,800 --> 00:08:01,800 Now we talked about the degree of a node, 169 00:08:03,420 --> 00:08:06,253 meaning the number of edges that are incident to it. 170 00:08:07,200 --> 00:08:09,270 But when we have directed graphs, 171 00:08:09,270 --> 00:08:11,210 we can also talk about the in degree 172 00:08:11,210 --> 00:08:12,120 and the out degree. 173 00:08:12,120 --> 00:08:14,520 The in degree, as you might expect, 174 00:08:14,520 --> 00:08:19,520 is the number of edges that flow into that node. 175 00:08:19,640 --> 00:08:23,120 And the out degree is the number of edges 176 00:08:23,120 --> 00:08:25,700 that flow out from that node. 177 00:08:25,700 --> 00:08:26,798 Okay. 178 00:08:26,798 --> 00:08:28,190 So when we talk about in degree or out degree, 179 00:08:28,190 --> 00:08:30,100 which we will later, not in this lecture, 180 00:08:30,100 --> 00:08:32,773 but in a later lecture, that's what that means. 181 00:08:35,250 --> 00:08:39,660 So here, we have a connected cyclic graph, right? 182 00:08:39,660 --> 00:08:40,880 There's a cycle here. 183 00:08:40,880 --> 00:08:42,110 There's a cycle here. 184 00:08:42,110 --> 00:08:43,240 There's a cycle here. 185 00:08:43,240 --> 00:08:45,810 This whole big thing is a cycle. 186 00:08:45,810 --> 00:08:46,643 Okay? 187 00:08:46,643 --> 00:08:47,740 And it's connected. 188 00:08:47,740 --> 00:08:51,550 For any pair of nodes, there's some path that connects them, 189 00:08:51,550 --> 00:08:53,600 or more than one path as the case may be. 190 00:08:56,200 --> 00:09:00,600 Now, this is a directed cyclic graph. 191 00:09:00,600 --> 00:09:02,970 It contains a cycle, but it's not this. 192 00:09:02,970 --> 00:09:05,630 This is not a cycle here because I can go 193 00:09:05,630 --> 00:09:10,080 from E to D, to D to A, to A to B, 194 00:09:10,080 --> 00:09:13,840 but I can't get back to B, at least not directly. 195 00:09:13,840 --> 00:09:16,740 This is a bigger cycle here, all the way around 196 00:09:18,230 --> 00:09:21,520 because I can travel in this cycle. 197 00:09:21,520 --> 00:09:23,743 And then there's also a cycle here. 198 00:09:25,140 --> 00:09:28,303 Okay. So this is a directed cyclic graph. 199 00:09:31,930 --> 00:09:36,233 Well, we can also add weights to our edges. 200 00:09:37,800 --> 00:09:41,280 Weights could represent many different things. 201 00:09:41,280 --> 00:09:42,990 They could be distances. 202 00:09:42,990 --> 00:09:45,383 So if this were like a map, 203 00:09:46,510 --> 00:09:49,450 and we'd abstracted some features from a map, 204 00:09:49,450 --> 00:09:53,000 these might be cities and miles between the cities. 205 00:09:53,000 --> 00:09:55,050 They could be capacities. 206 00:09:55,050 --> 00:09:57,830 Let's say these were again, cities, 207 00:09:57,830 --> 00:10:01,400 and this was the flow of electricity or water 208 00:10:01,400 --> 00:10:03,710 to and from different places 209 00:10:03,710 --> 00:10:06,690 in a network of pipes or a network of wires. 210 00:10:06,690 --> 00:10:08,630 They could be great many other things as well, 211 00:10:08,630 --> 00:10:11,493 costs or transition probabilities and so on. 212 00:10:12,360 --> 00:10:14,210 It all depends on the particular problem 213 00:10:14,210 --> 00:10:15,150 that's being modeled. 214 00:10:15,150 --> 00:10:18,030 But the important thing is that we can associate 215 00:10:18,030 --> 00:10:19,510 some scalar value, 216 00:10:19,510 --> 00:10:24,510 some integer or real number, typically with each edge. 217 00:10:25,350 --> 00:10:28,603 Okay. And that's called a weighted directed graph now. 218 00:10:30,490 --> 00:10:31,510 Here's an example. 219 00:10:31,510 --> 00:10:34,393 This is an undirected graph that has weights. 220 00:10:35,410 --> 00:10:36,750 Showing a concrete example 221 00:10:36,750 --> 00:10:39,560 of towns in Vermont and the distances between them. 222 00:10:39,560 --> 00:10:42,150 So here's Burlington, it's three miles 223 00:10:42,150 --> 00:10:43,750 to Winooski and three miles back 224 00:10:43,750 --> 00:10:46,330 from Winooski to Burlington. 225 00:10:46,330 --> 00:10:48,710 It's 67 miles down to Rutland, 226 00:10:48,710 --> 00:10:52,410 and 67 miles back up from Rutland to Burlington. 227 00:10:52,410 --> 00:10:54,960 39 miles to Montpelier and so on. 228 00:10:54,960 --> 00:10:58,290 So this is a case where the weights 229 00:10:58,290 --> 00:11:00,480 of the edges represent distances. 230 00:11:00,480 --> 00:11:02,513 They don't have to, but this one does. 231 00:11:04,300 --> 00:11:06,090 Now, this is an important point, 232 00:11:06,090 --> 00:11:09,430 and it can confuse people sometimes. 233 00:11:09,430 --> 00:11:11,920 And that is it does not matter. 234 00:11:11,920 --> 00:11:15,000 It does not matter how we draw our graph 235 00:11:15,000 --> 00:11:17,300 on a piece of paper or in a PowerPoint 236 00:11:17,300 --> 00:11:19,623 or keynote presentation. 237 00:11:21,550 --> 00:11:25,683 The orientation of the vertices is immaterial. 238 00:11:26,840 --> 00:11:27,673 Okay. 239 00:11:27,673 --> 00:11:31,720 So what we're looking at here are identical graphs. 240 00:11:31,720 --> 00:11:34,630 The graph on the left is identical 241 00:11:34,630 --> 00:11:36,460 with the graph on the right 242 00:11:36,460 --> 00:11:40,670 because we don't care where the vertices are located 243 00:11:40,670 --> 00:11:41,970 on that 2D plane. 244 00:11:41,970 --> 00:11:43,400 It doesn't matter. 245 00:11:43,400 --> 00:11:48,400 Okay. So what's really important here is these two graphs. 246 00:11:48,430 --> 00:11:50,370 Do they have the same vertex set? 247 00:11:50,370 --> 00:11:52,110 A, B, C, D E F. 248 00:11:52,110 --> 00:11:53,460 Yes, they do. 249 00:11:53,460 --> 00:11:55,290 Do they have the same edges? 250 00:11:55,290 --> 00:11:56,123 Go and check. 251 00:11:56,123 --> 00:11:57,960 Yes, they do. 252 00:11:57,960 --> 00:12:01,000 Are are the edges oriented the same way? 253 00:12:01,000 --> 00:12:06,000 For example, does the edge between A and B go from A to B? 254 00:12:06,290 --> 00:12:07,860 Yeah, sure. It does. 255 00:12:07,860 --> 00:12:09,870 Do they all have the same weights? 256 00:12:09,870 --> 00:12:12,560 Yup. Six, six, nine, nine. 257 00:12:12,560 --> 00:12:13,393 You can check. 258 00:12:14,730 --> 00:12:17,070 So it doesn't matter how we draw them. 259 00:12:17,070 --> 00:12:19,100 These are identical graphs. 260 00:12:19,100 --> 00:12:19,933 Okay. 261 00:12:19,933 --> 00:12:22,163 For all of our purposes, these are identical. 262 00:12:25,430 --> 00:12:30,430 So we've got this graph, and this is a course on algorithms. 263 00:12:30,520 --> 00:12:35,160 So we want to be able to do algorithms with graphs. 264 00:12:35,160 --> 00:12:38,273 So we're gonna need some representation for the graph. 265 00:12:39,210 --> 00:12:41,690 So what's a good data structure to use for a graph? 266 00:12:41,690 --> 00:12:44,220 Well, let's think about what properties a data structure 267 00:12:44,220 --> 00:12:45,150 would have to have. 268 00:12:45,150 --> 00:12:47,980 Want to capture the edge set. 269 00:12:47,980 --> 00:12:51,070 It would want to capture the node set. 270 00:12:51,070 --> 00:12:52,258 You'd want to capture 271 00:12:52,258 --> 00:12:55,690 all the adjacency relationships, right? 272 00:12:55,690 --> 00:12:56,980 If it were a directed graph, 273 00:12:56,980 --> 00:13:00,140 you'd want to capture the orientation of all of the edges. 274 00:13:00,140 --> 00:13:01,420 And if there were a weighted graph, 275 00:13:01,420 --> 00:13:02,840 you want to capture all the weights. 276 00:13:02,840 --> 00:13:06,740 So we have to have a way of doing all those things. 277 00:13:06,740 --> 00:13:07,710 Okay? 278 00:13:07,710 --> 00:13:10,560 And it turns out that there are a lot of ways to do this. 279 00:13:11,530 --> 00:13:15,010 And some approaches are suitable in some use cases, 280 00:13:15,010 --> 00:13:16,673 and some suitable in others. 281 00:13:17,910 --> 00:13:20,280 Three common ones are the adjacency list, 282 00:13:20,280 --> 00:13:23,443 the adjacency matrix and the incidents matrix. 283 00:13:24,450 --> 00:13:26,790 But we're only gonna talk about the first two. 284 00:13:26,790 --> 00:13:29,583 I'm gonna save incidence matrix for another lecture. 285 00:13:31,490 --> 00:13:35,310 So let's take a look and see what is an adjacency list. 286 00:13:35,310 --> 00:13:39,010 So we've got this graph over here on the left, 287 00:13:39,010 --> 00:13:40,000 and we're gonna build 288 00:13:40,000 --> 00:13:43,190 our adjacency list over here on the right. 289 00:13:43,190 --> 00:13:46,560 And notice that we have a place, an entry 290 00:13:46,560 --> 00:13:51,163 for each vertex, each node in our graph, A through F. 291 00:13:53,010 --> 00:13:55,140 And so it's pretty simple. 292 00:13:55,140 --> 00:13:56,420 We go take a look at A. 293 00:13:56,420 --> 00:13:58,050 We say what's adjacent to A? 294 00:13:58,050 --> 00:13:59,060 B and D. 295 00:13:59,060 --> 00:14:00,760 So guess what goes there? B and D. 296 00:14:01,710 --> 00:14:04,420 All right, what's adjacent to B? 297 00:14:04,420 --> 00:14:06,703 A, E, and C. 298 00:14:08,720 --> 00:14:10,560 AC, and you go there. 299 00:14:10,560 --> 00:14:12,313 What's adjacent to C? 300 00:14:13,240 --> 00:14:14,570 B, E, and F. 301 00:14:14,570 --> 00:14:16,230 B, E, and F. 302 00:14:16,230 --> 00:14:17,063 And so on. 303 00:14:20,260 --> 00:14:22,500 And so that's our adjacency list. 304 00:14:22,500 --> 00:14:24,050 And from this adjacency list, 305 00:14:24,050 --> 00:14:25,653 we could reconstruct our graph. 306 00:14:26,990 --> 00:14:28,040 How would we do that? 307 00:14:29,350 --> 00:14:31,563 Well, let's erase our graph and do it. 308 00:14:32,480 --> 00:14:34,760 So if we just look down this column, 309 00:14:34,760 --> 00:14:37,940 we see what our node set is. 310 00:14:37,940 --> 00:14:40,240 We know we've got these six nodes, 311 00:14:40,240 --> 00:14:42,620 so let's go ahead and draw those. 312 00:14:42,620 --> 00:14:44,210 And I'm, yes, I'm drawing them 313 00:14:44,210 --> 00:14:46,760 in exactly the same place as they were in before. 314 00:14:46,760 --> 00:14:50,840 But again, it doesn't really matter how you draw the graph. 315 00:14:50,840 --> 00:14:52,390 If I position these differently, 316 00:14:52,390 --> 00:14:53,940 we'd still get the same result. 317 00:14:55,560 --> 00:14:57,290 So now what do we do? 318 00:14:57,290 --> 00:14:59,270 Now we take a look at the entry for A. 319 00:14:59,270 --> 00:15:01,450 We see that A is adjacent to B and D. 320 00:15:01,450 --> 00:15:02,460 So what do we do? 321 00:15:02,460 --> 00:15:06,150 We draw an edge between A and B, 322 00:15:06,150 --> 00:15:07,370 in between A and D. 323 00:15:07,370 --> 00:15:10,260 And now we check A off the list. 324 00:15:10,260 --> 00:15:12,010 Now we move on to B. 325 00:15:12,010 --> 00:15:13,820 What's on the adjacency list for B? 326 00:15:13,820 --> 00:15:14,703 A, C, and E. 327 00:15:15,882 --> 00:15:18,890 So we've already got this edge drawn here, 328 00:15:18,890 --> 00:15:20,290 so we don't need to draw it again, 329 00:15:20,290 --> 00:15:22,450 but we need do need to draw the other two edges. 330 00:15:22,450 --> 00:15:25,163 So let's put them in and check B off the list. 331 00:15:26,280 --> 00:15:28,190 What's adjacent to C? 332 00:15:28,190 --> 00:15:29,173 B, E, and F. 333 00:15:30,350 --> 00:15:33,520 See, I got, I got this edge drawn already, 334 00:15:33,520 --> 00:15:35,420 but I need to draw these two edges. 335 00:15:35,420 --> 00:15:40,420 So let's do that and check that off the list. 336 00:15:40,430 --> 00:15:42,760 And you just proceed down the list. 337 00:15:42,760 --> 00:15:45,340 And by the time you get the poor, old F here 338 00:15:45,340 --> 00:15:48,030 at the end of the list, you've already drawn all the edges. 339 00:15:48,030 --> 00:15:53,030 So you can see there's some redundancy in an adjacency list. 340 00:15:53,600 --> 00:15:55,190 And at this point, there's nothing else to do. 341 00:15:55,190 --> 00:15:56,023 So we're done. 342 00:15:58,360 --> 00:16:01,593 So we've recovered our graph from the adjacency list. 343 00:16:02,970 --> 00:16:04,893 How much space does this take? 344 00:16:06,800 --> 00:16:10,910 If we were to analyze this in terms of the space complexity, 345 00:16:10,910 --> 00:16:15,910 the worst case would be if we had a number of vertices 346 00:16:17,770 --> 00:16:21,040 and every vertex was connected to every other vertex, 347 00:16:21,040 --> 00:16:23,800 or every node was connected to every other node. 348 00:16:23,800 --> 00:16:27,830 In which case, we'd have a space requirement of big O 349 00:16:27,830 --> 00:16:31,560 of V plus E, or I should say the cardinality of V 350 00:16:31,560 --> 00:16:33,363 and the cardinality of E, okay? 351 00:16:34,420 --> 00:16:37,260 What's the complexity of finding 352 00:16:37,260 --> 00:16:39,140 if two vertices are adjacent? 353 00:16:39,140 --> 00:16:40,000 Well, how would I do that? 354 00:16:40,000 --> 00:16:44,600 Let's say, I want to know, is E adjacent to D? 355 00:16:44,600 --> 00:16:47,940 Well I'd go look up the entry for E, 356 00:16:47,940 --> 00:16:49,890 and then I'd run across here, 357 00:16:49,890 --> 00:16:53,270 and I'd run through this list, which might be a linked list. 358 00:16:53,270 --> 00:16:56,070 And if I encountered D in that list, 359 00:16:56,070 --> 00:16:58,893 I would know that E is adjacent to D. 360 00:17:01,400 --> 00:17:03,960 I wouldn't need to go check the entry for D 361 00:17:03,960 --> 00:17:07,000 and make sure it's there also 362 00:17:07,000 --> 00:17:10,630 because if the adjacency lists is properly constructed, 363 00:17:10,630 --> 00:17:15,370 then the list really represents the edge twice. 364 00:17:15,370 --> 00:17:16,547 DE and ED. 365 00:17:17,930 --> 00:17:20,580 So either one we check, it gives us the right answer. 366 00:17:22,520 --> 00:17:26,340 So the worst case there is we have to run through 367 00:17:26,340 --> 00:17:30,960 an entire list here that could be all the vertices. 368 00:17:30,960 --> 00:17:34,640 So the worst case for a query like that is big O 369 00:17:34,640 --> 00:17:36,260 of the cardinality of V, 370 00:17:36,260 --> 00:17:38,060 the number of vertices that we have. 371 00:17:39,470 --> 00:17:41,290 Now, let's look at another data structure 372 00:17:41,290 --> 00:17:42,470 for representing a graph. 373 00:17:42,470 --> 00:17:45,270 This is called the adjacency matrix. 374 00:17:45,270 --> 00:17:48,200 And here we have a matrix with the rows and columns labeled 375 00:17:48,200 --> 00:17:49,140 with the node labels. 376 00:17:49,140 --> 00:17:52,520 So we have all these columns, A through F, 377 00:17:52,520 --> 00:17:54,283 and all these rows, A through F. 378 00:17:55,340 --> 00:17:57,480 And so with six nodes in our graph, 379 00:17:57,480 --> 00:18:00,040 we'd have a six by six matrix. 380 00:18:00,040 --> 00:18:04,590 If we have cardinality of V nodes, 381 00:18:04,590 --> 00:18:06,680 then we're going to have cardinality 382 00:18:06,680 --> 00:18:11,223 of V squared cells in our matrix. 383 00:18:12,090 --> 00:18:13,340 So what do we do? 384 00:18:13,340 --> 00:18:18,340 We put a zero in each entry of our matrix, 385 00:18:19,740 --> 00:18:24,290 where the two nodes are not adjacent. 386 00:18:24,290 --> 00:18:26,640 And if they are adjacent, then we put a one. 387 00:18:26,640 --> 00:18:29,640 So M here represents the matrix. 388 00:18:29,640 --> 00:18:34,030 I and J subscripts indicate the row and the column. 389 00:18:34,030 --> 00:18:39,030 And so to take an example, if we have A is adjacent to B, 390 00:18:42,310 --> 00:18:45,700 then we'd put a one here. 391 00:18:45,700 --> 00:18:47,430 Okay. So let's do that. 392 00:18:47,430 --> 00:18:48,480 Let's build this out. 393 00:18:49,660 --> 00:18:54,210 So this first row, A is not adjacent to A 394 00:18:54,210 --> 00:18:55,460 because there's no self-loop. 395 00:18:55,460 --> 00:18:57,770 We're going to ignore those for now. 396 00:18:57,770 --> 00:18:59,400 A is adjacent to B. 397 00:18:59,400 --> 00:19:00,920 It's not adjacent to C, 398 00:19:00,920 --> 00:19:02,590 but it is adjacent to D, 399 00:19:02,590 --> 00:19:05,703 it's not adjacent to E and it is not adjacent to F. 400 00:19:07,890 --> 00:19:09,360 Next row. 401 00:19:09,360 --> 00:19:14,360 This is what's adjacent to B, A, and C and E. 402 00:19:15,310 --> 00:19:16,763 A, C, and E, okay? 403 00:19:18,890 --> 00:19:22,033 And similarly for the other rows in our adjacency matrix. 404 00:19:27,880 --> 00:19:29,280 Now, there's some interesting properties 405 00:19:29,280 --> 00:19:30,883 of the adjacency matrix. 406 00:19:32,000 --> 00:19:35,350 The sum of any row or column gives the degree 407 00:19:35,350 --> 00:19:36,720 of the corresponding nodes. 408 00:19:36,720 --> 00:19:40,840 So if I take the sum of this column here for B, 409 00:19:40,840 --> 00:19:45,020 I get three, and sure enough, the degree of B is three. 410 00:19:45,020 --> 00:19:47,690 There are three edges incident to be. 411 00:19:47,690 --> 00:19:49,960 If I do the same for D, 412 00:19:49,960 --> 00:19:51,710 I sum that column, I get two. 413 00:19:51,710 --> 00:19:54,720 And sure enough, the degree of D is two. 414 00:19:54,720 --> 00:19:58,290 So that's a nice property for our adjacency matrix. 415 00:19:58,290 --> 00:20:00,720 And if we sum over all of the elements in the matrix 416 00:20:00,720 --> 00:20:01,553 and divide by two, 417 00:20:01,553 --> 00:20:03,770 we get the total number of edges in our graph. 418 00:20:08,230 --> 00:20:11,640 Now you may have noticed that this table is symmetric 419 00:20:11,640 --> 00:20:12,700 about the diagonal. 420 00:20:12,700 --> 00:20:16,040 So this upper triangle, this is called the upper triangle 421 00:20:16,040 --> 00:20:18,700 of a matrix, and the lower triangle, 422 00:20:18,700 --> 00:20:20,310 are mirror images of one another. 423 00:20:20,310 --> 00:20:22,273 This graph is symmetric. 424 00:20:23,610 --> 00:20:25,880 Because in an undirected graph, 425 00:20:25,880 --> 00:20:27,180 if A is adjacent to B, 426 00:20:27,180 --> 00:20:28,260 B as adjacent to A. 427 00:20:28,260 --> 00:20:31,830 So here's a one here, and there's gotta be a one here 428 00:20:32,670 --> 00:20:34,790 because it's an undirected graph. 429 00:20:34,790 --> 00:20:36,350 So in the case of undirected graphs, 430 00:20:36,350 --> 00:20:39,873 we would expect our adjacency matrix to be symmetric. 431 00:20:41,370 --> 00:20:45,370 So we could really ignore the lower triangle 432 00:20:45,370 --> 00:20:46,850 of the adjacency matrix. 433 00:20:46,850 --> 00:20:49,290 All of the information about the graph 434 00:20:49,290 --> 00:20:51,420 is encoded in the upper triangle here. 435 00:20:51,420 --> 00:20:54,070 So if we wanted to, we could ignore the other values. 436 00:20:55,110 --> 00:20:58,100 And you'll also notice that since this graph 437 00:20:58,100 --> 00:20:59,910 has no self-loops 438 00:20:59,910 --> 00:21:01,510 and then the ones that we're gonna look at, 439 00:21:01,510 --> 00:21:03,310 they're not going to have self loops, 440 00:21:03,310 --> 00:21:06,010 we expect to see zeros along the diagonal. 441 00:21:06,010 --> 00:21:09,043 So no node is connected to itself. 442 00:21:12,420 --> 00:21:17,420 Now, we see that this has a space complexity of big O 443 00:21:17,810 --> 00:21:19,760 of the cardinality of V squared 444 00:21:19,760 --> 00:21:22,660 because this is V. 445 00:21:22,660 --> 00:21:24,770 This is V, the cardinality of V. 446 00:21:24,770 --> 00:21:27,170 And so we take the product and we get V squared. 447 00:21:28,150 --> 00:21:33,150 But what time does it take to perform a query to see 448 00:21:33,420 --> 00:21:37,020 if E and B are adjacent? 449 00:21:37,020 --> 00:21:40,310 We just look up EB, and we get a one. 450 00:21:40,310 --> 00:21:41,980 So we know they're adjacent. 451 00:21:41,980 --> 00:21:43,310 And that's fast. 452 00:21:43,310 --> 00:21:46,230 Just like looking up a value in a vector 453 00:21:46,230 --> 00:21:48,670 takes place in constant time, 454 00:21:48,670 --> 00:21:53,440 looking up in a matrix takes place in constant time. 455 00:21:53,440 --> 00:21:55,120 So this is a constant time operation. 456 00:21:55,120 --> 00:21:59,120 So that's a lot faster than the adjacency list. 457 00:21:59,120 --> 00:22:01,160 It takes more space, 458 00:22:01,160 --> 00:22:03,300 and it may not be as efficient in terms of space, 459 00:22:03,300 --> 00:22:05,990 especially if you have a lot of vertices 460 00:22:05,990 --> 00:22:07,640 and not a lot of edges, 461 00:22:07,640 --> 00:22:09,733 you get a what's called a sparse matrix, 462 00:22:10,570 --> 00:22:12,220 but the lookup is very fast. 463 00:22:12,220 --> 00:22:13,993 So that's nice. 464 00:22:15,640 --> 00:22:19,150 Let's add the complication of directed graphs. 465 00:22:19,150 --> 00:22:21,770 What do we do with directed graphs? 466 00:22:21,770 --> 00:22:24,390 Well, you'll recall that in this graph, 467 00:22:24,390 --> 00:22:25,570 A is adjacent to B, 468 00:22:25,570 --> 00:22:27,360 but B is not adjacent to A. 469 00:22:27,360 --> 00:22:29,410 So in the case of the adjacency list, 470 00:22:29,410 --> 00:22:31,319 it's just gonna be an abridged version 471 00:22:31,319 --> 00:22:34,650 of the adjacency list that we had before. 472 00:22:34,650 --> 00:22:38,920 Before we would have an adjacency from A to B 473 00:22:38,920 --> 00:22:39,830 and from B to A. 474 00:22:39,830 --> 00:22:42,220 Here, we're just gonna have the one. 475 00:22:42,220 --> 00:22:45,530 So A is only adjacent to B. 476 00:22:45,530 --> 00:22:49,973 It's not adjacent to D because I can't get from A to D. 477 00:22:51,830 --> 00:22:54,030 B is only adjacent to C 478 00:22:54,030 --> 00:22:57,210 because this directed edge only goes to C. 479 00:22:57,210 --> 00:23:00,550 There's no way I can get from B directly from B 480 00:23:00,550 --> 00:23:03,223 to any other node in the graph. 481 00:23:04,950 --> 00:23:08,743 C is only adjacent to E and F, and so on. 482 00:23:10,263 --> 00:23:12,130 D is adjacent to A. 483 00:23:12,130 --> 00:23:13,843 E is adjacent to B and D. 484 00:23:14,796 --> 00:23:16,470 F is adjacent to E. 485 00:23:16,470 --> 00:23:20,020 And so that's our adjacency list for directed graphs. 486 00:23:20,020 --> 00:23:21,950 It's very similar, almost identical. 487 00:23:21,950 --> 00:23:22,783 Okay. 488 00:23:23,990 --> 00:23:27,490 Now what do we do in the case of the adjacency matrix? 489 00:23:27,490 --> 00:23:32,350 Well, we put a zero if they're not adjacent, 490 00:23:32,350 --> 00:23:34,260 and a one if they are adjacent. 491 00:23:34,260 --> 00:23:35,880 Same thing as before except now, 492 00:23:35,880 --> 00:23:38,540 we're just gonna have fewer one entries 493 00:23:38,540 --> 00:23:42,800 because we have fewer adjacencies in this graph 494 00:23:42,800 --> 00:23:45,300 because of its directed nature. 495 00:23:45,300 --> 00:23:47,973 So in the case of A, 496 00:23:47,973 --> 00:23:51,853 A is adjacent only to B and zeros for all the other entries. 497 00:23:52,700 --> 00:23:57,060 B is adjacent only to C and zero for all the other entries. 498 00:23:57,060 --> 00:23:57,893 And so on. 499 00:24:03,170 --> 00:24:07,650 Now, unlike the case of the undirected graph, 500 00:24:07,650 --> 00:24:10,650 we wouldn't typically expect the upper lower triangles 501 00:24:10,650 --> 00:24:12,780 of the graph to be mirror images of one another. 502 00:24:12,780 --> 00:24:14,780 In fact, they usually aren't. 503 00:24:14,780 --> 00:24:17,140 The matrix is not symmetrical 504 00:24:17,140 --> 00:24:22,140 because these directed edges are not symmetrical. 505 00:24:22,300 --> 00:24:25,290 I can only go one way along a directed edge. 506 00:24:25,290 --> 00:24:26,670 I can't go back. 507 00:24:26,670 --> 00:24:28,220 So they're not symmetrical. 508 00:24:28,220 --> 00:24:31,083 And so that's reflected in the adjacency matrix. 509 00:24:32,380 --> 00:24:36,030 Now let's look at the case of weighted directed graphs. 510 00:24:36,030 --> 00:24:38,560 And now here are the adjacency list data structure 511 00:24:38,560 --> 00:24:39,680 gets a little cumbersome 512 00:24:39,680 --> 00:24:41,850 because for each entry in the list, 513 00:24:41,850 --> 00:24:43,913 we also have to include the weight. 514 00:24:45,110 --> 00:24:50,110 So our first entry is adjacent to B with a weight six. 515 00:24:50,760 --> 00:24:52,900 Now we have to put B and six 516 00:24:52,900 --> 00:24:54,723 in its own little capsule there. 517 00:24:56,040 --> 00:25:01,040 And C, B is adjacent to C with a weight of nine. 518 00:25:01,750 --> 00:25:03,323 So now we have C, 9. 519 00:25:05,610 --> 00:25:09,130 C is adjacent to two other nodes. 520 00:25:09,130 --> 00:25:12,260 E and F, E with a weight of two 521 00:25:12,260 --> 00:25:14,023 and F with a weight of eight. 522 00:25:15,260 --> 00:25:17,910 Okay. And so on. 523 00:25:17,910 --> 00:25:20,723 So that's kind of inelegant, I think, 524 00:25:21,700 --> 00:25:24,750 but let's take a look and see how we would do the same thing 525 00:25:24,750 --> 00:25:26,780 with an adjacency matrix. 526 00:25:26,780 --> 00:25:30,580 With an adjacency matrix, we still use a zero 527 00:25:30,580 --> 00:25:34,150 if the nodes are not adjacent. 528 00:25:34,150 --> 00:25:36,360 But if they are adjacent, instead of putting a one, 529 00:25:36,360 --> 00:25:37,710 we put the weight. 530 00:25:37,710 --> 00:25:39,110 We put the weight right in the matrix. 531 00:25:39,110 --> 00:25:44,110 So W of I and J represents the weight of the edge 532 00:25:45,520 --> 00:25:47,360 connecting I and J. 533 00:25:47,360 --> 00:25:49,440 The matrix notation is the same here. 534 00:25:49,440 --> 00:25:51,950 So let's fill in some of these. 535 00:25:51,950 --> 00:25:55,120 We've got A is adjacent to B with weight six. 536 00:25:55,120 --> 00:25:58,950 So we're gonna put a six here and zeros elsewhere. 537 00:25:58,950 --> 00:25:59,783 Let's do that. 538 00:26:00,850 --> 00:26:04,520 B, again, has a weight of nine going to C. 539 00:26:04,520 --> 00:26:06,690 So we're gonna put a nine here. 540 00:26:06,690 --> 00:26:08,023 The rest will be zeros. 541 00:26:09,020 --> 00:26:09,853 And so on. 542 00:26:12,970 --> 00:26:15,050 Again, this is a directed graph. 543 00:26:15,050 --> 00:26:18,123 So you would not expect any symmetry here. 544 00:26:20,090 --> 00:26:22,710 And you can see that the upper and lower triangles 545 00:26:22,710 --> 00:26:24,873 are not mirror images of one another. 546 00:26:27,200 --> 00:26:30,010 So which data structure is best? 547 00:26:30,010 --> 00:26:34,630 Well, it depends, depends on a lot of things. 548 00:26:34,630 --> 00:26:36,250 And sometimes adjacency list 549 00:26:36,250 --> 00:26:38,250 is the right data structure to choose. 550 00:26:38,250 --> 00:26:41,134 And sometimes, the adjacency matrix 551 00:26:41,134 --> 00:26:43,623 is the right data structure to choose. 552 00:26:44,620 --> 00:26:48,550 If you take a look at their relative complexities, 553 00:26:48,550 --> 00:26:50,480 we've already said that the space complexity 554 00:26:50,480 --> 00:26:54,230 of the adjacency list is big O of V plus E, 555 00:26:54,230 --> 00:26:57,250 cardinality of V plus cardinality of E. 556 00:26:57,250 --> 00:26:58,960 And for the adjacency matrix, 557 00:26:58,960 --> 00:27:00,660 it's the cardinality of V squared. 558 00:27:01,735 --> 00:27:06,735 We've seen that the query takes in worst case, 559 00:27:07,620 --> 00:27:11,170 big O of the cardinality of V times in the adjacency list. 560 00:27:11,170 --> 00:27:14,963 And it's a constant time operation for the adjacency matrix. 561 00:27:16,500 --> 00:27:20,120 Adding a vertex is super easy for an adjacency list. 562 00:27:20,120 --> 00:27:21,960 We just pop another entry into the list. 563 00:27:21,960 --> 00:27:26,960 But for an adjacency matrix, if we want to add a vertex, 564 00:27:27,480 --> 00:27:32,480 then we have to rearrange our matrix and add row and column. 565 00:27:32,720 --> 00:27:36,000 So that becomes a big O of cardinality of V squared 566 00:27:36,000 --> 00:27:38,040 kind of an operation. 567 00:27:38,040 --> 00:27:41,260 Adding an edge though, is a constant time operation 568 00:27:41,260 --> 00:27:44,523 for both adjacency lists and adjacency matrix. 569 00:27:45,576 --> 00:27:49,823 So if you know the structure ahead of time, 570 00:27:53,000 --> 00:27:54,860 oftentimes, adjacency matrix 571 00:27:54,860 --> 00:27:57,500 is going to be the better choice. 572 00:27:57,500 --> 00:28:00,720 If you don't know the structure ahead of time, 573 00:28:00,720 --> 00:28:02,100 or you're gonna be making changes, 574 00:28:02,100 --> 00:28:05,730 sometimes adjacency list is the better choice. 575 00:28:05,730 --> 00:28:08,560 But again there's no hard and fast rule of thumb. 576 00:28:08,560 --> 00:28:10,210 So that's all for now. 577 00:28:10,210 --> 00:28:14,030 We're going to cover some supplementary topics 578 00:28:14,030 --> 00:28:15,930 in another lecture, 579 00:28:15,930 --> 00:28:18,960 and then we're gonna move on to some graph algorithms. 580 00:28:18,960 --> 00:28:20,363 That's all for now. Thanks.