1
00:00:01,260 --> 00:00:02,093
- [Instructor] Hi.
2
00:00:02,093 --> 00:00:05,660
This is a continuation of
our introduction to graphs.
3
00:00:05,660 --> 00:00:08,270
This is a sneak preview
into some graph theory
4
00:00:08,270 --> 00:00:10,960
that you'll see in CS 224
5
00:00:10,960 --> 00:00:14,330
and you should treat
this as supplementary.
6
00:00:14,330 --> 00:00:18,530
I'm not going to assess you
on any material in this video,
7
00:00:18,530 --> 00:00:21,330
so treat this as purely optional,
8
00:00:21,330 --> 00:00:24,390
but I think it's interesting material.
9
00:00:24,390 --> 00:00:28,370
And if you do plan on taking
CS 224, which I recommend.
10
00:00:28,370 --> 00:00:30,960
It's the advanced algorithms class,
11
00:00:30,960 --> 00:00:34,800
we call it Algorithm Design and Analysis,
12
00:00:34,800 --> 00:00:36,100
then you'll see some of the concepts
13
00:00:36,100 --> 00:00:38,180
that I'm gonna present here.
14
00:00:38,180 --> 00:00:40,380
And this should help you get up to speed.
15
00:00:40,380 --> 00:00:42,860
But in any event, I hope
you find it interesting,
16
00:00:42,860 --> 00:00:46,990
but you won't be examined
or quizzed or anything else
17
00:00:46,990 --> 00:00:49,380
with regard to this material.
18
00:00:49,380 --> 00:00:54,380
So let's start with these graphs here.
19
00:00:56,050 --> 00:00:57,973
What do these graphs have in common?
20
00:01:01,060 --> 00:01:03,450
Well, these graphs are fully connected.
21
00:01:03,450 --> 00:01:06,830
Each node is connected to every other node
22
00:01:06,830 --> 00:01:08,700
and we have a special term for this.
23
00:01:08,700 --> 00:01:11,110
We call these complete graphs.
24
00:01:11,110 --> 00:01:14,010
And these are so special that
we even have names for them.
25
00:01:15,450 --> 00:01:18,920
We call the complete
graph on two vertices K2,
26
00:01:18,920 --> 00:01:22,703
the complete graph on three
vertices, K3, and so on.
27
00:01:26,200 --> 00:01:30,920
Now, if we can draw a
graph on a 2D surface
28
00:01:30,920 --> 00:01:34,360
without having any of its edges cross,
29
00:01:34,360 --> 00:01:36,603
we call such a graph planar.
30
00:01:37,520 --> 00:01:41,900
So, here in the left, we
see K3 is clearly planar,
31
00:01:41,900 --> 00:01:45,220
because we can draw the three vertices,
32
00:01:45,220 --> 00:01:49,040
we can orient them in
our drawing in such a way
33
00:01:49,040 --> 00:01:53,030
that the three edges
here, none of the cross.
34
00:01:53,030 --> 00:01:55,257
Actually, it's impossible
(speaking faintly).
35
00:01:57,423 --> 00:01:59,010
What about K4?
36
00:01:59,010 --> 00:02:01,960
Here we see edges crossing,
37
00:02:01,960 --> 00:02:05,520
but is this the only way to draw K4?
38
00:02:05,520 --> 00:02:10,320
Is it possible to draw
K4 on a plane surface
39
00:02:10,320 --> 00:02:13,793
in such a way that none
of the edges cross?
40
00:02:14,780 --> 00:02:17,690
And, you know, we could
take this one edge here
41
00:02:17,690 --> 00:02:20,050
and stretch it to the outside
42
00:02:20,050 --> 00:02:23,783
and make a drawing that
looks kinda like this.
43
00:02:24,760 --> 00:02:26,560
And that's okay, that works,
44
00:02:26,560 --> 00:02:30,403
so now we know that K4
is definitely planar.
45
00:02:31,580 --> 00:02:33,930
It's a little more elegant
to draw it this way.
46
00:02:35,230 --> 00:02:37,730
It's sort of a tetrahedral
kind of a relationship.
47
00:02:38,950 --> 00:02:43,510
So this is what we mean when
we say a graph is planar,
48
00:02:43,510 --> 00:02:46,970
is that we can draw it in 2D
49
00:02:46,970 --> 00:02:49,023
without having any edges cross.
50
00:02:51,150 --> 00:02:54,900
K5, on the other hand,
is definitely not planar
51
00:02:54,900 --> 00:02:59,773
and K5 turns up in a lot
of proofs in graph theory.
52
00:03:01,858 --> 00:03:04,730
If K5 is found anywhere on a graph,
53
00:03:04,730 --> 00:03:06,560
then the graph is not planar.
54
00:03:06,560 --> 00:03:11,050
So we could hang all kinds
of other nodes off of this.
55
00:03:11,050 --> 00:03:14,680
If anywhere on the graph there appears K5,
56
00:03:14,680 --> 00:03:16,877
the graph is not planar.
57
00:03:20,318 --> 00:03:23,680
Now, here's something different
that you'll see in CS 224.
58
00:03:23,680 --> 00:03:26,430
Take a look at these two
graphs and what do you notice?
59
00:03:31,860 --> 00:03:36,540
Well, A, B, and C form what's
called an independent set,
60
00:03:36,540 --> 00:03:38,870
that is none of those vertices,
61
00:03:38,870 --> 00:03:41,000
or none of those nodes, I should say,
62
00:03:41,000 --> 00:03:42,950
shares an edge with any of the others.
63
00:03:42,950 --> 00:03:46,400
So we here we see A, B, C.
64
00:03:46,400 --> 00:03:50,000
A is connected to D, F, and E,
65
00:03:50,000 --> 00:03:52,300
but there's no edge that
goes between A and C.
66
00:03:52,300 --> 00:03:55,100
There's no edge that goes between A and B,
67
00:03:55,100 --> 00:03:57,940
and there's no edge that
goes between B and C.
68
00:03:57,940 --> 00:04:01,340
So we call that an independent set.
69
00:04:01,340 --> 00:04:03,620
And the same applies to D, E, and F.
70
00:04:03,620 --> 00:04:07,230
You'll see here D, E, and F
are connected to A, B, and C,
71
00:04:07,230 --> 00:04:09,480
but there's no edge between F and E,
72
00:04:09,480 --> 00:04:10,940
no edge between F and D,
73
00:04:10,940 --> 00:04:13,433
and no edge between D and E.
74
00:04:14,320 --> 00:04:16,333
And to make this more clear,
75
00:04:18,000 --> 00:04:19,600
we get the picture on the right,
76
00:04:20,450 --> 00:04:23,830
where we've segregated the vertices here,
77
00:04:23,830 --> 00:04:25,220
A, B, and C on one side,
78
00:04:25,220 --> 00:04:28,040
and D, E, and F on the other side.
79
00:04:28,040 --> 00:04:30,500
Remember that these are
exactly the same graph,
80
00:04:30,500 --> 00:04:32,120
they're just drawn differently.
81
00:04:32,120 --> 00:04:35,570
So the edge relationships
are exactly the same
82
00:04:35,570 --> 00:04:36,490
in both cases.
83
00:04:36,490 --> 00:04:37,910
We're just drawing them differently
84
00:04:37,910 --> 00:04:42,070
so that we can visualize a
particular characteristic
85
00:04:42,070 --> 00:04:42,903
of this graph.
86
00:04:43,930 --> 00:04:48,400
And so we have this set
containing A, B, and C,
87
00:04:48,400 --> 00:04:50,240
this independent set,
88
00:04:50,240 --> 00:04:54,440
and we have this independent
set containing D, E, and F.
89
00:04:54,440 --> 00:04:58,010
Members of these sets
share no common edges.
90
00:04:58,010 --> 00:05:02,170
And graphs like this we
call bipartite graphs
91
00:05:02,170 --> 00:05:05,320
and we've recently learned
about equivalence relations
92
00:05:05,320 --> 00:05:08,150
and it turns out that
belonging to an independent set
93
00:05:08,150 --> 00:05:09,573
is an equivalence relation.
94
00:05:10,950 --> 00:05:12,650
And where do these turn up?
95
00:05:12,650 --> 00:05:15,890
Well, bipartite graphs
have broad application
96
00:05:15,890 --> 00:05:17,530
and there's all kinds of algorithms
97
00:05:17,530 --> 00:05:18,863
that operate on them.
98
00:05:19,710 --> 00:05:22,210
And one common application
is called matching,
99
00:05:22,210 --> 00:05:24,800
and I won't go into too much detail here,
100
00:05:24,800 --> 00:05:28,410
but here's one example of a matching.
101
00:05:28,410 --> 00:05:30,810
On one side, we have high school students,
102
00:05:30,810 --> 00:05:33,080
and on the other side we have the colleges
103
00:05:33,080 --> 00:05:34,630
to which they've been admitted.
104
00:05:37,010 --> 00:05:39,650
By the same token, we could
have the same students
105
00:05:39,650 --> 00:05:41,490
and the things to which they're allergic.
106
00:05:41,490 --> 00:05:44,290
So one class of algorithms is designed
107
00:05:44,290 --> 00:05:47,600
to find matchings in a graph,
108
00:05:47,600 --> 00:05:49,800
and another class of
algorithms is intended
109
00:05:49,800 --> 00:05:53,450
to produce matchings with
certain desirable properties.
110
00:05:53,450 --> 00:05:56,580
For example, the famous
Gale-Shapley algorithm
111
00:05:58,090 --> 00:05:59,850
produces stable matchings
112
00:05:59,850 --> 00:06:03,970
and it's used to assign medical students
113
00:06:05,239 --> 00:06:06,843
to medical colleges.
114
00:06:08,230 --> 00:06:12,080
But let's conclude this
little sneak preview
115
00:06:12,080 --> 00:06:13,440
and move on to the problem
116
00:06:13,440 --> 00:06:16,853
of how we represent a graph
with some data structure.
117
00:06:17,710 --> 00:06:20,400
Now, in the other lecture,
118
00:06:20,400 --> 00:06:25,400
I introduced the adjacency list
119
00:06:25,830 --> 00:06:27,620
and the adjacency matrix
120
00:06:27,620 --> 00:06:29,750
and I mentioned that there
was such a thing called
121
00:06:29,750 --> 00:06:31,340
an incidence matrix,
122
00:06:31,340 --> 00:06:33,400
but we didn't go into it there.
123
00:06:33,400 --> 00:06:35,150
We're going to go into it here.
124
00:06:35,150 --> 00:06:37,530
So we're gonna look at this
one more data structure,
125
00:06:37,530 --> 00:06:39,330
the incidence matrix.
126
00:06:39,330 --> 00:06:41,700
Now, remember the meaning of incidence.
127
00:06:41,700 --> 00:06:43,630
An edge is incident to a vertex
128
00:06:43,630 --> 00:06:47,010
if the vertex is one of its
endpoints, I should say node.
129
00:06:47,010 --> 00:06:47,900
Let me say that again.
130
00:06:47,900 --> 00:06:50,590
An edge is incident to a node
131
00:06:50,590 --> 00:06:53,430
if the node is one of its endpoints.
132
00:06:53,430 --> 00:06:56,360
And the incidence matrix requires
133
00:06:56,360 --> 00:06:59,650
that we label edges and nodes.
134
00:06:59,650 --> 00:07:02,830
And so here you'll see, in
the diagram on the left,
135
00:07:02,830 --> 00:07:07,830
we have edge labels e1, e2,
e3, and so on through e8.
136
00:07:09,410 --> 00:07:10,960
And don't confuse these with weights.
137
00:07:10,960 --> 00:07:11,940
There are just labels.
138
00:07:11,940 --> 00:07:15,020
We could call them AB, BC, and so on,
139
00:07:15,020 --> 00:07:18,320
but I've adopted this notation
140
00:07:18,320 --> 00:07:19,640
for this particular incidence,
141
00:07:19,640 --> 00:07:23,013
because I think it's
more concise and elegant.
142
00:07:23,980 --> 00:07:25,140
So these are just labels.
143
00:07:25,140 --> 00:07:27,980
These are labeling this edge
and this edge and this edge,
144
00:07:27,980 --> 00:07:30,753
just like we have labels on our nodes.
145
00:07:31,890 --> 00:07:34,100
So what we do in an incidence matrix
146
00:07:34,100 --> 00:07:38,720
is that we have the edges on one axis,
147
00:07:38,720 --> 00:07:42,150
and these numbers correspond
to the subscripts here
148
00:07:42,150 --> 00:07:43,520
of each of these edges.
149
00:07:43,520 --> 00:07:48,520
So this column is going
to tell us things about e1
150
00:07:48,530 --> 00:07:51,723
and this column is going
to tell us things about e2.
151
00:07:53,080 --> 00:07:58,080
And so we have nodes on the other axis,
152
00:07:58,270 --> 00:08:00,360
these are the node labels,
153
00:08:00,360 --> 00:08:04,430
and what we do is we
put a one in the matrix
154
00:08:04,430 --> 00:08:08,670
if a node is an endpoint of
an edge, and zero otherwise.
155
00:08:08,670 --> 00:08:12,500
So again, this is our matrix notation
156
00:08:12,500 --> 00:08:14,380
with v and e as subscripts.
157
00:08:14,380 --> 00:08:17,780
V is the vertices, e is the edges.
158
00:08:17,780 --> 00:08:19,473
So let's see how that works.
159
00:08:21,490 --> 00:08:23,670
So here we'll proceed by columns,
160
00:08:23,670 --> 00:08:25,610
each column representing an edge.
161
00:08:25,610 --> 00:08:29,300
So let's start with edge one.
162
00:08:29,300 --> 00:08:30,980
So here we have edge one
163
00:08:30,980 --> 00:08:35,980
and we see that it's incident to A and B,
164
00:08:36,040 --> 00:08:39,530
and so for A, we put a
one, for B, we put a one,
165
00:08:39,530 --> 00:08:42,863
and we put zeros on all
the other positions.
166
00:08:43,740 --> 00:08:47,620
For edge two, edge two is incident to A
167
00:08:47,620 --> 00:08:49,200
and incident to D,
168
00:08:49,200 --> 00:08:51,870
so we put a one next to A here
169
00:08:51,870 --> 00:08:54,430
and a one here for D
170
00:08:54,430 --> 00:08:56,730
and all the others are zeros.
171
00:08:56,730 --> 00:09:01,143
And similarly for the rest
of the incidence matrix.
172
00:09:04,350 --> 00:09:08,170
And so finally, we end up at edge eight,
173
00:09:08,170 --> 00:09:10,230
which is incident to E and F.
174
00:09:10,230 --> 00:09:12,730
You see those down in the
lower right-hand corner.
175
00:09:15,450 --> 00:09:17,010
Now, you may wonder what we do
176
00:09:17,010 --> 00:09:18,800
in the case of a directed graph.
177
00:09:18,800 --> 00:09:23,433
Well, in a directed graph,
we use negative one.
178
00:09:24,400 --> 00:09:28,540
We use negative one if
edge e leaves node v,
179
00:09:28,540 --> 00:09:32,490
and we use positive one
if edge e enters node v,
180
00:09:32,490 --> 00:09:33,810
or zero otherwise.
181
00:09:33,810 --> 00:09:37,490
So here we have edge e1
182
00:09:37,490 --> 00:09:41,740
and we would put a negative one
183
00:09:41,740 --> 00:09:45,000
as the entry corresponding to e1 and A,
184
00:09:45,000 --> 00:09:48,220
and a positive one for
the entry corresponding
185
00:09:48,220 --> 00:09:49,910
to e1 and B,
186
00:09:49,910 --> 00:09:54,860
because e1 leaves A and enters B.
187
00:09:55,740 --> 00:09:57,660
Okay, so let's see what
188
00:09:57,660 --> 00:10:00,360
the incidence matrix looks like for those.
189
00:10:00,360 --> 00:10:01,900
Again, we'll proceed by columns,
190
00:10:01,900 --> 00:10:04,530
with each column representing an edge.
191
00:10:04,530 --> 00:10:07,200
So you see here e1, it leaves A,
192
00:10:07,200 --> 00:10:09,320
so we've got a negative one here.
193
00:10:09,320 --> 00:10:11,760
It enters B, so we have
a positive one here,
194
00:10:11,760 --> 00:10:13,660
and all the rest are zeros.
195
00:10:13,660 --> 00:10:17,413
Have to be because each edge
has exactly two components.
196
00:10:19,830 --> 00:10:21,290
And then the same thing.
197
00:10:21,290 --> 00:10:25,220
Here's e2, it leaves D and enters A,
198
00:10:25,220 --> 00:10:27,080
so there's a minus one here
199
00:10:27,080 --> 00:10:29,500
and a positive one here.
200
00:10:29,500 --> 00:10:33,083
And similarly for the rest
of the incidence matrix.
201
00:10:34,990 --> 00:10:37,480
Now, notice here that the columns
202
00:10:37,480 --> 00:10:39,580
in the case of a directed graph,
203
00:10:39,580 --> 00:10:43,210
the columns of an incidence
matrix will sum to zero
204
00:10:44,476 --> 00:10:47,190
and each row gives the
net input to a node.
205
00:10:47,190 --> 00:10:48,640
So if we take a look
206
00:10:48,640 --> 00:10:52,310
and we take the sum here for node A,
207
00:10:52,310 --> 00:10:55,320
we see we have minus one, one, zero,
208
00:10:55,320 --> 00:11:00,013
so there's an equivalent number
of in edges and out edges.
209
00:11:01,160 --> 00:11:03,800
This is not true for B, for example,
210
00:11:03,800 --> 00:11:05,760
we see that we have a net of one.
211
00:11:05,760 --> 00:11:09,007
We have one, negative one, and one.
212
00:11:09,007 --> 00:11:14,007
And so that tells us that
B has one more in edge
213
00:11:14,810 --> 00:11:16,040
than it has out edges.
214
00:11:16,040 --> 00:11:18,610
And if we check, yeah,
that's exactly the case.
215
00:11:18,610 --> 00:11:22,110
We see we have two in
edges and one out edge.
216
00:11:22,110 --> 00:11:24,270
So those are some interesting properties
217
00:11:24,270 --> 00:11:26,220
of the incidence matrix.
218
00:11:26,220 --> 00:11:28,180
And I won't go into the applications here,
219
00:11:28,180 --> 00:11:30,870
I just wanted to show
you the incidence matrix,
220
00:11:30,870 --> 00:11:33,540
that there are other ways
of representing a graph
221
00:11:34,800 --> 00:11:37,680
than just the two that we
saw in the previous lecture,
222
00:11:37,680 --> 00:11:40,770
but there are a number of applications
223
00:11:40,770 --> 00:11:43,920
for which incidence matrix provides
224
00:11:43,920 --> 00:11:47,290
the more efficient data structure.
225
00:11:47,290 --> 00:11:52,290
In general, the complexity
of the incidence matrix
226
00:11:54,150 --> 00:11:58,571
is big O of V times the cardinality of V
227
00:11:58,571 --> 00:12:00,193
times the cardinality of E.
228
00:12:01,770 --> 00:12:05,070
And again, it has some interesting
mathematical properties
229
00:12:05,070 --> 00:12:07,530
and it's used in more
advanced applications,
230
00:12:07,530 --> 00:12:11,150
such as polytopes and
problems in combinatorics.
231
00:12:11,150 --> 00:12:14,470
And if you don't know what
polytopes are and combinatorics,
232
00:12:14,470 --> 00:12:16,460
don't worry now,
233
00:12:16,460 --> 00:12:21,460
but I hope that you see these
things in later classes.
234
00:12:21,990 --> 00:12:24,450
And that concludes this
little sneak preview
235
00:12:24,450 --> 00:12:28,143
for the graph theory that
you might see in CS 224.
236
00:12:29,440 --> 00:12:30,273
Thanks.