1
00:00:01,010 --> 00:00:02,510
- [Instructor] Hey there.

2
00:00:02,510 --> 00:00:04,380
I'm assuming that you've already seen

3
00:00:04,380 --> 00:00:06,840
the instructional video

4
00:00:06,840 --> 00:00:08,650
on separate chaining

5
00:00:08,650 --> 00:00:09,960
and so what we're gonna do here

6
00:00:09,960 --> 00:00:12,880
is we're going to
implement separate chaining

7
00:00:12,880 --> 00:00:14,420
in a hash table

8
00:00:14,420 --> 00:00:16,570
and rather than start from scratch,

9
00:00:16,570 --> 00:00:20,060
I'm just gonna modify
the existing hash table

10
00:00:20,060 --> 00:00:21,150
that we've been working with.

11
00:00:21,150 --> 00:00:25,830
If you'll recall, we first
built a very naive version

12
00:00:25,830 --> 00:00:28,630
of hash table that just
worked with strings

13
00:00:28,630 --> 00:00:32,070
and then we added to that
HashTable class the ability

14
00:00:32,070 --> 00:00:33,900
to handle objects

15
00:00:33,900 --> 00:00:37,170
by including an additional
getKey function.

16
00:00:37,170 --> 00:00:38,270
And what we're gonna do

17
00:00:38,270 --> 00:00:40,400
is we're just gonna
modify that hash table,

18
00:00:40,400 --> 00:00:42,900
again to support separate chaining.

19
00:00:42,900 --> 00:00:43,733
And I think you'll find

20
00:00:43,733 --> 00:00:46,500
that these are pretty
straightforward changes.

21
00:00:46,500 --> 00:00:50,710
Our HashTable class isn't
going to change too much,

22
00:00:50,710 --> 00:00:55,710
except now, instead of
having just a vector

23
00:00:57,780 --> 00:00:59,810
of optional Hashables,

24
00:00:59,810 --> 00:01:04,810
we're going to have a vector
of vectors of Hashables, okay?

25
00:01:05,850 --> 00:01:07,850
So let's make the first change.

26
00:01:07,850 --> 00:01:10,080
We're gonna change this standard vector

27
00:01:10,080 --> 00:01:13,950
of optional Hashables to a vector

28
00:01:18,250 --> 00:01:22,200
of vector of Hashables, okay?

29
00:01:22,200 --> 00:01:25,050
And that's the fundamental change

30
00:01:25,050 --> 00:01:26,950
in the underlying data structure.

31
00:01:26,950 --> 00:01:29,873
Now we have a vector of buckets basically.

32
00:01:30,800 --> 00:01:32,880
And as you might expect,

33
00:01:32,880 --> 00:01:35,700
we're going to have to change our insert,

34
00:01:35,700 --> 00:01:39,543
find and remove functions, okay?

35
00:01:41,150 --> 00:01:43,510
Right now, as you'll recall,

36
00:01:43,510 --> 00:01:47,050
we're just overwriting the
value in the insert function.

37
00:01:47,050 --> 00:01:49,560
So if we have a hash collision,

38
00:01:49,560 --> 00:01:51,300
we just overwriting the value.

39
00:01:51,300 --> 00:01:53,320
That we don't want to happen.

40
00:01:53,320 --> 00:01:56,920
So what we're gonna wanna
do is we're going to want

41
00:01:56,920 --> 00:02:00,440
to search through the bucket

42
00:02:00,440 --> 00:02:03,330
to see if that value already exists

43
00:02:03,330 --> 00:02:04,600
and if it does not,

44
00:02:04,600 --> 00:02:06,600
then we're going to insert a new object.

45
00:02:08,550 --> 00:02:10,840
So we're gonna leverage,

46
00:02:10,840 --> 00:02:13,700
notice that we have
this find function here.

47
00:02:13,700 --> 00:02:15,130
Now, we're gonna modify that next

48
00:02:15,130 --> 00:02:19,430
but we're going to behave as
if it were already changed.

49
00:02:19,430 --> 00:02:22,990
So what we're gonna do is
we're gonna put in a test.

50
00:02:22,990 --> 00:02:27,713
And that is if we don't find that key,

51
00:02:33,450 --> 00:02:34,283
then

52
00:02:35,960 --> 00:02:37,620
we're going to insert our object

53
00:02:37,620 --> 00:02:39,883
by pushing it onto the vector.

54
00:02:42,050 --> 00:02:44,633
So table index.

55
00:02:45,900 --> 00:02:46,733
Which vector?

56
00:02:46,733 --> 00:02:48,093
The one at that index.

57
00:02:49,170 --> 00:02:52,413
Push_back item.

58
00:02:53,750 --> 00:02:54,968
Okay?

59
00:02:54,968 --> 00:02:55,893
So see what we've done here?

60
00:02:56,920 --> 00:02:59,610
We're still getting the key,

61
00:02:59,610 --> 00:03:03,000
we're still calculating the
index where it should go

62
00:03:03,000 --> 00:03:04,253
using our hash function.

63
00:03:05,730 --> 00:03:07,940
But now before we insert,

64
00:03:07,940 --> 00:03:09,750
we're going to look to see

65
00:03:09,750 --> 00:03:11,940
if that key already exists

66
00:03:12,910 --> 00:03:16,850
and if it does, then we're
not gonna do anything

67
00:03:16,850 --> 00:03:18,720
and if it doesn't already exists,

68
00:03:18,720 --> 00:03:21,500
we're going to add it to the bucket

69
00:03:21,500 --> 00:03:23,783
and of course, this line goes, let's go.

70
00:03:26,490 --> 00:03:28,700
So that's our insert function.

71
00:03:28,700 --> 00:03:31,373
That's the only change we needed to make.

72
00:03:32,340 --> 00:03:33,993
Now let's do our find function.

73
00:03:35,060 --> 00:03:36,973
So what's going on here?

74
00:03:37,890 --> 00:03:38,723
Let's see here.

75
00:03:38,723 --> 00:03:42,050
We're getting a string that's a key.

76
00:03:42,050 --> 00:03:45,000
We're calculating the index from the key

77
00:03:45,000 --> 00:03:46,500
in the tableSize.

78
00:03:46,500 --> 00:03:48,090
That looks good.

79
00:03:48,090 --> 00:03:51,960
But then we're gonna want
to loop through the items

80
00:03:51,960 --> 00:03:53,540
at that index

81
00:03:53,540 --> 00:03:56,470
and check to see if that key exists.

82
00:03:56,470 --> 00:03:59,710
So rather than just what we're doing here

83
00:03:59,710 --> 00:04:02,240
is getting the key of whatever's sitting

84
00:04:02,240 --> 00:04:05,130
at that index, assuming
that it's just one element

85
00:04:05,130 --> 00:04:07,700
and not a vector, we're
performing this comparison

86
00:04:07,700 --> 00:04:10,060
but here we need to work with the contents

87
00:04:10,060 --> 00:04:12,920
of a bucket, which may
be more than one element.

88
00:04:12,920 --> 00:04:17,573
So let's create a for loop.

89
00:04:19,810 --> 00:04:22,353
Int i equals zero.

90
00:04:23,480 --> 00:04:28,480
I, oops, i less than
table at this index size.

91
00:04:34,780 --> 00:04:37,640
Now, remember, we're
getting the size of a bucket

92
00:04:37,640 --> 00:04:39,913
at this index, okay?

93
00:04:40,862 --> 00:04:42,029
++i.

94
00:04:44,660 --> 00:04:46,590
And then what we're gonna wanna do

95
00:04:46,590 --> 00:04:49,730
is we're going to want to
check to see if it's the item

96
00:04:49,730 --> 00:04:50,640
that we're looking for.

97
00:04:50,640 --> 00:04:52,993
So if key,

98
00:04:53,890 --> 00:04:58,310
and this is gonna replace this
code that I saved here below

99
00:04:58,310 --> 00:05:00,500
just for comparison purposes.

100
00:05:00,500 --> 00:05:04,890
If key equals, I'm sorry, if key

101
00:05:06,030 --> 00:05:07,447
equals getKey,

102
00:05:09,890 --> 00:05:13,340
now before, we were using optionals here

103
00:05:15,380 --> 00:05:16,900
and de-referencing a pointer

104
00:05:16,900 --> 00:05:20,733
to this value here that's
stored at table index,

105
00:05:21,590 --> 00:05:23,560
we're not gonna do that anymore.

106
00:05:23,560 --> 00:05:26,710
We're going to use getKey

107
00:05:26,710 --> 00:05:31,710
but we're gonna use table at this index

108
00:05:32,000 --> 00:05:35,560
and then since we have a
loop through the bucket,

109
00:05:35,560 --> 00:05:37,113
which is indexed by i,

110
00:05:38,060 --> 00:05:40,320
this is the ith item in that bucket.

111
00:05:40,320 --> 00:05:44,070
So if this key equals the key

112
00:05:44,070 --> 00:05:46,793
for the ith item in this bucket,

113
00:05:49,470 --> 00:05:52,690
then return

114
00:05:53,900 --> 00:05:57,170
table index

115
00:05:59,620 --> 00:06:00,580
i.

116
00:06:00,580 --> 00:06:04,550
So we're gonna return
that item if we find it

117
00:06:04,550 --> 00:06:07,360
and I have a problem here.

118
00:06:07,360 --> 00:06:08,590
Why is that red?

119
00:06:08,590 --> 00:06:11,840
Oh, I see, I'm missing a closing paren.

120
00:06:11,840 --> 00:06:12,820
Okay.

121
00:06:12,820 --> 00:06:15,423
So that replaces this code.

122
00:06:17,190 --> 00:06:19,420
But we're still gonna return a nullopt

123
00:06:20,550 --> 00:06:21,840
if we don't find it

124
00:06:21,840 --> 00:06:24,270
because this is an optional Hashable.

125
00:06:24,270 --> 00:06:26,610
This is the return type, okay?

126
00:06:26,610 --> 00:06:27,860
So we're gonna leave that

127
00:06:28,930 --> 00:06:30,660
and that's our new find function

128
00:06:30,660 --> 00:06:33,330
and that's what being called by insert,

129
00:06:33,330 --> 00:06:34,440
so when we insert,

130
00:06:34,440 --> 00:06:36,283
we're gonna call find with the key.

131
00:06:37,380 --> 00:06:38,690
If it finds the object,

132
00:06:38,690 --> 00:06:40,970
it's gonna return the object

133
00:06:40,970 --> 00:06:44,860
and then this will evaluate to false

134
00:06:44,860 --> 00:06:47,300
and we won't insert the item.

135
00:06:47,300 --> 00:06:49,470
Otherwise it'll return nullopt

136
00:06:49,470 --> 00:06:52,320
and this will evaluate to true

137
00:06:52,320 --> 00:06:54,873
and then will add the item, okay?

138
00:06:57,480 --> 00:06:59,800
Now we have to modify or remove

139
00:06:59,800 --> 00:07:02,760
and the modifications for remove

140
00:07:02,760 --> 00:07:05,030
are going to be very similar

141
00:07:05,030 --> 00:07:08,590
to what we have for find.

142
00:07:08,590 --> 00:07:10,230
So I'm gonna do what I did before.

143
00:07:10,230 --> 00:07:12,280
I'm just gonna leave the old code there,

144
00:07:12,280 --> 00:07:15,460
so you can see what I'm changing and why

145
00:07:15,460 --> 00:07:17,210
and write some new code.

146
00:07:17,210 --> 00:07:21,700
So for int i, this is
gonna be exactly the same,

147
00:07:21,700 --> 00:07:26,670
equals zero, i less than table

148
00:07:29,160 --> 00:07:32,583
index size, the size of this bucket.

149
00:07:34,290 --> 00:07:35,710
Size.

150
00:07:35,710 --> 00:07:36,543
There we go.

151
00:07:37,825 --> 00:07:38,992
++i.

152
00:07:40,410 --> 00:07:41,243
Okay.

153
00:07:42,940 --> 00:07:47,000
And now we're gonna do
this same thing here.

154
00:07:47,000 --> 00:07:50,513
If getKey, so I'm just
gonna copy and paste.

155
00:07:52,170 --> 00:07:54,153
If that key exists,

156
00:07:55,010 --> 00:07:57,890
then we're gonna do
this old swaparoo here.

157
00:07:57,890 --> 00:08:02,234
So before we're
de-referencing this pointer

158
00:08:02,234 --> 00:08:04,150
that was stored here

159
00:08:04,150 --> 00:08:05,710
to get the return item,

160
00:08:05,710 --> 00:08:08,090
now we're going to just return the value

161
00:08:08,090 --> 00:08:11,077
that we find at the appropriate index

162
00:08:11,077 --> 00:08:13,430
within that bucket, okay?

163
00:08:13,430 --> 00:08:14,440
So Hashable

164
00:08:17,185 --> 00:08:18,018
returnItem

165
00:08:20,050 --> 00:08:21,980
equal table index i.

166
00:08:21,980 --> 00:08:23,763
Just copy and paste this.

167
00:08:25,520 --> 00:08:26,353
Okay?

168
00:08:28,190 --> 00:08:30,560
And then we need to remove the item.

169
00:08:30,560 --> 00:08:34,560
Now, I don't know that
we've seen this before.

170
00:08:34,560 --> 00:08:35,630
If we have, forgive me

171
00:08:35,630 --> 00:08:38,300
but if we haven't, then it's all good

172
00:08:38,300 --> 00:08:40,790
and maybe a reminder's okay.

173
00:08:40,790 --> 00:08:43,960
Vectors have an erase function

174
00:08:43,960 --> 00:08:48,960
and what you do is you specify
what you wanna erase, okay?

175
00:08:50,610 --> 00:08:55,090
So you have table index

176
00:08:56,300 --> 00:08:59,400
and that is our bucket

177
00:08:59,400 --> 00:09:02,000
and then we're going to
erase within that bucket

178
00:09:04,180 --> 00:09:07,423
table index begin,

179
00:09:11,000 --> 00:09:12,810
it's the beginning index,

180
00:09:12,810 --> 00:09:16,073
which is gonna be zero plus i.

181
00:09:18,430 --> 00:09:22,010
And this is how we erase an element

182
00:09:22,010 --> 00:09:23,470
within a vector.

183
00:09:23,470 --> 00:09:28,063
We're just calling that
vector's erase method, okay?

184
00:09:29,160 --> 00:09:32,370
And then we return returnItem.

185
00:09:36,440 --> 00:09:39,213
And that replaces this code.

186
00:09:40,240 --> 00:09:43,550
But if we did not find the item,

187
00:09:43,550 --> 00:09:46,683
we're still gonna return nullopt, okay?

188
00:09:47,920 --> 00:09:49,900
Now, we have more thing to change.

189
00:09:49,900 --> 00:09:50,733
What is that?

190
00:09:50,733 --> 00:09:52,670
It's the printTable function

191
00:09:52,670 --> 00:09:54,910
because before, we were assuming

192
00:09:54,910 --> 00:09:58,680
that there was a single value or a nullopt

193
00:09:58,680 --> 00:10:02,450
at each position in our hash table

194
00:10:02,450 --> 00:10:05,950
but now we may have multiple
values within a bucket

195
00:10:05,950 --> 00:10:07,390
at a particular index.

196
00:10:07,390 --> 00:10:10,370
So we're gonna have to make
a similar kind of a change.

197
00:10:10,370 --> 00:10:13,063
And so let's, I'm just gonna copy this,

198
00:10:14,150 --> 00:10:15,800
'cause this is gonna be the same.

199
00:10:17,050 --> 00:10:19,751
Scroll up so you can see
what I'm doing better.

200
00:10:19,751 --> 00:10:20,584
There we go.

201
00:10:20,584 --> 00:10:22,530
Okay, we're still gonna iterate

202
00:10:22,530 --> 00:10:26,710
through our hash table finding
the appropriate buckets.

203
00:10:26,710 --> 00:10:30,163
And then we're going to standard,

204
00:10:31,630 --> 00:10:33,110
oh, actually, I'm just
gonna borrow this code.

205
00:10:33,110 --> 00:10:34,143
This is the same.

206
00:10:35,860 --> 00:10:38,570
But now we have to look
at all the elements

207
00:10:38,570 --> 00:10:39,450
within the bucket.

208
00:10:39,450 --> 00:10:42,110
So we're gonna have an inner loop here.

209
00:10:42,110 --> 00:10:47,110
So for int j equal zero,

210
00:10:49,470 --> 00:10:54,170
j less than table i,

211
00:10:54,170 --> 00:10:56,763
less than the size of that bucket.

212
00:10:59,673 --> 00:11:00,506
++j.

213
00:11:02,020 --> 00:11:02,960
That's our inner loop.

214
00:11:02,960 --> 00:11:06,220
This is iterating through
the items in the bucket.

215
00:11:06,220 --> 00:11:08,710
Now we send a standard out.

216
00:11:08,710 --> 00:11:10,840
Std::cout

217
00:11:13,610 --> 00:11:17,820
table i, that's specifying the bucket

218
00:11:17,820 --> 00:11:22,240
and then j specifies the
item within the bucket.

219
00:11:22,240 --> 00:11:25,250
And then let's put a separator in here.

220
00:11:25,250 --> 00:11:29,730
Just a comma, that should
suffice between records.

221
00:11:29,730 --> 00:11:32,340
And then we need, we'll just borrow this.

222
00:11:32,340 --> 00:11:33,303
Std::cout.

223
00:11:35,930 --> 00:11:37,590
And that replaces this.

224
00:11:37,590 --> 00:11:39,333
So that all goes.

225
00:11:40,740 --> 00:11:45,070
And now we've modified our hash table

226
00:11:45,070 --> 00:11:47,453
so that it supports separate chaining.

227
00:11:48,430 --> 00:11:51,143
Now, let's go back to our demo.

228
00:11:53,910 --> 00:11:55,590
And I don't think we need to change this.

229
00:11:55,590 --> 00:11:57,920
I think we exercise our code the same way

230
00:11:59,150 --> 00:12:04,150
and we'll see that instead of
overwriting dentist with fish

231
00:12:06,860 --> 00:12:08,063
when we run printTable,

232
00:12:08,950 --> 00:12:11,470
we will see that dentist and fish

233
00:12:11,470 --> 00:12:15,760
are both in the same bucket, okay?

234
00:12:15,760 --> 00:12:17,400
'Cause if you recall,

235
00:12:17,400 --> 00:12:21,260
dentist and fish both had the same index

236
00:12:21,260 --> 00:12:24,960
when we calculated the
hornerHash, that was zero.

237
00:12:24,960 --> 00:12:27,740
So everything else should function fine.

238
00:12:27,740 --> 00:12:29,870
So we don't need to change this code

239
00:12:29,870 --> 00:12:32,950
in order to test our modified class.

240
00:12:32,950 --> 00:12:33,873
So let's run that.

241
00:12:38,850 --> 00:12:40,463
Okay, let's take a look.

242
00:12:42,640 --> 00:12:47,640
So now you'll see we
have this hash collision

243
00:12:51,790 --> 00:12:56,650
and now we have two items stored

244
00:12:56,650 --> 00:13:00,360
at index zero where we
had overwritten before.

245
00:13:00,360 --> 00:13:02,290
These others don't have hash collisions

246
00:13:02,290 --> 00:13:04,380
and again, recall that both of these fish

247
00:13:04,380 --> 00:13:07,910
and dentist gave us index zero

248
00:13:07,910 --> 00:13:12,650
and then what we did was we deleted giggle

249
00:13:12,650 --> 00:13:14,600
and so we're verifying that this works.

250
00:13:14,600 --> 00:13:16,840
Giggle was at index 12.

251
00:13:16,840 --> 00:13:19,280
Now at index 12, there's nothing.

252
00:13:19,280 --> 00:13:21,920
And so we know that our code works.

253
00:13:21,920 --> 00:13:26,720
And that is a complete
implementation now of a hash table

254
00:13:27,560 --> 00:13:29,550
with separate chaining.

255
00:13:29,550 --> 00:13:31,680
Code's on Blackboard, have a look.

256
00:13:31,680 --> 00:13:32,923
Thanks. Bye.