1 00:00:01,010 --> 00:00:02,510 - [Instructor] Hey there. 2 00:00:02,510 --> 00:00:04,380 I'm assuming that you've already seen 3 00:00:04,380 --> 00:00:06,840 the instructional video 4 00:00:06,840 --> 00:00:08,650 on separate chaining 5 00:00:08,650 --> 00:00:09,960 and so what we're gonna do here 6 00:00:09,960 --> 00:00:12,880 is we're going to implement separate chaining 7 00:00:12,880 --> 00:00:14,420 in a hash table 8 00:00:14,420 --> 00:00:16,570 and rather than start from scratch, 9 00:00:16,570 --> 00:00:20,060 I'm just gonna modify the existing hash table 10 00:00:20,060 --> 00:00:21,150 that we've been working with. 11 00:00:21,150 --> 00:00:25,830 If you'll recall, we first built a very naive version 12 00:00:25,830 --> 00:00:28,630 of hash table that just worked with strings 13 00:00:28,630 --> 00:00:32,070 and then we added to that HashTable class the ability 14 00:00:32,070 --> 00:00:33,900 to handle objects 15 00:00:33,900 --> 00:00:37,170 by including an additional getKey function. 16 00:00:37,170 --> 00:00:38,270 And what we're gonna do 17 00:00:38,270 --> 00:00:40,400 is we're just gonna modify that hash table, 18 00:00:40,400 --> 00:00:42,900 again to support separate chaining. 19 00:00:42,900 --> 00:00:43,733 And I think you'll find 20 00:00:43,733 --> 00:00:46,500 that these are pretty straightforward changes. 21 00:00:46,500 --> 00:00:50,710 Our HashTable class isn't going to change too much, 22 00:00:50,710 --> 00:00:55,710 except now, instead of having just a vector 23 00:00:57,780 --> 00:00:59,810 of optional Hashables, 24 00:00:59,810 --> 00:01:04,810 we're going to have a vector of vectors of Hashables, okay? 25 00:01:05,850 --> 00:01:07,850 So let's make the first change. 26 00:01:07,850 --> 00:01:10,080 We're gonna change this standard vector 27 00:01:10,080 --> 00:01:13,950 of optional Hashables to a vector 28 00:01:18,250 --> 00:01:22,200 of vector of Hashables, okay? 29 00:01:22,200 --> 00:01:25,050 And that's the fundamental change 30 00:01:25,050 --> 00:01:26,950 in the underlying data structure. 31 00:01:26,950 --> 00:01:29,873 Now we have a vector of buckets basically. 32 00:01:30,800 --> 00:01:32,880 And as you might expect, 33 00:01:32,880 --> 00:01:35,700 we're going to have to change our insert, 34 00:01:35,700 --> 00:01:39,543 find and remove functions, okay? 35 00:01:41,150 --> 00:01:43,510 Right now, as you'll recall, 36 00:01:43,510 --> 00:01:47,050 we're just overwriting the value in the insert function. 37 00:01:47,050 --> 00:01:49,560 So if we have a hash collision, 38 00:01:49,560 --> 00:01:51,300 we just overwriting the value. 39 00:01:51,300 --> 00:01:53,320 That we don't want to happen. 40 00:01:53,320 --> 00:01:56,920 So what we're gonna wanna do is we're going to want 41 00:01:56,920 --> 00:02:00,440 to search through the bucket 42 00:02:00,440 --> 00:02:03,330 to see if that value already exists 43 00:02:03,330 --> 00:02:04,600 and if it does not, 44 00:02:04,600 --> 00:02:06,600 then we're going to insert a new object. 45 00:02:08,550 --> 00:02:10,840 So we're gonna leverage, 46 00:02:10,840 --> 00:02:13,700 notice that we have this find function here. 47 00:02:13,700 --> 00:02:15,130 Now, we're gonna modify that next 48 00:02:15,130 --> 00:02:19,430 but we're going to behave as if it were already changed. 49 00:02:19,430 --> 00:02:22,990 So what we're gonna do is we're gonna put in a test. 50 00:02:22,990 --> 00:02:27,713 And that is if we don't find that key, 51 00:02:33,450 --> 00:02:34,283 then 52 00:02:35,960 --> 00:02:37,620 we're going to insert our object 53 00:02:37,620 --> 00:02:39,883 by pushing it onto the vector. 54 00:02:42,050 --> 00:02:44,633 So table index. 55 00:02:45,900 --> 00:02:46,733 Which vector? 56 00:02:46,733 --> 00:02:48,093 The one at that index. 57 00:02:49,170 --> 00:02:52,413 Push_back item. 58 00:02:53,750 --> 00:02:54,968 Okay? 59 00:02:54,968 --> 00:02:55,893 So see what we've done here? 60 00:02:56,920 --> 00:02:59,610 We're still getting the key, 61 00:02:59,610 --> 00:03:03,000 we're still calculating the index where it should go 62 00:03:03,000 --> 00:03:04,253 using our hash function. 63 00:03:05,730 --> 00:03:07,940 But now before we insert, 64 00:03:07,940 --> 00:03:09,750 we're going to look to see 65 00:03:09,750 --> 00:03:11,940 if that key already exists 66 00:03:12,910 --> 00:03:16,850 and if it does, then we're not gonna do anything 67 00:03:16,850 --> 00:03:18,720 and if it doesn't already exists, 68 00:03:18,720 --> 00:03:21,500 we're going to add it to the bucket 69 00:03:21,500 --> 00:03:23,783 and of course, this line goes, let's go. 70 00:03:26,490 --> 00:03:28,700 So that's our insert function. 71 00:03:28,700 --> 00:03:31,373 That's the only change we needed to make. 72 00:03:32,340 --> 00:03:33,993 Now let's do our find function. 73 00:03:35,060 --> 00:03:36,973 So what's going on here? 74 00:03:37,890 --> 00:03:38,723 Let's see here. 75 00:03:38,723 --> 00:03:42,050 We're getting a string that's a key. 76 00:03:42,050 --> 00:03:45,000 We're calculating the index from the key 77 00:03:45,000 --> 00:03:46,500 in the tableSize. 78 00:03:46,500 --> 00:03:48,090 That looks good. 79 00:03:48,090 --> 00:03:51,960 But then we're gonna want to loop through the items 80 00:03:51,960 --> 00:03:53,540 at that index 81 00:03:53,540 --> 00:03:56,470 and check to see if that key exists. 82 00:03:56,470 --> 00:03:59,710 So rather than just what we're doing here 83 00:03:59,710 --> 00:04:02,240 is getting the key of whatever's sitting 84 00:04:02,240 --> 00:04:05,130 at that index, assuming that it's just one element 85 00:04:05,130 --> 00:04:07,700 and not a vector, we're performing this comparison 86 00:04:07,700 --> 00:04:10,060 but here we need to work with the contents 87 00:04:10,060 --> 00:04:12,920 of a bucket, which may be more than one element. 88 00:04:12,920 --> 00:04:17,573 So let's create a for loop. 89 00:04:19,810 --> 00:04:22,353 Int i equals zero. 90 00:04:23,480 --> 00:04:28,480 I, oops, i less than table at this index size. 91 00:04:34,780 --> 00:04:37,640 Now, remember, we're getting the size of a bucket 92 00:04:37,640 --> 00:04:39,913 at this index, okay? 93 00:04:40,862 --> 00:04:42,029 ++i. 94 00:04:44,660 --> 00:04:46,590 And then what we're gonna wanna do 95 00:04:46,590 --> 00:04:49,730 is we're going to want to check to see if it's the item 96 00:04:49,730 --> 00:04:50,640 that we're looking for. 97 00:04:50,640 --> 00:04:52,993 So if key, 98 00:04:53,890 --> 00:04:58,310 and this is gonna replace this code that I saved here below 99 00:04:58,310 --> 00:05:00,500 just for comparison purposes. 100 00:05:00,500 --> 00:05:04,890 If key equals, I'm sorry, if key 101 00:05:06,030 --> 00:05:07,447 equals getKey, 102 00:05:09,890 --> 00:05:13,340 now before, we were using optionals here 103 00:05:15,380 --> 00:05:16,900 and de-referencing a pointer 104 00:05:16,900 --> 00:05:20,733 to this value here that's stored at table index, 105 00:05:21,590 --> 00:05:23,560 we're not gonna do that anymore. 106 00:05:23,560 --> 00:05:26,710 We're going to use getKey 107 00:05:26,710 --> 00:05:31,710 but we're gonna use table at this index 108 00:05:32,000 --> 00:05:35,560 and then since we have a loop through the bucket, 109 00:05:35,560 --> 00:05:37,113 which is indexed by i, 110 00:05:38,060 --> 00:05:40,320 this is the ith item in that bucket. 111 00:05:40,320 --> 00:05:44,070 So if this key equals the key 112 00:05:44,070 --> 00:05:46,793 for the ith item in this bucket, 113 00:05:49,470 --> 00:05:52,690 then return 114 00:05:53,900 --> 00:05:57,170 table index 115 00:05:59,620 --> 00:06:00,580 i. 116 00:06:00,580 --> 00:06:04,550 So we're gonna return that item if we find it 117 00:06:04,550 --> 00:06:07,360 and I have a problem here. 118 00:06:07,360 --> 00:06:08,590 Why is that red? 119 00:06:08,590 --> 00:06:11,840 Oh, I see, I'm missing a closing paren. 120 00:06:11,840 --> 00:06:12,820 Okay. 121 00:06:12,820 --> 00:06:15,423 So that replaces this code. 122 00:06:17,190 --> 00:06:19,420 But we're still gonna return a nullopt 123 00:06:20,550 --> 00:06:21,840 if we don't find it 124 00:06:21,840 --> 00:06:24,270 because this is an optional Hashable. 125 00:06:24,270 --> 00:06:26,610 This is the return type, okay? 126 00:06:26,610 --> 00:06:27,860 So we're gonna leave that 127 00:06:28,930 --> 00:06:30,660 and that's our new find function 128 00:06:30,660 --> 00:06:33,330 and that's what being called by insert, 129 00:06:33,330 --> 00:06:34,440 so when we insert, 130 00:06:34,440 --> 00:06:36,283 we're gonna call find with the key. 131 00:06:37,380 --> 00:06:38,690 If it finds the object, 132 00:06:38,690 --> 00:06:40,970 it's gonna return the object 133 00:06:40,970 --> 00:06:44,860 and then this will evaluate to false 134 00:06:44,860 --> 00:06:47,300 and we won't insert the item. 135 00:06:47,300 --> 00:06:49,470 Otherwise it'll return nullopt 136 00:06:49,470 --> 00:06:52,320 and this will evaluate to true 137 00:06:52,320 --> 00:06:54,873 and then will add the item, okay? 138 00:06:57,480 --> 00:06:59,800 Now we have to modify or remove 139 00:06:59,800 --> 00:07:02,760 and the modifications for remove 140 00:07:02,760 --> 00:07:05,030 are going to be very similar 141 00:07:05,030 --> 00:07:08,590 to what we have for find. 142 00:07:08,590 --> 00:07:10,230 So I'm gonna do what I did before. 143 00:07:10,230 --> 00:07:12,280 I'm just gonna leave the old code there, 144 00:07:12,280 --> 00:07:15,460 so you can see what I'm changing and why 145 00:07:15,460 --> 00:07:17,210 and write some new code. 146 00:07:17,210 --> 00:07:21,700 So for int i, this is gonna be exactly the same, 147 00:07:21,700 --> 00:07:26,670 equals zero, i less than table 148 00:07:29,160 --> 00:07:32,583 index size, the size of this bucket. 149 00:07:34,290 --> 00:07:35,710 Size. 150 00:07:35,710 --> 00:07:36,543 There we go. 151 00:07:37,825 --> 00:07:38,992 ++i. 152 00:07:40,410 --> 00:07:41,243 Okay. 153 00:07:42,940 --> 00:07:47,000 And now we're gonna do this same thing here. 154 00:07:47,000 --> 00:07:50,513 If getKey, so I'm just gonna copy and paste. 155 00:07:52,170 --> 00:07:54,153 If that key exists, 156 00:07:55,010 --> 00:07:57,890 then we're gonna do this old swaparoo here. 157 00:07:57,890 --> 00:08:02,234 So before we're de-referencing this pointer 158 00:08:02,234 --> 00:08:04,150 that was stored here 159 00:08:04,150 --> 00:08:05,710 to get the return item, 160 00:08:05,710 --> 00:08:08,090 now we're going to just return the value 161 00:08:08,090 --> 00:08:11,077 that we find at the appropriate index 162 00:08:11,077 --> 00:08:13,430 within that bucket, okay? 163 00:08:13,430 --> 00:08:14,440 So Hashable 164 00:08:17,185 --> 00:08:18,018 returnItem 165 00:08:20,050 --> 00:08:21,980 equal table index i. 166 00:08:21,980 --> 00:08:23,763 Just copy and paste this. 167 00:08:25,520 --> 00:08:26,353 Okay? 168 00:08:28,190 --> 00:08:30,560 And then we need to remove the item. 169 00:08:30,560 --> 00:08:34,560 Now, I don't know that we've seen this before. 170 00:08:34,560 --> 00:08:35,630 If we have, forgive me 171 00:08:35,630 --> 00:08:38,300 but if we haven't, then it's all good 172 00:08:38,300 --> 00:08:40,790 and maybe a reminder's okay. 173 00:08:40,790 --> 00:08:43,960 Vectors have an erase function 174 00:08:43,960 --> 00:08:48,960 and what you do is you specify what you wanna erase, okay? 175 00:08:50,610 --> 00:08:55,090 So you have table index 176 00:08:56,300 --> 00:08:59,400 and that is our bucket 177 00:08:59,400 --> 00:09:02,000 and then we're going to erase within that bucket 178 00:09:04,180 --> 00:09:07,423 table index begin, 179 00:09:11,000 --> 00:09:12,810 it's the beginning index, 180 00:09:12,810 --> 00:09:16,073 which is gonna be zero plus i. 181 00:09:18,430 --> 00:09:22,010 And this is how we erase an element 182 00:09:22,010 --> 00:09:23,470 within a vector. 183 00:09:23,470 --> 00:09:28,063 We're just calling that vector's erase method, okay? 184 00:09:29,160 --> 00:09:32,370 And then we return returnItem. 185 00:09:36,440 --> 00:09:39,213 And that replaces this code. 186 00:09:40,240 --> 00:09:43,550 But if we did not find the item, 187 00:09:43,550 --> 00:09:46,683 we're still gonna return nullopt, okay? 188 00:09:47,920 --> 00:09:49,900 Now, we have more thing to change. 189 00:09:49,900 --> 00:09:50,733 What is that? 190 00:09:50,733 --> 00:09:52,670 It's the printTable function 191 00:09:52,670 --> 00:09:54,910 because before, we were assuming 192 00:09:54,910 --> 00:09:58,680 that there was a single value or a nullopt 193 00:09:58,680 --> 00:10:02,450 at each position in our hash table 194 00:10:02,450 --> 00:10:05,950 but now we may have multiple values within a bucket 195 00:10:05,950 --> 00:10:07,390 at a particular index. 196 00:10:07,390 --> 00:10:10,370 So we're gonna have to make a similar kind of a change. 197 00:10:10,370 --> 00:10:13,063 And so let's, I'm just gonna copy this, 198 00:10:14,150 --> 00:10:15,800 'cause this is gonna be the same. 199 00:10:17,050 --> 00:10:19,751 Scroll up so you can see what I'm doing better. 200 00:10:19,751 --> 00:10:20,584 There we go. 201 00:10:20,584 --> 00:10:22,530 Okay, we're still gonna iterate 202 00:10:22,530 --> 00:10:26,710 through our hash table finding the appropriate buckets. 203 00:10:26,710 --> 00:10:30,163 And then we're going to standard, 204 00:10:31,630 --> 00:10:33,110 oh, actually, I'm just gonna borrow this code. 205 00:10:33,110 --> 00:10:34,143 This is the same. 206 00:10:35,860 --> 00:10:38,570 But now we have to look at all the elements 207 00:10:38,570 --> 00:10:39,450 within the bucket. 208 00:10:39,450 --> 00:10:42,110 So we're gonna have an inner loop here. 209 00:10:42,110 --> 00:10:47,110 So for int j equal zero, 210 00:10:49,470 --> 00:10:54,170 j less than table i, 211 00:10:54,170 --> 00:10:56,763 less than the size of that bucket. 212 00:10:59,673 --> 00:11:00,506 ++j. 213 00:11:02,020 --> 00:11:02,960 That's our inner loop. 214 00:11:02,960 --> 00:11:06,220 This is iterating through the items in the bucket. 215 00:11:06,220 --> 00:11:08,710 Now we send a standard out. 216 00:11:08,710 --> 00:11:10,840 Std::cout 217 00:11:13,610 --> 00:11:17,820 table i, that's specifying the bucket 218 00:11:17,820 --> 00:11:22,240 and then j specifies the item within the bucket. 219 00:11:22,240 --> 00:11:25,250 And then let's put a separator in here. 220 00:11:25,250 --> 00:11:29,730 Just a comma, that should suffice between records. 221 00:11:29,730 --> 00:11:32,340 And then we need, we'll just borrow this. 222 00:11:32,340 --> 00:11:33,303 Std::cout. 223 00:11:35,930 --> 00:11:37,590 And that replaces this. 224 00:11:37,590 --> 00:11:39,333 So that all goes. 225 00:11:40,740 --> 00:11:45,070 And now we've modified our hash table 226 00:11:45,070 --> 00:11:47,453 so that it supports separate chaining. 227 00:11:48,430 --> 00:11:51,143 Now, let's go back to our demo. 228 00:11:53,910 --> 00:11:55,590 And I don't think we need to change this. 229 00:11:55,590 --> 00:11:57,920 I think we exercise our code the same way 230 00:11:59,150 --> 00:12:04,150 and we'll see that instead of overwriting dentist with fish 231 00:12:06,860 --> 00:12:08,063 when we run printTable, 232 00:12:08,950 --> 00:12:11,470 we will see that dentist and fish 233 00:12:11,470 --> 00:12:15,760 are both in the same bucket, okay? 234 00:12:15,760 --> 00:12:17,400 'Cause if you recall, 235 00:12:17,400 --> 00:12:21,260 dentist and fish both had the same index 236 00:12:21,260 --> 00:12:24,960 when we calculated the hornerHash, that was zero. 237 00:12:24,960 --> 00:12:27,740 So everything else should function fine. 238 00:12:27,740 --> 00:12:29,870 So we don't need to change this code 239 00:12:29,870 --> 00:12:32,950 in order to test our modified class. 240 00:12:32,950 --> 00:12:33,873 So let's run that. 241 00:12:38,850 --> 00:12:40,463 Okay, let's take a look. 242 00:12:42,640 --> 00:12:47,640 So now you'll see we have this hash collision 243 00:12:51,790 --> 00:12:56,650 and now we have two items stored 244 00:12:56,650 --> 00:13:00,360 at index zero where we had overwritten before. 245 00:13:00,360 --> 00:13:02,290 These others don't have hash collisions 246 00:13:02,290 --> 00:13:04,380 and again, recall that both of these fish 247 00:13:04,380 --> 00:13:07,910 and dentist gave us index zero 248 00:13:07,910 --> 00:13:12,650 and then what we did was we deleted giggle 249 00:13:12,650 --> 00:13:14,600 and so we're verifying that this works. 250 00:13:14,600 --> 00:13:16,840 Giggle was at index 12. 251 00:13:16,840 --> 00:13:19,280 Now at index 12, there's nothing. 252 00:13:19,280 --> 00:13:21,920 And so we know that our code works. 253 00:13:21,920 --> 00:13:26,720 And that is a complete implementation now of a hash table 254 00:13:27,560 --> 00:13:29,550 with separate chaining. 255 00:13:29,550 --> 00:13:31,680 Code's on Blackboard, have a look. 256 00:13:31,680 --> 00:13:32,923 Thanks. Bye.