1 00:00:01,490 --> 00:00:03,920 - [Instructor] Oh my gosh, it's week nine. 2 00:00:03,920 --> 00:00:06,396 Wow, what do we have for week nine? 3 00:00:06,396 --> 00:00:07,363 Good stuff. 4 00:00:08,730 --> 00:00:11,493 This week, we're gonna learn about searching. 5 00:00:12,450 --> 00:00:14,380 You may think we've already learned about searching 6 00:00:14,380 --> 00:00:15,301 with trees, 7 00:00:15,301 --> 00:00:17,463 well, we've got a little more to go yet. 8 00:00:18,640 --> 00:00:20,990 And we're gonna begin our study of hash tables. 9 00:00:22,240 --> 00:00:24,050 Hash tables are widely used 10 00:00:24,050 --> 00:00:26,500 and they provide us with constant time access 11 00:00:26,500 --> 00:00:28,810 to objects stored in the table 12 00:00:28,810 --> 00:00:30,053 which is pretty cool. 13 00:00:31,470 --> 00:00:33,200 Taking a step back, 14 00:00:33,200 --> 00:00:36,370 you'll recall that we learned how to search for objects 15 00:00:36,370 --> 00:00:40,160 in a binary search tree or an AVL tree or a splay tree 16 00:00:40,160 --> 00:00:42,860 and that in general, these have a complexity 17 00:00:42,860 --> 00:00:44,423 of Big O of Log N. 18 00:00:45,650 --> 00:00:48,900 True, we have the worst cases for binary search trees 19 00:00:48,900 --> 00:00:50,150 of O of N 20 00:00:51,510 --> 00:00:54,350 but that's mitigated by the AVL and splay trees 21 00:00:54,350 --> 00:00:55,913 and true if we have, 22 00:00:56,820 --> 00:00:59,770 if we access the same element repeatedly in a splay tree, 23 00:00:59,770 --> 00:01:01,250 we get constant time access. 24 00:01:01,250 --> 00:01:03,223 But in general, we have Big O of Log N. 25 00:01:04,720 --> 00:01:06,870 Now this week, we'll look at how to search 26 00:01:06,870 --> 00:01:08,630 a linear structure like a vector 27 00:01:09,720 --> 00:01:12,250 and we'll see that naive linear search 28 00:01:12,250 --> 00:01:14,330 has linear complexity. 29 00:01:14,330 --> 00:01:17,760 We'll also see that if we sort our vector first, 30 00:01:17,760 --> 00:01:20,840 we can achieve Big O of Log N 31 00:01:20,840 --> 00:01:22,940 using what's called binary search 32 00:01:22,940 --> 00:01:25,713 which is a big improvement over linear time. 33 00:01:26,860 --> 00:01:28,910 And if our data have certain properties, 34 00:01:28,910 --> 00:01:30,500 we can do even better than this 35 00:01:30,500 --> 00:01:32,820 with what's called interpolation search 36 00:01:32,820 --> 00:01:35,580 which can achieve under good conditions 37 00:01:35,580 --> 00:01:38,430 Big O of Log of Log N, 38 00:01:38,430 --> 00:01:39,630 which is almost magical. 39 00:01:42,090 --> 00:01:44,800 And binary search probes a vector 40 00:01:44,800 --> 00:01:46,510 in search of the target 41 00:01:46,510 --> 00:01:50,350 and divides the search space in half at each iteration 42 00:01:50,350 --> 00:01:51,380 until it finds the target. 43 00:01:51,380 --> 00:01:54,640 That's how it achieve Big O of Log N complexity. 44 00:01:54,640 --> 00:01:58,060 Interpolation search takes advantage of certain regularities 45 00:01:58,060 --> 00:01:59,200 in the data 46 00:01:59,200 --> 00:02:01,800 that allows it to make very, very good guesses 47 00:02:01,800 --> 00:02:05,250 as to where to find a particular element 48 00:02:05,250 --> 00:02:07,170 in a linear list. 49 00:02:07,170 --> 00:02:09,660 And if that works, it works great. 50 00:02:09,660 --> 00:02:11,970 But it does require that your data have certain, 51 00:02:11,970 --> 00:02:14,120 certain properties in order for it to work. 52 00:02:15,760 --> 00:02:18,840 And after we complete our investigation and searching, 53 00:02:18,840 --> 00:02:20,363 we'll learn about hash tables. 54 00:02:21,610 --> 00:02:23,490 Now the idea behind hash tables 55 00:02:23,490 --> 00:02:25,710 is that we don't search to find the index 56 00:02:25,710 --> 00:02:27,140 of an item we're looking for. 57 00:02:27,140 --> 00:02:29,573 We calculate its position directly. 58 00:02:30,520 --> 00:02:31,900 And if we can do that, 59 00:02:31,900 --> 00:02:34,450 that means we can access elements in hash table 60 00:02:34,450 --> 00:02:35,743 in constant time. 61 00:02:37,370 --> 00:02:40,173 Now it turns out that its not quite that simple. 62 00:02:41,030 --> 00:02:43,360 It may be the case that when we try calculating 63 00:02:43,360 --> 00:02:45,960 the index of two different objects, 64 00:02:45,960 --> 00:02:47,390 we get the same answer 65 00:02:47,390 --> 00:02:49,410 and that's called the hash collision. 66 00:02:49,410 --> 00:02:51,210 And that's a problem for hash tables 67 00:02:51,210 --> 00:02:52,610 and there are a number of so-called 68 00:02:52,610 --> 00:02:54,880 collision resolution policies 69 00:02:54,880 --> 00:02:57,990 that are used to handle cases of hash collisions. 70 00:02:57,990 --> 00:03:00,240 And this week, we'll see one such policy 71 00:03:00,240 --> 00:03:02,740 which is called separate chaining. 72 00:03:02,740 --> 00:03:06,270 Essentially we're gonna work with a vector of vectors. 73 00:03:06,270 --> 00:03:09,030 And next week, we'll learn about other hash collision 74 00:03:09,030 --> 00:03:10,463 resolution policies. 75 00:03:11,480 --> 00:03:13,330 But we've got plenty to cover this week: 76 00:03:13,330 --> 00:03:17,230 binary search, interpolation search, hash tables, 77 00:03:17,230 --> 00:03:21,053 hashing algorithms, hash collisions and separate chaining. 78 00:03:22,400 --> 00:03:24,480 So get started with this week's video lectures 79 00:03:24,480 --> 00:03:25,590 and demonstrations 80 00:03:25,590 --> 00:03:28,220 and supplementary materials posted on Blackboard 81 00:03:28,220 --> 00:03:31,120 and don't forget the assigned reading in the textbook. 82 00:03:31,120 --> 00:03:32,340 At the end of the week, 83 00:03:32,340 --> 00:03:33,870 we'll have another quiz which will cover 84 00:03:33,870 --> 00:03:34,830 this week's material 85 00:03:34,830 --> 00:03:37,423 and a little bit of material from last week. 86 00:03:38,450 --> 00:03:40,000 And so that's all for now. 87 00:03:40,000 --> 00:03:42,420 I'll see you on our Yellowdig discussion board. 88 00:03:42,420 --> 00:03:43,570 Good luck and have fun.