1 00:00:01,160 --> 00:00:02,560 - [Narrator] Selection sort. 2 00:00:03,860 --> 00:00:07,610 In the previous two videos, we saw bubble sort. 3 00:00:07,610 --> 00:00:09,830 Selection sort is another algorithm 4 00:00:09,830 --> 00:00:12,870 that works by swapping elements in a vector. 5 00:00:12,870 --> 00:00:14,420 Selection sort is designed 6 00:00:14,420 --> 00:00:17,180 to make the fewest swaps possible. 7 00:00:17,180 --> 00:00:18,930 So for a vector of n elements, 8 00:00:18,930 --> 00:00:21,433 selection sort will make n swaps. 9 00:00:23,910 --> 00:00:26,700 Taking a few liberties, we can write pseudo code 10 00:00:26,700 --> 00:00:29,380 for this algorithm in two lines. 11 00:00:29,380 --> 00:00:31,600 While this is a little too concise, 12 00:00:31,600 --> 00:00:34,660 it shows how simple the idea is. 13 00:00:34,660 --> 00:00:37,490 We iterate through a vector n times, 14 00:00:37,490 --> 00:00:40,900 and at each iteration we swap the element at index i 15 00:00:40,900 --> 00:00:42,770 with the minimum element found 16 00:00:42,770 --> 00:00:46,083 between indices i and n minus one. 17 00:00:47,280 --> 00:00:48,590 Notice that this implies 18 00:00:48,590 --> 00:00:51,620 a division of the vector into two parts. 19 00:00:51,620 --> 00:00:54,450 The part with indices less than i, 20 00:00:54,450 --> 00:00:56,960 that's the part that's already been sorted, 21 00:00:56,960 --> 00:01:00,600 and the indices greater than or equal to i, 22 00:01:00,600 --> 00:01:02,563 that's the part we have yet to sort. 23 00:01:03,560 --> 00:01:06,603 So let's expand on this pseudo code just a bit. 24 00:01:08,740 --> 00:01:10,460 Here's the expanded pseudo code 25 00:01:10,460 --> 00:01:12,620 to show that there is an inner loop, 26 00:01:12,620 --> 00:01:15,080 one that scans the unsorted values 27 00:01:15,080 --> 00:01:17,493 to find the minimum unsorted value. 28 00:01:18,420 --> 00:01:21,170 Again, the vector is divided into two parts, 29 00:01:21,170 --> 00:01:23,000 the part that's already sorted, 30 00:01:23,000 --> 00:01:25,560 and the part we have yet to sort. 31 00:01:25,560 --> 00:01:27,900 It is this latter part that gets processed 32 00:01:27,900 --> 00:01:28,800 in the inner loop. 33 00:01:30,020 --> 00:01:33,280 So in that inner loop, we just update the index 34 00:01:33,280 --> 00:01:36,543 of the minimum value when we find a smaller value. 35 00:01:38,810 --> 00:01:40,090 And when that's done, 36 00:01:40,090 --> 00:01:42,793 and we've found a minimum value, we perform the swap. 37 00:01:44,000 --> 00:01:45,700 Let's look at this with a diagram. 38 00:01:47,880 --> 00:01:49,220 Here we have a vector, 39 00:01:49,220 --> 00:01:51,920 and everything to the left of the arrow is sorted. 40 00:01:51,920 --> 00:01:54,380 You'll notice there's nothing to the left of the arrow, 41 00:01:54,380 --> 00:01:56,233 so we start with nothing sorted. 42 00:01:57,630 --> 00:02:00,180 Then we scan through the list, 43 00:02:00,180 --> 00:02:01,760 and we find the minimum element 44 00:02:01,760 --> 00:02:04,640 in the portion of the vector to the right of the arrow. 45 00:02:04,640 --> 00:02:08,050 And then we swap it with the leftmost unsorted element. 46 00:02:08,050 --> 00:02:10,233 Here we swap zero with seven. 47 00:02:12,930 --> 00:02:15,543 Now the first element in the list is sorted. 48 00:02:17,590 --> 00:02:19,300 And then we perform the same steps 49 00:02:19,300 --> 00:02:20,883 on the balance of the list. 50 00:02:22,470 --> 00:02:25,570 We find the minimum value in the unsorted portion, 51 00:02:25,570 --> 00:02:28,760 here it's one, and we swap it 52 00:02:28,760 --> 00:02:31,310 with the left most unsorted element. 53 00:02:31,310 --> 00:02:34,283 We just repeat this process until the list is sorted. 54 00:02:35,380 --> 00:02:38,793 Let's watch a demonstration on the Visualgo website. 55 00:02:46,450 --> 00:02:48,363 Here we have a little larger vector, 56 00:02:52,420 --> 00:02:55,073 but I've set the speed to move pretty quickly. 57 00:02:56,860 --> 00:03:00,500 You'll see that sorted elements, 58 00:03:00,500 --> 00:03:03,513 elements that are already sorted are colored orange. 59 00:03:06,720 --> 00:03:10,300 The minimum that's been found, 60 00:03:10,300 --> 00:03:12,850 and the element it's swapped with or marked in red, 61 00:03:18,530 --> 00:03:20,190 and the element that's being scanned 62 00:03:20,190 --> 00:03:21,863 at the moment is in green. 63 00:03:34,400 --> 00:03:37,350 Now, you'll see 36 doesn't change position there 64 00:03:37,350 --> 00:03:39,783 because it was already in the correct position. 65 00:03:44,000 --> 00:03:46,873 Again, 46 here won't change position. 66 00:03:48,900 --> 00:03:50,623 And now we have a sorted list. 67 00:03:51,990 --> 00:03:54,360 Is selection sort stable? 68 00:03:54,360 --> 00:03:58,020 No, selection sort does not guarantee 69 00:03:58,020 --> 00:04:00,330 that if duplicate values appear on the vector, 70 00:04:00,330 --> 00:04:01,960 that their relative positions 71 00:04:01,960 --> 00:04:03,833 will be preserved after sorting. 72 00:04:04,840 --> 00:04:06,100 Why is this? 73 00:04:06,100 --> 00:04:09,780 Well, let's consider the vector three, five, six, 74 00:04:09,780 --> 00:04:12,050 seven, three, one. 75 00:04:12,050 --> 00:04:14,980 We have two threes, so we have duplicate values. 76 00:04:14,980 --> 00:04:16,490 But after the first pass, 77 00:04:16,490 --> 00:04:18,900 the three that was in the left-most position 78 00:04:18,900 --> 00:04:21,230 will be swapped with the value one, 79 00:04:21,230 --> 00:04:23,120 the far right-hand side of the vector, 80 00:04:23,120 --> 00:04:27,310 and this changes the relative position of the two threes. 81 00:04:27,310 --> 00:04:29,710 Yes, it's true that the next two passes 82 00:04:29,710 --> 00:04:30,910 will move the two threes 83 00:04:30,910 --> 00:04:33,540 to the second and third positions in the vector, 84 00:04:33,540 --> 00:04:34,760 and that that operation 85 00:04:34,760 --> 00:04:37,110 will preserve their relative positions, 86 00:04:37,110 --> 00:04:38,660 but their relative positions 87 00:04:38,660 --> 00:04:41,330 were already changed in the first step, 88 00:04:41,330 --> 00:04:43,823 so selection sort is not stable. 89 00:04:45,710 --> 00:04:48,120 What about its time complexity? 90 00:04:48,120 --> 00:04:51,860 Well, we have an inner loop and an outer loop, 91 00:04:51,860 --> 00:04:55,400 and selection sort makes n passes through the vector, 92 00:04:55,400 --> 00:04:59,500 and in the ith pass, it performs n minus i comparisons, 93 00:04:59,500 --> 00:05:01,790 so the complexity analysis is similar 94 00:05:01,790 --> 00:05:03,740 to that of bubble sort. 95 00:05:03,740 --> 00:05:08,740 We have a sequence of terms that goes from n down to one. 96 00:05:09,210 --> 00:05:12,830 And if we simplify that we get n squared minus n over two, 97 00:05:12,830 --> 00:05:16,420 so the complexity is O of n squared. 98 00:05:16,420 --> 00:05:17,963 Let's do a quick comparison. 99 00:05:18,840 --> 00:05:22,050 Here's a comparison of selection sort and bubble sort. 100 00:05:22,050 --> 00:05:25,490 We see they both have time complexity events squared, 101 00:05:25,490 --> 00:05:29,070 they both have constant space complexity, 102 00:05:29,070 --> 00:05:32,263 bubble sort is stable while selection sort is not, 103 00:05:33,840 --> 00:05:36,920 bubble sort can tell if the list is already sorted. 104 00:05:36,920 --> 00:05:40,020 So, the one thing bubble sort has going for it 105 00:05:40,020 --> 00:05:43,020 is that if you give it an already sorted list, 106 00:05:43,020 --> 00:05:45,003 it will terminate almost immediately, 107 00:05:45,930 --> 00:05:48,843 but selection sort performs the fewest swaps. 108 00:05:49,770 --> 00:05:51,350 Now, how did we get that value 109 00:05:51,350 --> 00:05:53,670 for constant space complexity? 110 00:05:53,670 --> 00:05:55,290 We considered the extra memory 111 00:05:55,290 --> 00:05:57,123 needed to implement the algorithm. 112 00:05:58,280 --> 00:05:59,900 Bubble sort and selection sort 113 00:05:59,900 --> 00:06:03,080 both require a temporary variable for swapping, 114 00:06:03,080 --> 00:06:05,800 and selection sort also requires a variable 115 00:06:05,800 --> 00:06:09,610 representing the current minimum found so far. 116 00:06:09,610 --> 00:06:11,780 But in both cases, this is constant 117 00:06:11,780 --> 00:06:14,840 and does not vary with the size of the input. 118 00:06:14,840 --> 00:06:18,410 Hence their space complexities are big O of one. 119 00:06:18,410 --> 00:06:21,080 Now, a little comprehension check. 120 00:06:21,080 --> 00:06:22,890 Given this vector of integers, 121 00:06:22,890 --> 00:06:25,210 zero, one, six, five, three, nine, 122 00:06:25,210 --> 00:06:27,250 two, four, seven, eight, 123 00:06:27,250 --> 00:06:28,830 what will the vector look like 124 00:06:28,830 --> 00:06:31,720 after one pass of bubble sort? 125 00:06:31,720 --> 00:06:33,823 How many swaps will take place? 126 00:06:35,810 --> 00:06:38,850 Well, we're gonna swap six with five 127 00:06:38,850 --> 00:06:41,190 and then six is gonna be next to three 128 00:06:41,190 --> 00:06:43,690 and we'll swap six with three, 129 00:06:43,690 --> 00:06:46,740 six and nine are gonna be in the correct order. 130 00:06:46,740 --> 00:06:49,240 So now we're gonna swap nine with two, 131 00:06:49,240 --> 00:06:51,880 nine with four, nine with seven, 132 00:06:51,880 --> 00:06:54,450 nine with eight, that's six swaps. 133 00:06:54,450 --> 00:06:56,300 And so our vector will look like 134 00:06:56,300 --> 00:06:58,730 zero, one, five, three, six, two, 135 00:06:58,730 --> 00:07:00,860 four, seven, eight, nine. 136 00:07:00,860 --> 00:07:03,600 What will it look like with selection sort, 137 00:07:03,600 --> 00:07:05,900 given that zero and one are already sorted 138 00:07:05,900 --> 00:07:07,530 and in the correct positions. 139 00:07:07,530 --> 00:07:10,320 What will the vector look like after one pass? 140 00:07:10,320 --> 00:07:12,760 And how many swaps will take place? 141 00:07:12,760 --> 00:07:14,590 Well, it's selection sort, 142 00:07:14,590 --> 00:07:17,310 so we know we're only going to swap once per pass, 143 00:07:17,310 --> 00:07:20,220 so there's only going to be one swap. 144 00:07:20,220 --> 00:07:22,940 And we're gonna swap the minimum value 145 00:07:22,940 --> 00:07:26,530 that we find in the unsorted portion of the vector 146 00:07:26,530 --> 00:07:29,750 with the left most unsorted value. 147 00:07:29,750 --> 00:07:31,690 So we're gonna swap two and six. 148 00:07:31,690 --> 00:07:33,100 And so we'll get the vector 149 00:07:33,100 --> 00:07:35,970 zero, one, two, five, three, nine, 150 00:07:35,970 --> 00:07:38,763 six, four, seven and eight with one swap. 151 00:07:41,150 --> 00:07:43,500 Now let's return to our comparison with bubble sort 152 00:07:43,500 --> 00:07:44,483 for another moment. 153 00:07:45,550 --> 00:07:49,550 Despite both algorithms having O of n squared complexity, 154 00:07:49,550 --> 00:07:53,500 selection sort typically outperforms bubble sort. 155 00:07:53,500 --> 00:07:54,540 Why is that? 156 00:07:54,540 --> 00:07:57,550 Well, changing the value of a variable and memory, 157 00:07:57,550 --> 00:08:01,210 that is writing to the variable is more expensive, 158 00:08:01,210 --> 00:08:05,220 it takes a longer time than reading the value. 159 00:08:05,220 --> 00:08:07,410 And selection sort is designed 160 00:08:07,410 --> 00:08:10,030 to make the fewest swaps possible. 161 00:08:10,030 --> 00:08:15,030 Therefore it performs fewer rights than bubble sort does, 162 00:08:15,030 --> 00:08:17,230 hence it tends to be faster. 163 00:08:17,230 --> 00:08:19,660 We're going to investigate this in greater depth 164 00:08:19,660 --> 00:08:21,373 when we get to project four. 165 00:08:22,370 --> 00:08:27,240 So in summary, selection sort has O of n squared complexity, 166 00:08:27,240 --> 00:08:29,110 it's not stable, 167 00:08:29,110 --> 00:08:33,250 it's designed to perform the fewest swaps possible. 168 00:08:33,250 --> 00:08:36,830 And again, I encourage you to visit visualgo.net 169 00:08:36,830 --> 00:08:38,233 to try it out on your own. 170 00:08:39,550 --> 00:08:42,330 In the next video, we'll code up a version 171 00:08:42,330 --> 00:08:46,310 of selection sort in C++, but that's all for now. 172 00:08:46,310 --> 00:08:47,143 Thanks.