WEBVTT 1 00:00:03.480 --> 00:00:05.280 Hello, students. 2 00:00:05.280 --> 00:00:08.643 We are now going to get problem 20. 3 00:00:11.100 --> 00:00:12.630 This is, again, a new problem 4 00:00:12.630 --> 00:00:15.150 and here we have the raw dataset. 5 00:00:15.150 --> 00:00:18.510 The following data are a sample of white blood cell counts 6 00:00:18.510 --> 00:00:22.470 in thousands of cells per cubic millimeter for nine patients 7 00:00:22.470 --> 00:00:26.883 entering a hospital in Boston, Massachusetts on a given day. 8 00:00:28.110 --> 00:00:29.730 So, the question is asking us 9 00:00:29.730 --> 00:00:32.430 are there any outliers in this dataset? 10 00:00:32.430 --> 00:00:35.010 And of course, we have to justify our answer. 11 00:00:35.010 --> 00:00:36.600 So, what do we have to do here? 12 00:00:36.600 --> 00:00:38.430 As you may have understood 13 00:00:38.430 --> 00:00:41.010 from our previous problems that we have solved, 14 00:00:41.010 --> 00:00:43.653 we need to compute the quartiles. 15 00:00:44.790 --> 00:00:47.370 So, we have to first compute the quartiles 16 00:00:47.370 --> 00:00:49.800 and kind of make a plan for this. 17 00:00:49.800 --> 00:00:51.690 We have to compute the quartiles 18 00:00:51.690 --> 00:00:56.050 and then we have to calculate the lower 19 00:00:57.150 --> 00:01:01.180 and our limits, right, for the two (indistinct). 20 00:01:01.180 --> 00:01:02.523 Okay, now, 21 00:01:04.230 --> 00:01:06.540 before we can calculate the quartiles 22 00:01:06.540 --> 00:01:07.950 what's the first thing we have to do? 23 00:01:07.950 --> 00:01:09.900 We need the ordered set. 24 00:01:09.900 --> 00:01:14.250 So, again here I'm going to give you the ordered set. 25 00:01:14.250 --> 00:01:18.900 Again, very easily we can use Excel and calculate this. 26 00:01:18.900 --> 00:01:20.130 So, let's see here 27 00:01:20.130 --> 00:01:25.130 is 3, 5, 7, 28 00:01:26.940 --> 00:01:31.383 8, 8, 9, 29 00:01:32.730 --> 00:01:36.930 10, 12, and 35. 30 00:01:36.930 --> 00:01:39.540 Okay, so again, this is our ordered set. 31 00:01:39.540 --> 00:01:42.210 So, what I want to do is right now number them 32 00:01:42.210 --> 00:01:46.800 using another color so it is easier for us to see it. 33 00:01:46.800 --> 00:01:49.860 So, this is 1, this is 2, this is 3, 34 00:01:49.860 --> 00:01:52.260 this is 4, this is 5, 35 00:01:52.260 --> 00:01:56.010 6, 7, 8, and 9. 36 00:01:56.010 --> 00:01:58.500 And because, again, we have 9, 37 00:01:58.500 --> 00:02:00.300 which is an odd number, 38 00:02:00.300 --> 00:02:04.170 it's very clear that this is going to be our median. 39 00:02:04.170 --> 00:02:07.710 Okay, so the fifth place, which is number 8, 40 00:02:07.710 --> 00:02:09.523 is our median, and this is Q2. 41 00:02:10.470 --> 00:02:15.133 So, now the question is, what is Q1 and what is Q3? 42 00:02:16.020 --> 00:02:18.540 So, the Q1 will be, again, 43 00:02:18.540 --> 00:02:23.070 the median of the upper values, 44 00:02:23.070 --> 00:02:26.250 basically the values that are above our median 45 00:02:26.250 --> 00:02:29.823 and that will be, again, I'm going to use another color. 46 00:02:30.930 --> 00:02:33.900 So, that's basically this data set. 47 00:02:33.900 --> 00:02:35.910 And of this, what's in the middle? 48 00:02:35.910 --> 00:02:36.743 These two 49 00:02:38.220 --> 00:02:39.570 And because it's a even number, 50 00:02:39.570 --> 00:02:41.430 we need to make an average of these two. 51 00:02:41.430 --> 00:02:45.870 So, this is Q1, so it is five plus seven. 52 00:02:45.870 --> 00:02:49.200 And again, I always like to put a parenthesis around 53 00:02:49.200 --> 00:02:51.240 because that makes me realize 54 00:02:51.240 --> 00:02:54.210 I gotta do the addition before doing the division. 55 00:02:54.210 --> 00:02:56.760 So, this will be 12 divided by two. 56 00:02:56.760 --> 00:02:58.170 That is what, six? 57 00:02:58.170 --> 00:03:00.150 Correct, okay. 58 00:03:00.150 --> 00:03:01.920 Now we have to go to the lower part, 59 00:03:01.920 --> 00:03:04.200 and again I'm going to use another color. 60 00:03:04.200 --> 00:03:06.100 Let's see, I'm going to use 61 00:03:07.770 --> 00:03:09.698 maybe blue this time. 62 00:03:09.698 --> 00:03:13.860 Okay, so here this is basically the four values 63 00:03:13.860 --> 00:03:18.480 that are below 64 00:03:18.480 --> 00:03:20.700 our median, which is Q2. 65 00:03:20.700 --> 00:03:23.220 So, here again, what is going to be Q3? 66 00:03:23.220 --> 00:03:27.210 That's going to be the average of these two numbers, 67 00:03:27.210 --> 00:03:31.870 so Q3 is 10 plus 12 68 00:03:33.330 --> 00:03:35.010 divided by two. 69 00:03:35.010 --> 00:03:37.260 So, that will be what? 70 00:03:37.260 --> 00:03:38.490 That will be 11. 71 00:03:38.490 --> 00:03:41.160 Okay, so now what is our next step? 72 00:03:41.160 --> 00:03:43.980 Our next step is to calculate the IQR. 73 00:03:43.980 --> 00:03:45.580 So, what is going to be our IQR? 74 00:03:47.640 --> 00:03:51.040 Our IQR is going to be the difference between 75 00:03:55.470 --> 00:03:56.708 Q3 and Q1. 76 00:03:56.708 --> 00:03:59.353 So, IQR will be Q3 minus Q1, 77 00:04:02.820 --> 00:04:05.340 and that is going to be what? 78 00:04:05.340 --> 00:04:08.400 11 minus six, that is five. 79 00:04:08.400 --> 00:04:10.770 Correct. Okay, great. 80 00:04:10.770 --> 00:04:12.840 So, now what we need to do is calculate 81 00:04:12.840 --> 00:04:14.250 the lower limit and the upper limit. 82 00:04:14.250 --> 00:04:17.310 So, again, the lower limit, 83 00:04:17.310 --> 00:04:20.343 I'm going to again write the formulas. 84 00:04:21.960 --> 00:04:25.240 So, that will be Q1 85 00:04:26.940 --> 00:04:31.173 minus 1.5 multiplied by IQR. 86 00:04:32.340 --> 00:04:33.990 So, what is our Q1? 87 00:04:33.990 --> 00:04:35.520 We just calculated it. 88 00:04:35.520 --> 00:04:37.170 Q1 is 6, 89 00:04:37.170 --> 00:04:42.100 so it's going to be 6 minus 1.5 90 00:04:43.200 --> 00:04:44.850 multiplied by 5. 91 00:04:44.850 --> 00:04:46.260 So, it's going to be what? 92 00:04:46.260 --> 00:04:50.340 6 minus 7.5. 93 00:04:50.340 --> 00:04:51.690 So, what's gonna happen here? 94 00:04:51.690 --> 00:04:53.714 We're gonna have a negative number. 95 00:04:53.714 --> 00:04:56.550 1.5, correct? 96 00:04:56.550 --> 00:05:01.470 Yes, so now we are going to go to the upper limit. 97 00:05:01.470 --> 00:05:03.960 So, the upper limit, again, it seems like repetitive, 98 00:05:03.960 --> 00:05:06.960 but it's always a good idea to write the formula again. 99 00:05:06.960 --> 00:05:09.510 It always is. I love doing it, okay? 100 00:05:09.510 --> 00:05:14.510 So, Q3 plus 1.5 101 00:05:14.640 --> 00:05:16.207 multiplied by IQR. 102 00:05:18.000 --> 00:05:20.820 Okay, oops, that got a little, little. 103 00:05:20.820 --> 00:05:24.810 Okay, let me make sure it fits here. 104 00:05:24.810 --> 00:05:28.410 Okay, so I'm going to just erase this 105 00:05:28.410 --> 00:05:31.890 and write this whole thing so everything fits nicely. 106 00:05:31.890 --> 00:05:33.190 Okay, upper limit 107 00:05:35.700 --> 00:05:39.860 is formula is Q3 108 00:05:41.426 --> 00:05:46.080 plus 1.5 multiplied by IQR. 109 00:05:46.080 --> 00:05:49.113 Okay, so, again, remember from before. 110 00:05:52.050 --> 00:05:54.090 I'm going to again use a different color, 111 00:05:54.090 --> 00:05:55.920 maybe purple this time. 112 00:05:55.920 --> 00:05:58.320 This is going to be the same, right? 113 00:05:58.320 --> 00:06:01.710 So, this is going to be still 7.5 114 00:06:01.710 --> 00:06:04.170 so we don't need to do that calculation again, 115 00:06:04.170 --> 00:06:07.743 all we need to do is take that and add it to Q3. 116 00:06:10.170 --> 00:06:12.180 And this is how you can save time 117 00:06:12.180 --> 00:06:13.710 when you are taking the test 118 00:06:13.710 --> 00:06:16.800 by not repeating the calculations. 119 00:06:16.800 --> 00:06:19.020 Okay, that really saves time. 120 00:06:19.020 --> 00:06:20.023 So, what is our Q3? 121 00:06:21.880 --> 00:06:23.073 Q3 is 11. 122 00:06:24.390 --> 00:06:28.530 So, 11 plus now 7.5. 123 00:06:28.530 --> 00:06:32.250 Okay, so 11 plus 7.5 is going to be what? 124 00:06:32.250 --> 00:06:33.500 18.5. 125 00:06:35.339 --> 00:06:39.150 So, are there outliers? 126 00:06:39.150 --> 00:06:40.680 That's the question. 127 00:06:40.680 --> 00:06:42.300 Yes, there are, right? 128 00:06:42.300 --> 00:06:43.620 We do have outliers. 129 00:06:43.620 --> 00:06:44.880 How do we know that? 130 00:06:44.880 --> 00:06:49.833 Because 35 is bigger than 18.5. 131 00:06:51.120 --> 00:06:54.900 We don't have any outliers in terms of the lower limit 132 00:06:54.900 --> 00:06:58.170 because we don't have any value less than -1.5, 133 00:06:58.170 --> 00:07:01.950 but we do have a value that is greater than our upper limit, 134 00:07:01.950 --> 00:07:03.390 which is 18.5. 135 00:07:03.390 --> 00:07:05.613 So, again, we have an outlier. 136 00:07:07.080 --> 00:07:08.640 Let me use another nice color, 137 00:07:08.640 --> 00:07:10.350 let's use green this time. 138 00:07:10.350 --> 00:07:14.280 Okay, so 35 is actually, 139 00:07:14.280 --> 00:07:16.830 as we know, bigger than 18.5 140 00:07:16.830 --> 00:07:20.013 so yes, we have an outlier. 141 00:07:24.240 --> 00:07:28.823 Yes, outlier exists, okay? 142 00:07:33.090 --> 00:07:35.700 Because 35 is greater than 18.5, 143 00:07:35.700 --> 00:07:37.623 so that is our outlier. 144 00:07:39.390 --> 00:07:43.463 Okay, let me write it clearly. 145 00:07:43.463 --> 00:07:44.630 Okay, so 18.5. 146 00:07:48.240 --> 00:07:52.620 Okay, so we have done 147 00:07:52.620 --> 00:07:54.720 this calculation now a few times 148 00:07:54.720 --> 00:07:57.960 and I really, really encourage students 149 00:07:57.960 --> 00:08:02.190 to go through the calculations multiple times. 150 00:08:02.190 --> 00:08:04.050 The magic number for me has been three. 151 00:08:04.050 --> 00:08:05.790 If I go through a calculation three times, 152 00:08:05.790 --> 00:08:07.170 go through a problem three times 153 00:08:07.170 --> 00:08:10.710 even when I know the answer, I know the process, 154 00:08:10.710 --> 00:08:12.360 even then when I repeat that 155 00:08:12.360 --> 00:08:14.730 it really helps me because you know why? 156 00:08:14.730 --> 00:08:17.580 When you take a test, you are what? 157 00:08:17.580 --> 00:08:19.020 You have a time constraint 158 00:08:19.020 --> 00:08:21.630 so you have to go a little faster 159 00:08:21.630 --> 00:08:23.730 and that really is very helpful 160 00:08:23.730 --> 00:08:27.060 when you have solved a problem a few times 161 00:08:27.060 --> 00:08:31.170 then that process naturally gets speeded 162 00:08:31.170 --> 00:08:33.000 or sped up in your mind, 163 00:08:33.000 --> 00:08:34.800 you can do it much faster. 164 00:08:34.800 --> 00:08:35.640 You can think faster 165 00:08:35.640 --> 00:08:37.860 because you have gone through the process a few times. 166 00:08:37.860 --> 00:08:39.780 So, again, I very much encourage you 167 00:08:39.780 --> 00:08:41.897 to go through these problems a couple of times. 168 00:08:41.897 --> 00:08:44.370 Again, I'm showing you a number of problems, 169 00:08:44.370 --> 00:08:46.620 I'm putting a lot more examples this time 170 00:08:46.620 --> 00:08:49.290 because I think it's important to go through these examples 171 00:08:49.290 --> 00:08:52.230 so that will clear the understanding in your mind 172 00:08:52.230 --> 00:08:55.050 and you will also be able to solve the problems 173 00:08:55.050 --> 00:08:57.603 much faster when you take the test.