WEBVTT 1 00:00:00.930 --> 00:00:03.830 Hello, students, and welcome to Problem 20. 2 00:00:05.370 --> 00:00:08.400 So this problem, I'm going to read the problem first. 3 00:00:08.400 --> 00:00:12.330 The following data are sample of white blood counts 4 00:00:12.330 --> 00:00:14.790 in thousands of cells per-cubic-millimeter 5 00:00:14.790 --> 00:00:18.060 for nine participants entering a hospital 6 00:00:18.060 --> 00:00:20.673 in Boston, Massachusetts on a given day. 7 00:00:21.990 --> 00:00:26.250 So the values here, I have already created the ordered set, 8 00:00:26.250 --> 00:00:28.020 as you can see, 9 00:00:28.020 --> 00:00:30.900 which are three, five, seven, eight, eight, 10 00:00:30.900 --> 00:00:33.360 nine, 10, 12, 35. 11 00:00:33.360 --> 00:00:36.360 Now the question here is asking us are there any outliers 12 00:00:36.360 --> 00:00:39.600 in this data set, and if so, justify. 13 00:00:39.600 --> 00:00:41.340 So as you know by now, 14 00:00:41.340 --> 00:00:46.340 when we start the process of identifying the outlier, 15 00:00:47.490 --> 00:00:51.690 we need to go back to first calculating Q1, Q3, 16 00:00:51.690 --> 00:00:53.160 and then the Tukey fences, 17 00:00:53.160 --> 00:00:54.990 basically the upper and lower limits. 18 00:00:54.990 --> 00:00:57.870 And then we need to see if we have any values 19 00:00:57.870 --> 00:01:02.870 above or below the upper and lower limits. 20 00:01:03.120 --> 00:01:04.700 So let's start here. 21 00:01:04.700 --> 00:01:07.650 So Q1 is the first thing we need to calculate. 22 00:01:07.650 --> 00:01:10.260 So as you already know by now, 23 00:01:10.260 --> 00:01:12.660 the formula to calculate Q1 here 24 00:01:12.660 --> 00:01:17.660 is L equal to K divided by 100. 25 00:01:20.040 --> 00:01:25.040 Oops, sorry, should be this here, multiplied by n, 26 00:01:25.230 --> 00:01:26.820 and that is our sample size. 27 00:01:26.820 --> 00:01:31.650 So K here is going to be 25 divided by 100 28 00:01:31.650 --> 00:01:34.770 and multiplied by nine, 29 00:01:34.770 --> 00:01:39.390 and that is going to give us 2.25. 30 00:01:39.390 --> 00:01:42.870 So as you know, 2.25 is not a whole number. 31 00:01:42.870 --> 00:01:45.570 So what we have to do is proceed 32 00:01:45.570 --> 00:01:48.120 and identify the next whole number. 33 00:01:48.120 --> 00:01:50.043 So that will be three. 34 00:01:50.970 --> 00:01:55.650 So now we have to go back to our order data set 35 00:01:55.650 --> 00:01:57.843 and identify the third value, 36 00:01:58.740 --> 00:02:00.180 and that is seven. 37 00:02:00.180 --> 00:02:02.313 So that is our Q1. 38 00:02:03.720 --> 00:02:05.923 Now we need to calculate Q3. 39 00:02:08.700 --> 00:02:11.910 So for Q3, again, we are going to use the same formula. 40 00:02:11.910 --> 00:02:14.760 Only difference now would be instead of 25, 41 00:02:14.760 --> 00:02:16.953 we are going to use 75. 42 00:02:24.360 --> 00:02:25.860 And when we do that, 43 00:02:25.860 --> 00:02:28.590 what we get here is 6.75. 44 00:02:33.450 --> 00:02:36.510 Again, 6.75 is not a whole number, 45 00:02:36.510 --> 00:02:39.210 so we need to proceed to the next whole number, 46 00:02:39.210 --> 00:02:41.400 and that will be seven. 47 00:02:41.400 --> 00:02:45.120 So what is the seventh value of our order set? 48 00:02:45.120 --> 00:02:48.513 When we go down, we can see that is going to be 10. 49 00:02:49.620 --> 00:02:51.513 So our Q3 is 10. 50 00:02:52.470 --> 00:02:54.930 So now we need to calculate the Tukey fences, 51 00:02:54.930 --> 00:02:57.303 basically the upper and lower limits. 52 00:03:10.290 --> 00:03:12.813 So the formula here, as you have seen before, 53 00:03:16.740 --> 00:03:21.740 is going to be Q1 for the lower limits, or lower limit, 54 00:03:21.900 --> 00:03:26.743 is going to be Q1 minus 1.5 55 00:03:33.960 --> 00:03:36.360 multiplied by Q3 minus Q1. 56 00:03:36.360 --> 00:03:40.203 And as I've said before, Q3 minus Q1 is our IQR. 57 00:03:41.580 --> 00:03:46.580 So here, Q1 is going to be seven minus 1.5, 58 00:03:49.403 --> 00:03:54.363 and the IQR is going to be 10 minus seven. 59 00:03:55.530 --> 00:03:57.660 So when we do the calculation, 60 00:03:57.660 --> 00:04:00.033 so let's do it and see what we get. 61 00:04:03.330 --> 00:04:05.190 10 minus seven is three. 62 00:04:05.190 --> 00:04:09.063 So three times 1.5 is 4.5. 63 00:04:10.290 --> 00:04:15.290 So seven minus 4.5 is 2.5. 64 00:04:16.350 --> 00:04:19.530 So for our Tukey fences, 65 00:04:19.530 --> 00:04:21.047 the lower limit is 2.5. 66 00:04:22.050 --> 00:04:24.350 Now we are going to calculate the upper limit. 67 00:04:30.480 --> 00:04:35.000 So here the formula is Q3 plus 1.5, 68 00:04:38.910 --> 00:04:41.645 Q3 minus Q1. 69 00:04:41.645 --> 00:04:46.645 So Q3 is going to be 10, 1.5. 70 00:04:46.980 --> 00:04:50.970 And we already know Q3 minus Q1 is three. 71 00:04:50.970 --> 00:04:53.250 So when we do all this calculation, 72 00:04:53.250 --> 00:04:54.250 what we get is 14.5. 73 00:05:09.840 --> 00:05:13.230 So now the question is do we have any value 74 00:05:13.230 --> 00:05:17.790 [edited] below 2.5? 75 00:05:17.790 --> 00:05:21.040 So the Tukey fences here is 2.5 to 14.5. 76 00:05:29.100 --> 00:05:32.340 Now, do we have any value that is below 2.5? 77 00:05:32.340 --> 00:05:35.460 The answer is no, because our lowest value is three. 78 00:05:35.460 --> 00:05:38.820 Do we have any value that is above 14.5? 79 00:05:38.820 --> 00:05:40.980 Now the answer is yes, we do, 80 00:05:40.980 --> 00:05:42.720 and that is 35. 81 00:05:42.720 --> 00:05:47.340 So because 35 is greater than 14.5, 82 00:05:47.340 --> 00:05:51.690 we have identified that, yes, we do have an outlier here, 83 00:05:51.690 --> 00:05:53.593 and that outlier is 35. 84 00:05:56.250 --> 00:05:57.930 So I hope was helpful, 85 00:05:57.930 --> 00:06:00.510 because I mentioned previously that we are going to go 86 00:06:00.510 --> 00:06:03.960 through a problem where we will have an outlier, 87 00:06:03.960 --> 00:06:07.593 and that's exactly what we had for this problem. 88 00:06:07.593 --> 00:06:08.970 Here we have an outlier, 89 00:06:08.970 --> 00:06:13.970 and we were able to identify by using the Tukey fences.