WEBVTT 1 00:00:02.400 --> 00:00:04.850 Dear students, welcome to Biostat 2 00:00:06.480 --> 00:00:08.670 ER example eight, chapter seven. 3 00:00:08.670 --> 00:00:10.440 In this example, we will learn 4 00:00:10.440 --> 00:00:13.620 how to perform hypothesis testing 5 00:00:13.620 --> 00:00:17.823 by using multi-sample independent continuous outcome data. 6 00:00:20.040 --> 00:00:22.923 And for this purpose, we will be using ANOVA, 7 00:00:25.140 --> 00:00:27.333 or analysis of variance. 8 00:00:28.770 --> 00:00:32.250 For this example, we will use problem 28 from our textbook, 9 00:00:32.250 --> 00:00:34.300 and first, I'm going to read the problem. 10 00:00:35.280 --> 00:00:37.800 Use the data shown in problem 25 11 00:00:37.800 --> 00:00:40.380 and test if there is a significant difference 12 00:00:40.380 --> 00:00:42.570 in mean age among the three groups. 13 00:00:42.570 --> 00:00:47.570 Hint, SStotal is equal to 2,893. 14 00:00:48.090 --> 00:00:51.090 Use a 5% level of significance. 15 00:00:51.090 --> 00:00:52.290 So for your convenience, 16 00:00:52.290 --> 00:00:54.603 I have copied and pasted the table here. 17 00:00:55.860 --> 00:00:57.240 And before I get started, 18 00:00:57.240 --> 00:01:00.600 I want to bring your attention to a couple of items. 19 00:01:00.600 --> 00:01:03.810 So, why are we using ANOVA? 20 00:01:03.810 --> 00:01:07.350 Well, we are using ANOVA because here, we have three groups, 21 00:01:07.350 --> 00:01:10.383 therefore we cannot use T-test or Z-test. 22 00:01:14.130 --> 00:01:17.280 Also here, our outcome age is measured 23 00:01:17.280 --> 00:01:21.813 on a continuous scale, hence we cannot use chi-squared. 24 00:01:23.340 --> 00:01:28.340 Therefore, we will be using ANOVA for this example. 25 00:01:29.910 --> 00:01:32.730 So for step one, we are going to set up the hypotheses 26 00:01:32.730 --> 00:01:36.060 and we are going to determine the level of significance. 27 00:01:36.060 --> 00:01:37.950 The null hypothesis here states 28 00:01:37.950 --> 00:01:39.780 that all the means are equal, 29 00:01:39.780 --> 00:01:43.500 so mu one is equal to mu two is equal to mu three, 30 00:01:43.500 --> 00:01:47.070 and the alternative states that the means are not equal. 31 00:01:47.070 --> 00:01:50.223 The alpha here is provided to us, which is 0.05. 32 00:01:52.110 --> 00:01:53.940 In step two, we are going to select 33 00:01:53.940 --> 00:01:57.360 the appropriate test statistics, and that is F statistics. 34 00:01:57.360 --> 00:02:01.050 And to determine the F statistics, we need to divide 35 00:02:01.050 --> 00:02:06.050 MSB by MSW, and MSB stands for means squared between, 36 00:02:06.210 --> 00:02:09.213 and MSW stands for means squared within. 37 00:02:10.920 --> 00:02:14.760 Now, in step three, we are going to set up the decision. 38 00:02:14.760 --> 00:02:18.330 We have three groups, so the k is equal to three. 39 00:02:18.330 --> 00:02:21.060 Now, our big N is 120 40 00:02:21.060 --> 00:02:23.850 because each group has 40 participants, 41 00:02:23.850 --> 00:02:27.270 so when we add it up, we get 120. 42 00:02:27.270 --> 00:02:30.270 The degrees of freedom one is k minus 1, 43 00:02:30.270 --> 00:02:32.610 and the k here is again three, 44 00:02:32.610 --> 00:02:35.640 so 3 - 1, we get two. 45 00:02:35.640 --> 00:02:37.563 And for degrees of freedom two, 46 00:02:39.270 --> 00:02:42.270 we have to subtract the k from the big N, 47 00:02:42.270 --> 00:02:45.930 and the big N here is 120, and the k is three, 48 00:02:45.930 --> 00:02:50.930 so when we subtract three from 120, we get 117. 49 00:02:51.300 --> 00:02:54.960 Now, the F-table on page 351, 50 00:02:54.960 --> 00:02:59.640 which is available at the back of our book, 51 00:02:59.640 --> 00:03:02.370 does not include 117, 52 00:03:02.370 --> 00:03:07.370 hence, we will use 100 as the DF2 value, 53 00:03:07.650 --> 00:03:12.650 and based on that, our F critical value will be 3.09. 54 00:03:13.260 --> 00:03:14.940 Hence, we will reject the null 55 00:03:14.940 --> 00:03:19.593 if our F calculated is greater than or equal to 3.09. 56 00:03:21.570 --> 00:03:24.873 Now in step four, we have to compute the test statistics. 57 00:03:26.010 --> 00:03:27.780 To organize our computations, 58 00:03:27.780 --> 00:03:29.610 we will complete the ANOVA table, 59 00:03:29.610 --> 00:03:31.770 and for that purpose, first we need 60 00:03:31.770 --> 00:03:34.890 to calculate the overall, our grand mean, 61 00:03:34.890 --> 00:03:38.700 which is X bar, and for that purpose we are going to add 62 00:03:38.700 --> 00:03:41.373 the three group means and divide it by three. 63 00:03:43.710 --> 00:03:47.040 Now for this problem, we can calculate the grand mean, 64 00:03:47.040 --> 00:03:49.530 or overall mean by adding the three means 65 00:03:49.530 --> 00:03:52.890 and dividing by three because we have equal number 66 00:03:52.890 --> 00:03:56.023 of participants in each group, and that is 40. 67 00:03:58.980 --> 00:04:02.010 So now we can start to calculate the SSB, 68 00:04:02.010 --> 00:04:04.350 which is the sums of square between, 69 00:04:04.350 --> 00:04:07.320 and for that, the formula is given here, 70 00:04:07.320 --> 00:04:10.443 which is the summation sign, n sub j, 71 00:04:11.610 --> 00:04:14.080 open parentheses X bar sub j 72 00:04:15.450 --> 00:04:19.890 minus X bar, closed parentheses, and the whole square. 73 00:04:19.890 --> 00:04:21.878 So the X bar sub j here 74 00:04:21.878 --> 00:04:25.590 is the group mean for each group. 75 00:04:25.590 --> 00:04:30.590 And that is 75.2 for group one, 75.6 for group two, 76 00:04:31.020 --> 00:04:33.570 and 74.7 for group three. 77 00:04:33.570 --> 00:04:38.570 And the X bar is 75.2, which is our grand. 78 00:04:38.580 --> 00:04:41.110 So once we insert all the values 79 00:04:42.270 --> 00:04:44.760 and we perform the calculation, 80 00:04:44.760 --> 00:04:47.540 our SSB is equal to 16.4. 81 00:04:50.738 --> 00:04:53.428 So the total, our SStotal, 82 00:04:53.428 --> 00:04:57.011 is equal to 2,893, which is provided to us. 83 00:04:58.080 --> 00:05:01.950 So by subtracting SSB from SStotal, 84 00:05:01.950 --> 00:05:04.950 we can ascertain that SSW, 85 00:05:04.950 --> 00:05:08.087 or SSE, is going to be 2876.6. 86 00:05:15.540 --> 00:05:18.570 Now, we need to insert these values in our ANOVA table 87 00:05:18.570 --> 00:05:21.510 and start calculating the mean squares 88 00:05:21.510 --> 00:05:26.510 by dividing sums of squares by degrees of freedom. 89 00:05:26.970 --> 00:05:30.390 So first, we are going to take 16.4 90 00:05:30.390 --> 00:05:34.050 and divide that by two and we will get 8.2, 91 00:05:34.050 --> 00:05:39.050 then we are going to divide 2876.6 by 117, 92 00:05:40.530 --> 00:05:43.620 and we will get 24.6. 93 00:05:43.620 --> 00:05:46.020 Now, to calculate the final F value, 94 00:05:46.020 --> 00:05:50.670 we are going to divide 8.2 by 24.6, 95 00:05:50.670 --> 00:05:54.300 and that will give us 0.33. 96 00:05:54.300 --> 00:05:58.620 So now in step five, we have to draw our conclusion. 97 00:05:58.620 --> 00:06:01.260 So here, we will fail to reject null 98 00:06:01.260 --> 00:06:05.430 because 0.33 is less than 3.09. 99 00:06:05.430 --> 00:06:08.790 So we do not have statistically significant evidence 100 00:06:08.790 --> 00:06:12.450 at alpha value of 0.05 to show that 101 00:06:12.450 --> 00:06:16.323 there is a difference in mean age among the three groups. 102 00:06:19.590 --> 00:06:20.940 I hope this was helpful, 103 00:06:20.940 --> 00:06:24.030 and I would like to thank you for your time and attention, 104 00:06:24.030 --> 00:06:26.220 and please feel free to contact me 105 00:06:26.220 --> 00:06:27.573 if you have any questions.