1 00:00:01,210 --> 00:00:02,840 - [Professor] Hello everyone, 2 00:00:02,840 --> 00:00:07,727 and welcome to the final exam review for CDAE 359. 3 00:00:12,580 --> 00:00:16,130 So our agenda here will just be to go over 4 00:00:18,070 --> 00:00:19,770 what's gonna be on the exam, 5 00:00:19,770 --> 00:00:22,080 starting with a few announcements 6 00:00:22,080 --> 00:00:22,963 and then, 7 00:00:25,970 --> 00:00:28,273 go over the actual topics. 8 00:00:38,390 --> 00:00:41,543 Here's what we'll be running down today. 9 00:00:42,520 --> 00:00:46,100 So starting with two 10 00:00:46,100 --> 00:00:50,660 big related concepts, 11 00:00:50,660 --> 00:00:55,660 and then going through the basic topics that we've done, 12 00:00:56,070 --> 00:01:01,070 limited dependent variable and sampling corrections. 13 00:01:01,460 --> 00:01:03,320 panel data, 14 00:01:03,320 --> 00:01:05,890 instrumental variable, two-stage least squares 15 00:01:05,890 --> 00:01:07,780 and simultaneous equations. 16 00:01:07,780 --> 00:01:12,570 And this is the material that will be emphasized 17 00:01:12,570 --> 00:01:14,090 on the exam. 18 00:01:14,090 --> 00:01:16,240 The exam is cumulative, 19 00:01:16,240 --> 00:01:19,360 but only to the extent that concepts 20 00:01:21,270 --> 00:01:24,570 and principles we've learned 21 00:01:24,570 --> 00:01:28,870 early on carry through, 22 00:01:28,870 --> 00:01:33,150 but the questions will very much focus 23 00:01:33,150 --> 00:01:36,873 on these specific topics. 24 00:01:41,730 --> 00:01:46,730 The two biggest relationships that I wanted to talk about, 25 00:01:47,530 --> 00:01:52,530 the two big concepts are first, 26 00:01:53,150 --> 00:01:54,890 a lot of what we've done 27 00:01:54,890 --> 00:01:59,890 in the second half of this class has been looking at 28 00:02:00,320 --> 00:02:05,320 the trade-off of consistency or bias and efficiency 29 00:02:05,740 --> 00:02:09,140 that in many cases, 30 00:02:09,140 --> 00:02:10,700 the techniques that we use 31 00:02:10,700 --> 00:02:14,480 to eliminate bias are at the cost of efficiency 32 00:02:14,480 --> 00:02:16,970 that they make the variance 33 00:02:16,970 --> 00:02:21,970 and standard error of our betas increase greatly. 34 00:02:22,100 --> 00:02:25,480 And we found this with random effects 35 00:02:25,480 --> 00:02:29,570 and instrumental variables, two-stage least squares 36 00:02:29,570 --> 00:02:32,130 and have learned two tests. 37 00:02:32,130 --> 00:02:34,720 One that I'm gonna go over, 38 00:02:34,720 --> 00:02:35,880 pan a little bit more depth, 39 00:02:35,880 --> 00:02:37,170 mean square error, 40 00:02:37,170 --> 00:02:39,430 as well as the Wu Hausman test 41 00:02:39,430 --> 00:02:41,340 or the Hausman tests that we learned 42 00:02:41,340 --> 00:02:44,653 when we were working with panel data. 43 00:02:46,680 --> 00:02:47,513 The other one, 44 00:02:47,513 --> 00:02:52,390 and there was a lot of focus on this concept is endogeneity 45 00:02:54,200 --> 00:02:58,350 and we recall that we assume and hope, 46 00:02:58,350 --> 00:03:01,570 and if we have well-behaved data 47 00:03:01,570 --> 00:03:06,570 that the expected value of the error term 48 00:03:06,810 --> 00:03:11,140 for any value of X equals zero, 49 00:03:11,140 --> 00:03:14,320 that the covariance of the error term 50 00:03:14,320 --> 00:03:17,350 and any regressor equals zero. 51 00:03:17,350 --> 00:03:20,940 And therefore we can say that X is exogenous, 52 00:03:20,940 --> 00:03:22,100 and this is what we want. 53 00:03:22,100 --> 00:03:23,720 This is well-behaved data. 54 00:03:23,720 --> 00:03:28,720 And this is one of the absolute foundational requirements 55 00:03:28,720 --> 00:03:30,793 to have an unbiased estimator. 56 00:03:32,130 --> 00:03:35,100 And we've learned that if this is not true, 57 00:03:35,100 --> 00:03:40,100 if the covariance does not equal zero, then X is endogenous. 58 00:03:40,240 --> 00:03:42,100 This causes bias. 59 00:03:42,100 --> 00:03:46,730 And just as a review that we wanna know the value of beta, 60 00:03:46,730 --> 00:03:51,653 how does Y change as X changes, all else equal. 61 00:03:52,610 --> 00:03:56,380 Really what we wanna know is how does y hat change 62 00:03:57,642 --> 00:03:58,725 as X changes, 63 00:04:00,456 --> 00:04:05,206 that how does the expected value of Y change as X changes 64 00:04:07,330 --> 00:04:11,000 and because we cannot observe this, 65 00:04:11,000 --> 00:04:13,110 and if the error term 66 00:04:14,340 --> 00:04:19,340 and any regressor are correlated, 67 00:04:19,920 --> 00:04:23,620 if they're covariance equal zero, 68 00:04:23,620 --> 00:04:28,050 if du/dx does not equal zero, 69 00:04:28,050 --> 00:04:32,403 we cannot tell if a change in X result 70 00:04:33,300 --> 00:04:36,730 in a change in y hat or in u. 71 00:04:36,730 --> 00:04:40,883 And therefore this results in a biased estimate. 72 00:04:44,870 --> 00:04:48,610 Two tests that we learned are mean square error 73 00:04:48,610 --> 00:04:51,133 and the Wu Hausman test. 74 00:04:52,440 --> 00:04:56,763 And we'll talk a bit more about this. 75 00:04:58,670 --> 00:05:01,960 The Hausman test we learned, 76 00:05:01,960 --> 00:05:06,560 and you covered pretty extensively on your homework. 77 00:05:06,560 --> 00:05:11,560 Mean squared error is the square 78 00:05:12,190 --> 00:05:15,980 of the bias plus the variance. 79 00:05:15,980 --> 00:05:20,380 So I'm gonna show you a bit more how this means. 80 00:05:20,380 --> 00:05:24,683 So let's say that we have an endogenous regressor 81 00:05:26,130 --> 00:05:30,100 and we wanna know its beta and which one should we use. 82 00:05:30,100 --> 00:05:32,910 So we can run both OLS 83 00:05:32,910 --> 00:05:37,910 and use this, IV two-stage least squared method. 84 00:05:38,260 --> 00:05:41,820 And just to keep things clear, 85 00:05:41,820 --> 00:05:45,123 let's call, when we run it by OLS, 86 00:05:45,123 --> 00:05:47,730 that this B is beta o. 87 00:05:47,730 --> 00:05:51,900 And when we run it by instrumental variable, 88 00:05:51,900 --> 00:05:54,290 we call this beta i. 89 00:05:54,290 --> 00:05:57,280 So the mean square error is the bias squared 90 00:05:57,280 --> 00:05:59,560 plus the variance. 91 00:05:59,560 --> 00:06:03,310 So for the instrumental variable, 92 00:06:03,310 --> 00:06:06,310 since there is no bias, 93 00:06:06,310 --> 00:06:10,450 it's just zero squared plus zero 94 00:06:10,450 --> 00:06:14,253 plus the variance of beta i. 95 00:06:16,650 --> 00:06:20,850 To get the mean square error of beta o, 96 00:06:20,850 --> 00:06:24,380 we subtract beta i minus beta o. 97 00:06:24,380 --> 00:06:29,380 So we subtract the biased from the unbiased value 98 00:06:29,930 --> 00:06:31,400 and square that, 99 00:06:31,400 --> 00:06:35,280 and then add the variance of beta o. 100 00:06:35,280 --> 00:06:39,700 And whichever of these has the smaller value, 101 00:06:39,700 --> 00:06:42,200 the mean square error of beta i, 102 00:06:42,200 --> 00:06:44,780 or the mean squared error of beta o, 103 00:06:44,780 --> 00:06:48,120 then this test would suggest that that is the one 104 00:06:48,120 --> 00:06:49,373 that we use. 105 00:06:53,400 --> 00:06:58,310 Now to get into the topics of specifically 106 00:06:58,310 --> 00:07:01,020 that will be on the exam. 107 00:07:01,020 --> 00:07:02,410 So the first thing we did 108 00:07:02,410 --> 00:07:06,613 after the midterm was to look at LimDepVar, 109 00:07:07,670 --> 00:07:10,970 the limited dependent variables. 110 00:07:10,970 --> 00:07:12,430 So we learned 111 00:07:12,430 --> 00:07:17,430 that not all dependent variables meet OLS requirements, 112 00:07:17,820 --> 00:07:19,820 that in theory, 113 00:07:19,820 --> 00:07:24,070 the Y in OLS should be able to take any value 114 00:07:24,070 --> 00:07:28,710 of negative infinity to positive infinity. 115 00:07:28,710 --> 00:07:31,110 And while we don't, you know, 116 00:07:31,110 --> 00:07:33,420 certainly see all those become used, 117 00:07:33,420 --> 00:07:35,600 ones that deviate far away 118 00:07:35,600 --> 00:07:39,090 from that will be more problematic. 119 00:07:39,090 --> 00:07:40,960 And we see this a lot, 120 00:07:40,960 --> 00:07:43,110 some are nominal or ordinal. 121 00:07:43,110 --> 00:07:47,407 Some naturally take on small integer values. 122 00:07:49,900 --> 00:07:52,060 Like how many cars you own, 123 00:07:52,060 --> 00:07:54,920 or how many times you've been arrested. 124 00:07:54,920 --> 00:07:58,667 Some sort of naturally pile up at zero 125 00:08:00,150 --> 00:08:02,940 due to behavior, 126 00:08:02,940 --> 00:08:06,207 such as how many cigarettes 127 00:08:08,120 --> 00:08:13,070 that you smoke or how much that you spend on alcohol. 128 00:08:13,070 --> 00:08:16,770 Many people would say zero for this. 129 00:08:16,770 --> 00:08:20,080 And then, 130 00:08:20,080 --> 00:08:25,060 sometimes our dependent variable cannot be measured 131 00:08:25,060 --> 00:08:28,010 outside of a certain range. 132 00:08:28,010 --> 00:08:29,690 So in these cases, 133 00:08:29,690 --> 00:08:34,290 we need to do methods other than ordinary least squares 134 00:08:34,290 --> 00:08:36,053 to get a good estimate of them. 135 00:08:38,950 --> 00:08:40,560 And then next, 136 00:08:40,560 --> 00:08:45,560 there are times when we don't have the perfect random sample 137 00:08:46,730 --> 00:08:50,120 where we can observe everybody, as we hope. 138 00:08:50,120 --> 00:08:55,077 Sometimes we choose to only focus on a sub sample, 139 00:08:55,950 --> 00:08:58,330 because that's what we are interested in, 140 00:08:58,330 --> 00:09:00,330 or that's all that we can reach. 141 00:09:00,330 --> 00:09:02,410 And then there are also times 142 00:09:02,410 --> 00:09:06,160 where only the participants can be observed. 143 00:09:06,160 --> 00:09:09,917 So we talked a lot about the case of women's wages 144 00:09:12,670 --> 00:09:14,020 in the workforce. 145 00:09:14,020 --> 00:09:15,570 And this was at a time 146 00:09:15,570 --> 00:09:20,570 when many women did not work from home, 147 00:09:21,100 --> 00:09:23,290 they were not in the workforce. 148 00:09:23,290 --> 00:09:26,680 So only the wages 149 00:09:26,680 --> 00:09:30,323 of those who are in the workforce could be observed. 150 00:09:31,605 --> 00:09:33,755 And we'll talk more about that in a minute. 151 00:09:36,410 --> 00:09:40,690 So here are the categories of LimDepVar 152 00:09:40,690 --> 00:09:42,840 that we talked about, binary, ordinal, 153 00:09:42,840 --> 00:09:45,727 tobit, hurdle, Poisson and censored. 154 00:18:01,900 --> 00:18:06,900 The last type of limited dependent variable is censored. 155 00:18:07,210 --> 00:18:09,840 And in this case, 156 00:18:09,840 --> 00:18:11,930 we either choose 157 00:18:11,930 --> 00:18:16,270 or are only able to measure our Y's 158 00:18:16,270 --> 00:18:17,840 within a certain range. 159 00:18:17,840 --> 00:18:22,500 And you can think of it as a scale 160 00:18:22,500 --> 00:18:27,150 that maybe only goes up to 300 pounds 161 00:18:27,150 --> 00:18:29,350 and anybody who weighs more than that, 162 00:18:29,350 --> 00:18:31,930 we can't observe how much they weigh. 163 00:18:31,930 --> 00:18:33,260 And we talked about 164 00:18:35,620 --> 00:18:40,060 very often, we put income in categories, 165 00:18:40,060 --> 00:18:41,840 and that would be another example 166 00:18:41,840 --> 00:18:46,290 where if it's above this amount, 167 00:18:46,290 --> 00:18:51,290 that all we can observe is that it's up to that point. 168 00:18:51,470 --> 00:18:53,423 So in this case, 169 00:18:55,350 --> 00:18:56,870 we can't use OLS. 170 00:18:56,870 --> 00:18:59,893 We have to again, use maximum likelihood, 171 00:19:01,700 --> 00:19:04,970 but the betas can be interpreted directly. 172 00:19:04,970 --> 00:19:08,710 So it is a constant, it is dy/dx 173 00:19:08,710 --> 00:19:11,190 unlike the probit, 174 00:19:11,190 --> 00:19:13,420 logit, tobit and ordinal, 175 00:19:13,420 --> 00:19:15,393 and all of those that we learned about. 176 00:19:23,190 --> 00:19:28,190 Another topic that we covered this week is so far 177 00:19:28,600 --> 00:19:30,890 when we're talking about LimDepVar, 178 00:19:30,890 --> 00:19:34,640 it's properties of our dependent, 179 00:19:34,640 --> 00:19:37,810 and now we're gonna talk about some properties 180 00:19:37,810 --> 00:19:39,283 of our sample. 181 00:19:44,090 --> 00:19:47,480 And this is called truncation where, 182 00:19:47,480 --> 00:19:52,480 because of our researcher choice, 183 00:19:52,860 --> 00:19:54,310 we can only observe, 184 00:19:54,310 --> 00:19:58,910 or we choose to only observe part of the population. 185 00:19:58,910 --> 00:20:03,270 So only folks 186 00:20:03,270 --> 00:20:04,530 of a certain income, 187 00:20:04,530 --> 00:20:07,860 higher income or lower income 188 00:20:07,860 --> 00:20:10,990 or houses worth a certain amount, 189 00:20:10,990 --> 00:20:14,520 more, greater or less than that. 190 00:20:14,520 --> 00:20:17,750 Here we use T-tests 191 00:20:17,750 --> 00:20:20,890 and log likelihood tests 192 00:20:20,890 --> 00:20:23,703 to measure this. 193 00:20:27,750 --> 00:20:29,900 One of the more interesting parts 194 00:20:29,900 --> 00:20:34,900 of truncation is so-called incidental truncation, 195 00:20:35,380 --> 00:20:39,620 where we can only observe participants 196 00:20:39,620 --> 00:20:44,400 in sort of more broad in truncation 197 00:20:44,400 --> 00:20:49,400 that we could observe higher income folks behavior 198 00:20:49,730 --> 00:20:51,880 and get data from them, 199 00:20:51,880 --> 00:20:55,750 even though we're only interested in lower income 200 00:20:55,750 --> 00:20:59,550 or we choose to put our attention to lower income. 201 00:20:59,550 --> 00:21:03,020 This incidental truncation is a case 202 00:21:03,020 --> 00:21:06,490 where we can only observe participants. 203 00:21:06,490 --> 00:21:10,347 So if we wanna measure the returns 204 00:21:11,640 --> 00:21:16,640 of college education to income, 205 00:21:16,850 --> 00:21:19,750 that we can only observe those who went 206 00:21:19,750 --> 00:21:23,953 to college to have that effect, 207 00:21:25,500 --> 00:21:30,500 or the classic case of women in the workforce, 208 00:21:32,730 --> 00:21:35,920 where we can only observe that wage 209 00:21:35,920 --> 00:21:39,050 of those who are in the workforce. 210 00:21:39,050 --> 00:21:44,050 And so in this case, 211 00:21:44,900 --> 00:21:49,900 the participation is endogenous. 212 00:21:51,040 --> 00:21:55,973 And we have to estimate this with two steps. 213 00:21:57,360 --> 00:22:02,360 And this is a fairly 214 00:22:02,900 --> 00:22:07,033 commonly used model in econometrics. 215 00:22:11,180 --> 00:22:14,873 Now we're gonna jump to panel data. 216 00:22:17,230 --> 00:22:20,420 Before this, we've worked exclusively 217 00:22:20,420 --> 00:22:22,580 with cross sectional data. 218 00:22:22,580 --> 00:22:26,370 So only one time period 219 00:22:27,320 --> 00:22:30,000 and with many observations. 220 00:22:30,000 --> 00:22:34,560 But now what we're going to work with many observations 221 00:22:34,560 --> 00:22:36,420 over many years. 222 00:22:36,420 --> 00:22:40,703 So now T is greater than one and N is greater than one. 223 00:22:42,050 --> 00:22:46,670 As an aside, up till this point we had only worked 224 00:22:46,670 --> 00:22:47,960 with cross section, 225 00:22:47,960 --> 00:22:48,930 T equals one. 226 00:22:48,930 --> 00:22:53,700 So data from N participants 227 00:22:53,700 --> 00:22:58,267 in a single time and time series is data 228 00:22:59,260 --> 00:23:01,900 from a single participant 229 00:23:01,900 --> 00:23:06,900 over many or looking at a single variable 230 00:23:07,110 --> 00:23:10,193 over many years like income or GDP 231 00:23:12,160 --> 00:23:15,510 or prices or anything like that. 232 00:23:15,510 --> 00:23:19,130 So panel data is when we have both T 233 00:23:19,130 --> 00:23:20,923 and N are greater than one. 234 00:23:26,256 --> 00:23:30,630 In a sense, panel data combines both cross section 235 00:23:30,630 --> 00:23:33,330 and time series. 236 00:23:33,330 --> 00:23:35,590 And there's two main types. 237 00:23:35,590 --> 00:23:39,720 First is independently pooled cross section, 238 00:23:39,720 --> 00:23:44,720 where you are asking the same questions, 239 00:23:44,990 --> 00:23:48,890 measuring the same variables over time, 240 00:23:48,890 --> 00:23:53,780 but each time you're drawing a random sample. 241 00:23:53,780 --> 00:23:58,630 So an example of this is asking the same question 242 00:23:58,630 --> 00:24:02,663 on the Vermonter Poll over multiple years. 243 00:24:07,750 --> 00:24:11,700 Whereas true panel data, 244 00:24:11,700 --> 00:24:14,860 you're following the same subjects 245 00:24:14,860 --> 00:24:19,080 and keeping track of what each individual said 246 00:24:20,050 --> 00:24:24,883 for each variable over each time period. 247 00:24:29,120 --> 00:24:32,230 One application of this independently pooled 248 00:24:32,230 --> 00:24:36,690 cross section is the difference in differences model, 249 00:24:36,690 --> 00:24:41,380 which is sort of a natural or quasi-experiment 250 00:24:41,380 --> 00:24:46,380 where we have a treatment that takes place in one, 251 00:24:46,790 --> 00:24:48,080 say one area, 252 00:24:48,080 --> 00:24:49,850 but not in another. 253 00:24:49,850 --> 00:24:52,170 And then we measure the before 254 00:24:52,170 --> 00:24:55,763 and the after effects of this. 255 00:24:56,770 --> 00:24:59,860 So we choose two areas. 256 00:24:59,860 --> 00:25:01,840 One gets a treatment, 257 00:25:01,840 --> 00:25:04,380 one gets the control. 258 00:25:04,380 --> 00:25:09,380 We take a baseline reading of a random sample of those areas 259 00:25:11,110 --> 00:25:12,570 in the first time, 260 00:25:12,570 --> 00:25:17,570 then the treatment happens. 261 00:25:17,870 --> 00:25:21,540 And then we take another reading in time too. 262 00:25:21,540 --> 00:25:23,030 And you can see here that 263 00:25:26,736 --> 00:25:28,193 the coefficient of interest is delta one here. 264 00:25:32,530 --> 00:25:34,127 That is the effect 265 00:25:39,790 --> 00:25:43,330 of the treatment in time two, 266 00:25:43,330 --> 00:25:47,870 controlling for the treatment area 267 00:25:47,870 --> 00:25:52,230 and the passage of times. 268 00:25:52,230 --> 00:25:57,230 So, this delta one isolates the effect of the treatment 269 00:25:58,940 --> 00:26:03,450 and as a result of the true treatment and in time two, 270 00:26:03,450 --> 00:26:07,290 so what changed from time one to time two 271 00:26:07,290 --> 00:26:09,130 in the treatment area, 272 00:26:09,130 --> 00:26:12,983 controlling for both treatment and time. 273 00:26:17,770 --> 00:26:18,802 On the other hand, 274 00:26:18,802 --> 00:26:19,680 (mic screeching) 275 00:26:19,680 --> 00:26:21,650 true panel data, 276 00:26:21,650 --> 00:26:25,830 you're not taking a random sample each time. 277 00:26:25,830 --> 00:26:28,660 It's not a new sample each time 278 00:26:28,660 --> 00:26:31,950 that you're actually following the same individuals 279 00:26:31,950 --> 00:26:36,580 or the same units of analysis 280 00:26:36,580 --> 00:26:38,460 over multiple years 281 00:26:38,460 --> 00:26:41,610 and keeping track of who says what. 282 00:26:41,610 --> 00:26:46,320 So when we talked about this city market measurement 283 00:26:47,930 --> 00:26:49,340 of their members, 284 00:26:49,340 --> 00:26:53,430 where they follow me say over many years 285 00:26:53,430 --> 00:26:58,143 and keep track of what I said to each question in each year. 286 00:26:58,980 --> 00:27:03,863 This tends to have usually a fairly big N and a small T. 287 00:27:04,950 --> 00:27:09,950 The problem is there is unobserved individual heterogeneity 288 00:27:12,257 --> 00:27:15,510 that there are things that just make me, me 289 00:27:15,510 --> 00:27:17,110 and make you, you 290 00:27:17,110 --> 00:27:19,510 that are very hard to measure 291 00:27:19,510 --> 00:27:23,980 and that we can think of as almost certainly correlated 292 00:27:23,980 --> 00:27:25,660 with our regressors, 293 00:27:25,660 --> 00:27:28,023 so we have to account for those. 294 00:27:32,210 --> 00:27:33,230 How do we account 295 00:27:33,230 --> 00:27:38,110 for this unobserved individual heterogeneity? 296 00:27:39,560 --> 00:27:44,560 Well, we could have a dummy variable for each individual. 297 00:27:44,630 --> 00:27:48,160 The problem with that is, 298 00:27:48,160 --> 00:27:51,610 especially in a single time period, 299 00:27:51,610 --> 00:27:55,870 you're gonna end up with negative degrees of freedom. 300 00:27:55,870 --> 00:27:59,050 And in any case, 301 00:27:59,050 --> 00:28:01,707 you're gonna have a whole lot of regressors. 302 00:28:01,707 --> 00:28:06,000 So you're gonna have N dummy variables 303 00:28:06,000 --> 00:28:09,560 for N individuals for whom you have data, 304 00:28:09,560 --> 00:28:12,853 and that really eats into your degrees of freedom. 305 00:28:13,720 --> 00:28:16,780 But we know that if we ignore it, 306 00:28:16,780 --> 00:28:20,020 that this will go into the error term 307 00:28:20,020 --> 00:28:22,730 and it will cause bias estimates 308 00:28:22,730 --> 00:28:25,330 of all of our betas of interest. 309 00:28:25,330 --> 00:28:29,573 So we have to deal with it in some way. 310 00:28:36,070 --> 00:28:39,690 We learned about two ways and I'll go through each one. 311 00:28:39,690 --> 00:28:43,930 The first differencing and these, 312 00:28:43,930 --> 00:28:47,160 the fixed effects, time demeaning. 313 00:28:47,160 --> 00:28:52,160 And note that if we were to run OLS on true panel data, 314 00:28:54,680 --> 00:28:57,620 that this results in a bias estimate 315 00:28:57,620 --> 00:29:01,600 that it will probably be more efficient, 316 00:29:01,600 --> 00:29:04,530 but it will almost certainly be biased. 317 00:29:04,530 --> 00:29:08,650 And that it is unbiased only 318 00:29:08,650 --> 00:29:10,993 when we draw an independent, 319 00:29:14,010 --> 00:29:15,700 random sample, 320 00:29:15,700 --> 00:29:20,700 a new sample every time does that get rid of the bias. 321 00:29:21,280 --> 00:29:26,250 And if we keep sampling the same people over time, 322 00:29:26,250 --> 00:29:30,593 running it as pooled OLS will result in bias. 323 00:29:35,550 --> 00:29:40,550 The first method that we learned with is first differencing. 324 00:29:40,670 --> 00:29:41,740 So here, 325 00:29:41,740 --> 00:29:46,740 we keep the individual persons observations together, 326 00:29:49,070 --> 00:29:54,070 and we subtract time t from time t minus one. 327 00:29:54,920 --> 00:29:59,450 So we take the difference of how say NMI 328 00:29:59,450 --> 00:30:02,290 in this city market example 329 00:30:02,290 --> 00:30:06,650 of how much I spent in 2017 330 00:30:06,650 --> 00:30:10,670 and subtracted how much I spent in 2012. 331 00:30:10,670 --> 00:30:13,910 And so we get the delta y 332 00:30:13,910 --> 00:30:16,893 and delta x here. 333 00:30:18,350 --> 00:30:21,990 I think you can see that the ai drops out by subtraction 334 00:30:21,990 --> 00:30:24,690 because it's the same thing every time. 335 00:30:24,690 --> 00:30:28,700 One thing that we need is for the delta x's and delta u's 336 00:30:28,700 --> 00:30:33,120 to be uncorrelated. 337 00:30:33,120 --> 00:30:36,300 This is called the strict form, 338 00:30:36,300 --> 00:30:38,900 a form of strict exogeneity 339 00:30:38,900 --> 00:30:43,790 that the covariance of all of my X's 340 00:30:43,790 --> 00:30:45,020 over all times 341 00:30:45,020 --> 00:30:49,400 and all of my error terms equals zero 342 00:30:49,400 --> 00:30:54,400 by running the regression on this bottom equation here, 343 00:30:58,485 --> 00:31:03,485 the change of my y as a function of the change in my X 344 00:31:04,550 --> 00:31:08,160 that beta that you get there is called 345 00:31:08,160 --> 00:31:11,883 the first difference estimator. 346 00:31:16,890 --> 00:31:20,177 This technique, first differencing, 347 00:31:23,700 --> 00:31:27,980 the two big pros of it if it does eliminate bias, 348 00:31:27,980 --> 00:31:31,360 and if you have serial correlation, 349 00:31:31,360 --> 00:31:36,360 it helps to get rid of the serial correlation of the errors. 350 00:31:37,810 --> 00:31:42,490 The drawbacks is you lose a whole observation. 351 00:31:42,490 --> 00:31:47,490 So if you have data over two years 352 00:31:47,810 --> 00:31:49,410 for each individual, 353 00:31:49,410 --> 00:31:51,960 now you only have one observation, 354 00:31:51,960 --> 00:31:56,330 or if you have three years, it's down to two, et cetera. 355 00:31:56,330 --> 00:32:01,330 So you only get t minus one observations for each person. 356 00:32:03,490 --> 00:32:06,190 Since we're subtracting out, 357 00:32:06,190 --> 00:32:09,190 we're only measuring the change in X, 358 00:32:09,190 --> 00:32:13,920 you're probably going to have less variation in X. 359 00:32:13,920 --> 00:32:18,410 We lose any information that might be contained in ai 360 00:32:18,410 --> 00:32:20,500 because it's subtracted out. 361 00:32:20,500 --> 00:32:22,680 And as a result, 362 00:32:22,680 --> 00:32:26,180 because we lose degrees of freedom, 363 00:32:26,180 --> 00:32:28,590 we lose information each time. 364 00:32:28,590 --> 00:32:31,803 This tends to be a less efficient estimator. 365 00:32:35,140 --> 00:32:37,760 The other technique that we learned to deal 366 00:32:37,760 --> 00:32:42,760 with panel data is fixed effects or what, 367 00:32:45,590 --> 00:32:48,280 where we do the time demeaning. 368 00:32:48,280 --> 00:32:50,530 So if we have one regressor, 369 00:32:50,530 --> 00:32:53,530 we still have this model 370 00:32:53,530 --> 00:32:58,530 where we have data from each individual 371 00:32:59,490 --> 00:33:00,880 over t times. 372 00:33:00,880 --> 00:33:03,120 And we assume that there's this ai there 373 00:33:03,120 --> 00:33:04,930 for every individual. 374 00:33:04,930 --> 00:33:09,910 Now we group each individual's observations. 375 00:33:09,910 --> 00:33:13,190 So you say, take all of my observations 376 00:33:13,190 --> 00:33:14,870 and take the mean of them. 377 00:33:14,870 --> 00:33:19,770 So if I answered the survey three times, 378 00:33:19,770 --> 00:33:21,460 take all three of my Y's, 379 00:33:21,460 --> 00:33:24,280 and take the mean of them. 380 00:33:24,280 --> 00:33:29,280 So that's yi bar same with xi bar and ui bar. 381 00:33:29,420 --> 00:33:33,630 And then we subtract them from each time observation. 382 00:33:33,630 --> 00:33:37,700 So say in year three, 383 00:33:37,700 --> 00:33:40,880 you take my Y and subtract 384 00:33:42,100 --> 00:33:46,500 the mean of my Y and the same for each X, 385 00:33:46,500 --> 00:33:49,660 and in this way, 386 00:33:49,660 --> 00:33:53,510 since ai doesn't change 387 00:33:53,510 --> 00:33:57,170 over time with the mean of ai is always ai, 388 00:33:57,170 --> 00:33:58,700 it gets subtracted out. 389 00:33:58,700 --> 00:34:00,700 And in the same way, 390 00:34:00,700 --> 00:34:05,340 it gets rid of this endogeneity 391 00:34:05,340 --> 00:34:07,203 and therefore the bias. 392 00:34:08,510 --> 00:34:13,510 And this could be done for any number of K's 393 00:34:13,950 --> 00:34:17,510 and for any number of times. 394 00:34:17,510 --> 00:34:22,510 So if this were pooled, 395 00:34:22,650 --> 00:34:26,550 the data that you would have NT observations, 396 00:34:26,550 --> 00:34:27,633 but we lose, 397 00:34:30,159 --> 00:34:32,350 but the degrees of freedom 398 00:34:34,400 --> 00:34:37,107 here is NT minus N 399 00:34:41,270 --> 00:34:43,630 because we lose a degree of freedom 400 00:34:43,630 --> 00:34:45,670 by subtracting out the mean, 401 00:34:45,670 --> 00:34:49,253 and then k for k regressors. 402 00:34:52,530 --> 00:34:56,750 So the pros and cons of panel data, 403 00:34:56,750 --> 00:34:59,203 it does eliminate bias as well. 404 00:35:00,410 --> 00:35:04,690 And compared to first differencing 405 00:35:04,690 --> 00:35:08,420 for any number of times greater than two, 406 00:35:08,420 --> 00:35:10,893 it retains a degree of freedom. 407 00:35:15,998 --> 00:35:20,610 It keeps one more observation 408 00:35:20,610 --> 00:35:22,320 than it does per person 409 00:35:22,320 --> 00:35:24,747 than when first differencing. 410 00:35:26,230 --> 00:35:29,677 The cons is same as first differencing. 411 00:35:31,680 --> 00:35:36,680 Any predictor that is constant over time will be lost. 412 00:35:37,470 --> 00:35:39,650 It gets subtracted out, 413 00:35:39,650 --> 00:35:44,650 and it does not deal with serial correlation 414 00:35:46,840 --> 00:35:49,840 in the way that first differencing did. 415 00:35:49,840 --> 00:35:52,900 And you need a homoskedasticity 416 00:35:52,900 --> 00:35:56,500 and a lack of serial correlation 417 00:35:56,500 --> 00:35:59,163 for this to be an efficient estimator. 418 00:36:02,640 --> 00:36:05,430 This slide discusses the difference 419 00:36:05,430 --> 00:36:08,100 between our two techniques, 420 00:36:08,100 --> 00:36:09,677 the first differencing 421 00:36:09,677 --> 00:36:14,677 and time demeaning fixed effects data transformations. 422 00:36:15,320 --> 00:36:20,320 So, as we saw in the SPSS exercise, 423 00:36:21,510 --> 00:36:25,060 when you had two time periods, 424 00:36:25,060 --> 00:36:27,093 you get identical betas. 425 00:36:28,340 --> 00:36:32,020 If T is greater than or equal to three, 426 00:36:32,020 --> 00:36:35,663 then you have some decisions to make. 427 00:36:37,732 --> 00:36:39,990 So they're both unbiased, 428 00:36:39,990 --> 00:36:42,080 as long as the other 429 00:36:43,910 --> 00:36:47,963 classic linear regression assumptions hold. 430 00:36:48,880 --> 00:36:49,980 And in many ways, 431 00:36:49,980 --> 00:36:53,950 it comes down to if there is serial correlation. 432 00:36:53,950 --> 00:36:55,850 So when we do time series, 433 00:36:55,850 --> 00:36:58,240 we'll learn how to test for that. 434 00:36:58,240 --> 00:37:02,940 So if there is no serial correlation 435 00:37:02,940 --> 00:37:07,940 then time demeaning is more efficient, 436 00:37:08,810 --> 00:37:09,660 in many ways, 437 00:37:09,660 --> 00:37:13,630 because you have one more observation to work with. 438 00:37:13,630 --> 00:37:18,123 If there is significant serial correlation, 439 00:37:19,590 --> 00:37:22,364 then it's better to use first differencing 440 00:37:22,364 --> 00:37:24,833 to eliminate this. 441 00:37:26,170 --> 00:37:28,890 And also when you have a large T, 442 00:37:28,890 --> 00:37:32,120 so when T is large and N is small, 443 00:37:32,120 --> 00:37:34,770 it comes more of a time series model 444 00:37:34,770 --> 00:37:38,610 and first differencing is better. 445 00:37:38,610 --> 00:37:42,030 And as is so often that the case, 446 00:37:42,030 --> 00:37:44,230 it's good to do things both ways 447 00:37:44,230 --> 00:37:49,090 and to discuss why and how they differ, 448 00:37:49,090 --> 00:37:52,743 and which is the final model that you choose. 449 00:38:01,360 --> 00:38:04,420 The last data transformation that we can do 450 00:38:05,610 --> 00:38:08,750 in panel data is random effects. 451 00:38:08,750 --> 00:38:12,470 And we have our same model where we have 452 00:38:14,531 --> 00:38:19,531 N research subjects, data collected over t times, 453 00:38:19,840 --> 00:38:22,640 we have k regressors, 454 00:38:22,640 --> 00:38:27,640 but now we can assume that ai is uncorrelated 455 00:38:29,300 --> 00:38:32,683 with any of the Xs at any time, 456 00:38:33,940 --> 00:38:37,360 which basically means that we've controlled 457 00:38:37,360 --> 00:38:39,560 for everything through our Xs, 458 00:38:39,560 --> 00:38:43,593 and therefore ai is very small and insignificant. 459 00:38:44,470 --> 00:38:49,470 And by doing the time demeaning or first differencing 460 00:38:53,550 --> 00:38:55,230 we lose information 461 00:38:55,230 --> 00:39:00,230 and therefore those betas are less efficient. 462 00:39:00,880 --> 00:39:02,773 They have a higher variance. 463 00:39:04,390 --> 00:39:07,350 So this is a form of FGLS. 464 00:39:07,350 --> 00:39:09,930 And here you can see Jeff Feagles 465 00:39:09,930 --> 00:39:14,293 when he was a punter for the New England Patriots. 466 00:39:15,210 --> 00:39:19,700 And anyway, 467 00:39:19,700 --> 00:39:24,700 we weigh every observation by a factor of one minus lambda, 468 00:39:25,560 --> 00:39:29,113 and we use our data to come up with the lambda, 469 00:39:30,050 --> 00:39:35,050 and you can see that the lambda is a function of 470 00:39:37,860 --> 00:39:39,930 the variance of our error term, 471 00:39:39,930 --> 00:39:43,500 which we've worked with before, 472 00:39:43,500 --> 00:39:47,410 the variance of our a term and t, 473 00:39:47,410 --> 00:39:50,440 how many observations that there are. 474 00:39:50,440 --> 00:39:55,370 And then we use this lambda to weigh every observation. 475 00:39:55,370 --> 00:39:56,233 So, 476 00:40:01,000 --> 00:40:06,000 we weigh the mean 477 00:40:06,090 --> 00:40:10,790 by this factor. 478 00:40:10,790 --> 00:40:15,790 And in the next one, I will show you what this all means. 479 00:40:18,520 --> 00:40:22,050 So again, this is our lambda. 480 00:40:22,050 --> 00:40:24,330 So with lambda equals zero, 481 00:40:24,330 --> 00:40:28,130 then we don't subtract any of the mean out, 482 00:40:28,130 --> 00:40:29,750 no wrong. 483 00:40:29,750 --> 00:40:34,750 If lambda equals zero, then it's the same as fixed effects. 484 00:40:35,260 --> 00:40:38,570 We subtract all of our mean out. 485 00:40:38,570 --> 00:40:42,870 If lambda equals one, it's the same as pooled OLS. 486 00:40:42,870 --> 00:40:45,423 We do not subtract any of it out. 487 00:40:46,380 --> 00:40:51,380 And in most cases, it'll be a number between zero and one. 488 00:40:53,230 --> 00:40:58,230 Note that if a varies a lot, a is important, 489 00:41:00,198 --> 00:41:03,150 and the sigma squared a is large 490 00:41:03,150 --> 00:41:06,930 than one minus lambda approaches one, 491 00:41:06,930 --> 00:41:09,330 and it becomes more like the fixed effects, 492 00:41:09,330 --> 00:41:12,430 which we would assume if a is important, 493 00:41:12,430 --> 00:41:15,480 we need to subtract it out. 494 00:41:15,480 --> 00:41:17,083 As T gets large, 495 00:41:18,660 --> 00:41:20,260 one minus lambda, 496 00:41:20,260 --> 00:41:22,000 again, the approach is one, 497 00:41:22,000 --> 00:41:24,210 and it becomes like fixed effects 498 00:41:24,210 --> 00:41:27,760 and as a got smaller and smaller, 499 00:41:27,760 --> 00:41:31,500 one minus lambda approaches zero 500 00:41:31,500 --> 00:41:32,983 and it's more like pooled. 501 00:41:36,470 --> 00:41:38,770 We learned about the Wu Hausman test 502 00:41:38,770 --> 00:41:43,770 or Wu Durbin Hausman tests, it has a number of names, 503 00:41:43,980 --> 00:41:47,203 but this test that W, 504 00:41:48,370 --> 00:41:51,370 in the numerator is the difference 505 00:41:51,370 --> 00:41:53,420 between the two betas, 506 00:41:53,420 --> 00:41:55,823 the random effects and fixed effects squared, 507 00:41:56,830 --> 00:42:01,240 and on the denominator is just the difference 508 00:42:01,240 --> 00:42:03,440 of the variances. 509 00:42:03,440 --> 00:42:06,060 So you would run it both ways. 510 00:42:06,060 --> 00:42:08,623 And for a given beta, 511 00:42:09,540 --> 00:42:13,460 note, what is the estimate of beta? 512 00:42:13,460 --> 00:42:15,800 Those two things go in the numerator 513 00:42:15,800 --> 00:42:18,050 and what are the variances of these? 514 00:42:18,050 --> 00:42:22,390 So you would take the standard error from the SPSS 515 00:42:22,390 --> 00:42:23,740 and square it, 516 00:42:23,740 --> 00:42:26,183 and those are used in the denominator, 517 00:42:27,100 --> 00:42:31,570 our null hypothesis is that this X and ai 518 00:42:31,570 --> 00:42:35,320 that their covariance equals zero. 519 00:42:35,320 --> 00:42:38,280 So if the null is true, 520 00:42:38,280 --> 00:42:43,280 if the covariance of those two things are zero, 521 00:42:43,960 --> 00:42:46,950 then the numerator is small. 522 00:42:46,950 --> 00:42:50,290 There's very small or no bias. 523 00:42:50,290 --> 00:42:52,760 The denominator is large 524 00:42:52,760 --> 00:42:56,260 because we, 525 00:42:56,260 --> 00:43:00,430 there a big difference in the efficiencies, 526 00:43:00,430 --> 00:43:01,710 W is small, 527 00:43:01,710 --> 00:43:06,710 and we use random effects. 528 00:43:07,890 --> 00:43:11,450 If the test stat is large, 529 00:43:11,450 --> 00:43:14,160 which means a lot of bias 530 00:43:14,160 --> 00:43:17,990 and not very much change in variance. 531 00:43:17,990 --> 00:43:20,670 So big number in the numerator, 532 00:43:20,670 --> 00:43:24,640 small number in the denominator means a large test stat, 533 00:43:24,640 --> 00:43:27,160 which means that we reject the null 534 00:43:27,160 --> 00:43:29,283 and we use fixed effects. 535 00:43:33,760 --> 00:43:36,570 Now we jump to the next topic, (mic screeching) 536 00:43:36,570 --> 00:43:40,230 instrumental variables and two-stage least squares. 537 00:43:40,230 --> 00:43:43,560 So this is another data transformation 538 00:43:43,560 --> 00:43:45,690 to deal with endogeneity. 539 00:43:45,690 --> 00:43:49,280 You may notice (paper screeching) 540 00:43:49,280 --> 00:43:50,790 a trend that that's a lot 541 00:43:50,790 --> 00:43:53,680 of what we were doing in the second half 542 00:43:53,680 --> 00:43:56,970 of this class is transforming data 543 00:43:56,970 --> 00:44:00,040 to deal with endogeneity. 544 00:44:00,040 --> 00:44:04,500 And again, finding that in almost every case, 545 00:44:04,500 --> 00:44:09,253 the eliminating bias comes at a cost of efficiency. 546 00:44:10,930 --> 00:44:15,373 So this is a method that we use to address endogeneity. 547 00:44:19,420 --> 00:44:22,420 Its applications in omitted 548 00:44:22,420 --> 00:44:26,470 with variables and measurement errors, 549 00:44:26,470 --> 00:44:31,470 and most commonly is used in the next topic 550 00:44:31,760 --> 00:44:35,580 that we covered simultaneous equations. 551 00:44:35,580 --> 00:44:38,460 So we've talked a lot about this, 552 00:44:38,460 --> 00:44:40,210 but just as a review, 553 00:44:40,210 --> 00:44:42,630 that endogeneity causes bias, 554 00:44:42,630 --> 00:44:47,630 that if the regressor and the error term are correlated, 555 00:44:48,120 --> 00:44:50,680 that we can't know (paper screeching) 556 00:44:50,680 --> 00:44:55,680 what is the direct effect of x on the expected value, 557 00:44:56,340 --> 00:45:00,820 because we can't partial out 558 00:45:02,310 --> 00:45:04,660 the way that x affects u, 559 00:45:04,660 --> 00:45:06,380 and then u affects y. 560 00:45:06,380 --> 00:45:10,670 So this is why failing to deal 561 00:45:10,670 --> 00:45:13,963 with endogeneity gives us bias assessments. 562 00:45:17,400 --> 00:45:21,103 So the three main ways that it, 563 00:45:22,420 --> 00:45:25,930 or the three main sort of types of data that we have, 564 00:45:25,930 --> 00:45:30,360 or three main issues are omitted variable, 565 00:45:30,360 --> 00:45:34,003 where we are missing the variable where we, 566 00:45:35,110 --> 00:45:38,050 our regressor has an error (paper screeching) 567 00:45:38,050 --> 00:45:40,713 and how it is measured or simultaneity. 568 00:45:41,750 --> 00:45:43,040 So far, 569 00:45:43,040 --> 00:45:47,310 what we've learned is we could ignore it, 570 00:45:47,310 --> 00:45:49,010 but we know that causes bias. (mic screeching) 571 00:45:49,010 --> 00:45:52,690 We could find a proxy that is exogenous, 572 00:45:52,690 --> 00:45:55,623 which may or may not be possible. 573 00:45:56,830 --> 00:46:00,330 We can assume it is time constant. 574 00:46:00,330 --> 00:46:05,330 And in panel data subtracted out through first differencing 575 00:46:06,560 --> 00:46:11,370 or subtract out its mean through fixed effects, 576 00:46:11,370 --> 00:46:16,370 or we can use the instrumental variable method. 577 00:46:20,410 --> 00:46:25,170 So this method requires us 578 00:46:25,170 --> 00:46:30,170 to have an additional regressor 579 00:46:30,684 --> 00:46:34,520 or an additional variable 580 00:46:34,520 --> 00:46:37,410 that's not in the original equation 581 00:46:37,410 --> 00:46:41,540 that we sort of saved in our back pockets. 582 00:46:41,540 --> 00:46:45,060 And it has to has two attributes. 583 00:46:45,060 --> 00:46:46,040 First, (mic screeching) 584 00:46:46,040 --> 00:46:49,000 it must be in itself exogenous. 585 00:46:49,000 --> 00:46:53,840 So the z must have zero covariance with u 586 00:46:53,840 --> 00:46:58,840 but it has to have some sort of explanatory power of our x, 587 00:46:59,550 --> 00:47:02,974 so the z and x cannot have covariance 588 00:47:02,974 --> 00:47:05,724 (mic screeching) 589 00:47:07,843 --> 00:47:10,540 So in two-stage least squares, 590 00:47:10,540 --> 00:47:13,240 we start with our structural models. 591 00:47:13,240 --> 00:47:16,880 So we have Y on the left side, Y1, 592 00:47:16,880 --> 00:47:20,813 and we have this endogenous regressor, Y2, 593 00:47:22,200 --> 00:47:27,157 and an exogenous one in our structural equation, Z1. 594 00:47:29,030 --> 00:47:34,030 We've also saved two exogenous variables, Z2 and Z3, 595 00:47:35,480 --> 00:47:40,480 which have these two attributes that they are exogenous. 596 00:47:42,230 --> 00:47:45,350 So they're both uncorrelated with u1, 597 00:47:45,350 --> 00:47:49,910 but they do have some overlap with Y2. 598 00:47:49,910 --> 00:47:51,640 So we wanted it to come up 599 00:47:52,730 --> 00:47:55,573 with an estimate of Y2, 600 00:47:56,440 --> 00:48:00,550 which is a function of Z1, Z2, Z3, 601 00:48:00,550 --> 00:48:05,493 that has the most explanatory power that we can. 602 00:48:08,700 --> 00:48:11,040 We do this by doing the first stage 603 00:48:11,040 --> 00:48:12,590 of two-stage least squares, 604 00:48:12,590 --> 00:48:16,050 which we put Y2 on the left side, 605 00:48:16,050 --> 00:48:21,050 and we regress it on all three of our exogenous regressor, 606 00:48:21,461 --> 00:48:24,470 Z1, Z2, Z3. 607 00:48:24,470 --> 00:48:25,760 We do an F-test, 608 00:48:25,760 --> 00:48:29,700 and we hope that we can reject our null, 609 00:48:29,700 --> 00:48:34,700 that we hope that Pi2 and Pi3 are not jointly equal to zero, 610 00:48:35,410 --> 00:48:36,270 that these two, 611 00:48:36,270 --> 00:48:41,270 Z2 and Z3 have significant explanatory power for Y2. 612 00:48:42,730 --> 00:48:47,450 Then we save the predicted values of y hat. 613 00:48:47,450 --> 00:48:52,450 So everybody's y2 hat gets saved. 614 00:48:53,630 --> 00:48:58,170 And then in the second stage of least squares, 615 00:48:58,170 --> 00:49:02,120 we put this Y2 hat in place of the Y2. 616 00:49:02,120 --> 00:49:03,280 And in this way, 617 00:49:03,280 --> 00:49:07,150 since Y2 is a linear combination 618 00:49:07,150 --> 00:49:09,500 of all exogenous regressors, 619 00:49:09,500 --> 00:49:12,497 it is in and of itself exogenous. 620 00:49:12,497 --> 00:49:16,913 And this purges the Y2 of its endogeneity. 621 00:49:19,250 --> 00:49:22,360 Note though, that it comes at a cost 622 00:49:22,360 --> 00:49:24,850 of much higher variance. 623 00:49:24,850 --> 00:49:26,730 And we saw that 624 00:49:26,730 --> 00:49:29,970 when we did the simultaneous equation exercises, 625 00:49:29,970 --> 00:49:34,970 the standard error increased almost tenfold, 626 00:49:35,990 --> 00:49:38,830 and two very related things 627 00:49:38,830 --> 00:49:43,830 that Y2 has a lot less variability, 628 00:49:44,380 --> 00:49:46,990 or y2 hat has less variability, 629 00:49:46,990 --> 00:49:48,150 and more importantly, 630 00:49:48,150 --> 00:49:52,773 maybe Y2 is very collinear with our Zs. 631 00:49:58,900 --> 00:50:03,900 This method can also help address measurement errors. 632 00:50:03,930 --> 00:50:05,850 So if one of your regressors, 633 00:50:05,850 --> 00:50:09,313 if there is an error in which it was measured, 634 00:50:11,320 --> 00:50:15,510 that will in and of itself cause endogeneity. 635 00:50:15,510 --> 00:50:20,510 So if you have an instrument for the variable 636 00:50:21,570 --> 00:50:23,500 that was measured in error, 637 00:50:23,500 --> 00:50:27,890 that you can do two-stage least squares just as before, 638 00:50:27,890 --> 00:50:32,480 and so in this case use Z1 639 00:50:34,800 --> 00:50:36,513 as an instrument for X1, 640 00:50:37,470 --> 00:50:39,220 and save X1 hat, 641 00:50:39,220 --> 00:50:43,660 and then put it into the structural equation. 642 00:50:43,660 --> 00:50:46,810 And again, this will eliminate the bias, 643 00:50:46,810 --> 00:50:51,423 but as a cost of less efficiency. 644 00:50:55,880 --> 00:50:59,930 The most common application is simultaneous equations. 645 00:50:59,930 --> 00:51:01,970 So we showed how 646 00:51:01,970 --> 00:51:06,970 simultaneity is a major cause 647 00:51:07,480 --> 00:51:09,990 of endogeneity that when you have this 648 00:51:09,990 --> 00:51:13,000 two directional causality, 649 00:51:13,000 --> 00:51:17,183 that modeling it as a one way can cause bias. 650 00:51:18,290 --> 00:51:22,610 And it tends to be jointly determined variables, 651 00:51:22,610 --> 00:51:25,963 and often in equilibrium. 652 00:51:30,100 --> 00:51:35,050 Very often we have a two equation, supply and demand 653 00:51:35,050 --> 00:51:40,050 with Y1 on the left side of the first equation, 654 00:51:40,180 --> 00:51:45,180 and it has a regressor in the second, Y2 on the left side 655 00:51:45,480 --> 00:51:48,430 of the second equation 656 00:51:48,430 --> 00:51:51,470 and has regressor in one, 657 00:51:51,470 --> 00:51:54,673 and then each one also has a set of Zs, 658 00:51:56,810 --> 00:51:57,760 Z1. 659 00:51:57,760 --> 00:51:58,593 So, you know, 660 00:51:58,593 --> 00:52:03,050 that that could be a number of things that affect supply. 661 00:52:03,050 --> 00:52:06,733 Z2 would be a number of things that affect demand. 662 00:52:09,180 --> 00:52:11,730 Here it is, spelled out a bit more. 663 00:52:11,730 --> 00:52:13,240 Note that we would expect 664 00:52:13,240 --> 00:52:15,590 that there would probably be some overlap 665 00:52:15,590 --> 00:52:18,550 that some of the regressors 666 00:52:18,550 --> 00:52:23,210 in the first equation would also appear in the second 667 00:52:23,210 --> 00:52:25,420 and vice versa, 668 00:52:25,420 --> 00:52:28,660 but they cannot be exactly the same 669 00:52:28,660 --> 00:52:33,210 that for each equation to be identified, 670 00:52:33,210 --> 00:52:38,210 there must be a Z in the other equation, 671 00:52:38,960 --> 00:52:43,650 because if Z1 and Z2 are all exactly the same, 672 00:52:43,650 --> 00:52:46,590 then when we do the, 673 00:52:46,590 --> 00:52:49,393 in the first stage of least squares, 674 00:52:51,580 --> 00:52:56,580 our say, Y2 hat will be a linear combination 675 00:52:59,750 --> 00:53:01,330 of Z1. 676 00:53:01,330 --> 00:53:03,790 And then when we put that back in, 677 00:53:03,790 --> 00:53:08,370 it will be perfectly collinear with our Z1, 678 00:53:08,370 --> 00:53:10,163 and it will not work. 679 00:53:12,750 --> 00:53:15,603 Again for- 680 00:53:21,630 --> 00:53:24,800 We can use two-stage least squares 681 00:53:24,800 --> 00:53:27,683 to our simultaneous equations. 682 00:53:28,660 --> 00:53:31,743 The way that we do this is we, 683 00:53:33,260 --> 00:53:38,060 that if we're mainly interested in the first equation, 684 00:53:38,060 --> 00:53:41,870 so that's really the structural equation 685 00:53:41,870 --> 00:53:44,070 that we're most interested in. 686 00:53:44,070 --> 00:53:48,850 Then we do a reduced form of the second equation for Y2. 687 00:53:48,850 --> 00:53:51,770 So we put Y2 on the left side 688 00:53:51,770 --> 00:53:55,440 and we regress it on all of our Zs, 689 00:53:55,440 --> 00:54:00,440 all of our exogenous Zs, and save the Y hats, 690 00:54:00,590 --> 00:54:05,590 and then put this Y hat two into the structural equation 691 00:54:05,620 --> 00:54:10,393 and use that to regress for the structural parameters. 692 00:54:16,080 --> 00:54:20,930 Here is a note on the directions for the exam. 693 00:54:20,930 --> 00:54:24,530 So this is the first thing you'll see in the exam. 694 00:54:24,530 --> 00:54:28,580 And mainly that this is take home and open book 695 00:54:28,580 --> 00:54:32,250 that you can and should use class notes, 696 00:54:32,250 --> 00:54:36,580 slides, both textbooks, 697 00:54:36,580 --> 00:54:38,650 the textbook slides, 698 00:54:38,650 --> 00:54:40,490 and also that Dartmouth guide. 699 00:54:40,490 --> 00:54:45,490 So all of those are absolutely fair game for you to use, 700 00:54:45,711 --> 00:54:48,540 to answer these. 701 00:54:48,540 --> 00:54:52,950 It will have four, 25 point questions 702 00:54:52,950 --> 00:54:55,433 for a total of 100 points. 703 00:54:56,380 --> 00:54:58,170 The biggest thing is that 704 00:55:00,420 --> 00:55:03,040 unlike the problem sets 705 00:55:03,040 --> 00:55:07,103 that I want you to work on this only as individuals, 706 00:55:08,250 --> 00:55:11,130 please do not collaborate on it. 707 00:55:11,130 --> 00:55:12,490 Once you open it, 708 00:55:12,490 --> 00:55:16,360 don't discuss any of the questions or the answers 709 00:55:16,360 --> 00:55:17,800 with anybody else. 710 00:55:17,800 --> 00:55:20,080 So do your own work. 711 00:55:20,080 --> 00:55:24,720 You can use any resource except for another person. 712 00:55:24,720 --> 00:55:25,740 So here's a place 713 00:55:25,740 --> 00:55:30,630 where I really wanna test your individual knowledge. 714 00:55:30,630 --> 00:55:35,630 So we'll go over these slides on Monday in class, 715 00:55:35,640 --> 00:55:39,720 and I hope you are having a good weekend. 716 00:55:39,720 --> 00:55:43,570 It's not the very best of weather to be outside, 717 00:55:43,570 --> 00:55:44,403 but, 718 00:55:47,700 --> 00:55:49,730 I hope that you are all well. 719 00:55:49,730 --> 00:55:52,857 And we'll check in on Monday, thank you.