1 00:00:01,210 --> 00:00:02,560 - [Instructor] Hi everyone. 2 00:00:04,700 --> 00:00:08,800 This week we're going to do the final material 3 00:00:08,800 --> 00:00:13,160 that you are responsible for the final exam, 4 00:00:13,160 --> 00:00:15,350 which is Simultaneous Equations. 5 00:00:15,350 --> 00:00:17,930 And you'll see how it really builds on 6 00:00:17,930 --> 00:00:19,680 what we did last week, 7 00:00:19,680 --> 00:00:22,890 that in many ways, last week, 8 00:00:22,890 --> 00:00:26,770 the instrumental variable two-stage least squares 9 00:00:26,770 --> 00:00:28,223 is in many ways, 10 00:00:29,440 --> 00:00:34,110 just a setup for dealing with simultaneous equations, 11 00:00:34,110 --> 00:00:36,870 which especially in economics 12 00:00:36,870 --> 00:00:41,870 tends to be a very common phenomenon and dataset 13 00:00:43,090 --> 00:00:44,423 that we have to deal with. 14 00:00:47,680 --> 00:00:49,083 By way of overview, 15 00:00:49,980 --> 00:00:53,740 this is how we deal with variables 16 00:00:53,740 --> 00:00:58,740 that can be jointly determined by different actors. 17 00:00:59,890 --> 00:01:01,720 And in many cases, 18 00:01:01,720 --> 00:01:04,180 it's some kind of equilibrium 19 00:01:04,180 --> 00:01:08,900 like the supply and demand of a good or service. 20 00:01:08,900 --> 00:01:11,473 That it's jointly determined 21 00:01:11,473 --> 00:01:15,850 both by the supplier and the demander. 22 00:01:15,850 --> 00:01:20,430 And so a simple, single equation model 23 00:01:20,430 --> 00:01:23,010 really doesn't address it, 24 00:01:23,010 --> 00:01:26,603 that it has this two-way causal arrow, 25 00:01:27,880 --> 00:01:29,593 and we need to deal with that. 26 00:01:32,700 --> 00:01:37,113 Last time we looked at how instrumental variables 27 00:01:37,960 --> 00:01:40,080 and two-stage least squares 28 00:01:40,080 --> 00:01:43,810 can deal with endogenous regressors. 29 00:01:43,810 --> 00:01:46,975 Usually in the case of we forgot 30 00:01:46,975 --> 00:01:51,975 to measure one or we measured it wrongly. 31 00:01:53,130 --> 00:01:57,263 And in each case we'll get a biased estimate of our betas. 32 00:01:58,900 --> 00:02:01,270 This time we're going to be looking 33 00:02:01,270 --> 00:02:06,270 at simultaneity as a form of endogeneity. 34 00:02:06,880 --> 00:02:08,610 And there's two main, 35 00:02:08,610 --> 00:02:11,660 and I guess, related reasons 36 00:02:11,660 --> 00:02:16,660 why simultaneity causes endogeneity. 37 00:02:16,780 --> 00:02:21,780 The first is that the same weird random stochastic effects 38 00:02:22,340 --> 00:02:27,030 that affect one variable 39 00:02:29,150 --> 00:02:31,170 will probably affect both, 40 00:02:31,170 --> 00:02:35,530 that Y one and Y two in the example 41 00:02:35,530 --> 00:02:39,580 are probably going to be impacted 42 00:02:39,580 --> 00:02:44,340 by the same sort of random non measured effects 43 00:02:44,340 --> 00:02:45,923 that we encounter. 44 00:02:47,050 --> 00:02:50,240 And as you'll see mathematically 45 00:02:50,240 --> 00:02:54,460 that the error term actually appears in the regressors. 46 00:02:54,460 --> 00:02:56,210 And I'm gonna show you 47 00:02:56,210 --> 00:02:58,940 the mathematical derivation of that, 48 00:02:58,940 --> 00:03:03,740 which shows that in a true simultaneous equation model, 49 00:03:03,740 --> 00:03:06,253 that right there you'll see it, 50 00:03:06,253 --> 00:03:07,853 the endogeneity. 51 00:03:11,580 --> 00:03:14,590 Again, simultaneous equation models 52 00:03:14,590 --> 00:03:19,590 are used when the variables are jointly determined. 53 00:03:20,870 --> 00:03:25,397 So again, most commonly supply and demand 54 00:03:25,397 --> 00:03:30,397 usually linked by some kind of a market equilibrium 55 00:03:30,700 --> 00:03:34,400 or other kinds of equilibrium. 56 00:03:34,400 --> 00:03:38,250 And it's important that each structural equation 57 00:03:38,250 --> 00:03:42,720 have a ceteris paribus interpretation, 58 00:03:42,720 --> 00:03:47,720 that the betas sort of make sense in and of themselves 59 00:03:48,112 --> 00:03:52,220 that you can look at each of the structural equations, 60 00:03:52,220 --> 00:03:53,413 and it makes sense. 61 00:03:57,980 --> 00:03:59,142 Here's an example. 62 00:03:59,142 --> 00:04:04,142 So Wooldridge textbook gives an example 63 00:04:05,280 --> 00:04:10,250 of equilibrium in the labor market. 64 00:04:10,250 --> 00:04:15,250 So first we look at the employee's equation, 65 00:04:15,490 --> 00:04:18,260 and this employee is thinking 66 00:04:18,260 --> 00:04:21,600 about how many hours should I work? 67 00:04:21,600 --> 00:04:25,620 So the hour, the supply of labors 68 00:04:25,620 --> 00:04:29,840 is a function of what is the wage offered 69 00:04:29,840 --> 00:04:34,840 and then a set of other exogenous economic variables, 70 00:04:37,050 --> 00:04:41,350 things that might drive how many hours that they will work. 71 00:04:41,350 --> 00:04:43,480 And I'm gonna put a little bit more meat 72 00:04:43,480 --> 00:04:44,843 on these bones later. 73 00:04:47,281 --> 00:04:50,070 And then we have the U one error term, 74 00:04:50,070 --> 00:04:54,193 which has all the properties that we've come to know. 75 00:04:55,352 --> 00:04:58,440 And again, this is a structural equation 76 00:04:58,440 --> 00:05:01,740 that we want to know what is A one, 77 00:05:01,740 --> 00:05:05,170 how does the number of hours worked change 78 00:05:05,170 --> 00:05:09,363 as the wage is changed? 79 00:05:13,530 --> 00:05:15,910 The other equation in this model 80 00:05:15,910 --> 00:05:19,050 now has wage on the left side 81 00:05:19,050 --> 00:05:22,080 and the employer is thinking 82 00:05:22,080 --> 00:05:25,670 about what wage should I offer 83 00:05:25,670 --> 00:05:29,530 to draw that the number of hours I need? 84 00:05:29,530 --> 00:05:34,530 So this is a function of labor need. 85 00:05:34,530 --> 00:05:39,530 And so that's the Hd, the demanded labor, 86 00:05:40,070 --> 00:05:43,000 and there's also these Zs. 87 00:05:43,000 --> 00:05:48,000 So a number of factors that drive demand for labor. 88 00:05:54,690 --> 00:05:58,300 And again, we'll see more examples soon. 89 00:05:58,300 --> 00:06:01,320 And again, this is a structural equation. 90 00:06:01,320 --> 00:06:04,550 So we want to know A two, 91 00:06:04,550 --> 00:06:07,331 what wage must I offer 92 00:06:07,331 --> 00:06:11,063 based on the number of hours that I want? 93 00:06:12,410 --> 00:06:16,360 And these are linked by an equilibrium 94 00:06:16,360 --> 00:06:21,360 that the hours supplied have to equal the hours demanded. 95 00:06:21,570 --> 00:06:25,220 So in a market economy, these two things, 96 00:06:25,220 --> 00:06:29,717 sort of the wage and hours reach an equilibrium point, 97 00:06:34,330 --> 00:06:38,164 the supply and the demand curves meet 98 00:06:38,164 --> 00:06:42,520 at a point where the hours are equal, 99 00:06:42,520 --> 00:06:47,520 and that is the wage that will prevail 100 00:06:47,550 --> 00:06:49,793 to clear the labor market. 101 00:06:53,000 --> 00:06:55,410 Put these two together, 102 00:06:55,410 --> 00:07:00,410 and we have a fairly simple simultaneous equations model. 103 00:07:02,200 --> 00:07:06,217 And in this case, Z one, Z two, U one, and U two, 104 00:07:06,217 --> 00:07:11,217 jointly determined H and W. 105 00:07:11,820 --> 00:07:16,003 The Zs as before are exogenous, 106 00:07:18,230 --> 00:07:20,880 H and W, the hours worked, 107 00:07:20,880 --> 00:07:24,180 and the wage offered are both endogenous, 108 00:07:24,180 --> 00:07:28,290 and U two and U one are structural errors. 109 00:07:28,290 --> 00:07:29,684 So they again, 110 00:07:29,684 --> 00:07:34,684 we assume that they have all the well-behaved properties 111 00:07:35,320 --> 00:07:36,779 that we want, 112 00:07:36,779 --> 00:07:40,980 mean equal to zero and all kinds of things like that 113 00:07:43,070 --> 00:07:46,073 where not otherwise noted. 114 00:07:50,020 --> 00:07:53,440 A very important feature 115 00:07:53,440 --> 00:07:58,440 again, is that Z one and Z two cannot be the same variable 116 00:08:00,990 --> 00:08:05,700 that there has to be different information in each of these, 117 00:08:05,700 --> 00:08:07,300 that they can't be the same, 118 00:08:07,300 --> 00:08:11,290 they can't be linear combinations of each other, 119 00:08:11,290 --> 00:08:15,940 that they need to each add some unique information. 120 00:08:15,940 --> 00:08:20,873 And if they do not, then we cannot identify this equation. 121 00:08:22,860 --> 00:08:27,230 Note two, that the endogenous variables 122 00:08:27,230 --> 00:08:31,393 are not chosen by the same economic agent. 123 00:08:33,380 --> 00:08:35,593 And for this case, 124 00:08:35,593 --> 00:08:39,147 we're assuming that the employer sets the wage 125 00:08:40,200 --> 00:08:43,480 and that the worker will choose how many hours 126 00:08:43,480 --> 00:08:46,740 to work for the employers 127 00:08:46,740 --> 00:08:51,740 and the workers are sort of independently 128 00:08:52,990 --> 00:08:57,990 determining the value of our two endogenous variable. 129 00:08:59,563 --> 00:09:01,980 (indistinct) 130 00:09:13,940 --> 00:09:18,880 So again, now we're gonna walk through the math of this. 131 00:09:18,880 --> 00:09:23,880 So if you take the first equation, 132 00:09:25,210 --> 00:09:30,210 the second bullet point that has Y one on the left side, 133 00:09:32,300 --> 00:09:36,193 and plug in the second equation, 134 00:09:39,300 --> 00:09:42,263 you get the bullet point, 135 00:09:43,328 --> 00:09:48,328 what Y two equals A two times the quantity A one, Y two, 136 00:09:48,350 --> 00:09:49,363 blah, blah, blah, blah, blah, blah. 137 00:09:50,320 --> 00:09:54,010 So here, and then doing some rearranging 138 00:09:54,010 --> 00:09:58,140 that you could solve for Y two by itself, 139 00:09:58,140 --> 00:10:03,140 strictly as a function of Z one and Z two. 140 00:10:05,090 --> 00:10:09,000 Note that A one and A two 141 00:10:09,000 --> 00:10:12,780 cannot be inverses of each other. 142 00:10:12,780 --> 00:10:17,780 That A one times A two cannot equal one, 143 00:10:20,530 --> 00:10:24,750 which implies that A one cannot equal 144 00:10:26,420 --> 00:10:29,943 one divided by A two. 145 00:10:32,750 --> 00:10:36,020 Do you see that if you do that, 146 00:10:36,020 --> 00:10:40,940 that looking up at the equation 147 00:10:40,940 --> 00:10:44,820 so you see the thing sub in first for a second, 148 00:10:44,820 --> 00:10:48,580 and go down two bullet points, 149 00:10:48,580 --> 00:10:51,680 that the first factor there 150 00:10:51,680 --> 00:10:54,920 is one minus A two times A one, 151 00:10:54,920 --> 00:10:57,860 And if A two and A one are equal to zero, 152 00:10:57,860 --> 00:11:02,240 then that makes it that term zero. 153 00:11:02,240 --> 00:11:06,427 And to shift it around, you'd be dividing by zero 154 00:11:09,030 --> 00:11:11,950 which as you well know, mathematically, 155 00:11:11,950 --> 00:11:13,523 you're not allowed to do. 156 00:11:15,026 --> 00:11:19,513 (indistinct) model fall apart and not work. 157 00:11:28,920 --> 00:11:31,140 So, note that here, 158 00:11:31,140 --> 00:11:36,113 when I now named them equation one, equation two, 159 00:11:37,430 --> 00:11:41,903 equation one has Y one on the left side, 160 00:11:42,959 --> 00:11:46,040 and equation two has Y two, 161 00:11:46,040 --> 00:11:51,040 that when you plug in this form of Y two, 162 00:11:52,160 --> 00:11:55,863 that you can see that it has U one in it. 163 00:12:00,390 --> 00:12:04,719 So if you would take this version of Y two, 164 00:12:04,719 --> 00:12:09,719 which is a function of Z one, Z two, U one, and U two, 165 00:12:11,490 --> 00:12:16,050 and plug it back up into the equation, 166 00:12:16,050 --> 00:12:20,827 it is clear that U two and Y two, 167 00:12:22,560 --> 00:12:27,560 and U one, do not have zero co-variants, 168 00:12:27,940 --> 00:12:32,940 that U one appears in the term for U two. 169 00:12:36,380 --> 00:12:41,380 So it's clear that our assumption of exogeneity 170 00:12:42,900 --> 00:12:47,420 is violated there mathematically that you can see it. 171 00:12:47,420 --> 00:12:51,650 So we cannot make a claim that in equation one 172 00:12:51,650 --> 00:12:55,950 that Y two and U one are unrelated 173 00:12:55,950 --> 00:12:58,810 because clearly they are related. 174 00:12:58,810 --> 00:13:02,510 And this is sort of the mathematical basis 175 00:13:02,510 --> 00:13:06,513 of where the endogeneity comes from. 176 00:13:08,530 --> 00:13:12,060 And again, as I said a few slides back 177 00:13:12,060 --> 00:13:13,520 that U one and U two 178 00:13:13,520 --> 00:13:16,780 are also almost certainly correlated 179 00:13:16,780 --> 00:13:21,202 that the same random effects in the economy 180 00:13:21,202 --> 00:13:26,202 will apply to both and affect both supply and the demand. 181 00:13:34,940 --> 00:13:39,120 (indistinct) with that structural equation, 182 00:13:39,120 --> 00:13:44,120 but now, we're going to do the reduced form equation 183 00:13:45,010 --> 00:13:49,690 where we're going to rearrange things 184 00:13:49,690 --> 00:13:52,340 that that equation that we saw earlier, 185 00:13:52,340 --> 00:13:57,340 so that Y two is now a function of Z one and Z two. 186 00:13:59,600 --> 00:14:02,740 And note that we put these PI's in front of them 187 00:14:02,740 --> 00:14:05,640 instead of betas or A's, 188 00:14:05,640 --> 00:14:09,830 which tells us that this is a reduced form equation. 189 00:14:09,830 --> 00:14:13,880 And that PI two one, and PI two two, 190 00:14:13,880 --> 00:14:18,590 are reduced form versions of structural parameters. 191 00:14:22,820 --> 00:14:27,083 And the V two is a reduced form error. 192 00:14:32,680 --> 00:14:37,223 This is a more general case of our two equation model. 193 00:14:39,340 --> 00:14:42,501 And we can write the Zs 194 00:14:42,501 --> 00:14:46,840 in shorthand form 195 00:14:46,840 --> 00:14:50,560 where it's written in bold Z one, B one, 196 00:14:50,560 --> 00:14:55,560 is a number of Zs of exogenous regressors 197 00:14:56,730 --> 00:15:00,630 that may appear in the first equation. 198 00:15:00,630 --> 00:15:05,630 And the bold Z two, B two is a number of regressors 199 00:15:06,030 --> 00:15:09,453 that might appear in the second equation. 200 00:15:10,760 --> 00:15:15,760 Usually we will find that Z one and Z two 201 00:15:17,820 --> 00:15:19,790 may have some overlap, 202 00:15:19,790 --> 00:15:24,460 that some of the regressors may appear in both, 203 00:15:24,460 --> 00:15:28,200 but it is absolutely essential 204 00:15:28,200 --> 00:15:30,980 that they are not exactly the same 205 00:15:30,980 --> 00:15:35,980 that ideally each of the set of Zs 206 00:15:37,360 --> 00:15:41,070 will have regressors in one 207 00:15:41,070 --> 00:15:44,610 that do not appear in the other. 208 00:15:44,610 --> 00:15:46,120 And as we'll see, 209 00:15:46,120 --> 00:15:51,120 we will need that for both equations to be identified. 210 00:15:52,530 --> 00:15:57,440 And thinking about two-stage least squares, 211 00:15:57,440 --> 00:16:00,582 that if Z one and Z two 212 00:16:00,582 --> 00:16:05,580 contain exactly the same regressors 213 00:16:05,580 --> 00:16:10,020 then if we try to do two-stage least squares, 214 00:16:10,020 --> 00:16:14,017 then we don't have one in our back pocket 215 00:16:14,017 --> 00:16:18,970 that we can put in and add new information for. 216 00:16:18,970 --> 00:16:21,410 And all of our regressors 217 00:16:21,410 --> 00:16:25,020 will be linear combinations of each other, 218 00:16:25,020 --> 00:16:29,517 and we won't be able to identify the model at all. 219 00:16:37,180 --> 00:16:41,070 Looking again at this two equation model, 220 00:16:41,070 --> 00:16:45,910 where we have Y one and Y two, 221 00:16:45,910 --> 00:16:50,910 two models where Y two appears 222 00:16:52,030 --> 00:16:54,990 in the structural equation for Y one, 223 00:16:54,990 --> 00:16:59,940 and Y one appears in the structural equation for Y two, 224 00:16:59,940 --> 00:17:04,940 and we could write it out in reduced form 225 00:17:09,140 --> 00:17:12,420 to do two-stage least squares. 226 00:17:12,420 --> 00:17:14,640 And note that we still need 227 00:17:14,640 --> 00:17:17,200 that A one and A two, 228 00:17:17,200 --> 00:17:22,200 that A one times A two cannot equal one. 229 00:17:22,390 --> 00:17:25,090 Or as we've seen before, 230 00:17:25,090 --> 00:17:26,713 this whole thing falls apart, 231 00:17:27,800 --> 00:17:31,363 and we're dividing by zero, which we can't do. 232 00:17:38,090 --> 00:17:41,063 So this is kind of a tricky part. 233 00:17:46,270 --> 00:17:51,270 And I really want you to spend some time 234 00:17:51,270 --> 00:17:56,270 and think about this, that it's sort of counter-intuitive, 235 00:17:56,270 --> 00:18:01,270 and it's called the rank condition for identification. 236 00:18:01,770 --> 00:18:06,770 So note that our first equation can only be identified 237 00:18:09,147 --> 00:18:14,147 if there is a regressor 238 00:18:14,700 --> 00:18:17,990 that appears in the second equation, 239 00:18:17,990 --> 00:18:21,363 that doesn't appear in the first equation. 240 00:18:23,110 --> 00:18:28,110 So for the equation that starts with Y one 241 00:18:28,280 --> 00:18:30,880 on the left hand side, 242 00:18:30,880 --> 00:18:34,890 that you need an instrument in Y two. 243 00:18:34,890 --> 00:18:39,890 So you need a regressor that appears in the Y two equation 244 00:18:41,440 --> 00:18:46,060 that does not appear in the Y one equation. 245 00:18:46,060 --> 00:18:49,010 And it would work the other way around. 246 00:18:49,010 --> 00:18:54,010 That to identify Y two on the left-hand side, 247 00:18:56,410 --> 00:19:01,410 there has to be a regressor in the first equation 248 00:19:01,970 --> 00:19:05,815 that does not appear in the second equation. 249 00:19:05,815 --> 00:19:10,815 And so there has to be at least one unique regressor 250 00:19:12,390 --> 00:19:17,390 in each equation to be able to identify both equations. 251 00:19:21,190 --> 00:19:26,190 And this is a case where we can use two-stage least squares, 252 00:19:27,610 --> 00:19:32,610 and we have to do the test in the reduced form equation 253 00:19:34,290 --> 00:19:39,290 that this unique instrumental regressor 254 00:19:41,000 --> 00:19:44,610 it's PI is not equal to zero. 255 00:19:44,610 --> 00:19:47,750 And we can do T or F tests. 256 00:19:47,750 --> 00:19:50,630 A T test if there's only one of them, 257 00:19:50,630 --> 00:19:53,120 and an F test, if there is more than one. 258 00:19:53,120 --> 00:19:55,890 And hopefully reject the null 259 00:19:55,890 --> 00:19:57,583 that it equals zero. 260 00:20:00,760 --> 00:20:04,543 Once we know which equations are identified, 261 00:20:05,630 --> 00:20:08,670 we can use the instrumental variable 262 00:20:08,670 --> 00:20:13,670 two-stage least squares technique that we learned last time. 263 00:20:14,750 --> 00:20:18,450 And recall that there's a bit 264 00:20:18,450 --> 00:20:21,150 of a counter-intuitive thing here, 265 00:20:21,150 --> 00:20:25,890 that for the first equation to be identified, 266 00:20:25,890 --> 00:20:27,330 there must be an instrument 267 00:20:27,330 --> 00:20:31,950 in the second equation and vice versa. 268 00:20:31,950 --> 00:20:36,760 And now I'm going to walk you through 269 00:20:36,760 --> 00:20:41,760 a more detailed example of this workforce supply and demand. 270 00:20:45,480 --> 00:20:49,580 So as before, there's these supply and demand equations, 271 00:20:49,580 --> 00:20:54,580 the supply one is how many hours should the employee work? 272 00:20:54,860 --> 00:20:59,170 And the demand is what's the wage 273 00:20:59,170 --> 00:21:01,643 that the employer should offer? 274 00:21:03,140 --> 00:21:06,860 The supply equation will be things like 275 00:21:06,860 --> 00:21:10,897 the employees education, age, 276 00:21:10,897 --> 00:21:15,360 how many kids, what's their non wage income, et cetera? 277 00:21:15,360 --> 00:21:20,080 And the demand will be, how many hours that they need, 278 00:21:20,080 --> 00:21:25,080 as well as the education and experience and so forth 279 00:21:26,480 --> 00:21:28,690 needed for this work? 280 00:21:28,690 --> 00:21:33,690 And here, again, all of the variables 281 00:21:34,750 --> 00:21:38,010 except for hours and wage 282 00:21:38,010 --> 00:21:41,773 are assumed to be exogenous. 283 00:21:47,230 --> 00:21:49,270 Here it is spelled out again. 284 00:21:49,270 --> 00:21:51,043 Now note that, 285 00:21:53,377 --> 00:21:56,580 it might be that we're primarily interested 286 00:21:56,580 --> 00:21:59,330 in the first one, the supply equation. 287 00:21:59,330 --> 00:22:01,950 We really wanted to know A one, 288 00:22:01,950 --> 00:22:04,700 that's the most important thing here. 289 00:22:04,700 --> 00:22:09,053 And for this equation to be identified, 290 00:22:11,690 --> 00:22:15,080 the experience and experience squared 291 00:22:16,490 --> 00:22:20,040 must be present in the demand equation, 292 00:22:20,040 --> 00:22:25,010 and we need it so that they actually are relevant, 293 00:22:25,010 --> 00:22:28,700 that they do have an impact on wage. 294 00:22:28,700 --> 00:22:33,700 So we need B two, two, and B two, three 295 00:22:34,250 --> 00:22:37,380 to be jointly, not zero. 296 00:22:37,380 --> 00:22:41,850 So we would do this as a reduced form equation. 297 00:22:41,850 --> 00:22:45,170 And we add in all of our regressors, 298 00:22:45,170 --> 00:22:47,420 all the ones from the first equation, 299 00:22:47,420 --> 00:22:50,341 as well as all those from the second equation. 300 00:22:50,341 --> 00:22:53,610 And we do an F test 301 00:22:53,610 --> 00:22:57,130 and really hope where we take out 302 00:22:57,130 --> 00:22:59,720 experience and experience squared. 303 00:22:59,720 --> 00:23:02,080 So we'll restrict those. 304 00:23:02,080 --> 00:23:06,370 The restricted model of our F test has those out. 305 00:23:06,370 --> 00:23:11,370 The full model is this wage equation that you see here. 306 00:23:12,840 --> 00:23:17,840 Wage equals PI two, zero plus PI two, one, educ, et cetera. 307 00:23:18,990 --> 00:23:22,630 And we do an F test. 308 00:23:22,630 --> 00:23:27,100 And in this case, we hope to reject the null. 309 00:23:27,100 --> 00:23:32,100 We hope that experience and experience squared 310 00:23:32,280 --> 00:23:35,763 do have a significant effect here. 311 00:23:36,790 --> 00:23:41,490 We hope that we can reject the null, 312 00:23:41,490 --> 00:23:43,407 we get a big F stat 313 00:23:43,407 --> 00:23:48,407 and that these do have value as instruments. 314 00:23:54,650 --> 00:23:58,800 So here in the first stage of two-stage least squares, 315 00:23:58,800 --> 00:24:03,250 we put all of our exogenous regressors 316 00:24:03,250 --> 00:24:04,980 on the right-hand side 317 00:24:04,980 --> 00:24:09,980 and wage on the left, and we save the wage-hat. 318 00:24:11,140 --> 00:24:15,920 And then in the second stage of two-stage least squares, 319 00:24:15,920 --> 00:24:20,900 we plug that into our structural equation. 320 00:24:20,900 --> 00:24:25,900 And that is how we get the estimated coefficient 321 00:24:26,020 --> 00:24:26,893 that we want. 322 00:24:29,180 --> 00:24:30,670 In many cases, 323 00:24:30,670 --> 00:24:33,970 we will find in a sort of true simultaneous equation model 324 00:24:35,462 --> 00:24:39,083 that the two signs have opposite signs, 325 00:24:40,294 --> 00:24:45,294 that the two coefficients in this case, A one and A two, 326 00:24:50,170 --> 00:24:52,130 have different signs. 327 00:24:52,130 --> 00:24:55,890 That A one we would expect to be greater than zero, 328 00:24:55,890 --> 00:25:00,890 that a higher wage draw more people and more employees, 329 00:25:01,140 --> 00:25:02,683 they wanted to work more. 330 00:25:03,520 --> 00:25:06,600 But the A two is less than zero. 331 00:25:06,600 --> 00:25:11,600 That the more hours of labor a firm might need, 332 00:25:12,050 --> 00:25:16,180 the less per hour that they would want to pay, 333 00:25:16,180 --> 00:25:19,880 or to care to pay or feel like they are able to pay. 334 00:25:19,880 --> 00:25:22,920 And this is a feature that you will see a lot 335 00:25:22,920 --> 00:25:27,920 where the coefficients of our endogenous regressors 336 00:25:29,806 --> 00:25:33,313 in the two equations have opposite signs. 337 00:25:35,840 --> 00:25:40,840 Here is a very simple stylized example of another model 338 00:25:44,640 --> 00:25:48,430 and has to do with Wolf and deer. 339 00:25:48,430 --> 00:25:50,409 And we want a model, 340 00:25:50,409 --> 00:25:54,980 what is the population of wolves 341 00:25:54,980 --> 00:25:58,040 as the number of deer changes, 342 00:25:58,040 --> 00:26:03,040 as well as the number of deer as wolves change? 343 00:26:04,510 --> 00:26:09,200 So note that we would theorize that A one 344 00:26:09,200 --> 00:26:12,430 is greater than zero. 345 00:26:12,430 --> 00:26:14,470 The more deer there are to eat, 346 00:26:14,470 --> 00:26:17,050 the more wolves there will be. 347 00:26:17,050 --> 00:26:21,060 But that B one is less than zero. 348 00:26:21,060 --> 00:26:24,570 The more wolves are out there preying on deer, 349 00:26:24,570 --> 00:26:27,560 the fewer deer that there are. 350 00:26:27,560 --> 00:26:29,999 And then these would have to be equal 351 00:26:29,999 --> 00:26:34,263 in the equilibrium. 352 00:26:35,460 --> 00:26:39,300 It may be that we are most interested in equation one, 353 00:26:39,300 --> 00:26:42,160 that we are a wildlife biologists 354 00:26:42,160 --> 00:26:44,200 and we're studying wolves. 355 00:26:44,200 --> 00:26:46,573 So we really wanna know A one. 356 00:26:48,040 --> 00:26:53,040 What is the variable that we need to have to identify that? 357 00:26:56,710 --> 00:26:59,280 So it's corn. 358 00:26:59,280 --> 00:27:03,399 So we need to have this corn 359 00:27:03,399 --> 00:27:08,399 to be able to identify equation one, 360 00:27:08,490 --> 00:27:13,490 and we need to have this forest equation to identify two. 361 00:27:18,840 --> 00:27:22,810 We can do this with more than one equation. 362 00:27:22,810 --> 00:27:27,810 And basically we need an order condition here too, 363 00:27:28,620 --> 00:27:33,620 that there has to be an excluded variable in the equations. 364 00:27:39,540 --> 00:27:42,480 Sorry, let me restate that. 365 00:27:42,480 --> 00:27:46,780 For every endogenous regressor that we have, 366 00:27:46,780 --> 00:27:50,330 there must be a corresponding instrument 367 00:27:57,670 --> 00:28:01,090 that only appears in the equation 368 00:28:01,090 --> 00:28:05,700 where that endogenous regressor is on the left side. 369 00:28:05,700 --> 00:28:08,790 So if we have Y one, Y two, Y three, 370 00:28:08,790 --> 00:28:12,733 a three equation model, 371 00:28:13,866 --> 00:28:18,866 that the equation that each of them 372 00:28:19,910 --> 00:28:23,550 must have an instrumental regressor 373 00:28:23,550 --> 00:28:28,520 that does not appear in any of the other equations, 374 00:28:28,520 --> 00:28:33,520 that there must be some unique variable in each one 375 00:28:33,830 --> 00:28:36,928 that provides new information 376 00:28:36,928 --> 00:28:40,443 in order for this to be identified. 377 00:28:40,443 --> 00:28:45,097 So we need an instrument for every regressor. 378 00:28:48,510 --> 00:28:51,220 So here's a bunch of more information, 379 00:28:51,220 --> 00:28:55,117 including this identification matter. 380 00:28:57,240 --> 00:29:02,240 And two YouTube videos by Ben Lambert. 381 00:29:02,440 --> 00:29:06,470 So I encourage you to use those. 382 00:29:06,470 --> 00:29:11,470 And I look forward to seeing you in class on Wednesday. 383 00:29:11,970 --> 00:29:12,803 Thank you.