1
00:00:01,210 --> 00:00:02,840
- [Professor] Hello everyone,

2
00:00:02,840 --> 00:00:07,727
and welcome to the final
exam review for CDAE 359.

3
00:00:12,580 --> 00:00:16,130
So our agenda here will just be to go over

4
00:00:18,070 --> 00:00:19,770
what's gonna be on the exam,

5
00:00:19,770 --> 00:00:22,080
starting with a few announcements

6
00:00:22,080 --> 00:00:22,963
and then,

7
00:00:25,970 --> 00:00:28,273
go over the actual topics.

8
00:00:38,390 --> 00:00:41,543
Here's what we'll be running down today.

9
00:00:42,520 --> 00:00:46,100
So starting with two

10
00:00:46,100 --> 00:00:50,660
big related concepts,

11
00:00:50,660 --> 00:00:55,660
and then going through the
basic topics that we've done,

12
00:00:56,070 --> 00:01:01,070
limited dependent variable
and sampling corrections.

13
00:01:01,460 --> 00:01:03,320
panel data,

14
00:01:03,320 --> 00:01:05,890
instrumental variable,
two-stage least squares

15
00:01:05,890 --> 00:01:07,780
and simultaneous equations.

16
00:01:07,780 --> 00:01:12,570
And this is the material
that will be emphasized

17
00:01:12,570 --> 00:01:14,090
on the exam.

18
00:01:14,090 --> 00:01:16,240
The exam is cumulative,

19
00:01:16,240 --> 00:01:19,360
but only to the extent that concepts

20
00:01:21,270 --> 00:01:24,570
and principles we've learned

21
00:01:24,570 --> 00:01:28,870
early on carry through,

22
00:01:28,870 --> 00:01:33,150
but the questions will very much focus

23
00:01:33,150 --> 00:01:36,873
on these specific topics.

24
00:01:41,730 --> 00:01:46,730
The two biggest relationships
that I wanted to talk about,

25
00:01:47,530 --> 00:01:52,530
the two big concepts are first,

26
00:01:53,150 --> 00:01:54,890
a lot of what we've done

27
00:01:54,890 --> 00:01:59,890
in the second half of this
class has been looking at

28
00:02:00,320 --> 00:02:05,320
the trade-off of consistency
or bias and efficiency

29
00:02:05,740 --> 00:02:09,140
that in many cases,

30
00:02:09,140 --> 00:02:10,700
the techniques that we use

31
00:02:10,700 --> 00:02:14,480
to eliminate bias are at
the cost of efficiency

32
00:02:14,480 --> 00:02:16,970
that they make the variance

33
00:02:16,970 --> 00:02:21,970
and standard error of our
betas increase greatly.

34
00:02:22,100 --> 00:02:25,480
And we found this with random effects

35
00:02:25,480 --> 00:02:29,570
and instrumental variables,
two-stage least squares

36
00:02:29,570 --> 00:02:32,130
and have learned two tests.

37
00:02:32,130 --> 00:02:34,720
One that I'm gonna go over,

38
00:02:34,720 --> 00:02:35,880
pan a little bit more depth,

39
00:02:35,880 --> 00:02:37,170
mean square error,

40
00:02:37,170 --> 00:02:39,430
as well as the Wu Hausman test

41
00:02:39,430 --> 00:02:41,340
or the Hausman tests that we learned

42
00:02:41,340 --> 00:02:44,653
when we were working with panel data.

43
00:02:46,680 --> 00:02:47,513
The other one,

44
00:02:47,513 --> 00:02:52,390
and there was a lot of focus
on this concept is endogeneity

45
00:02:54,200 --> 00:02:58,350
and we recall that we assume and hope,

46
00:02:58,350 --> 00:03:01,570
and if we have well-behaved data

47
00:03:01,570 --> 00:03:06,570
that the expected value of the error term

48
00:03:06,810 --> 00:03:11,140
for any value of X equals zero,

49
00:03:11,140 --> 00:03:14,320
that the covariance of the error term

50
00:03:14,320 --> 00:03:17,350
and any regressor equals zero.

51
00:03:17,350 --> 00:03:20,940
And therefore we can
say that X is exogenous,

52
00:03:20,940 --> 00:03:22,100
and this is what we want.

53
00:03:22,100 --> 00:03:23,720
This is well-behaved data.

54
00:03:23,720 --> 00:03:28,720
And this is one of the absolute
foundational requirements

55
00:03:28,720 --> 00:03:30,793
to have an unbiased estimator.

56
00:03:32,130 --> 00:03:35,100
And we've learned that
if this is not true,

57
00:03:35,100 --> 00:03:40,100
if the covariance does not equal
zero, then X is endogenous.

58
00:03:40,240 --> 00:03:42,100
This causes bias.

59
00:03:42,100 --> 00:03:46,730
And just as a review that we
wanna know the value of beta,

60
00:03:46,730 --> 00:03:51,653
how does Y change as X
changes, all else equal.

61
00:03:52,610 --> 00:03:56,380
Really what we wanna know
is how does y hat change

62
00:03:57,642 --> 00:03:58,725
as X changes,

63
00:04:00,456 --> 00:04:05,206
that how does the expected
value of Y change as X changes

64
00:04:07,330 --> 00:04:11,000
and because we cannot observe this,

65
00:04:11,000 --> 00:04:13,110
and if the error term

66
00:04:14,340 --> 00:04:19,340
and any regressor are correlated,

67
00:04:19,920 --> 00:04:23,620
if they're covariance equal zero,

68
00:04:23,620 --> 00:04:28,050
if du/dx does not equal zero,

69
00:04:28,050 --> 00:04:32,403
we cannot tell if a change in X result

70
00:04:33,300 --> 00:04:36,730
in a change in y hat or in u.

71
00:04:36,730 --> 00:04:40,883
And therefore this results
in a biased estimate.

72
00:04:44,870 --> 00:04:48,610
Two tests that we learned
are mean square error

73
00:04:48,610 --> 00:04:51,133
and the Wu Hausman test.

74
00:04:52,440 --> 00:04:56,763
And we'll talk a bit more about this.

75
00:04:58,670 --> 00:05:01,960
The Hausman test we learned,

76
00:05:01,960 --> 00:05:06,560
and you covered pretty
extensively on your homework.

77
00:05:06,560 --> 00:05:11,560
Mean squared error is the square

78
00:05:12,190 --> 00:05:15,980
of the bias plus the variance.

79
00:05:15,980 --> 00:05:20,380
So I'm gonna show you a
bit more how this means.

80
00:05:20,380 --> 00:05:24,683
So let's say that we have
an endogenous regressor

81
00:05:26,130 --> 00:05:30,100
and we wanna know its beta
and which one should we use.

82
00:05:30,100 --> 00:05:32,910
So we can run both OLS

83
00:05:32,910 --> 00:05:37,910
and use this, IV two-stage
least squared method.

84
00:05:38,260 --> 00:05:41,820
And just to keep things clear,

85
00:05:41,820 --> 00:05:45,123
let's call, when we run it by OLS,

86
00:05:45,123 --> 00:05:47,730
that this B is beta o.

87
00:05:47,730 --> 00:05:51,900
And when we run it by
instrumental variable,

88
00:05:51,900 --> 00:05:54,290
we call this beta i.

89
00:05:54,290 --> 00:05:57,280
So the mean square error
is the bias squared

90
00:05:57,280 --> 00:05:59,560
plus the variance.

91
00:05:59,560 --> 00:06:03,310
So for the instrumental variable,

92
00:06:03,310 --> 00:06:06,310
since there is no bias,

93
00:06:06,310 --> 00:06:10,450
it's just zero squared plus zero

94
00:06:10,450 --> 00:06:14,253
plus the variance of beta i.

95
00:06:16,650 --> 00:06:20,850
To get the mean square error of beta o,

96
00:06:20,850 --> 00:06:24,380
we subtract beta i minus beta o.

97
00:06:24,380 --> 00:06:29,380
So we subtract the biased
from the unbiased value

98
00:06:29,930 --> 00:06:31,400
and square that,

99
00:06:31,400 --> 00:06:35,280
and then add the variance of beta o.

100
00:06:35,280 --> 00:06:39,700
And whichever of these
has the smaller value,

101
00:06:39,700 --> 00:06:42,200
the mean square error of beta i,

102
00:06:42,200 --> 00:06:44,780
or the mean squared error of beta o,

103
00:06:44,780 --> 00:06:48,120
then this test would
suggest that that is the one

104
00:06:48,120 --> 00:06:49,373
that we use.

105
00:06:53,400 --> 00:06:58,310
Now to get into the topics of specifically

106
00:06:58,310 --> 00:07:01,020
that will be on the exam.

107
00:07:01,020 --> 00:07:02,410
So the first thing we did

108
00:07:02,410 --> 00:07:06,613
after the midterm was
to look at LimDepVar,

109
00:07:07,670 --> 00:07:10,970
the limited dependent variables.

110
00:07:10,970 --> 00:07:12,430
So we learned

111
00:07:12,430 --> 00:07:17,430
that not all dependent
variables meet OLS requirements,

112
00:07:17,820 --> 00:07:19,820
that in theory,

113
00:07:19,820 --> 00:07:24,070
the Y in OLS should be
able to take any value

114
00:07:24,070 --> 00:07:28,710
of negative infinity to positive infinity.

115
00:07:28,710 --> 00:07:31,110
And while we don't, you know,

116
00:07:31,110 --> 00:07:33,420
certainly see all those become used,

117
00:07:33,420 --> 00:07:35,600
ones that deviate far away

118
00:07:35,600 --> 00:07:39,090
from that will be more problematic.

119
00:07:39,090 --> 00:07:40,960
And we see this a lot,

120
00:07:40,960 --> 00:07:43,110
some are nominal or ordinal.

121
00:07:43,110 --> 00:07:47,407
Some naturally take on
small integer values.

122
00:07:49,900 --> 00:07:52,060
Like how many cars you own,

123
00:07:52,060 --> 00:07:54,920
or how many times you've been arrested.

124
00:07:54,920 --> 00:07:58,667
Some sort of naturally pile up at zero

125
00:08:00,150 --> 00:08:02,940
due to behavior,

126
00:08:02,940 --> 00:08:06,207
such as how many cigarettes

127
00:08:08,120 --> 00:08:13,070
that you smoke or how much
that you spend on alcohol.

128
00:08:13,070 --> 00:08:16,770
Many people would say zero for this.

129
00:08:16,770 --> 00:08:20,080
And then,

130
00:08:20,080 --> 00:08:25,060
sometimes our dependent
variable cannot be measured

131
00:08:25,060 --> 00:08:28,010
outside of a certain range.

132
00:08:28,010 --> 00:08:29,690
So in these cases,

133
00:08:29,690 --> 00:08:34,290
we need to do methods other
than ordinary least squares

134
00:08:34,290 --> 00:08:36,053
to get a good estimate of them.

135
00:08:38,950 --> 00:08:40,560
And then next,

136
00:08:40,560 --> 00:08:45,560
there are times when we don't
have the perfect random sample

137
00:08:46,730 --> 00:08:50,120
where we can observe
everybody, as we hope.

138
00:08:50,120 --> 00:08:55,077
Sometimes we choose to
only focus on a sub sample,

139
00:08:55,950 --> 00:08:58,330
because that's what we are interested in,

140
00:08:58,330 --> 00:09:00,330
or that's all that we can reach.

141
00:09:00,330 --> 00:09:02,410
And then there are also times

142
00:09:02,410 --> 00:09:06,160
where only the participants
can be observed.

143
00:09:06,160 --> 00:09:09,917
So we talked a lot about
the case of women's wages

144
00:09:12,670 --> 00:09:14,020
in the workforce.

145
00:09:14,020 --> 00:09:15,570
And this was at a time

146
00:09:15,570 --> 00:09:20,570
when many women did not work from home,

147
00:09:21,100 --> 00:09:23,290
they were not in the workforce.

148
00:09:23,290 --> 00:09:26,680
So only the wages

149
00:09:26,680 --> 00:09:30,323
of those who are in the
workforce could be observed.

150
00:09:31,605 --> 00:09:33,755
And we'll talk more
about that in a minute.

151
00:09:36,410 --> 00:09:40,690
So here are the categories of LimDepVar

152
00:09:40,690 --> 00:09:42,840
that we talked about, binary, ordinal,

153
00:09:42,840 --> 00:09:45,727
tobit, hurdle, Poisson and censored.

154
00:18:01,900 --> 00:18:06,900
The last type of limited
dependent variable is censored.

155
00:18:07,210 --> 00:18:09,840
And in this case,

156
00:18:09,840 --> 00:18:11,930
we either choose

157
00:18:11,930 --> 00:18:16,270
or are only able to measure our Y's

158
00:18:16,270 --> 00:18:17,840
within a certain range.

159
00:18:17,840 --> 00:18:22,500
And you can think of it as a scale

160
00:18:22,500 --> 00:18:27,150
that maybe only goes up to 300 pounds

161
00:18:27,150 --> 00:18:29,350
and anybody who weighs more than that,

162
00:18:29,350 --> 00:18:31,930
we can't observe how much they weigh.

163
00:18:31,930 --> 00:18:33,260
And we talked about

164
00:18:35,620 --> 00:18:40,060
very often, we put income in categories,

165
00:18:40,060 --> 00:18:41,840
and that would be another example

166
00:18:41,840 --> 00:18:46,290
where if it's above this amount,

167
00:18:46,290 --> 00:18:51,290
that all we can observe is
that it's up to that point.

168
00:18:51,470 --> 00:18:53,423
So in this case,

169
00:18:55,350 --> 00:18:56,870
we can't use OLS.

170
00:18:56,870 --> 00:18:59,893
We have to again, use maximum likelihood,

171
00:19:01,700 --> 00:19:04,970
but the betas can be interpreted directly.

172
00:19:04,970 --> 00:19:08,710
So it is a constant, it is dy/dx

173
00:19:08,710 --> 00:19:11,190
unlike the probit,

174
00:19:11,190 --> 00:19:13,420
logit, tobit and ordinal,

175
00:19:13,420 --> 00:19:15,393
and all of those that we learned about.

176
00:19:23,190 --> 00:19:28,190
Another topic that we
covered this week is so far

177
00:19:28,600 --> 00:19:30,890
when we're talking about LimDepVar,

178
00:19:30,890 --> 00:19:34,640
it's properties of our dependent,

179
00:19:34,640 --> 00:19:37,810
and now we're gonna talk
about some properties

180
00:19:37,810 --> 00:19:39,283
of our sample.

181
00:19:44,090 --> 00:19:47,480
And this is called truncation where,

182
00:19:47,480 --> 00:19:52,480
because of our researcher choice,

183
00:19:52,860 --> 00:19:54,310
we can only observe,

184
00:19:54,310 --> 00:19:58,910
or we choose to only observe
part of the population.

185
00:19:58,910 --> 00:20:03,270
So only folks

186
00:20:03,270 --> 00:20:04,530
of a certain income,

187
00:20:04,530 --> 00:20:07,860
higher income or lower income

188
00:20:07,860 --> 00:20:10,990
or houses worth a certain amount,

189
00:20:10,990 --> 00:20:14,520
more, greater or less than that.

190
00:20:14,520 --> 00:20:17,750
Here we use T-tests

191
00:20:17,750 --> 00:20:20,890
and log likelihood tests

192
00:20:20,890 --> 00:20:23,703
to measure this.

193
00:20:27,750 --> 00:20:29,900
One of the more interesting parts

194
00:20:29,900 --> 00:20:34,900
of truncation is so-called
incidental truncation,

195
00:20:35,380 --> 00:20:39,620
where we can only observe participants

196
00:20:39,620 --> 00:20:44,400
in sort of more broad in truncation

197
00:20:44,400 --> 00:20:49,400
that we could observe
higher income folks behavior

198
00:20:49,730 --> 00:20:51,880
and get data from them,

199
00:20:51,880 --> 00:20:55,750
even though we're only
interested in lower income

200
00:20:55,750 --> 00:20:59,550
or we choose to put our
attention to lower income.

201
00:20:59,550 --> 00:21:03,020
This incidental truncation is a case

202
00:21:03,020 --> 00:21:06,490
where we can only observe participants.

203
00:21:06,490 --> 00:21:10,347
So if we wanna measure the returns

204
00:21:11,640 --> 00:21:16,640
of college education to income,

205
00:21:16,850 --> 00:21:19,750
that we can only observe those who went

206
00:21:19,750 --> 00:21:23,953
to college to have that effect,

207
00:21:25,500 --> 00:21:30,500
or the classic case of
women in the workforce,

208
00:21:32,730 --> 00:21:35,920
where we can only observe that wage

209
00:21:35,920 --> 00:21:39,050
of those who are in the workforce.

210
00:21:39,050 --> 00:21:44,050
And so in this case,

211
00:21:44,900 --> 00:21:49,900
the participation is endogenous.

212
00:21:51,040 --> 00:21:55,973
And we have to estimate
this with two steps.

213
00:21:57,360 --> 00:22:02,360
And this is a fairly

214
00:22:02,900 --> 00:22:07,033
commonly used model in econometrics.

215
00:22:11,180 --> 00:22:14,873
Now we're gonna jump to panel data.

216
00:22:17,230 --> 00:22:20,420
Before this, we've worked exclusively

217
00:22:20,420 --> 00:22:22,580
with cross sectional data.

218
00:22:22,580 --> 00:22:26,370
So only one time period

219
00:22:27,320 --> 00:22:30,000
and with many observations.

220
00:22:30,000 --> 00:22:34,560
But now what we're going to
work with many observations

221
00:22:34,560 --> 00:22:36,420
over many years.

222
00:22:36,420 --> 00:22:40,703
So now T is greater than one
and N is greater than one.

223
00:22:42,050 --> 00:22:46,670
As an aside, up till this
point we had only worked

224
00:22:46,670 --> 00:22:47,960
with cross section,

225
00:22:47,960 --> 00:22:48,930
T equals one.

226
00:22:48,930 --> 00:22:53,700
So data from N participants

227
00:22:53,700 --> 00:22:58,267
in a single time and time series is data

228
00:22:59,260 --> 00:23:01,900
from a single participant

229
00:23:01,900 --> 00:23:06,900
over many or looking at a single variable

230
00:23:07,110 --> 00:23:10,193
over many years like income or GDP

231
00:23:12,160 --> 00:23:15,510
or prices or anything like that.

232
00:23:15,510 --> 00:23:19,130
So panel data is when we have both T

233
00:23:19,130 --> 00:23:20,923
and N are greater than one.

234
00:23:26,256 --> 00:23:30,630
In a sense, panel data
combines both cross section

235
00:23:30,630 --> 00:23:33,330
and time series.

236
00:23:33,330 --> 00:23:35,590
And there's two main types.

237
00:23:35,590 --> 00:23:39,720
First is independently
pooled cross section,

238
00:23:39,720 --> 00:23:44,720
where you are asking the same questions,

239
00:23:44,990 --> 00:23:48,890
measuring the same variables over time,

240
00:23:48,890 --> 00:23:53,780
but each time you're
drawing a random sample.

241
00:23:53,780 --> 00:23:58,630
So an example of this is
asking the same question

242
00:23:58,630 --> 00:24:02,663
on the Vermonter Poll over multiple years.

243
00:24:07,750 --> 00:24:11,700
Whereas true panel data,

244
00:24:11,700 --> 00:24:14,860
you're following the same subjects

245
00:24:14,860 --> 00:24:19,080
and keeping track of
what each individual said

246
00:24:20,050 --> 00:24:24,883
for each variable over each time period.

247
00:24:29,120 --> 00:24:32,230
One application of this
independently pooled

248
00:24:32,230 --> 00:24:36,690
cross section is the difference
in differences model,

249
00:24:36,690 --> 00:24:41,380
which is sort of a natural
or quasi-experiment

250
00:24:41,380 --> 00:24:46,380
where we have a treatment
that takes place in one,

251
00:24:46,790 --> 00:24:48,080
say one area,

252
00:24:48,080 --> 00:24:49,850
but not in another.

253
00:24:49,850 --> 00:24:52,170
And then we measure the before

254
00:24:52,170 --> 00:24:55,763
and the after effects of this.

255
00:24:56,770 --> 00:24:59,860
So we choose two areas.

256
00:24:59,860 --> 00:25:01,840
One gets a treatment,

257
00:25:01,840 --> 00:25:04,380
one gets the control.

258
00:25:04,380 --> 00:25:09,380
We take a baseline reading of
a random sample of those areas

259
00:25:11,110 --> 00:25:12,570
in the first time,

260
00:25:12,570 --> 00:25:17,570
then the treatment happens.

261
00:25:17,870 --> 00:25:21,540
And then we take another
reading in time too.

262
00:25:21,540 --> 00:25:23,030
And you can see here that

263
00:25:26,736 --> 00:25:28,193
the coefficient of
interest is delta one here.

264
00:25:32,530 --> 00:25:34,127
That is the effect

265
00:25:39,790 --> 00:25:43,330
of the treatment in time two,

266
00:25:43,330 --> 00:25:47,870
controlling for the treatment area

267
00:25:47,870 --> 00:25:52,230
and the passage of times.

268
00:25:52,230 --> 00:25:57,230
So, this delta one isolates
the effect of the treatment

269
00:25:58,940 --> 00:26:03,450
and as a result of the true
treatment and in time two,

270
00:26:03,450 --> 00:26:07,290
so what changed from time one to time two

271
00:26:07,290 --> 00:26:09,130
in the treatment area,

272
00:26:09,130 --> 00:26:12,983
controlling for both treatment and time.

273
00:26:17,770 --> 00:26:18,802
On the other hand,

274
00:26:18,802 --> 00:26:19,680
(mic screeching)

275
00:26:19,680 --> 00:26:21,650
true panel data,

276
00:26:21,650 --> 00:26:25,830
you're not taking a
random sample each time.

277
00:26:25,830 --> 00:26:28,660
It's not a new sample each time

278
00:26:28,660 --> 00:26:31,950
that you're actually
following the same individuals

279
00:26:31,950 --> 00:26:36,580
or the same units of analysis

280
00:26:36,580 --> 00:26:38,460
over multiple years

281
00:26:38,460 --> 00:26:41,610
and keeping track of who says what.

282
00:26:41,610 --> 00:26:46,320
So when we talked about
this city market measurement

283
00:26:47,930 --> 00:26:49,340
of their members,

284
00:26:49,340 --> 00:26:53,430
where they follow me say over many years

285
00:26:53,430 --> 00:26:58,143
and keep track of what I said
to each question in each year.

286
00:26:58,980 --> 00:27:03,863
This tends to have usually a
fairly big N and a small T.

287
00:27:04,950 --> 00:27:09,950
The problem is there is unobserved
individual heterogeneity

288
00:27:12,257 --> 00:27:15,510
that there are things
that just make me, me

289
00:27:15,510 --> 00:27:17,110
and make you, you

290
00:27:17,110 --> 00:27:19,510
that are very hard to measure

291
00:27:19,510 --> 00:27:23,980
and that we can think of as
almost certainly correlated

292
00:27:23,980 --> 00:27:25,660
with our regressors,

293
00:27:25,660 --> 00:27:28,023
so we have to account for those.

294
00:27:32,210 --> 00:27:33,230
How do we account

295
00:27:33,230 --> 00:27:38,110
for this unobserved
individual heterogeneity?

296
00:27:39,560 --> 00:27:44,560
Well, we could have a dummy
variable for each individual.

297
00:27:44,630 --> 00:27:48,160
The problem with that is,

298
00:27:48,160 --> 00:27:51,610
especially in a single time period,

299
00:27:51,610 --> 00:27:55,870
you're gonna end up with
negative degrees of freedom.

300
00:27:55,870 --> 00:27:59,050
And in any case,

301
00:27:59,050 --> 00:28:01,707
you're gonna have a
whole lot of regressors.

302
00:28:01,707 --> 00:28:06,000
So you're gonna have N dummy variables

303
00:28:06,000 --> 00:28:09,560
for N individuals for whom you have data,

304
00:28:09,560 --> 00:28:12,853
and that really eats into
your degrees of freedom.

305
00:28:13,720 --> 00:28:16,780
But we know that if we ignore it,

306
00:28:16,780 --> 00:28:20,020
that this will go into the error term

307
00:28:20,020 --> 00:28:22,730
and it will cause bias estimates

308
00:28:22,730 --> 00:28:25,330
of all of our betas of interest.

309
00:28:25,330 --> 00:28:29,573
So we have to deal with it in some way.

310
00:28:36,070 --> 00:28:39,690
We learned about two ways
and I'll go through each one.

311
00:28:39,690 --> 00:28:43,930
The first differencing and these,

312
00:28:43,930 --> 00:28:47,160
the fixed effects, time demeaning.

313
00:28:47,160 --> 00:28:52,160
And note that if we were to
run OLS on true panel data,

314
00:28:54,680 --> 00:28:57,620
that this results in a bias estimate

315
00:28:57,620 --> 00:29:01,600
that it will probably be more efficient,

316
00:29:01,600 --> 00:29:04,530
but it will almost certainly be biased.

317
00:29:04,530 --> 00:29:08,650
And that it is unbiased only

318
00:29:08,650 --> 00:29:10,993
when we draw an independent,

319
00:29:14,010 --> 00:29:15,700
random sample,

320
00:29:15,700 --> 00:29:20,700
a new sample every time does
that get rid of the bias.

321
00:29:21,280 --> 00:29:26,250
And if we keep sampling
the same people over time,

322
00:29:26,250 --> 00:29:30,593
running it as pooled
OLS will result in bias.

323
00:29:35,550 --> 00:29:40,550
The first method that we learned
with is first differencing.

324
00:29:40,670 --> 00:29:41,740
So here,

325
00:29:41,740 --> 00:29:46,740
we keep the individual
persons observations together,

326
00:29:49,070 --> 00:29:54,070
and we subtract time t
from time t minus one.

327
00:29:54,920 --> 00:29:59,450
So we take the difference of how say NMI

328
00:29:59,450 --> 00:30:02,290
in this city market example

329
00:30:02,290 --> 00:30:06,650
of how much I spent in 2017

330
00:30:06,650 --> 00:30:10,670
and subtracted how much I spent in 2012.

331
00:30:10,670 --> 00:30:13,910
And so we get the delta y

332
00:30:13,910 --> 00:30:16,893
and delta x here.

333
00:30:18,350 --> 00:30:21,990
I think you can see that the
ai drops out by subtraction

334
00:30:21,990 --> 00:30:24,690
because it's the same thing every time.

335
00:30:24,690 --> 00:30:28,700
One thing that we need is for
the delta x's and delta u's

336
00:30:28,700 --> 00:30:33,120
to be uncorrelated.

337
00:30:33,120 --> 00:30:36,300
This is called the strict form,

338
00:30:36,300 --> 00:30:38,900
a form of strict exogeneity

339
00:30:38,900 --> 00:30:43,790
that the covariance of all of my X's

340
00:30:43,790 --> 00:30:45,020
over all times

341
00:30:45,020 --> 00:30:49,400
and all of my error terms equals zero

342
00:30:49,400 --> 00:30:54,400
by running the regression on
this bottom equation here,

343
00:30:58,485 --> 00:31:03,485
the change of my y as a
function of the change in my X

344
00:31:04,550 --> 00:31:08,160
that beta that you get there is called

345
00:31:08,160 --> 00:31:11,883
the first difference estimator.

346
00:31:16,890 --> 00:31:20,177
This technique, first differencing,

347
00:31:23,700 --> 00:31:27,980
the two big pros of it if
it does eliminate bias,

348
00:31:27,980 --> 00:31:31,360
and if you have serial correlation,

349
00:31:31,360 --> 00:31:36,360
it helps to get rid of the
serial correlation of the errors.

350
00:31:37,810 --> 00:31:42,490
The drawbacks is you
lose a whole observation.

351
00:31:42,490 --> 00:31:47,490
So if you have data over two years

352
00:31:47,810 --> 00:31:49,410
for each individual,

353
00:31:49,410 --> 00:31:51,960
now you only have one observation,

354
00:31:51,960 --> 00:31:56,330
or if you have three years,
it's down to two, et cetera.

355
00:31:56,330 --> 00:32:01,330
So you only get t minus one
observations for each person.

356
00:32:03,490 --> 00:32:06,190
Since we're subtracting out,

357
00:32:06,190 --> 00:32:09,190
we're only measuring the change in X,

358
00:32:09,190 --> 00:32:13,920
you're probably going to
have less variation in X.

359
00:32:13,920 --> 00:32:18,410
We lose any information that
might be contained in ai

360
00:32:18,410 --> 00:32:20,500
because it's subtracted out.

361
00:32:20,500 --> 00:32:22,680
And as a result,

362
00:32:22,680 --> 00:32:26,180
because we lose degrees of freedom,

363
00:32:26,180 --> 00:32:28,590
we lose information each time.

364
00:32:28,590 --> 00:32:31,803
This tends to be a less
efficient estimator.

365
00:32:35,140 --> 00:32:37,760
The other technique
that we learned to deal

366
00:32:37,760 --> 00:32:42,760
with panel data is fixed effects or what,

367
00:32:45,590 --> 00:32:48,280
where we do the time demeaning.

368
00:32:48,280 --> 00:32:50,530
So if we have one regressor,

369
00:32:50,530 --> 00:32:53,530
we still have this model

370
00:32:53,530 --> 00:32:58,530
where we have data from each individual

371
00:32:59,490 --> 00:33:00,880
over t times.

372
00:33:00,880 --> 00:33:03,120
And we assume that there's this ai there

373
00:33:03,120 --> 00:33:04,930
for every individual.

374
00:33:04,930 --> 00:33:09,910
Now we group each
individual's observations.

375
00:33:09,910 --> 00:33:13,190
So you say, take all of my observations

376
00:33:13,190 --> 00:33:14,870
and take the mean of them.

377
00:33:14,870 --> 00:33:19,770
So if I answered the survey three times,

378
00:33:19,770 --> 00:33:21,460
take all three of my Y's,

379
00:33:21,460 --> 00:33:24,280
and take the mean of them.

380
00:33:24,280 --> 00:33:29,280
So that's yi bar same
with xi bar and ui bar.

381
00:33:29,420 --> 00:33:33,630
And then we subtract them
from each time observation.

382
00:33:33,630 --> 00:33:37,700
So say in year three,

383
00:33:37,700 --> 00:33:40,880
you take my Y and subtract

384
00:33:42,100 --> 00:33:46,500
the mean of my Y and the same for each X,

385
00:33:46,500 --> 00:33:49,660
and in this way,

386
00:33:49,660 --> 00:33:53,510
since ai doesn't change

387
00:33:53,510 --> 00:33:57,170
over time with the mean
of ai is always ai,

388
00:33:57,170 --> 00:33:58,700
it gets subtracted out.

389
00:33:58,700 --> 00:34:00,700
And in the same way,

390
00:34:00,700 --> 00:34:05,340
it gets rid of this endogeneity

391
00:34:05,340 --> 00:34:07,203
and therefore the bias.

392
00:34:08,510 --> 00:34:13,510
And this could be done
for any number of K's

393
00:34:13,950 --> 00:34:17,510
and for any number of times.

394
00:34:17,510 --> 00:34:22,510
So if this were pooled,

395
00:34:22,650 --> 00:34:26,550
the data that you would
have NT observations,

396
00:34:26,550 --> 00:34:27,633
but we lose,

397
00:34:30,159 --> 00:34:32,350
but the degrees of freedom

398
00:34:34,400 --> 00:34:37,107
here is NT minus N

399
00:34:41,270 --> 00:34:43,630
because we lose a degree of freedom

400
00:34:43,630 --> 00:34:45,670
by subtracting out the mean,

401
00:34:45,670 --> 00:34:49,253
and then k for k regressors.

402
00:34:52,530 --> 00:34:56,750
So the pros and cons of panel data,

403
00:34:56,750 --> 00:34:59,203
it does eliminate bias as well.

404
00:35:00,410 --> 00:35:04,690
And compared to first differencing

405
00:35:04,690 --> 00:35:08,420
for any number of times greater than two,

406
00:35:08,420 --> 00:35:10,893
it retains a degree of freedom.

407
00:35:15,998 --> 00:35:20,610
It keeps one more observation

408
00:35:20,610 --> 00:35:22,320
than it does per person

409
00:35:22,320 --> 00:35:24,747
than when first differencing.

410
00:35:26,230 --> 00:35:29,677
The cons is same as first differencing.

411
00:35:31,680 --> 00:35:36,680
Any predictor that is constant
over time will be lost.

412
00:35:37,470 --> 00:35:39,650
It gets subtracted out,

413
00:35:39,650 --> 00:35:44,650
and it does not deal
with serial correlation

414
00:35:46,840 --> 00:35:49,840
in the way that first differencing did.

415
00:35:49,840 --> 00:35:52,900
And you need a homoskedasticity

416
00:35:52,900 --> 00:35:56,500
and a lack of serial correlation

417
00:35:56,500 --> 00:35:59,163
for this to be an efficient estimator.

418
00:36:02,640 --> 00:36:05,430
This slide discusses the difference

419
00:36:05,430 --> 00:36:08,100
between our two techniques,

420
00:36:08,100 --> 00:36:09,677
the first differencing

421
00:36:09,677 --> 00:36:14,677
and time demeaning fixed
effects data transformations.

422
00:36:15,320 --> 00:36:20,320
So, as we saw in the SPSS exercise,

423
00:36:21,510 --> 00:36:25,060
when you had two time periods,

424
00:36:25,060 --> 00:36:27,093
you get identical betas.

425
00:36:28,340 --> 00:36:32,020
If T is greater than or equal to three,

426
00:36:32,020 --> 00:36:35,663
then you have some decisions to make.

427
00:36:37,732 --> 00:36:39,990
So they're both unbiased,

428
00:36:39,990 --> 00:36:42,080
as long as the other

429
00:36:43,910 --> 00:36:47,963
classic linear regression
assumptions hold.

430
00:36:48,880 --> 00:36:49,980
And in many ways,

431
00:36:49,980 --> 00:36:53,950
it comes down to if there
is serial correlation.

432
00:36:53,950 --> 00:36:55,850
So when we do time series,

433
00:36:55,850 --> 00:36:58,240
we'll learn how to test for that.

434
00:36:58,240 --> 00:37:02,940
So if there is no serial correlation

435
00:37:02,940 --> 00:37:07,940
then time demeaning is more efficient,

436
00:37:08,810 --> 00:37:09,660
in many ways,

437
00:37:09,660 --> 00:37:13,630
because you have one more
observation to work with.

438
00:37:13,630 --> 00:37:18,123
If there is significant
serial correlation,

439
00:37:19,590 --> 00:37:22,364
then it's better to use first differencing

440
00:37:22,364 --> 00:37:24,833
to eliminate this.

441
00:37:26,170 --> 00:37:28,890
And also when you have a large T,

442
00:37:28,890 --> 00:37:32,120
so when T is large and N is small,

443
00:37:32,120 --> 00:37:34,770
it comes more of a time series model

444
00:37:34,770 --> 00:37:38,610
and first differencing is better.

445
00:37:38,610 --> 00:37:42,030
And as is so often that the case,

446
00:37:42,030 --> 00:37:44,230
it's good to do things both ways

447
00:37:44,230 --> 00:37:49,090
and to discuss why and how they differ,

448
00:37:49,090 --> 00:37:52,743
and which is the final
model that you choose.

449
00:38:01,360 --> 00:38:04,420
The last data transformation
that we can do

450
00:38:05,610 --> 00:38:08,750
in panel data is random effects.

451
00:38:08,750 --> 00:38:12,470
And we have our same model where we have

452
00:38:14,531 --> 00:38:19,531
N research subjects, data
collected over t times,

453
00:38:19,840 --> 00:38:22,640
we have k regressors,

454
00:38:22,640 --> 00:38:27,640
but now we can assume
that ai is uncorrelated

455
00:38:29,300 --> 00:38:32,683
with any of the Xs at any time,

456
00:38:33,940 --> 00:38:37,360
which basically means
that we've controlled

457
00:38:37,360 --> 00:38:39,560
for everything through our Xs,

458
00:38:39,560 --> 00:38:43,593
and therefore ai is very
small and insignificant.

459
00:38:44,470 --> 00:38:49,470
And by doing the time
demeaning or first differencing

460
00:38:53,550 --> 00:38:55,230
we lose information

461
00:38:55,230 --> 00:39:00,230
and therefore those
betas are less efficient.

462
00:39:00,880 --> 00:39:02,773
They have a higher variance.

463
00:39:04,390 --> 00:39:07,350
So this is a form of FGLS.

464
00:39:07,350 --> 00:39:09,930
And here you can see Jeff Feagles

465
00:39:09,930 --> 00:39:14,293
when he was a punter for
the New England Patriots.

466
00:39:15,210 --> 00:39:19,700
And anyway,

467
00:39:19,700 --> 00:39:24,700
we weigh every observation by
a factor of one minus lambda,

468
00:39:25,560 --> 00:39:29,113
and we use our data to
come up with the lambda,

469
00:39:30,050 --> 00:39:35,050
and you can see that the
lambda is a function of

470
00:39:37,860 --> 00:39:39,930
the variance of our error term,

471
00:39:39,930 --> 00:39:43,500
which we've worked with before,

472
00:39:43,500 --> 00:39:47,410
the variance of our a term and t,

473
00:39:47,410 --> 00:39:50,440
how many observations that there are.

474
00:39:50,440 --> 00:39:55,370
And then we use this lambda
to weigh every observation.

475
00:39:55,370 --> 00:39:56,233
So,

476
00:40:01,000 --> 00:40:06,000
we weigh the mean

477
00:40:06,090 --> 00:40:10,790
by this factor.

478
00:40:10,790 --> 00:40:15,790
And in the next one, I will
show you what this all means.

479
00:40:18,520 --> 00:40:22,050
So again, this is our lambda.

480
00:40:22,050 --> 00:40:24,330
So with lambda equals zero,

481
00:40:24,330 --> 00:40:28,130
then we don't subtract
any of the mean out,

482
00:40:28,130 --> 00:40:29,750
no wrong.

483
00:40:29,750 --> 00:40:34,750
If lambda equals zero, then
it's the same as fixed effects.

484
00:40:35,260 --> 00:40:38,570
We subtract all of our mean out.

485
00:40:38,570 --> 00:40:42,870
If lambda equals one, it's
the same as pooled OLS.

486
00:40:42,870 --> 00:40:45,423
We do not subtract any of it out.

487
00:40:46,380 --> 00:40:51,380
And in most cases, it'll be a
number between zero and one.

488
00:40:53,230 --> 00:40:58,230
Note that if a varies
a lot, a is important,

489
00:41:00,198 --> 00:41:03,150
and the sigma squared a is large

490
00:41:03,150 --> 00:41:06,930
than one minus lambda approaches one,

491
00:41:06,930 --> 00:41:09,330
and it becomes more
like the fixed effects,

492
00:41:09,330 --> 00:41:12,430
which we would assume if a is important,

493
00:41:12,430 --> 00:41:15,480
we need to subtract it out.

494
00:41:15,480 --> 00:41:17,083
As T gets large,

495
00:41:18,660 --> 00:41:20,260
one minus lambda,

496
00:41:20,260 --> 00:41:22,000
again, the approach is one,

497
00:41:22,000 --> 00:41:24,210
and it becomes like fixed effects

498
00:41:24,210 --> 00:41:27,760
and as a got smaller and smaller,

499
00:41:27,760 --> 00:41:31,500
one minus lambda approaches zero

500
00:41:31,500 --> 00:41:32,983
and it's more like pooled.

501
00:41:36,470 --> 00:41:38,770
We learned about the Wu Hausman test

502
00:41:38,770 --> 00:41:43,770
or Wu Durbin Hausman tests,
it has a number of names,

503
00:41:43,980 --> 00:41:47,203
but this test that W,

504
00:41:48,370 --> 00:41:51,370
in the numerator is the difference

505
00:41:51,370 --> 00:41:53,420
between the two betas,

506
00:41:53,420 --> 00:41:55,823
the random effects and
fixed effects squared,

507
00:41:56,830 --> 00:42:01,240
and on the denominator
is just the difference

508
00:42:01,240 --> 00:42:03,440
of the variances.

509
00:42:03,440 --> 00:42:06,060
So you would run it both ways.

510
00:42:06,060 --> 00:42:08,623
And for a given beta,

511
00:42:09,540 --> 00:42:13,460
note, what is the estimate of beta?

512
00:42:13,460 --> 00:42:15,800
Those two things go in the numerator

513
00:42:15,800 --> 00:42:18,050
and what are the variances of these?

514
00:42:18,050 --> 00:42:22,390
So you would take the
standard error from the SPSS

515
00:42:22,390 --> 00:42:23,740
and square it,

516
00:42:23,740 --> 00:42:26,183
and those are used in the denominator,

517
00:42:27,100 --> 00:42:31,570
our null hypothesis is that this X and ai

518
00:42:31,570 --> 00:42:35,320
that their covariance equals zero.

519
00:42:35,320 --> 00:42:38,280
So if the null is true,

520
00:42:38,280 --> 00:42:43,280
if the covariance of
those two things are zero,

521
00:42:43,960 --> 00:42:46,950
then the numerator is small.

522
00:42:46,950 --> 00:42:50,290
There's very small or no bias.

523
00:42:50,290 --> 00:42:52,760
The denominator is large

524
00:42:52,760 --> 00:42:56,260
because we,

525
00:42:56,260 --> 00:43:00,430
there a big difference
in the efficiencies,

526
00:43:00,430 --> 00:43:01,710
W is small,

527
00:43:01,710 --> 00:43:06,710
and we use random effects.

528
00:43:07,890 --> 00:43:11,450
If the test stat is large,

529
00:43:11,450 --> 00:43:14,160
which means a lot of bias

530
00:43:14,160 --> 00:43:17,990
and not very much change in variance.

531
00:43:17,990 --> 00:43:20,670
So big number in the numerator,

532
00:43:20,670 --> 00:43:24,640
small number in the denominator
means a large test stat,

533
00:43:24,640 --> 00:43:27,160
which means that we reject the null

534
00:43:27,160 --> 00:43:29,283
and we use fixed effects.

535
00:43:33,760 --> 00:43:36,570
Now we jump to the next topic,
(mic screeching)

536
00:43:36,570 --> 00:43:40,230
instrumental variables and
two-stage least squares.

537
00:43:40,230 --> 00:43:43,560
So this is another data transformation

538
00:43:43,560 --> 00:43:45,690
to deal with endogeneity.

539
00:43:45,690 --> 00:43:49,280
You may notice
(paper screeching)

540
00:43:49,280 --> 00:43:50,790
a trend that that's a lot

541
00:43:50,790 --> 00:43:53,680
of what we were doing in the second half

542
00:43:53,680 --> 00:43:56,970
of this class is transforming data

543
00:43:56,970 --> 00:44:00,040
to deal with endogeneity.

544
00:44:00,040 --> 00:44:04,500
And again, finding that
in almost every case,

545
00:44:04,500 --> 00:44:09,253
the eliminating bias comes
at a cost of efficiency.

546
00:44:10,930 --> 00:44:15,373
So this is a method that we
use to address endogeneity.

547
00:44:19,420 --> 00:44:22,420
Its applications in omitted

548
00:44:22,420 --> 00:44:26,470
with variables and measurement errors,

549
00:44:26,470 --> 00:44:31,470
and most commonly is
used in the next topic

550
00:44:31,760 --> 00:44:35,580
that we covered simultaneous equations.

551
00:44:35,580 --> 00:44:38,460
So we've talked a lot about this,

552
00:44:38,460 --> 00:44:40,210
but just as a review,

553
00:44:40,210 --> 00:44:42,630
that endogeneity causes bias,

554
00:44:42,630 --> 00:44:47,630
that if the regressor and the
error term are correlated,

555
00:44:48,120 --> 00:44:50,680
that we can't know
(paper screeching)

556
00:44:50,680 --> 00:44:55,680
what is the direct effect
of x on the expected value,

557
00:44:56,340 --> 00:45:00,820
because we can't partial out

558
00:45:02,310 --> 00:45:04,660
the way that x affects u,

559
00:45:04,660 --> 00:45:06,380
and then u affects y.

560
00:45:06,380 --> 00:45:10,670
So this is why failing to deal

561
00:45:10,670 --> 00:45:13,963
with endogeneity gives
us bias assessments.

562
00:45:17,400 --> 00:45:21,103
So the three main ways that it,

563
00:45:22,420 --> 00:45:25,930
or the three main sort of
types of data that we have,

564
00:45:25,930 --> 00:45:30,360
or three main issues are omitted variable,

565
00:45:30,360 --> 00:45:34,003
where we are missing
the variable where we,

566
00:45:35,110 --> 00:45:38,050
our regressor has an error
(paper screeching)

567
00:45:38,050 --> 00:45:40,713
and how it is measured or simultaneity.

568
00:45:41,750 --> 00:45:43,040
So far,

569
00:45:43,040 --> 00:45:47,310
what we've learned is we could ignore it,

570
00:45:47,310 --> 00:45:49,010
but we know that causes bias.
(mic screeching)

571
00:45:49,010 --> 00:45:52,690
We could find a proxy that is exogenous,

572
00:45:52,690 --> 00:45:55,623
which may or may not be possible.

573
00:45:56,830 --> 00:46:00,330
We can assume it is time constant.

574
00:46:00,330 --> 00:46:05,330
And in panel data subtracted
out through first differencing

575
00:46:06,560 --> 00:46:11,370
or subtract out its mean
through fixed effects,

576
00:46:11,370 --> 00:46:16,370
or we can use the
instrumental variable method.

577
00:46:20,410 --> 00:46:25,170
So this method requires us

578
00:46:25,170 --> 00:46:30,170
to have an additional regressor

579
00:46:30,684 --> 00:46:34,520
or an additional variable

580
00:46:34,520 --> 00:46:37,410
that's not in the original equation

581
00:46:37,410 --> 00:46:41,540
that we sort of saved in our back pockets.

582
00:46:41,540 --> 00:46:45,060
And it has to has two attributes.

583
00:46:45,060 --> 00:46:46,040
First,
(mic screeching)

584
00:46:46,040 --> 00:46:49,000
it must be in itself exogenous.

585
00:46:49,000 --> 00:46:53,840
So the z must have zero covariance with u

586
00:46:53,840 --> 00:46:58,840
but it has to have some sort
of explanatory power of our x,

587
00:46:59,550 --> 00:47:02,974
so the z and x cannot have covariance

588
00:47:02,974 --> 00:47:05,724
(mic screeching)

589
00:47:07,843 --> 00:47:10,540
So in two-stage least squares,

590
00:47:10,540 --> 00:47:13,240
we start with our structural models.

591
00:47:13,240 --> 00:47:16,880
So we have Y on the left side, Y1,

592
00:47:16,880 --> 00:47:20,813
and we have this endogenous regressor, Y2,

593
00:47:22,200 --> 00:47:27,157
and an exogenous one in our
structural equation, Z1.

594
00:47:29,030 --> 00:47:34,030
We've also saved two exogenous
variables, Z2 and Z3,

595
00:47:35,480 --> 00:47:40,480
which have these two attributes
that they are exogenous.

596
00:47:42,230 --> 00:47:45,350
So they're both uncorrelated with u1,

597
00:47:45,350 --> 00:47:49,910
but they do have some overlap with Y2.

598
00:47:49,910 --> 00:47:51,640
So we wanted it to come up

599
00:47:52,730 --> 00:47:55,573
with an estimate of Y2,

600
00:47:56,440 --> 00:48:00,550
which is a function of Z1, Z2, Z3,

601
00:48:00,550 --> 00:48:05,493
that has the most explanatory
power that we can.

602
00:48:08,700 --> 00:48:11,040
We do this by doing the first stage

603
00:48:11,040 --> 00:48:12,590
of two-stage least squares,

604
00:48:12,590 --> 00:48:16,050
which we put Y2 on the left side,

605
00:48:16,050 --> 00:48:21,050
and we regress it on all three
of our exogenous regressor,

606
00:48:21,461 --> 00:48:24,470
Z1, Z2, Z3.

607
00:48:24,470 --> 00:48:25,760
We do an F-test,

608
00:48:25,760 --> 00:48:29,700
and we hope that we can reject our null,

609
00:48:29,700 --> 00:48:34,700
that we hope that Pi2 and Pi3
are not jointly equal to zero,

610
00:48:35,410 --> 00:48:36,270
that these two,

611
00:48:36,270 --> 00:48:41,270
Z2 and Z3 have significant
explanatory power for Y2.

612
00:48:42,730 --> 00:48:47,450
Then we save the
predicted values of y hat.

613
00:48:47,450 --> 00:48:52,450
So everybody's y2 hat gets saved.

614
00:48:53,630 --> 00:48:58,170
And then in the second
stage of least squares,

615
00:48:58,170 --> 00:49:02,120
we put this Y2 hat in place of the Y2.

616
00:49:02,120 --> 00:49:03,280
And in this way,

617
00:49:03,280 --> 00:49:07,150
since Y2 is a linear combination

618
00:49:07,150 --> 00:49:09,500
of all exogenous regressors,

619
00:49:09,500 --> 00:49:12,497
it is in and of itself exogenous.

620
00:49:12,497 --> 00:49:16,913
And this purges the Y2 of its endogeneity.

621
00:49:19,250 --> 00:49:22,360
Note though, that it comes at a cost

622
00:49:22,360 --> 00:49:24,850
of much higher variance.

623
00:49:24,850 --> 00:49:26,730
And we saw that

624
00:49:26,730 --> 00:49:29,970
when we did the simultaneous
equation exercises,

625
00:49:29,970 --> 00:49:34,970
the standard error
increased almost tenfold,

626
00:49:35,990 --> 00:49:38,830
and two very related things

627
00:49:38,830 --> 00:49:43,830
that Y2 has a lot less variability,

628
00:49:44,380 --> 00:49:46,990
or y2 hat has less variability,

629
00:49:46,990 --> 00:49:48,150
and more importantly,

630
00:49:48,150 --> 00:49:52,773
maybe Y2 is very collinear with our Zs.

631
00:49:58,900 --> 00:50:03,900
This method can also help
address measurement errors.

632
00:50:03,930 --> 00:50:05,850
So if one of your regressors,

633
00:50:05,850 --> 00:50:09,313
if there is an error in
which it was measured,

634
00:50:11,320 --> 00:50:15,510
that will in and of
itself cause endogeneity.

635
00:50:15,510 --> 00:50:20,510
So if you have an
instrument for the variable

636
00:50:21,570 --> 00:50:23,500
that was measured in error,

637
00:50:23,500 --> 00:50:27,890
that you can do two-stage
least squares just as before,

638
00:50:27,890 --> 00:50:32,480
and so in this case use Z1

639
00:50:34,800 --> 00:50:36,513
as an instrument for X1,

640
00:50:37,470 --> 00:50:39,220
and save X1 hat,

641
00:50:39,220 --> 00:50:43,660
and then put it into
the structural equation.

642
00:50:43,660 --> 00:50:46,810
And again, this will eliminate the bias,

643
00:50:46,810 --> 00:50:51,423
but as a cost of less efficiency.

644
00:50:55,880 --> 00:50:59,930
The most common application
is simultaneous equations.

645
00:50:59,930 --> 00:51:01,970
So we showed how

646
00:51:01,970 --> 00:51:06,970
simultaneity is a major cause

647
00:51:07,480 --> 00:51:09,990
of endogeneity that when you have this

648
00:51:09,990 --> 00:51:13,000
two directional causality,

649
00:51:13,000 --> 00:51:17,183
that modeling it as a
one way can cause bias.

650
00:51:18,290 --> 00:51:22,610
And it tends to be jointly
determined variables,

651
00:51:22,610 --> 00:51:25,963
and often in equilibrium.

652
00:51:30,100 --> 00:51:35,050
Very often we have a two
equation, supply and demand

653
00:51:35,050 --> 00:51:40,050
with Y1 on the left side
of the first equation,

654
00:51:40,180 --> 00:51:45,180
and it has a regressor in the
second, Y2 on the left side

655
00:51:45,480 --> 00:51:48,430
of the second equation

656
00:51:48,430 --> 00:51:51,470
and has regressor in one,

657
00:51:51,470 --> 00:51:54,673
and then each one also has a set of Zs,

658
00:51:56,810 --> 00:51:57,760
Z1.

659
00:51:57,760 --> 00:51:58,593
So, you know,

660
00:51:58,593 --> 00:52:03,050
that that could be a number
of things that affect supply.

661
00:52:03,050 --> 00:52:06,733
Z2 would be a number of
things that affect demand.

662
00:52:09,180 --> 00:52:11,730
Here it is, spelled out a bit more.

663
00:52:11,730 --> 00:52:13,240
Note that we would expect

664
00:52:13,240 --> 00:52:15,590
that there would probably be some overlap

665
00:52:15,590 --> 00:52:18,550
that some of the regressors

666
00:52:18,550 --> 00:52:23,210
in the first equation would
also appear in the second

667
00:52:23,210 --> 00:52:25,420
and vice versa,

668
00:52:25,420 --> 00:52:28,660
but they cannot be exactly the same

669
00:52:28,660 --> 00:52:33,210
that for each equation to be identified,

670
00:52:33,210 --> 00:52:38,210
there must be a Z in the other equation,

671
00:52:38,960 --> 00:52:43,650
because if Z1 and Z2 are
all exactly the same,

672
00:52:43,650 --> 00:52:46,590
then when we do the,

673
00:52:46,590 --> 00:52:49,393
in the first stage of least squares,

674
00:52:51,580 --> 00:52:56,580
our say, Y2 hat will
be a linear combination

675
00:52:59,750 --> 00:53:01,330
of Z1.

676
00:53:01,330 --> 00:53:03,790
And then when we put that back in,

677
00:53:03,790 --> 00:53:08,370
it will be perfectly
collinear with our Z1,

678
00:53:08,370 --> 00:53:10,163
and it will not work.

679
00:53:12,750 --> 00:53:15,603
Again for-

680
00:53:21,630 --> 00:53:24,800
We can use two-stage least squares

681
00:53:24,800 --> 00:53:27,683
to our simultaneous equations.

682
00:53:28,660 --> 00:53:31,743
The way that we do this is we,

683
00:53:33,260 --> 00:53:38,060
that if we're mainly interested
in the first equation,

684
00:53:38,060 --> 00:53:41,870
so that's really the structural equation

685
00:53:41,870 --> 00:53:44,070
that we're most interested in.

686
00:53:44,070 --> 00:53:48,850
Then we do a reduced form of
the second equation for Y2.

687
00:53:48,850 --> 00:53:51,770
So we put Y2 on the left side

688
00:53:51,770 --> 00:53:55,440
and we regress it on all of our Zs,

689
00:53:55,440 --> 00:54:00,440
all of our exogenous
Zs, and save the Y hats,

690
00:54:00,590 --> 00:54:05,590
and then put this Y hat two
into the structural equation

691
00:54:05,620 --> 00:54:10,393
and use that to regress for
the structural parameters.

692
00:54:16,080 --> 00:54:20,930
Here is a note on the
directions for the exam.

693
00:54:20,930 --> 00:54:24,530
So this is the first thing
you'll see in the exam.

694
00:54:24,530 --> 00:54:28,580
And mainly that this is
take home and open book

695
00:54:28,580 --> 00:54:32,250
that you can and should use class notes,

696
00:54:32,250 --> 00:54:36,580
slides, both textbooks,

697
00:54:36,580 --> 00:54:38,650
the textbook slides,

698
00:54:38,650 --> 00:54:40,490
and also that Dartmouth guide.

699
00:54:40,490 --> 00:54:45,490
So all of those are absolutely
fair game for you to use,

700
00:54:45,711 --> 00:54:48,540
to answer these.

701
00:54:48,540 --> 00:54:52,950
It will have four, 25 point questions

702
00:54:52,950 --> 00:54:55,433
for a total of 100 points.

703
00:54:56,380 --> 00:54:58,170
The biggest thing is that

704
00:55:00,420 --> 00:55:03,040
unlike the problem sets

705
00:55:03,040 --> 00:55:07,103
that I want you to work on
this only as individuals,

706
00:55:08,250 --> 00:55:11,130
please do not collaborate on it.

707
00:55:11,130 --> 00:55:12,490
Once you open it,

708
00:55:12,490 --> 00:55:16,360
don't discuss any of the
questions or the answers

709
00:55:16,360 --> 00:55:17,800
with anybody else.

710
00:55:17,800 --> 00:55:20,080
So do your own work.

711
00:55:20,080 --> 00:55:24,720
You can use any resource
except for another person.

712
00:55:24,720 --> 00:55:25,740
So here's a place

713
00:55:25,740 --> 00:55:30,630
where I really wanna test
your individual knowledge.

714
00:55:30,630 --> 00:55:35,630
So we'll go over these
slides on Monday in class,

715
00:55:35,640 --> 00:55:39,720
and I hope you are having a good weekend.

716
00:55:39,720 --> 00:55:43,570
It's not the very best
of weather to be outside,

717
00:55:43,570 --> 00:55:44,403
but,

718
00:55:47,700 --> 00:55:49,730
I hope that you are all well.

719
00:55:49,730 --> 00:55:52,857
And we'll check in on Monday, thank you.