1
00:00:02,830 --> 00:00:05,240
- [Instructor] Hello, and
welcome to the video lecture

2
00:00:05,240 --> 00:00:09,650
on multiple regression
analysis and estimation.

3
00:00:09,650 --> 00:00:11,400
So we're gonna be looking at

4
00:00:12,310 --> 00:00:15,430
how to do a linear regression

5
00:00:15,430 --> 00:00:19,230
with more than one regressor

6
00:00:19,230 --> 00:00:24,230
and look at some of the
properties of the model,

7
00:00:25,610 --> 00:00:29,180
what makes a good model,
some of the assumptions

8
00:00:29,180 --> 00:00:33,070
that make for an unbiased estimator,

9
00:00:33,070 --> 00:00:37,310
much like we did last time,
but now adding more regressors.

10
00:00:37,310 --> 00:00:39,830
And we're also gonna think about

11
00:00:41,970 --> 00:00:43,360
how to understand better

12
00:00:43,360 --> 00:00:46,143
what makes for a more efficient estimator.

13
00:00:49,750 --> 00:00:53,290
We'll start with the K=2, the 2 regressor,

14
00:00:53,290 --> 00:00:56,290
and then look at the general form

15
00:00:56,290 --> 00:01:00,590
where there's K regressors,
look at the residuals,

16
00:01:00,590 --> 00:01:03,423
how we derive the OLS estimator,

17
00:01:05,260 --> 00:01:08,930
look at the ceteris paribus assumption,

18
00:01:08,930 --> 00:01:11,750
goodness of fit, the degrees of freedoms,

19
00:01:11,750 --> 00:01:14,763
and then again, looking
at those assumptions,

20
00:01:15,860 --> 00:01:18,080
which, if they all hold,

21
00:01:18,080 --> 00:01:23,080
that we can say that OLS is
the best unbiased estimator,

22
00:01:24,070 --> 00:01:28,610
and then last, looking at

23
00:01:28,610 --> 00:01:31,550
how do we know what the right model is?

24
00:01:31,550 --> 00:01:36,550
What happens if we add
a irrelevant regressor,

25
00:01:37,510 --> 00:01:41,207
and what happens if we
miss a relevant regressor?

26
00:01:43,790 --> 00:01:46,750
So here is the homework.

27
00:01:46,750 --> 00:01:50,363
So I want you to think about
what is multicollinearity,

28
00:01:51,250 --> 00:01:55,513
what's the consequence and
what's perfect collinearity,

29
00:01:56,410 --> 00:01:59,860
what happens if you omit a variable,

30
00:01:59,860 --> 00:02:02,370
and when doesn't it matter?

31
00:02:02,370 --> 00:02:07,030
Three, I want you to
unpack the three factors

32
00:02:07,030 --> 00:02:11,580
of the OLS estimator's variance.

33
00:02:11,580 --> 00:02:14,420
What drives the variance
in this estimator?

34
00:02:14,420 --> 00:02:16,559
And then there are also a couple

35
00:02:16,559 --> 00:02:19,893
of computer homework problems.

36
00:02:21,680 --> 00:02:25,670
Last time, we looked at
the one regressor model.

37
00:02:25,670 --> 00:02:27,190
Now we're gonna look at the two.

38
00:02:27,190 --> 00:02:29,160
So now we're looking at a model

39
00:02:29,160 --> 00:02:32,080
where on the left-hand side is wage,

40
00:02:32,080 --> 00:02:34,090
so what drives whether or not somebody

41
00:02:34,090 --> 00:02:36,480
has a low or a high wage.

42
00:02:36,480 --> 00:02:39,280
And we're looking at two regressors,

43
00:02:39,280 --> 00:02:42,500
education and experience.

44
00:02:42,500 --> 00:02:45,210
So we're really interested
mostly in education.

45
00:02:45,210 --> 00:02:47,660
So basically does
education pay for itself?

46
00:02:47,660 --> 00:02:49,813
What's the returns to education?

47
00:02:50,935 --> 00:02:54,593
And may also be interested in experience,

48
00:02:56,050 --> 00:03:01,050
but we know that if we omit experience

49
00:03:01,080 --> 00:03:04,780
that it goes into the error term.

50
00:03:04,780 --> 00:03:09,400
And it is very likely
that if we asked a survey

51
00:03:09,400 --> 00:03:13,080
that education and experience

52
00:03:13,080 --> 00:03:15,413
may be related,

53
00:03:17,580 --> 00:03:20,980
that they are correlated.

54
00:03:20,980 --> 00:03:24,730
For example, the more
education that you have,

55
00:03:24,730 --> 00:03:26,440
maybe you're spending more,

56
00:03:26,440 --> 00:03:28,260
spent more of your life on education,

57
00:03:28,260 --> 00:03:31,610
so you have less experience.

58
00:03:31,610 --> 00:03:33,560
That may be one way of thinking about it,

59
00:03:33,560 --> 00:03:36,500
but you can think they're
almost certainly correlated.

60
00:03:36,500 --> 00:03:41,500
So if they are, and we
forget to put in experience,

61
00:03:42,050 --> 00:03:46,023
then experience, we know,
is in the error term,

62
00:03:47,772 --> 00:03:49,623
and then, almost certainly,

63
00:03:51,870 --> 00:03:54,310
our assumption of the error term

64
00:03:54,310 --> 00:03:57,800
and the regressors not being
correlated is violated.

65
00:03:57,800 --> 00:04:01,373
And thus, we will almost
certainly have a biased estimator.

66
00:04:05,350 --> 00:04:06,733
As another example,

67
00:04:07,570 --> 00:04:11,290
we might wanna look at a
model where the test score,

68
00:04:11,290 --> 00:04:15,400
so what is the test score of some school

69
00:04:15,400 --> 00:04:19,480
based on how much that school district

70
00:04:21,710 --> 00:04:23,670
spends on schools, their expenditure,

71
00:04:23,670 --> 00:04:26,090
and the average household income.

72
00:04:26,090 --> 00:04:28,260
So if we only, so we're really interested,

73
00:04:28,260 --> 00:04:30,060
again, in expenditure.

74
00:04:30,060 --> 00:04:33,870
Does higher expenditure
lead to higher test scores?

75
00:04:33,870 --> 00:04:35,990
That was, I would say,
what we would think,

76
00:04:35,990 --> 00:04:37,690
and maybe even what we would hope,

77
00:04:39,510 --> 00:04:43,210
but if we forget to put in average income,

78
00:04:43,210 --> 00:04:45,023
almost certainly,

79
00:04:48,120 --> 00:04:51,610
expenditure and income are correlated.

80
00:04:51,610 --> 00:04:55,320
So again, if we forget
to put it into the model,

81
00:04:55,320 --> 00:04:59,240
then we're almost certainly
going to get a biased estimate

82
00:04:59,240 --> 00:05:03,373
of the effect on expenditure.

83
00:05:09,900 --> 00:05:13,150
So in each case, the beta-1,

84
00:05:13,150 --> 00:05:15,710
the one that we're most interested in,

85
00:05:15,710 --> 00:05:18,440
is a measure of

86
00:05:18,440 --> 00:05:22,560
if we increase X1 by one unit,

87
00:05:22,560 --> 00:05:26,493
how much does the Y increase?

88
00:05:27,410 --> 00:05:32,260
So basically we want to include
all the relevant regressors,

89
00:05:32,260 --> 00:05:33,280
so we can account for them,

90
00:05:33,280 --> 00:05:35,580
so they don't end up in the error term,

91
00:05:35,580 --> 00:05:39,550
so we don't have a biased estimator.

92
00:05:39,550 --> 00:05:41,490
We want to account for everything,

93
00:05:41,490 --> 00:05:43,190
so we can make a good case

94
00:05:43,190 --> 00:05:48,143
that the estimator for
our beta-1 is unbiased.

95
00:05:52,450 --> 00:05:56,450
Remember back that an
absolutely essential assumption

96
00:05:56,450 --> 00:05:59,040
is that our error term is uncorrelated

97
00:05:59,040 --> 00:06:00,780
with any of our regressors.

98
00:06:00,780 --> 00:06:02,620
So now that we have two regressors,

99
00:06:02,620 --> 00:06:05,853
it must be uncorrelated with both.

100
00:06:08,980 --> 00:06:12,220
Thinking ahead to have
an unbiased estimator,

101
00:06:12,220 --> 00:06:15,520
that the error term must be uncorrelated

102
00:06:15,520 --> 00:06:18,710
with both regressors.

103
00:06:18,710 --> 00:06:21,500
So that would be

104
00:06:21,500 --> 00:06:24,630
that in the population

105
00:06:24,630 --> 00:06:28,940
that the average residual

106
00:06:28,940 --> 00:06:31,800
for any individual, so the expected value,

107
00:06:31,800 --> 00:06:33,110
would be zero.

108
00:06:33,110 --> 00:06:36,620
And that would hold true
for any value of X1 or X2.

109
00:06:36,620 --> 00:06:40,970
So no matter what the
respondents say on the survey,

110
00:06:40,970 --> 00:06:43,840
no matter what their X1 and X2 is,

111
00:06:43,840 --> 00:06:48,193
the expected value of
the residual is zero.

112
00:06:49,820 --> 00:06:54,820
So, and remember that,
that that's important

113
00:06:54,830 --> 00:06:58,303
because if that is not true,

114
00:07:00,398 --> 00:07:03,370
that if dU/dX1

115
00:07:03,370 --> 00:07:05,380
does not equal zero,

116
00:07:05,380 --> 00:07:09,070
then we can't tell as we change X1

117
00:07:09,070 --> 00:07:12,350
whether the change in the observed Y

118
00:07:12,350 --> 00:07:17,263
is due to a change in Y-hat
or a change in the error term.

119
00:07:23,130 --> 00:07:26,440
Much the same holds when we
look at the general case,

120
00:07:26,440 --> 00:07:28,030
so K regressors.

121
00:07:28,030 --> 00:07:31,270
So in many cases, we're
going to have more than one,

122
00:07:31,270 --> 00:07:35,920
more than two, but a good
number of regressors.

123
00:07:35,920 --> 00:07:39,590
So to think about the original example,

124
00:07:39,590 --> 00:07:42,420
that wages are probably affected,

125
00:07:42,420 --> 00:07:45,810
not just by education experience,

126
00:07:45,810 --> 00:07:50,810
by training, by ability, by
all kinds of other things.

127
00:07:51,050 --> 00:07:52,760
Test scores, in the same way,

128
00:07:52,760 --> 00:07:56,903
have many other factors
that would affect them.

129
00:07:58,260 --> 00:08:02,700
When we think of the example
of local food expenditures,

130
00:08:02,700 --> 00:08:06,930
it wouldn't just be income,
but many other factors,

131
00:08:06,930 --> 00:08:11,710
preferences in where you
live and household size

132
00:08:11,710 --> 00:08:13,460
and all kinds of things like that.

133
00:08:13,460 --> 00:08:15,380
So you can probably
think of other examples

134
00:08:15,380 --> 00:08:18,913
of where many Xs might affect our Y.

135
00:08:22,890 --> 00:08:25,360
This equation at the
top is the general form,

136
00:08:25,360 --> 00:08:27,323
where we have K regressors.

137
00:08:28,420 --> 00:08:31,320
And so when we run it
through our software,

138
00:08:31,320 --> 00:08:34,350
there's K plus one parameters,

139
00:08:34,350 --> 00:08:37,910
beta-0, beta-1, through beta-k.

140
00:08:37,910 --> 00:08:40,550
Again, beta-0 is the intercept.

141
00:08:40,550 --> 00:08:44,520
And remember that we sort
of slide it up and down

142
00:08:44,520 --> 00:08:48,950
so that the expected
value of u equals zero.

143
00:08:48,950 --> 00:08:52,150
And in some cases, it could be interpreted

144
00:08:52,150 --> 00:08:56,800
as the value of X, or
the expected value of X,

145
00:08:56,800 --> 00:09:01,470
if all the other, if X,

146
00:09:01,470 --> 00:09:03,030
if all the Xs equals zero,

147
00:09:03,030 --> 00:09:05,460
or if all the betas equals zero.

148
00:09:05,460 --> 00:09:08,870
And in many cases, we
think of beta-1 to beta-k

149
00:09:08,870 --> 00:09:11,330
are called the slope parameters

150
00:09:11,330 --> 00:09:13,933
'cause they measure the sort of change,

151
00:09:15,190 --> 00:09:18,080
the slope of the line for that regressor,

152
00:09:18,080 --> 00:09:22,203
and u is the disturbance term, as always.

153
00:09:28,120 --> 00:09:30,550
So once we get some data,

154
00:09:30,550 --> 00:09:35,550
and we run it through a software package,

155
00:09:35,650 --> 00:09:40,450
like SPSS, we get a number of things back.

156
00:09:40,450 --> 00:09:43,090
One is we get estimates, these beta-hats.

157
00:09:43,090 --> 00:09:45,550
So we get K plus 1 beta-hats.

158
00:09:45,550 --> 00:09:49,870
And then if we take
everybody's X and plug it in

159
00:09:49,870 --> 00:09:54,373
and multiply each of these,

160
00:09:55,540 --> 00:09:58,920
each of their Xs times the beta-hat

161
00:09:58,920 --> 00:10:01,490
and get them all, add them all up,

162
00:10:01,490 --> 00:10:04,440
you'll get the Y-hat for that individual.

163
00:10:04,440 --> 00:10:09,020
So the forecast of what we would say

164
00:10:09,020 --> 00:10:11,123
or what would be our best guess,

165
00:10:12,002 --> 00:10:15,930
that that individual who
answered X in that way,

166
00:10:15,930 --> 00:10:19,103
this is what the predicted
value of their Y is.

167
00:10:20,840 --> 00:10:24,480
And note that when we put hats on things,

168
00:10:24,480 --> 00:10:26,867
that that's always the estimate.

169
00:10:26,867 --> 00:10:29,400
That's just how we sort of denote it.

170
00:10:29,400 --> 00:10:34,310
And if there are N observations,

171
00:10:34,310 --> 00:10:35,650
so we do a survey,

172
00:10:35,650 --> 00:10:40,410
and we get N observations,

173
00:10:40,410 --> 00:10:43,060
again, what OLS does

174
00:10:43,060 --> 00:10:47,160
is it minimizes the sum
of squared residuals.

175
00:10:47,160 --> 00:10:51,020
So everybody has a ui-hat,

176
00:10:51,020 --> 00:10:54,170
and that's the difference
between their Y-hat

177
00:10:54,170 --> 00:10:58,700
and their, what they actually
said Y on the survey.

178
00:10:58,700 --> 00:11:00,900
So we square those and add them up.

179
00:11:00,900 --> 00:11:04,610
And that is

180
00:11:04,610 --> 00:11:07,363
how we get these estimates.

181
00:11:13,100 --> 00:11:16,010
We can have a sample regression line.

182
00:11:16,010 --> 00:11:19,423
So that is everybody's Y-hat,

183
00:11:20,600 --> 00:11:24,580
which we get by taking
everybody's beta-hat,

184
00:11:24,580 --> 00:11:28,670
or everybody's X, and
multiplying it by the beta-hat,

185
00:11:28,670 --> 00:11:29,800
as we see.

186
00:11:29,800 --> 00:11:31,860
Note that everybody's Y-hat,

187
00:11:31,860 --> 00:11:35,720
every Yi-hat is on the regression line.

188
00:11:35,720 --> 00:11:36,740
Why is that?

189
00:11:36,740 --> 00:11:39,913
So think about that, and
we'll discuss it in class.

190
00:11:46,610 --> 00:11:51,550
So we interpret these
beta-hats as the partial effect

191
00:11:51,550 --> 00:11:56,550
of a one-unit change of that Xi on Y.

192
00:11:56,720 --> 00:12:01,160
So as we change that X by a unit,

193
00:12:01,160 --> 00:12:06,160
that beta-hat denotes
the expected change in Y.

194
00:12:06,480 --> 00:12:10,140
So, if we go and change every X

195
00:12:10,140 --> 00:12:11,290
by some amount of units

196
00:12:14,250 --> 00:12:16,530
and multiply them by the beta-hat,

197
00:12:16,530 --> 00:12:20,653
it gets the expected change in Y-hat.

198
00:12:21,610 --> 00:12:24,300
Or you can have,

199
00:12:24,300 --> 00:12:27,760
hold all Xs constant except one

200
00:12:27,760 --> 00:12:32,330
and only change one X, and
then beta-hat for that X

201
00:12:32,330 --> 00:12:36,443
would be the change in Y
just for changing that one.

202
00:12:41,800 --> 00:12:45,870
And this is sort of the
magic of regression,

203
00:12:45,870 --> 00:12:50,620
that it allows us to
really isolate the effect

204
00:12:50,620 --> 00:12:54,193
of changing one X while
holding all else the same.

205
00:12:55,770 --> 00:12:59,860
So we don't have to go

206
00:12:59,860 --> 00:13:02,723
and collect data where,

207
00:13:04,520 --> 00:13:07,660
for the first many Xs,

208
00:13:07,660 --> 00:13:10,770
everybody answers it the same way.

209
00:13:10,770 --> 00:13:13,800
And then only on, say, the
fourth or fifth question,

210
00:13:13,800 --> 00:13:18,730
do they change it, that we can collect

211
00:13:18,730 --> 00:13:23,250
where lots and lots of,
there are a lot of answers.

212
00:13:23,250 --> 00:13:26,333
Basically, everybody answers
it slightly differently,

213
00:13:28,680 --> 00:13:31,640
but the magic of a regression

214
00:13:31,640 --> 00:13:35,040
is we can still isolate,
holding all else equal,

215
00:13:35,040 --> 00:13:38,280
what is the change by just changing one X

216
00:13:38,280 --> 00:13:40,033
and everything else stays the same?

217
00:13:42,520 --> 00:13:47,100
We can also measure the
effect of changing a lot of Xs

218
00:13:47,100 --> 00:13:49,290
or even all of them.

219
00:13:49,290 --> 00:13:54,083
So all we have to do is sort of plug it,

220
00:13:55,230 --> 00:14:00,203
plug these change in X,
into all of these equations,

221
00:14:01,450 --> 00:14:06,270
into the equation, multiply
by the various hats,

222
00:14:07,270 --> 00:14:10,063
add them up, and there you go.

223
00:14:11,510 --> 00:14:14,850
Or, you could change by one unit,

224
00:14:14,850 --> 00:14:19,770
or you can change by basically any unit.

225
00:14:19,770 --> 00:14:23,670
If you do change every X by one unit,

226
00:14:23,670 --> 00:14:26,970
the change in Y-hat will just be the sum

227
00:14:26,970 --> 00:14:29,463
of the various beta-hats.

228
00:14:30,880 --> 00:14:34,103
So hopefully, mathematically,
all of that makes sense.

229
00:14:38,930 --> 00:14:41,240
Remember that when we run a regression

230
00:14:42,350 --> 00:14:46,460
that every individual

231
00:14:46,460 --> 00:14:50,330
has a Y-hat, the predicted value,

232
00:14:50,330 --> 00:14:55,170
and a u-hat, the value of their residual.

233
00:14:55,170 --> 00:14:59,640
So where do they fall
on the regression line?

234
00:14:59,640 --> 00:15:03,220
And then that u-hat
measures the difference

235
00:15:03,220 --> 00:15:05,960
between what they actually said

236
00:15:05,960 --> 00:15:08,790
and what we would've predicted they said.

237
00:15:08,790 --> 00:15:13,303
And again, the i is for each
individual in the sample.

238
00:15:18,120 --> 00:15:21,280
So you get the Y-hat for each individual

239
00:15:21,280 --> 00:15:26,190
by plugging their Xs
in to the model, again,

240
00:15:26,190 --> 00:15:30,720
multiplying them by the beta-hats

241
00:15:30,720 --> 00:15:34,330
and coming up with the Y-hat.

242
00:15:34,330 --> 00:15:38,120
And then we also, everybody has a u-hat.

243
00:15:38,120 --> 00:15:40,700
And we will learn down the road

244
00:15:40,700 --> 00:15:43,540
that we can save them both in SPSS.

245
00:15:43,540 --> 00:15:47,423
So when we go into SPSS, I
will show you how to do that.

246
00:15:51,410 --> 00:15:54,070
Here are a number of mathematical

247
00:15:54,070 --> 00:15:57,670
and statistical properties
of the residual.

248
00:15:57,670 --> 00:16:02,417
The sample average of ui-hat equals zero

249
00:16:04,370 --> 00:16:06,310
because the mean is zero.

250
00:16:06,310 --> 00:16:10,650
And so the mean of Y, the observed values,

251
00:16:10,650 --> 00:16:12,860
equals the mean of Y-hat.

252
00:16:12,860 --> 00:16:16,543
That Y-bar-hat equals Y-bar.

253
00:16:19,010 --> 00:16:23,870
the sample covariance
between each Xk and u

254
00:16:23,870 --> 00:16:25,143
is zero.

255
00:16:26,870 --> 00:16:31,870
And the point of the mean observation,

256
00:16:32,230 --> 00:16:37,100
so the mean of Y, the mean of X1,

257
00:16:37,100 --> 00:16:42,100
all the way through Xk, always
lies on the regression line.

258
00:16:50,320 --> 00:16:54,320
An important thing to note
is that adding regressors

259
00:16:54,320 --> 00:16:59,180
almost always changes the
value of your beta-hats.

260
00:16:59,180 --> 00:17:01,400
So when you go from beta,

261
00:17:04,820 --> 00:17:09,820
from one regressor to
two, so from k=1 to k=2,

262
00:17:11,280 --> 00:17:15,350
the value of your beta-1-hat will change.

263
00:17:15,350 --> 00:17:17,790
In the first model, you only have X1,

264
00:17:17,790 --> 00:17:20,873
and then in another model,
you add another X, X2,

265
00:17:21,860 --> 00:17:25,280
there will only be two cases

266
00:17:25,280 --> 00:17:29,190
where the value of
beta-1-hat does not change.

267
00:17:29,190 --> 00:17:34,190
First is that if the
beta-2-hat equals zero,

268
00:17:34,730 --> 00:17:38,103
so X2 has no effect on Y.

269
00:17:38,980 --> 00:17:43,980
And the other is if X1
and X2 are uncorrelated.

270
00:17:44,350 --> 00:17:47,670
So if X2 is uncorrelated

271
00:17:47,670 --> 00:17:50,440
with Y or with X1,

272
00:17:50,440 --> 00:17:53,460
those are the only two
cases where adding X2

273
00:17:53,460 --> 00:17:57,933
will not change the
value of our beta-1-hat.

274
00:18:02,480 --> 00:18:04,430
So think about it in this way.

275
00:18:04,430 --> 00:18:05,887
So we run this Y

276
00:18:09,540 --> 00:18:13,040
with two Xs, and then we run it again

277
00:18:13,040 --> 00:18:14,923
with only one X.

278
00:18:17,010 --> 00:18:20,343
Now, suppose that this A1,

279
00:18:22,020 --> 00:18:26,740
which is the coefficient

280
00:18:26,740 --> 00:18:29,780
in our second model, when
we didn't include X2,

281
00:18:29,780 --> 00:18:34,550
we could write it as
beta-1 plus beta-2 times D,

282
00:18:34,550 --> 00:18:37,890
where D is the slope coefficient

283
00:18:37,890 --> 00:18:39,280
of if you had

284
00:18:41,760 --> 00:18:45,070
regressed X2 on X1.

285
00:18:45,070 --> 00:18:46,210
They will be the same.

286
00:18:46,210 --> 00:18:50,180
So this A1 or A2 will be the same,

287
00:18:50,180 --> 00:18:54,373
only if one of these two things is true,

288
00:18:55,370 --> 00:18:58,510
either B2 here is zero,

289
00:18:58,510 --> 00:19:02,060
so X2 has no effect on Y,

290
00:19:02,060 --> 00:19:05,880
or if X1 and X2 are uncorrelated,

291
00:19:05,880 --> 00:19:08,610
so that if this D is zero.

292
00:19:08,610 --> 00:19:13,610
And I am going to show you in
class what this looks like,

293
00:19:14,060 --> 00:19:16,423
kind of drawing a Venn diagram.

294
00:19:22,500 --> 00:19:25,600
And in the general case of K regressors,

295
00:19:32,020 --> 00:19:34,793
that when you add regressors,

296
00:19:35,640 --> 00:19:40,640
usually it's going to change
the value of your beta-1.

297
00:19:40,750 --> 00:19:44,390
So the only time that
this would not be true

298
00:19:44,390 --> 00:19:47,800
is if all the betas equaled zero,

299
00:19:47,800 --> 00:19:52,250
so none of our regressors
have any effect on Y,

300
00:19:52,250 --> 00:19:56,640
or if X1 is uncorrelated
with every other X,

301
00:19:56,640 --> 00:19:57,727
with X2,...,Xk.

302
00:19:59,610 --> 00:20:04,610
Both of these would be very
rare instances, almost always.

303
00:20:04,650 --> 00:20:09,650
Y and X2 are gonna have
at least some correlation,

304
00:20:09,680 --> 00:20:14,680
or X1 will have some correlation

305
00:20:15,510 --> 00:20:20,450
with one or more, probably
all of our other Xs,

306
00:20:20,450 --> 00:20:21,810
X2 through Xk.

307
00:20:21,810 --> 00:20:26,160
So the bottom line here is
adding or subtracting regressors

308
00:20:26,160 --> 00:20:29,880
almost always changes
the value of every beta.

309
00:20:29,880 --> 00:20:33,933
And that's why it's so important
to include the right ones.

310
00:20:42,000 --> 00:20:46,340
Thinking again about the
concept of R squared,

311
00:20:46,340 --> 00:20:49,480
how well does our model fit the data?

312
00:20:49,480 --> 00:20:51,220
How much of the variation in Y

313
00:20:52,110 --> 00:20:55,040
is explained by the variation in the Xs?

314
00:20:55,040 --> 00:20:58,140
We can, again, decompose it as SST,

315
00:20:58,140 --> 00:21:01,670
the total sum of squares,
the variation in Y,

316
00:21:01,670 --> 00:21:06,670
the explained sum of squares,
the variation in Y-hat,

317
00:21:07,200 --> 00:21:11,040
and the sum of squared residuals, SSR.

318
00:21:11,040 --> 00:21:14,250
So note that, again, OLS

319
00:21:14,250 --> 00:21:16,993
makes SSR as small as possible,

320
00:21:19,400 --> 00:21:20,253
as before.

321
00:21:25,210 --> 00:21:27,560
So in this drawing here,

322
00:21:27,560 --> 00:21:29,243
they're writing SST as TSS,

323
00:21:31,600 --> 00:21:32,433
but it's the same thing.

324
00:21:32,433 --> 00:21:34,700
It's the total sum of squares.

325
00:21:34,700 --> 00:21:37,100
So SST

326
00:21:37,100 --> 00:21:41,190
is the sum of each Y

327
00:21:41,190 --> 00:21:44,770
minus the mean of Y squared.

328
00:21:44,770 --> 00:21:49,373
SSE is the sum of each Y-hat minus Y-bar.

329
00:21:50,601 --> 00:21:54,300
And SSR is the sum of squared residuals.

330
00:21:54,300 --> 00:21:58,250
So here's the formula.

331
00:21:58,250 --> 00:22:00,563
It's one that you've seen before.

332
00:22:01,740 --> 00:22:04,413
So again, SSD equals SSR.

333
00:22:04,413 --> 00:22:09,120
SSE plus SSR, and we do a bit of math,

334
00:22:09,120 --> 00:22:10,760
and we get R squared,

335
00:22:10,760 --> 00:22:14,710
which is defined as one minus SSR

336
00:22:14,710 --> 00:22:16,550
divided by SST.

337
00:22:16,550 --> 00:22:17,643
So it is that,

338
00:22:20,940 --> 00:22:25,430
that part of the variation in Y,

339
00:22:25,430 --> 00:22:28,290
which is explained by the Xs.

340
00:22:28,290 --> 00:22:32,323
R squared is always a
number between zero and one.

341
00:22:33,210 --> 00:22:34,670
Hardly ever is it zero.

342
00:22:34,670 --> 00:22:35,750
Hardly ever is it one.

343
00:22:35,750 --> 00:22:37,650
In fact, in any regression,

344
00:22:37,650 --> 00:22:40,790
you'll basically never see this.

345
00:22:40,790 --> 00:22:44,400
What does it mean if SSR equals zero?

346
00:22:44,400 --> 00:22:48,323
That's something that you
could ponder and think about.

347
00:22:54,850 --> 00:22:57,450
Here are some properties of R squared.

348
00:22:57,450 --> 00:23:01,390
So it never decreases, and
it almost always increases

349
00:23:01,390 --> 00:23:02,920
when you add a regressor.

350
00:23:02,920 --> 00:23:06,290
So even if you add a
total nonsense regressor,

351
00:23:06,290 --> 00:23:07,640
like your shoe size

352
00:23:07,640 --> 00:23:12,640
or how many letters are in your dog's name

353
00:23:12,940 --> 00:23:14,730
or anything like that

354
00:23:14,730 --> 00:23:17,800
that has nothing to do with your model,

355
00:23:17,800 --> 00:23:20,500
it's still going to increase R squared.

356
00:23:20,500 --> 00:23:23,390
And therefore, it's a poor criterion

357
00:23:23,390 --> 00:23:25,290
of whether to add a regressor.

358
00:23:25,290 --> 00:23:27,950
You almost always,
since it always goes up,

359
00:23:27,950 --> 00:23:31,720
it really, it's not going
to tell you anything.

360
00:23:31,720 --> 00:23:35,280
There is a way that you
can calculate R squared

361
00:23:35,280 --> 00:23:37,300
to see which sort of compensates

362
00:23:37,300 --> 00:23:40,440
for the decrease of degrees of freedom.

363
00:23:40,440 --> 00:23:45,440
So it sort of looks at, is the model,

364
00:23:45,680 --> 00:23:49,600
is the R squared better,
given that we know

365
00:23:49,600 --> 00:23:52,740
that we lost some degrees of freedom

366
00:23:52,740 --> 00:23:55,190
and sort of compensates it for that.

367
00:23:55,190 --> 00:23:56,833
And that's a better way.

368
00:24:01,110 --> 00:24:03,260
Now we're gonna look at the
same kind of assumptions

369
00:24:03,260 --> 00:24:05,220
that we looked at last time.

370
00:24:05,220 --> 00:24:08,800
So these are the things
that we assume are true

371
00:24:08,800 --> 00:24:13,670
or that must be true for an OLS model

372
00:24:13,670 --> 00:24:17,773
in order for it to be the
best unbiased estimator.

373
00:24:18,908 --> 00:24:21,270
And these are the same
as you've seen before,

374
00:24:21,270 --> 00:24:25,330
that it has to be linear in
parameters, random sampling,

375
00:24:25,330 --> 00:24:28,180
non-stochastic Xs

376
00:24:28,180 --> 00:24:31,100
that are not perfectly colinear,

377
00:24:31,100 --> 00:24:35,170
and that the residual has to
have zero conditional mean.

378
00:24:35,170 --> 00:24:36,763
So I'm gonna walk through each one.

379
00:24:39,410 --> 00:24:41,510
So it has to be a linear model

380
00:24:41,510 --> 00:24:44,700
that you can actually
write the population model

381
00:24:44,700 --> 00:24:49,700
in these terms, as a function
of Y and Xs, as you see here.

382
00:24:50,940 --> 00:24:54,730
Again, the betas are
the unknown parameters,

383
00:24:54,730 --> 00:24:57,433
and the u is the disturbance term.

384
00:24:58,540 --> 00:25:00,300
And what this means is that the betas

385
00:25:00,300 --> 00:25:03,020
cannot have any exponent other than one

386
00:25:03,020 --> 00:25:05,140
for it to be a linear function.

387
00:25:05,140 --> 00:25:08,150
Note that the Xs can have exponents,

388
00:25:08,150 --> 00:25:10,010
so it could be squares or logs

389
00:25:10,010 --> 00:25:11,890
or square roots or all
kinds of other things.

390
00:25:11,890 --> 00:25:15,930
And I think squared is
probably the most common one

391
00:25:15,930 --> 00:25:17,373
that you're gonna encounter.

392
00:25:19,180 --> 00:25:22,330
If you think that the
relationship has a curve in it,

393
00:25:22,330 --> 00:25:24,830
that it's not aligned, you
can often add a square.

394
00:25:27,550 --> 00:25:29,370
Next is random sampling.

395
00:25:29,370 --> 00:25:34,370
So again, we draw a
sample from a population,

396
00:25:34,850 --> 00:25:36,523
and it's a random sample.

397
00:25:37,570 --> 00:25:40,290
So there's no clear selection bias.

398
00:25:40,290 --> 00:25:44,200
So we don't only choose old folks or men

399
00:25:44,200 --> 00:25:46,790
or high income or larger household

400
00:25:46,790 --> 00:25:48,210
or married people or anything like that.

401
00:25:48,210 --> 00:25:51,723
It's representative of the population.

402
00:25:53,790 --> 00:25:57,070
Third is no perfect collinearity,

403
00:25:57,070 --> 00:26:01,740
that none of the, each regressor,

404
00:26:01,740 --> 00:26:03,010
well, one way to think of it,

405
00:26:03,010 --> 00:26:05,400
is must add some new information,

406
00:26:05,400 --> 00:26:09,110
that there's no perfect
linear relationship

407
00:26:09,110 --> 00:26:11,240
among any of the regressors.

408
00:26:11,240 --> 00:26:14,630
And we need this for it
to work mathematically,

409
00:26:14,630 --> 00:26:18,800
that their betas will not
be defined if you can't,

410
00:26:18,800 --> 00:26:23,233
if there is perfect collinearity.

411
00:26:25,470 --> 00:26:27,490
Note that they can be correlated,

412
00:26:27,490 --> 00:26:31,140
that they almost always are and will be,

413
00:26:31,140 --> 00:26:34,170
just not perfectly so.

414
00:26:34,170 --> 00:26:36,860
So if you are a matrix algebra nerd,

415
00:26:36,860 --> 00:26:39,533
know that our X matrix,

416
00:26:42,070 --> 00:26:47,070
the matrix is a singular matrix.

417
00:26:47,080 --> 00:26:51,210
It'll have a zero determinant,
and it cannot be inverted.

418
00:26:51,210 --> 00:26:53,460
So much like we saw before,

419
00:26:53,460 --> 00:26:55,930
it's sorta like dividing by zero.

420
00:26:55,930 --> 00:26:57,513
It's just undefined.

421
00:26:58,390 --> 00:27:00,180
It's certainly okay

422
00:27:00,180 --> 00:27:03,710
for them to be a non-linear relationship.

423
00:27:03,710 --> 00:27:07,810
So you could include both
income and income squared,

424
00:27:07,810 --> 00:27:12,810
again, if you think the
relationship has a curve in it,

425
00:27:13,360 --> 00:27:16,980
like increasing, but at a decreasing rate.

426
00:27:16,980 --> 00:27:17,953
And that's fine.

427
00:27:20,730 --> 00:27:23,180
Here are some examples.

428
00:27:23,180 --> 00:27:25,650
So one might be

429
00:27:27,670 --> 00:27:32,593
the expenditures in
Canadian and US dollars,

430
00:27:33,630 --> 00:27:38,350
where US dollars equals
a times Canadian dollars,

431
00:27:38,350 --> 00:27:39,563
a is the exchange rate.

432
00:27:41,080 --> 00:27:43,950
We often find it too in dummy variables.

433
00:27:43,950 --> 00:27:46,280
So if you code

434
00:27:46,280 --> 00:27:51,250
whether you live in Vermont as a yes,

435
00:27:51,250 --> 00:27:53,250
one equals yes, zero equals no,

436
00:27:53,250 --> 00:27:57,217
and have another variable,
non-Vermont, coded the other way,

437
00:27:58,080 --> 00:27:59,980
the sum of these two is always one.

438
00:27:59,980 --> 00:28:01,360
So you can include both.

439
00:28:01,360 --> 00:28:04,723
You only include one of
these two in your model.

440
00:28:08,000 --> 00:28:13,000
The intuition is how can we
measure the effect of US dollars

441
00:28:13,810 --> 00:28:18,140
by holding Canadian dollars constant?

442
00:28:18,140 --> 00:28:21,750
Or how can you, if you're
thinking about plant growth,

443
00:28:21,750 --> 00:28:26,260
how do you account for temperature

444
00:28:26,260 --> 00:28:28,480
in degrees Fahrenheit

445
00:28:28,480 --> 00:28:32,120
holding degrees Celsius constant?

446
00:28:32,120 --> 00:28:34,190
So that's the intuition.

447
00:28:34,190 --> 00:28:38,330
So, also, it adds no new information,

448
00:28:38,330 --> 00:28:40,300
that if you know degrees Fahrenheit,

449
00:28:40,300 --> 00:28:45,220
then you automatically
know degrees Celsius.

450
00:28:45,220 --> 00:28:48,630
And the degrees Celsius, one would say,

451
00:28:48,630 --> 00:28:50,853
adds no new information at all.

452
00:28:54,270 --> 00:28:57,380
We must also have a
positive degree of freedom,

453
00:28:57,380 --> 00:29:01,430
so N must be strictly greater than K+1.

454
00:29:01,430 --> 00:29:05,490
So the number of observations
must be strictly greater

455
00:29:05,490 --> 00:29:09,223
than the number of regressors plus one.

456
00:29:10,300 --> 00:29:14,320
Otherwise, you have more
unknowns than equations,

457
00:29:14,320 --> 00:29:18,370
and you have either no
or infinite solution.

458
00:29:18,370 --> 00:29:21,670
And the more degrees of
freedom that you have,

459
00:29:21,670 --> 00:29:24,290
the lower the variance of beta.

460
00:29:24,290 --> 00:29:28,743
And here's a YouTube that explains that.

461
00:29:34,180 --> 00:29:37,020
There are many benefits to having a big N,

462
00:29:37,020 --> 00:29:38,920
to having a large sample size,

463
00:29:38,920 --> 00:29:43,240
and one of those is it increases
your degrees of freedom.

464
00:29:43,240 --> 00:29:46,590
And as you see here in this table,

465
00:29:46,590 --> 00:29:51,590
the more degrees of freedom that you have,

466
00:29:51,860 --> 00:29:56,140
the lower the test stat has to be

467
00:29:56,140 --> 00:29:57,963
to be significant.

468
00:29:58,880 --> 00:30:02,250
And next week when we talk
about hypothesis tests,

469
00:30:02,250 --> 00:30:04,633
I think this is gonna
make even more sense.

470
00:30:06,010 --> 00:30:11,010
But basically, the
higher degree of freedom,

471
00:30:11,180 --> 00:30:14,760
the bigger N leads to a
more efficient estimator.

472
00:30:14,760 --> 00:30:16,390
And it's a theme that we're gonna revisit

473
00:30:16,390 --> 00:30:19,870
over and over again, more information

474
00:30:19,870 --> 00:30:21,900
leads to lower variance,

475
00:30:21,900 --> 00:30:24,940
or more information leads to

476
00:30:29,430 --> 00:30:33,870
a more efficient estimator,
or a lower variance estimator.

477
00:30:33,870 --> 00:30:37,360
And having bigger N, getting
information from more people,

478
00:30:37,360 --> 00:30:39,623
is one way that you can
get more information.

479
00:30:44,630 --> 00:30:47,460
The next assumption, again,

480
00:30:47,460 --> 00:30:50,290
this is one we should
be familiar with by now,

481
00:30:50,290 --> 00:30:53,070
that the expected value of the error term,

482
00:30:53,070 --> 00:30:55,980
no matter the value of X, is zero.

483
00:30:55,980 --> 00:31:00,610
And that no matter the value of X,

484
00:31:00,610 --> 00:31:03,020
the expected value is the same.

485
00:31:03,020 --> 00:31:06,150
It's this idea that the error term

486
00:31:06,150 --> 00:31:10,010
is uncorrelated with the regressors,

487
00:31:10,010 --> 00:31:13,973
and this is needed to have
an unbiased estimator.

488
00:31:14,920 --> 00:31:17,300
And that is why, as we'll see,

489
00:31:17,300 --> 00:31:19,630
that omitting an important variable

490
00:31:22,540 --> 00:31:25,423
will result in bias.

491
00:31:30,700 --> 00:31:33,740
When this zero conditional mean holds,

492
00:31:33,740 --> 00:31:37,110
we say that our regressors
are explanatory.

493
00:31:37,110 --> 00:31:39,100
The variables are exogenous.

494
00:31:39,100 --> 00:31:41,430
That's what we want, exogenous is good.

495
00:31:41,430 --> 00:31:43,280
When they are correlated
with the error term,

496
00:31:43,280 --> 00:31:44,750
they are said to be endogenous.

497
00:31:44,750 --> 00:31:48,020
And most, a lot of the topics

498
00:31:48,020 --> 00:31:50,270
that we'll be covering
toward the end of class

499
00:31:50,270 --> 00:31:54,650
will deal with how to detect

500
00:31:54,650 --> 00:31:57,393
if they're endogenous
and what to do about it.

501
00:32:02,190 --> 00:32:06,440
So, if we have these
four assumptions holding,

502
00:32:06,440 --> 00:32:08,430
if all four are true,

503
00:32:08,430 --> 00:32:12,340
then every beta-hat is unbiased.

504
00:32:12,340 --> 00:32:17,000
And know that what this means
is the procedure is unbiased.

505
00:32:17,000 --> 00:32:18,803
The model is unbiased.

506
00:32:20,628 --> 00:32:23,600
It doesn't mean that every single beta-hat

507
00:32:23,600 --> 00:32:26,810
will fall exactly on the true value.

508
00:32:26,810 --> 00:32:29,410
It means that there's no systematic reason

509
00:32:29,410 --> 00:32:32,780
why we should think it's
too big or too small,

510
00:32:32,780 --> 00:32:35,000
and if we did this over and over again

511
00:32:35,000 --> 00:32:38,810
that the value would
converge to its true value.

512
00:32:38,810 --> 00:32:41,123
And that's what unbiased means.

513
00:32:47,080 --> 00:32:52,080
We're gonna talk about two
cases now under specification.

514
00:32:52,890 --> 00:32:56,697
What regressors should
you include in your model?

515
00:32:56,697 --> 00:32:59,990
And we're gonna talk first
about overspecifying,

516
00:32:59,990 --> 00:33:02,973
which is including irrelevant ones,

517
00:33:04,590 --> 00:33:06,090
like your shoe size

518
00:33:06,090 --> 00:33:09,070
or the number of letters
in your dog's name

519
00:33:09,070 --> 00:33:10,230
or something like that.

520
00:33:10,230 --> 00:33:13,210
And sort of probably more seriously,

521
00:33:13,210 --> 00:33:18,120
what happens when you omit
ones that you should include?

522
00:33:18,120 --> 00:33:21,113
And we'll see those omitted variable bias.

523
00:33:24,470 --> 00:33:27,430
First, we'll deal with the
issue of overspecifying,

524
00:33:27,430 --> 00:33:32,100
which is including a
variable that is irrelevant.

525
00:33:32,100 --> 00:33:33,320
It could be nonsense

526
00:33:33,320 --> 00:33:35,370
or just has nothing to do with the model.

527
00:33:40,180 --> 00:33:44,940
So suppose we specify this
model with three regressors,

528
00:33:44,940 --> 00:33:48,500
and assumptions 1 through 4 are met,

529
00:33:48,500 --> 00:33:52,720
and everything is cool,
but X3 has no effect.

530
00:33:52,720 --> 00:33:56,610
So X3 has no effect on Y,

531
00:33:56,610 --> 00:34:01,610
that in the true parameter
in the population is zero.

532
00:34:02,100 --> 00:34:04,210
The slope is zero.

533
00:34:04,210 --> 00:34:08,563
Changing X3 has absolutely no effect on Y.

534
00:34:12,250 --> 00:34:13,193
What happens?

535
00:34:15,530 --> 00:34:17,520
Well, there's good news and bad news.

536
00:34:17,520 --> 00:34:22,520
The good news is since,
recall a few slides ago,

537
00:34:22,980 --> 00:34:27,120
that because B3 or beta-3 equals zero,

538
00:34:27,120 --> 00:34:28,790
it won't create bias.

539
00:34:28,790 --> 00:34:33,520
It won't effect any bias
of beta-1 or beta-2.

540
00:34:33,520 --> 00:34:38,180
However, it will inflate the
variance of the other betas.

541
00:34:38,180 --> 00:34:42,360
So it will increase the
variance of beta-1 or beta-2.

542
00:34:42,360 --> 00:34:45,520
So, and there's a few ways
that you can think about this.

543
00:34:45,520 --> 00:34:49,120
One, it takes away a degree
of freedom for no reason.

544
00:34:49,120 --> 00:34:54,040
And two, it takes away some
of the explanatory power

545
00:34:54,040 --> 00:34:55,810
of the other Xs.

546
00:34:55,810 --> 00:34:59,220
So especially if X3

547
00:34:59,220 --> 00:35:04,220
has any overlap with X2 and X1,

548
00:35:04,310 --> 00:35:07,300
it will take away some of the information

549
00:35:07,300 --> 00:35:11,960
that is in those variables

550
00:35:11,960 --> 00:35:16,150
and therefore take away
their explanatory power.

551
00:35:16,150 --> 00:35:18,200
And we're gonna talk about this a bit more

552
00:35:18,200 --> 00:35:22,010
when we think about the
variance of our beta-hats,

553
00:35:22,010 --> 00:35:25,833
which is sort of the end of this topic.

554
00:35:28,230 --> 00:35:32,910
So underspecifying is,
in a sense, more serious

555
00:35:32,910 --> 00:35:36,140
because it creates bias.

556
00:35:36,140 --> 00:35:39,320
But sometimes we can
know what the direction

557
00:35:39,320 --> 00:35:42,760
and maybe even size of the bias is.

558
00:35:42,760 --> 00:35:46,900
So assume that the true model is this,

559
00:35:46,900 --> 00:35:49,690
that only X2 and X1

560
00:35:49,690 --> 00:35:53,030
are the relevant regressors,

561
00:35:53,030 --> 00:35:55,680
and it's all well-behaved,

562
00:35:55,680 --> 00:35:57,663
assumptions 1 through 4 hold.

563
00:36:01,750 --> 00:36:04,160
We wanna know, what is beta-1?

564
00:36:04,160 --> 00:36:06,190
What is the effect of X1 on Y?

565
00:36:06,190 --> 00:36:10,970
However, we forget, for some
reason, when we exclude X2,

566
00:36:10,970 --> 00:36:13,910
we don't know enough to include it,

567
00:36:13,910 --> 00:36:16,410
there is no data available,
something like that,

568
00:36:16,410 --> 00:36:19,830
and so we run, instead, this regression

569
00:36:19,830 --> 00:36:21,870
with just a single regressor, X1.

570
00:36:21,870 --> 00:36:23,860
And I'm putting the A instead of B

571
00:36:23,860 --> 00:36:28,563
to sort of set this apart, so
that it's clear, hopefully.

572
00:36:34,780 --> 00:36:37,950
So here's an example
from the Wolters book.

573
00:36:37,950 --> 00:36:40,360
Again, we're looking at wages,

574
00:36:40,360 --> 00:36:43,743
and we're really interested
in the returns to education.

575
00:36:45,050 --> 00:36:49,740
And we have these two
regressors in the true model,

576
00:36:49,740 --> 00:36:51,410
education and ability.

577
00:36:51,410 --> 00:36:55,470
So you come with some innate ability,

578
00:36:55,470 --> 00:36:57,870
and you get education,

579
00:36:57,870 --> 00:37:00,350
and that's what drives your wage.

580
00:37:00,350 --> 00:37:02,020
Again, there's probably more,

581
00:37:02,020 --> 00:37:06,200
but just to make a simpler model.

582
00:37:06,200 --> 00:37:10,480
But we don't have a
variable measuring ability.

583
00:37:10,480 --> 00:37:14,737
So we run just a single regressor model,

584
00:37:16,130 --> 00:37:18,893
education, and we get A1.

585
00:37:20,590 --> 00:37:22,130
And that's what we see.

586
00:37:22,130 --> 00:37:26,600
And here, the error term, which is v,

587
00:37:26,600 --> 00:37:28,450
which we're calling v here,

588
00:37:28,450 --> 00:37:32,570
is beta-2 times ability plus the error,

589
00:37:32,570 --> 00:37:36,240
since we forgot about adding ability.

590
00:37:36,240 --> 00:37:41,240
And almost certainly, ability
and education are correlated,

591
00:37:41,610 --> 00:37:46,550
that I would assume that
those with more ability

592
00:37:46,550 --> 00:37:49,060
probably seek more education.

593
00:37:49,060 --> 00:37:51,020
That might be one hypothesis.

594
00:37:51,020 --> 00:37:55,083
But regardless of the direction,

595
00:37:56,840 --> 00:38:01,270
I think common sense says
that your innate ability

596
00:38:01,270 --> 00:38:03,680
and how much education that you get

597
00:38:03,680 --> 00:38:06,013
would be somewhat correlated.

598
00:38:09,190 --> 00:38:11,950
So we can think about what's the magnitude

599
00:38:11,950 --> 00:38:14,393
and the direction of A1?

600
00:38:15,410 --> 00:38:19,070
So A1 is then beta-1

601
00:38:19,070 --> 00:38:21,970
plus beta-2 times d,

602
00:38:21,970 --> 00:38:25,990
where d is the slope
of regressing X2 on X1.

603
00:38:25,990 --> 00:38:29,980
So it's how correlated are X1 and X2?

604
00:38:29,980 --> 00:38:31,230
What is the effect of it?

605
00:38:34,840 --> 00:38:39,780
And beta-hat-2 is the
slope from the real model,

606
00:38:39,780 --> 00:38:41,953
had we been able to run this.

607
00:38:47,130 --> 00:38:51,660
So here, the expected value of A1

608
00:38:51,660 --> 00:38:55,350
is the expected value of beta-1-hat

609
00:38:55,350 --> 00:38:58,400
plus beta-2-hat times d.

610
00:38:58,400 --> 00:39:01,890
So the bias is this second term,

611
00:39:01,890 --> 00:39:04,370
beta-2-hat times d.

612
00:39:04,370 --> 00:39:06,200
Now, if

613
00:39:07,650 --> 00:39:10,140
beta-2-hat equals zero,

614
00:39:10,140 --> 00:39:13,860
so if ability has no effect on wages,

615
00:39:13,860 --> 00:39:15,960
or if d equals zero,

616
00:39:15,960 --> 00:39:20,500
that is, ability and
education are uncorrelated,

617
00:39:20,500 --> 00:39:22,560
then A1 is unbiased.

618
00:39:22,560 --> 00:39:23,800
Then we're fine.

619
00:39:23,800 --> 00:39:27,393
And we sorta talked about
that a few slides ago.

620
00:39:35,340 --> 00:39:37,350
So again, if d equals zero,

621
00:39:37,350 --> 00:39:38,993
then X1 and X2 are uncorrelated.

622
00:39:41,508 --> 00:39:43,091
And that would mean

623
00:39:45,430 --> 00:39:48,220
that the expected value of X2 given X1

624
00:39:48,220 --> 00:39:49,570
is just X2.

625
00:39:49,570 --> 00:39:51,283
And then the error term,

626
00:39:52,910 --> 00:39:56,170
then X2 does not,

627
00:39:56,170 --> 00:40:00,047
is not correlated with X1,

628
00:40:00,047 --> 00:40:02,540
and X2 being in the error term

629
00:40:02,540 --> 00:40:07,540
does not violate our assumption
4, and everything is cool.

630
00:40:07,540 --> 00:40:11,390
But again, that's going to be rather rare.

631
00:40:11,390 --> 00:40:15,440
And in our example of omitting ability,

632
00:40:15,440 --> 00:40:17,900
I think, hope you can see,

633
00:40:17,900 --> 00:40:20,503
that that would not be
very good reasoning.

634
00:40:23,410 --> 00:40:27,740
So we can also think about
what is the direction of it

635
00:40:28,900 --> 00:40:31,940
and maybe even the magnitude.

636
00:40:31,940 --> 00:40:33,680
We can use our intuition.

637
00:40:33,680 --> 00:40:36,870
So if d is greater than zero,

638
00:40:36,870 --> 00:40:40,940
X2 and X1 are positively correlated,

639
00:40:40,940 --> 00:40:43,670
folks that have more
ability seek education,

640
00:40:43,670 --> 00:40:48,670
that would be my guess,
but maybe that isn't true,

641
00:40:48,880 --> 00:40:51,250
if it's less than zero,

642
00:40:51,250 --> 00:40:54,163
people with more ability
get less education.

643
00:40:55,490 --> 00:40:58,670
So you need to look at
the effect of X2 on Y

644
00:40:58,670 --> 00:41:02,460
as well as the effect of X2 on X1,

645
00:41:02,460 --> 00:41:04,293
and we can sort of intuit this.

646
00:41:08,120 --> 00:41:11,080
So again, this is our real model,

647
00:41:11,080 --> 00:41:15,540
where education and
ability are both included.

648
00:41:15,540 --> 00:41:18,820
So if beta-2 is greater than zero,

649
00:41:18,820 --> 00:41:22,660
which means more ability
leads to higher wage,

650
00:41:22,660 --> 00:41:23,980
that would be my guess,

651
00:41:23,980 --> 00:41:26,940
and d is also greater than zero,

652
00:41:26,940 --> 00:41:31,940
more ability leads to more education,

653
00:41:31,990 --> 00:41:35,380
and that means that the bias is positive,

654
00:41:35,380 --> 00:41:40,380
that the effect of
education would be smaller.

655
00:41:44,727 --> 00:41:47,230
A1 is greater than A2,

656
00:41:47,230 --> 00:41:51,193
so we overstate the effect of education.

657
00:41:56,620 --> 00:42:00,770
When we have a k variable equation,

658
00:42:00,770 --> 00:42:05,320
it's gonna depend again on how correlated

659
00:42:05,320 --> 00:42:10,220
the omitted variable is with
the ones that we include,

660
00:42:10,220 --> 00:42:13,070
but know that every B may be biased,

661
00:42:13,070 --> 00:42:17,057
not just those correlated
with the omitted regressor.

662
00:42:20,410 --> 00:42:23,420
Here is how I approach it in general.

663
00:42:23,420 --> 00:42:26,710
I tend to use a big tent approach

664
00:42:26,710 --> 00:42:30,420
and include a lot of Xs, at
least in the early model.

665
00:42:30,420 --> 00:42:35,420
So you always wanna base
it on these three things,

666
00:42:35,610 --> 00:42:38,260
previous research, theory,

667
00:42:38,260 --> 00:42:40,870
and just sort of common
sense and introspection.

668
00:42:40,870 --> 00:42:41,770
So you think about

669
00:42:43,418 --> 00:42:46,940
what does previous research
suggest we include?

670
00:42:46,940 --> 00:42:48,940
What does theory suggest we improve?

671
00:42:48,940 --> 00:42:52,823
And what does common sense
suggest that we improve?

672
00:42:54,000 --> 00:42:58,240
And we will learn down the
road with hypothesis test

673
00:42:58,240 --> 00:43:02,480
how to pair down and how to
sort of come to the right model,

674
00:43:02,480 --> 00:43:05,900
to start with a lot of Xs,

675
00:43:05,900 --> 00:43:08,350
and there are tests to see,

676
00:43:08,350 --> 00:43:11,923
well, if we take these out,
which is the best model?

677
00:43:15,900 --> 00:43:18,810
We're going to talk about the variance

678
00:43:18,810 --> 00:43:20,260
of the OLS estimators.

679
00:43:20,260 --> 00:43:23,013
What are the variance of beta-hat?

680
00:43:26,070 --> 00:43:28,580
We've spent a bunch of time

681
00:43:28,580 --> 00:43:32,290
looking at the assumptions
under which they are unbiased.

682
00:43:32,290 --> 00:43:35,430
And now we wanna think
about how efficient are they

683
00:43:35,430 --> 00:43:38,980
'cause we want the best
unbiased estimator.

684
00:43:38,980 --> 00:43:43,100
So beta-hat

685
00:43:43,940 --> 00:43:48,270
has variance because it
has the error term in them

686
00:43:48,270 --> 00:43:50,523
because it has the observed Y in them.

687
00:43:51,670 --> 00:43:54,340
So each time you,

688
00:43:54,340 --> 00:43:57,560
so even if you've
specified the model right,

689
00:43:57,560 --> 00:44:01,450
each time you draw a sample and run it,

690
00:44:01,450 --> 00:44:05,810
you're going to get a
slightly different beta-hat

691
00:44:07,525 --> 00:44:11,083
because there's a new set of
error terms that are drawn.

692
00:44:13,510 --> 00:44:18,510
We're going to assume now that
there is homoscedasticity,

693
00:44:19,100 --> 00:44:24,100
which has that the variance of
the error term is a constant.

694
00:44:24,770 --> 00:44:26,680
We're gonna deal with that later

695
00:44:26,680 --> 00:44:28,620
of what do we do when that's not true,

696
00:44:28,620 --> 00:44:31,870
how to test for it and
how to account for it.

697
00:44:31,870 --> 00:44:34,100
But now we're going to assume

698
00:44:34,100 --> 00:44:37,820
that the variance of u,
given any value of X,

699
00:44:37,820 --> 00:44:41,053
equals a constant, which
we call sigma squared.

700
00:44:42,150 --> 00:44:46,030
That is, it's the same,
regardless of any value of X,

701
00:44:46,030 --> 00:44:50,860
so no value of X will change the variance

702
00:44:52,510 --> 00:44:57,510
of u, so sort of the
shape of the bell curve.

703
00:44:57,940 --> 00:45:00,640
We know it's centered over zero,

704
00:45:00,640 --> 00:45:03,640
but it could be a very
tall, skinny bell curve.

705
00:45:03,640 --> 00:45:06,250
It could be a very short, fat bell curve.

706
00:45:06,250 --> 00:45:08,350
The key here, and that's basically

707
00:45:08,350 --> 00:45:11,560
what we're trying to measure,
but the assumption here

708
00:45:11,560 --> 00:45:13,970
is that no matter what the value of X is,

709
00:45:13,970 --> 00:45:16,890
that the shape of that
bell curve is the same.

710
00:45:21,220 --> 00:45:24,410
Here's the formula for the
variance of beta-hat-j.

711
00:45:24,410 --> 00:45:29,410
So you take one particular regressor, Xj.

712
00:45:29,430 --> 00:45:32,200
What is the variance of its beta-hat?

713
00:45:32,200 --> 00:45:33,810
And here it is.

714
00:45:33,810 --> 00:45:37,483
So it has three parts, sigma squared,

715
00:45:38,320 --> 00:45:41,823
SSTj, and one minus R squared j.

716
00:45:42,770 --> 00:45:46,000
So we're gonna think about sigma squared,

717
00:45:46,000 --> 00:45:49,380
that that is, that's the variance of u,

718
00:45:49,380 --> 00:45:51,500
which we already saw, and
we're gonna see in a bit

719
00:45:51,500 --> 00:45:52,683
how to measure that.

720
00:45:55,150 --> 00:45:56,980
There's also SSTj,

721
00:45:56,980 --> 00:46:01,910
which is the total sample variation of X.

722
00:46:01,910 --> 00:46:06,870
So it is the sum of every X,

723
00:46:06,870 --> 00:46:10,910
so what everybody said on the survey

724
00:46:10,910 --> 00:46:13,660
times the mean of that X,

725
00:46:13,660 --> 00:46:16,653
so the mean value from our sample,

726
00:46:17,730 --> 00:46:22,730
subtract every individual's
Xj from the mean of Xj,

727
00:46:23,350 --> 00:46:26,340
square it, and sum it up from one to N.

728
00:46:26,340 --> 00:46:29,390
So for each of our N respondents,

729
00:46:29,390 --> 00:46:34,390
it's the total sum of
squared, so squared X.

730
00:46:35,010 --> 00:46:38,620
And R squared j is the R squared

731
00:46:38,620 --> 00:46:41,010
of if you were to take Xj

732
00:46:41,010 --> 00:46:44,210
and regress it on all the other Xs.

733
00:46:44,210 --> 00:46:45,803
So if we're looking at X1,

734
00:46:48,510 --> 00:46:49,800
it would be the R squared.

735
00:46:49,800 --> 00:46:53,370
If we took X1 and put
it on the left-hand side

736
00:46:53,370 --> 00:46:56,603
and regressed X2, X3, X4,...,Xk,

737
00:46:58,750 --> 00:47:02,590
and look at that R squared,
that is what this is.

738
00:47:02,590 --> 00:47:06,160
So it's basically a
measure of how correlated

739
00:47:06,160 --> 00:47:09,360
is this Xj with all of the others?

740
00:47:09,360 --> 00:47:12,990
A high R squared implies that
they're highly correlated.

741
00:47:12,990 --> 00:47:15,883
A low R squared j means that they are not.

742
00:47:19,890 --> 00:47:21,070
Why does this matter?

743
00:47:21,070 --> 00:47:23,870
Why do we care about variance?

744
00:47:23,870 --> 00:47:26,600
Well, if we have a high variance,

745
00:47:26,600 --> 00:47:27,630
so if you can think about it

746
00:47:27,630 --> 00:47:32,630
as kind of a short, fat bell curve,

747
00:47:34,080 --> 00:47:37,030
we have a less precise estimator.

748
00:47:37,030 --> 00:47:41,080
There's a need for larger
confidence intervals.

749
00:47:41,080 --> 00:47:44,440
And thus, we're less likely
to find significance.

750
00:47:44,440 --> 00:47:47,146
So if you run a regression,

751
00:47:47,146 --> 00:47:49,770
and you don't find anything significant,

752
00:47:49,770 --> 00:47:51,620
it's kind of deflating.

753
00:47:51,620 --> 00:47:53,300
It's like, er, this is,

754
00:47:53,300 --> 00:47:57,580
it's not very interesting.

755
00:47:57,580 --> 00:47:59,953
So even from a really practical matter,

756
00:48:03,010 --> 00:48:06,630
you wanna find things,
if they are significant,

757
00:48:06,630 --> 00:48:11,030
you want to find that
'cause that's sort of

758
00:48:12,160 --> 00:48:14,813
what's interesting to talk about.

759
00:48:18,250 --> 00:48:22,563
So I will go over each of the
three components of variance.

760
00:48:27,330 --> 00:48:29,853
First is sigma squared.

761
00:48:30,830 --> 00:48:33,880
So a higher, note that in the formula,

762
00:48:33,880 --> 00:48:36,710
higher sigma squared
means higher variance.

763
00:48:36,710 --> 00:48:39,820
It means that the error
terms are all over the place.

764
00:48:39,820 --> 00:48:43,693
It means that we have a short, fat,

765
00:48:45,710 --> 00:48:47,893
very spread out bell curve.

766
00:48:48,830 --> 00:48:50,040
Another way of thinking about it

767
00:48:50,040 --> 00:48:52,230
is more noise in the equation

768
00:48:52,230 --> 00:48:55,690
makes it harder to
predict partial effects.

769
00:48:55,690 --> 00:49:00,240
And know that it is a population measure.

770
00:49:00,240 --> 00:49:03,090
It's independent of the sample size.

771
00:49:03,090 --> 00:49:07,810
And it is unknown, but there
is a way to estimate it,

772
00:49:07,810 --> 00:49:09,210
which we'll see in a minute.

773
00:49:15,415 --> 00:49:16,960
SSTj, again,

774
00:49:16,960 --> 00:49:20,773
is the total variation in X.

775
00:49:22,160 --> 00:49:26,410
And the more variation in X,

776
00:49:26,410 --> 00:49:28,260
the smaller the variance.

777
00:49:28,260 --> 00:49:31,300
It's in the denominator,
so you want a big SSTj.

778
00:49:31,300 --> 00:49:35,280
So that means that you want Xs

779
00:49:35,280 --> 00:49:37,533
to have some variation.

780
00:49:38,410 --> 00:49:42,903
So if Xj here is age,

781
00:49:44,420 --> 00:49:46,163
you want your sample,

782
00:49:47,259 --> 00:49:49,680
you always want your sample to be random,

783
00:49:49,680 --> 00:49:52,623
but if you're drawing from,

784
00:49:54,660 --> 00:49:59,220
folks from across the age spectrum

785
00:49:59,220 --> 00:50:02,340
from, say, 18 to 99,

786
00:50:02,340 --> 00:50:05,840
the Xs are going to be more spread out.

787
00:50:05,840 --> 00:50:08,440
And the partial effect of age

788
00:50:08,440 --> 00:50:11,930
is going to be a lot easier to predict.

789
00:50:11,930 --> 00:50:14,030
So you can think about it that

790
00:50:14,030 --> 00:50:18,340
if you're trying to eyeball
the slope of a line,

791
00:50:18,340 --> 00:50:22,430
and all of the dots are
all sort of converged

792
00:50:22,430 --> 00:50:26,070
around a single X, so you only have folks

793
00:50:26,070 --> 00:50:30,230
that are 31, 30, 31, 32, 31, 30,

794
00:50:30,230 --> 00:50:31,670
it's gonna be hard to sort of,

795
00:50:31,670 --> 00:50:33,583
what is the slope of this line?

796
00:50:36,220 --> 00:50:37,800
This is another example

797
00:50:37,800 --> 00:50:41,250
where more info leads to lower variance.

798
00:50:41,250 --> 00:50:45,060
And increasing sample size unambiguously

799
00:50:46,620 --> 00:50:51,280
increases the variation in X

800
00:50:51,280 --> 00:50:55,850
because we're not dividing by N here.

801
00:50:55,850 --> 00:50:57,660
It's just summing them up.

802
00:50:57,660 --> 00:50:59,573
So the more Xs that you have,

803
00:51:00,900 --> 00:51:03,453
unless every single one is on the mean,

804
00:51:04,800 --> 00:51:08,750
adding N will make SST get bigger

805
00:51:08,750 --> 00:51:10,823
and will decrease your variance.

806
00:51:16,400 --> 00:51:21,400
R squared j is the R squared
that if you would take Xj

807
00:51:21,780 --> 00:51:24,610
and put it on the left
side and run a regression

808
00:51:24,610 --> 00:51:26,250
with all the other Xs on the right side,

809
00:51:26,250 --> 00:51:27,740
what's the R squared?

810
00:51:27,740 --> 00:51:31,430
So if R squared j is one,

811
00:51:31,430 --> 00:51:34,343
we have a perfect linear combination.

812
00:51:36,844 --> 00:51:40,060
And you'll see that
we're dividing by zero.

813
00:51:40,060 --> 00:51:42,193
And again, it makes this blow up too.

814
00:51:43,260 --> 00:51:47,420
And in this case, Xj
adds no new information.

815
00:51:47,420 --> 00:51:51,040
So you want Xj

816
00:51:51,040 --> 00:51:54,850
to say something that
the other Xs don't say.

817
00:51:54,850 --> 00:51:59,290
And the more it says something
that the other Xs don't say,

818
00:51:59,290 --> 00:52:02,130
the more new information
that you're getting here,

819
00:52:02,130 --> 00:52:04,823
and the lower the variance.

820
00:52:05,670 --> 00:52:08,010
There's a way, and I will show you,

821
00:52:08,010 --> 00:52:13,010
to calculate the variance
inflation factor,

822
00:52:13,023 --> 00:52:14,203
or the VIF.

823
00:52:14,203 --> 00:52:18,363
And in SPSS, you'll find it
under collinearity diagnostics.

824
00:52:19,630 --> 00:52:24,233
And the VIF is one divided
by one minus R squared j.

825
00:52:25,430 --> 00:52:27,060
If it's less than four,

826
00:52:27,060 --> 00:52:30,700
this is just sort of a
rule of thumb I've learned,

827
00:52:30,700 --> 00:52:34,200
if your VIF is less than four,
it really isn't a problem.

828
00:52:34,200 --> 00:52:36,700
If it's less than 10, it
isn't a major problem.

829
00:52:36,700 --> 00:52:40,223
It's more than 10, you may have a problem.

830
00:52:46,130 --> 00:52:47,710
Here, I will show you

831
00:52:47,710 --> 00:52:51,980
that if you exclude a variable,

832
00:52:51,980 --> 00:52:56,630
it introduces bias, but it
also decreases the variance.

833
00:52:56,630 --> 00:52:59,190
So there's kind of a trade-off here.

834
00:52:59,190 --> 00:53:02,983
So think back of our k=2 example,

835
00:53:04,140 --> 00:53:06,853
where the real model, say, includes X1.

836
00:53:08,010 --> 00:53:10,293
I mean, it includes X2,

837
00:53:11,140 --> 00:53:14,890
but we run another model
where we exclude it.

838
00:53:14,890 --> 00:53:17,920
So beta-1-hat is where we include it,

839
00:53:17,920 --> 00:53:22,533
and it's the formula that we know well,

840
00:53:25,039 --> 00:53:26,580
the bottom box here,

841
00:53:26,580 --> 00:53:31,580
the variance of A1 now
no longer has this term.

842
00:53:35,848 --> 00:53:37,233
Since R squared 1

843
00:53:38,130 --> 00:53:41,680
is always a number between zero and one,

844
00:53:41,680 --> 00:53:46,460
one minus that number is also
a number between zero and one.

845
00:53:46,460 --> 00:53:49,310
And by including it

846
00:53:49,310 --> 00:53:52,133
in the third box down,

847
00:53:53,860 --> 00:53:55,210
it's going to have,

848
00:53:55,210 --> 00:53:58,840
you're dividing by a number less than one,

849
00:53:58,840 --> 00:54:02,350
which is like multiplying by
a number greater than one.

850
00:54:02,350 --> 00:54:06,380
So the variance of A1 is going to be less

851
00:54:06,380 --> 00:54:08,810
than the variance of beta-1-hat.

852
00:54:08,810 --> 00:54:11,853
So that's the trade-off,
and that's why it happens.

853
00:54:17,080 --> 00:54:20,033
So here's more about that.

854
00:54:21,520 --> 00:54:26,117
And it depends on how much
new information does X2 add?

855
00:54:27,420 --> 00:54:30,290
The more new information,

856
00:54:30,290 --> 00:54:34,380
the smaller this R squared is,

857
00:54:34,380 --> 00:54:38,300
and the less effect that
it has on the variance.

858
00:54:38,300 --> 00:54:40,540
So you can play with the math

859
00:54:40,540 --> 00:54:44,423
and see if you can see
what is going on here.

860
00:54:48,760 --> 00:54:52,430
So the variance of A1 is always smaller

861
00:54:52,430 --> 00:54:55,150
than the variance beta-1-hat,

862
00:54:55,150 --> 00:54:58,280
unless X2 is uncorrelated with X1,

863
00:54:58,280 --> 00:55:00,123
and then it would be the same.

864
00:55:01,200 --> 00:55:05,780
And if X1 and X2 are uncorrelated,

865
00:55:05,780 --> 00:55:08,513
I mean, are correlated,

866
00:55:13,070 --> 00:55:16,740
the bias trade-off depends on whether B2

867
00:55:18,270 --> 00:55:20,373
is zero or not.

868
00:55:21,220 --> 00:55:25,510
But the variance of beta

869
00:55:25,510 --> 00:55:27,263
is always going to be greater.

870
00:55:33,500 --> 00:55:38,090
The bottom line is that
adding an irrelevant variable

871
00:55:38,090 --> 00:55:43,003
exacerbates multicollinearity
and increases variance.

872
00:55:44,520 --> 00:55:47,840
And adding observations
can decrease variance,

873
00:55:47,840 --> 00:55:49,903
but it doesn't address bias.

874
00:55:52,800 --> 00:55:55,120
To calculate the variance,

875
00:55:55,120 --> 00:55:58,150
we need to estimate sigma squared.

876
00:55:58,150 --> 00:56:02,140
So we want an unbiased estimator of that,

877
00:56:02,140 --> 00:56:04,140
which is sigma squared hat.

878
00:56:04,140 --> 00:56:08,417
And we do this by using the residuals,

879
00:56:09,400 --> 00:56:12,840
which is, for every person,

880
00:56:12,840 --> 00:56:15,300
ui-hat equals Yi minus Y-i-hat,

881
00:56:15,300 --> 00:56:19,810
so what they actually said
versus the predicted value

882
00:56:19,810 --> 00:56:22,613
based on what their Xs are.

883
00:56:31,807 --> 00:56:34,170
Sigma squared hat is sum

884
00:56:34,170 --> 00:56:38,630
of the ui-hat squared

885
00:56:38,630 --> 00:56:41,630
normalized by degrees of freedom.

886
00:56:41,630 --> 00:56:46,520
So it's the sum of squared residuals

887
00:56:46,520 --> 00:56:49,600
divided by n minus k minus one.

888
00:56:49,600 --> 00:56:53,180
And note that as N goes to infinity,

889
00:56:53,180 --> 00:56:55,470
this is going to get smaller
and smaller and smaller.

890
00:56:55,470 --> 00:56:57,280
So again, a bigger N

891
00:56:57,280 --> 00:57:00,760
will make for a smaller sigma squared hat,

892
00:57:00,760 --> 00:57:04,270
and which makes the overall variance

893
00:57:04,270 --> 00:57:08,163
of beta-1-hat smaller and smaller.

894
00:57:15,040 --> 00:57:17,880
The standard error of the regression

895
00:57:17,880 --> 00:57:20,840
is the positive square
root of this estimate,

896
00:57:20,840 --> 00:57:25,840
of sigma-hat squared, square root of.

897
00:57:25,890 --> 00:57:29,510
And this estimate is unbiased,

898
00:57:29,510 --> 00:57:33,970
only if the assumption 5,

899
00:57:33,970 --> 00:57:35,943
homoscedasticity, holds.

900
00:57:37,220 --> 00:57:40,600
So now you could, in theory,

901
00:57:40,600 --> 00:57:45,090
calculate each of the components

902
00:57:45,090 --> 00:57:50,090
and calculate the
variance for a beta-1-hat.

903
00:57:52,970 --> 00:57:57,940
So here is our final problem set for this,

904
00:57:57,940 --> 00:58:01,410
and want you to think about collinearity,

905
00:58:01,410 --> 00:58:04,210
about omitted variable bias,

906
00:58:04,210 --> 00:58:07,990
and the variance of the beta-hat,

907
00:58:07,990 --> 00:58:12,070
and why, as each one
increases or decreases,

908
00:58:12,070 --> 00:58:14,583
what happens to the variance and why?

909
00:58:17,530 --> 00:58:18,900
This is what we did.

910
00:58:18,900 --> 00:58:23,630
Thanks for watching this,

911
00:58:23,630 --> 00:58:25,533
and have a good day.