1
00:00:02,830 --> 00:00:05,240
- [Instructor] Hello, and
welcome to the video lecture
2
00:00:05,240 --> 00:00:09,650
on multiple regression
analysis and estimation.
3
00:00:09,650 --> 00:00:11,400
So we're gonna be looking at
4
00:00:12,310 --> 00:00:15,430
how to do a linear regression
5
00:00:15,430 --> 00:00:19,230
with more than one regressor
6
00:00:19,230 --> 00:00:24,230
and look at some of the
properties of the model,
7
00:00:25,610 --> 00:00:29,180
what makes a good model,
some of the assumptions
8
00:00:29,180 --> 00:00:33,070
that make for an unbiased estimator,
9
00:00:33,070 --> 00:00:37,310
much like we did last time,
but now adding more regressors.
10
00:00:37,310 --> 00:00:39,830
And we're also gonna think about
11
00:00:41,970 --> 00:00:43,360
how to understand better
12
00:00:43,360 --> 00:00:46,143
what makes for a more efficient estimator.
13
00:00:49,750 --> 00:00:53,290
We'll start with the K=2, the 2 regressor,
14
00:00:53,290 --> 00:00:56,290
and then look at the general form
15
00:00:56,290 --> 00:01:00,590
where there's K regressors,
look at the residuals,
16
00:01:00,590 --> 00:01:03,423
how we derive the OLS estimator,
17
00:01:05,260 --> 00:01:08,930
look at the ceteris paribus assumption,
18
00:01:08,930 --> 00:01:11,750
goodness of fit, the degrees of freedoms,
19
00:01:11,750 --> 00:01:14,763
and then again, looking
at those assumptions,
20
00:01:15,860 --> 00:01:18,080
which, if they all hold,
21
00:01:18,080 --> 00:01:23,080
that we can say that OLS is
the best unbiased estimator,
22
00:01:24,070 --> 00:01:28,610
and then last, looking at
23
00:01:28,610 --> 00:01:31,550
how do we know what the right model is?
24
00:01:31,550 --> 00:01:36,550
What happens if we add
a irrelevant regressor,
25
00:01:37,510 --> 00:01:41,207
and what happens if we
miss a relevant regressor?
26
00:01:43,790 --> 00:01:46,750
So here is the homework.
27
00:01:46,750 --> 00:01:50,363
So I want you to think about
what is multicollinearity,
28
00:01:51,250 --> 00:01:55,513
what's the consequence and
what's perfect collinearity,
29
00:01:56,410 --> 00:01:59,860
what happens if you omit a variable,
30
00:01:59,860 --> 00:02:02,370
and when doesn't it matter?
31
00:02:02,370 --> 00:02:07,030
Three, I want you to
unpack the three factors
32
00:02:07,030 --> 00:02:11,580
of the OLS estimator's variance.
33
00:02:11,580 --> 00:02:14,420
What drives the variance
in this estimator?
34
00:02:14,420 --> 00:02:16,559
And then there are also a couple
35
00:02:16,559 --> 00:02:19,893
of computer homework problems.
36
00:02:21,680 --> 00:02:25,670
Last time, we looked at
the one regressor model.
37
00:02:25,670 --> 00:02:27,190
Now we're gonna look at the two.
38
00:02:27,190 --> 00:02:29,160
So now we're looking at a model
39
00:02:29,160 --> 00:02:32,080
where on the left-hand side is wage,
40
00:02:32,080 --> 00:02:34,090
so what drives whether or not somebody
41
00:02:34,090 --> 00:02:36,480
has a low or a high wage.
42
00:02:36,480 --> 00:02:39,280
And we're looking at two regressors,
43
00:02:39,280 --> 00:02:42,500
education and experience.
44
00:02:42,500 --> 00:02:45,210
So we're really interested
mostly in education.
45
00:02:45,210 --> 00:02:47,660
So basically does
education pay for itself?
46
00:02:47,660 --> 00:02:49,813
What's the returns to education?
47
00:02:50,935 --> 00:02:54,593
And may also be interested in experience,
48
00:02:56,050 --> 00:03:01,050
but we know that if we omit experience
49
00:03:01,080 --> 00:03:04,780
that it goes into the error term.
50
00:03:04,780 --> 00:03:09,400
And it is very likely
that if we asked a survey
51
00:03:09,400 --> 00:03:13,080
that education and experience
52
00:03:13,080 --> 00:03:15,413
may be related,
53
00:03:17,580 --> 00:03:20,980
that they are correlated.
54
00:03:20,980 --> 00:03:24,730
For example, the more
education that you have,
55
00:03:24,730 --> 00:03:26,440
maybe you're spending more,
56
00:03:26,440 --> 00:03:28,260
spent more of your life on education,
57
00:03:28,260 --> 00:03:31,610
so you have less experience.
58
00:03:31,610 --> 00:03:33,560
That may be one way of thinking about it,
59
00:03:33,560 --> 00:03:36,500
but you can think they're
almost certainly correlated.
60
00:03:36,500 --> 00:03:41,500
So if they are, and we
forget to put in experience,
61
00:03:42,050 --> 00:03:46,023
then experience, we know,
is in the error term,
62
00:03:47,772 --> 00:03:49,623
and then, almost certainly,
63
00:03:51,870 --> 00:03:54,310
our assumption of the error term
64
00:03:54,310 --> 00:03:57,800
and the regressors not being
correlated is violated.
65
00:03:57,800 --> 00:04:01,373
And thus, we will almost
certainly have a biased estimator.
66
00:04:05,350 --> 00:04:06,733
As another example,
67
00:04:07,570 --> 00:04:11,290
we might wanna look at a
model where the test score,
68
00:04:11,290 --> 00:04:15,400
so what is the test score of some school
69
00:04:15,400 --> 00:04:19,480
based on how much that school district
70
00:04:21,710 --> 00:04:23,670
spends on schools, their expenditure,
71
00:04:23,670 --> 00:04:26,090
and the average household income.
72
00:04:26,090 --> 00:04:28,260
So if we only, so we're really interested,
73
00:04:28,260 --> 00:04:30,060
again, in expenditure.
74
00:04:30,060 --> 00:04:33,870
Does higher expenditure
lead to higher test scores?
75
00:04:33,870 --> 00:04:35,990
That was, I would say,
what we would think,
76
00:04:35,990 --> 00:04:37,690
and maybe even what we would hope,
77
00:04:39,510 --> 00:04:43,210
but if we forget to put in average income,
78
00:04:43,210 --> 00:04:45,023
almost certainly,
79
00:04:48,120 --> 00:04:51,610
expenditure and income are correlated.
80
00:04:51,610 --> 00:04:55,320
So again, if we forget
to put it into the model,
81
00:04:55,320 --> 00:04:59,240
then we're almost certainly
going to get a biased estimate
82
00:04:59,240 --> 00:05:03,373
of the effect on expenditure.
83
00:05:09,900 --> 00:05:13,150
So in each case, the beta-1,
84
00:05:13,150 --> 00:05:15,710
the one that we're most interested in,
85
00:05:15,710 --> 00:05:18,440
is a measure of
86
00:05:18,440 --> 00:05:22,560
if we increase X1 by one unit,
87
00:05:22,560 --> 00:05:26,493
how much does the Y increase?
88
00:05:27,410 --> 00:05:32,260
So basically we want to include
all the relevant regressors,
89
00:05:32,260 --> 00:05:33,280
so we can account for them,
90
00:05:33,280 --> 00:05:35,580
so they don't end up in the error term,
91
00:05:35,580 --> 00:05:39,550
so we don't have a biased estimator.
92
00:05:39,550 --> 00:05:41,490
We want to account for everything,
93
00:05:41,490 --> 00:05:43,190
so we can make a good case
94
00:05:43,190 --> 00:05:48,143
that the estimator for
our beta-1 is unbiased.
95
00:05:52,450 --> 00:05:56,450
Remember back that an
absolutely essential assumption
96
00:05:56,450 --> 00:05:59,040
is that our error term is uncorrelated
97
00:05:59,040 --> 00:06:00,780
with any of our regressors.
98
00:06:00,780 --> 00:06:02,620
So now that we have two regressors,
99
00:06:02,620 --> 00:06:05,853
it must be uncorrelated with both.
100
00:06:08,980 --> 00:06:12,220
Thinking ahead to have
an unbiased estimator,
101
00:06:12,220 --> 00:06:15,520
that the error term must be uncorrelated
102
00:06:15,520 --> 00:06:18,710
with both regressors.
103
00:06:18,710 --> 00:06:21,500
So that would be
104
00:06:21,500 --> 00:06:24,630
that in the population
105
00:06:24,630 --> 00:06:28,940
that the average residual
106
00:06:28,940 --> 00:06:31,800
for any individual, so the expected value,
107
00:06:31,800 --> 00:06:33,110
would be zero.
108
00:06:33,110 --> 00:06:36,620
And that would hold true
for any value of X1 or X2.
109
00:06:36,620 --> 00:06:40,970
So no matter what the
respondents say on the survey,
110
00:06:40,970 --> 00:06:43,840
no matter what their X1 and X2 is,
111
00:06:43,840 --> 00:06:48,193
the expected value of
the residual is zero.
112
00:06:49,820 --> 00:06:54,820
So, and remember that,
that that's important
113
00:06:54,830 --> 00:06:58,303
because if that is not true,
114
00:07:00,398 --> 00:07:03,370
that if dU/dX1
115
00:07:03,370 --> 00:07:05,380
does not equal zero,
116
00:07:05,380 --> 00:07:09,070
then we can't tell as we change X1
117
00:07:09,070 --> 00:07:12,350
whether the change in the observed Y
118
00:07:12,350 --> 00:07:17,263
is due to a change in Y-hat
or a change in the error term.
119
00:07:23,130 --> 00:07:26,440
Much the same holds when we
look at the general case,
120
00:07:26,440 --> 00:07:28,030
so K regressors.
121
00:07:28,030 --> 00:07:31,270
So in many cases, we're
going to have more than one,
122
00:07:31,270 --> 00:07:35,920
more than two, but a good
number of regressors.
123
00:07:35,920 --> 00:07:39,590
So to think about the original example,
124
00:07:39,590 --> 00:07:42,420
that wages are probably affected,
125
00:07:42,420 --> 00:07:45,810
not just by education experience,
126
00:07:45,810 --> 00:07:50,810
by training, by ability, by
all kinds of other things.
127
00:07:51,050 --> 00:07:52,760
Test scores, in the same way,
128
00:07:52,760 --> 00:07:56,903
have many other factors
that would affect them.
129
00:07:58,260 --> 00:08:02,700
When we think of the example
of local food expenditures,
130
00:08:02,700 --> 00:08:06,930
it wouldn't just be income,
but many other factors,
131
00:08:06,930 --> 00:08:11,710
preferences in where you
live and household size
132
00:08:11,710 --> 00:08:13,460
and all kinds of things like that.
133
00:08:13,460 --> 00:08:15,380
So you can probably
think of other examples
134
00:08:15,380 --> 00:08:18,913
of where many Xs might affect our Y.
135
00:08:22,890 --> 00:08:25,360
This equation at the
top is the general form,
136
00:08:25,360 --> 00:08:27,323
where we have K regressors.
137
00:08:28,420 --> 00:08:31,320
And so when we run it
through our software,
138
00:08:31,320 --> 00:08:34,350
there's K plus one parameters,
139
00:08:34,350 --> 00:08:37,910
beta-0, beta-1, through beta-k.
140
00:08:37,910 --> 00:08:40,550
Again, beta-0 is the intercept.
141
00:08:40,550 --> 00:08:44,520
And remember that we sort
of slide it up and down
142
00:08:44,520 --> 00:08:48,950
so that the expected
value of u equals zero.
143
00:08:48,950 --> 00:08:52,150
And in some cases, it could be interpreted
144
00:08:52,150 --> 00:08:56,800
as the value of X, or
the expected value of X,
145
00:08:56,800 --> 00:09:01,470
if all the other, if X,
146
00:09:01,470 --> 00:09:03,030
if all the Xs equals zero,
147
00:09:03,030 --> 00:09:05,460
or if all the betas equals zero.
148
00:09:05,460 --> 00:09:08,870
And in many cases, we
think of beta-1 to beta-k
149
00:09:08,870 --> 00:09:11,330
are called the slope parameters
150
00:09:11,330 --> 00:09:13,933
'cause they measure the sort of change,
151
00:09:15,190 --> 00:09:18,080
the slope of the line for that regressor,
152
00:09:18,080 --> 00:09:22,203
and u is the disturbance term, as always.
153
00:09:28,120 --> 00:09:30,550
So once we get some data,
154
00:09:30,550 --> 00:09:35,550
and we run it through a software package,
155
00:09:35,650 --> 00:09:40,450
like SPSS, we get a number of things back.
156
00:09:40,450 --> 00:09:43,090
One is we get estimates, these beta-hats.
157
00:09:43,090 --> 00:09:45,550
So we get K plus 1 beta-hats.
158
00:09:45,550 --> 00:09:49,870
And then if we take
everybody's X and plug it in
159
00:09:49,870 --> 00:09:54,373
and multiply each of these,
160
00:09:55,540 --> 00:09:58,920
each of their Xs times the beta-hat
161
00:09:58,920 --> 00:10:01,490
and get them all, add them all up,
162
00:10:01,490 --> 00:10:04,440
you'll get the Y-hat for that individual.
163
00:10:04,440 --> 00:10:09,020
So the forecast of what we would say
164
00:10:09,020 --> 00:10:11,123
or what would be our best guess,
165
00:10:12,002 --> 00:10:15,930
that that individual who
answered X in that way,
166
00:10:15,930 --> 00:10:19,103
this is what the predicted
value of their Y is.
167
00:10:20,840 --> 00:10:24,480
And note that when we put hats on things,
168
00:10:24,480 --> 00:10:26,867
that that's always the estimate.
169
00:10:26,867 --> 00:10:29,400
That's just how we sort of denote it.
170
00:10:29,400 --> 00:10:34,310
And if there are N observations,
171
00:10:34,310 --> 00:10:35,650
so we do a survey,
172
00:10:35,650 --> 00:10:40,410
and we get N observations,
173
00:10:40,410 --> 00:10:43,060
again, what OLS does
174
00:10:43,060 --> 00:10:47,160
is it minimizes the sum
of squared residuals.
175
00:10:47,160 --> 00:10:51,020
So everybody has a ui-hat,
176
00:10:51,020 --> 00:10:54,170
and that's the difference
between their Y-hat
177
00:10:54,170 --> 00:10:58,700
and their, what they actually
said Y on the survey.
178
00:10:58,700 --> 00:11:00,900
So we square those and add them up.
179
00:11:00,900 --> 00:11:04,610
And that is
180
00:11:04,610 --> 00:11:07,363
how we get these estimates.
181
00:11:13,100 --> 00:11:16,010
We can have a sample regression line.
182
00:11:16,010 --> 00:11:19,423
So that is everybody's Y-hat,
183
00:11:20,600 --> 00:11:24,580
which we get by taking
everybody's beta-hat,
184
00:11:24,580 --> 00:11:28,670
or everybody's X, and
multiplying it by the beta-hat,
185
00:11:28,670 --> 00:11:29,800
as we see.
186
00:11:29,800 --> 00:11:31,860
Note that everybody's Y-hat,
187
00:11:31,860 --> 00:11:35,720
every Yi-hat is on the regression line.
188
00:11:35,720 --> 00:11:36,740
Why is that?
189
00:11:36,740 --> 00:11:39,913
So think about that, and
we'll discuss it in class.
190
00:11:46,610 --> 00:11:51,550
So we interpret these
beta-hats as the partial effect
191
00:11:51,550 --> 00:11:56,550
of a one-unit change of that Xi on Y.
192
00:11:56,720 --> 00:12:01,160
So as we change that X by a unit,
193
00:12:01,160 --> 00:12:06,160
that beta-hat denotes
the expected change in Y.
194
00:12:06,480 --> 00:12:10,140
So, if we go and change every X
195
00:12:10,140 --> 00:12:11,290
by some amount of units
196
00:12:14,250 --> 00:12:16,530
and multiply them by the beta-hat,
197
00:12:16,530 --> 00:12:20,653
it gets the expected change in Y-hat.
198
00:12:21,610 --> 00:12:24,300
Or you can have,
199
00:12:24,300 --> 00:12:27,760
hold all Xs constant except one
200
00:12:27,760 --> 00:12:32,330
and only change one X, and
then beta-hat for that X
201
00:12:32,330 --> 00:12:36,443
would be the change in Y
just for changing that one.
202
00:12:41,800 --> 00:12:45,870
And this is sort of the
magic of regression,
203
00:12:45,870 --> 00:12:50,620
that it allows us to
really isolate the effect
204
00:12:50,620 --> 00:12:54,193
of changing one X while
holding all else the same.
205
00:12:55,770 --> 00:12:59,860
So we don't have to go
206
00:12:59,860 --> 00:13:02,723
and collect data where,
207
00:13:04,520 --> 00:13:07,660
for the first many Xs,
208
00:13:07,660 --> 00:13:10,770
everybody answers it the same way.
209
00:13:10,770 --> 00:13:13,800
And then only on, say, the
fourth or fifth question,
210
00:13:13,800 --> 00:13:18,730
do they change it, that we can collect
211
00:13:18,730 --> 00:13:23,250
where lots and lots of,
there are a lot of answers.
212
00:13:23,250 --> 00:13:26,333
Basically, everybody answers
it slightly differently,
213
00:13:28,680 --> 00:13:31,640
but the magic of a regression
214
00:13:31,640 --> 00:13:35,040
is we can still isolate,
holding all else equal,
215
00:13:35,040 --> 00:13:38,280
what is the change by just changing one X
216
00:13:38,280 --> 00:13:40,033
and everything else stays the same?
217
00:13:42,520 --> 00:13:47,100
We can also measure the
effect of changing a lot of Xs
218
00:13:47,100 --> 00:13:49,290
or even all of them.
219
00:13:49,290 --> 00:13:54,083
So all we have to do is sort of plug it,
220
00:13:55,230 --> 00:14:00,203
plug these change in X,
into all of these equations,
221
00:14:01,450 --> 00:14:06,270
into the equation, multiply
by the various hats,
222
00:14:07,270 --> 00:14:10,063
add them up, and there you go.
223
00:14:11,510 --> 00:14:14,850
Or, you could change by one unit,
224
00:14:14,850 --> 00:14:19,770
or you can change by basically any unit.
225
00:14:19,770 --> 00:14:23,670
If you do change every X by one unit,
226
00:14:23,670 --> 00:14:26,970
the change in Y-hat will just be the sum
227
00:14:26,970 --> 00:14:29,463
of the various beta-hats.
228
00:14:30,880 --> 00:14:34,103
So hopefully, mathematically,
all of that makes sense.
229
00:14:38,930 --> 00:14:41,240
Remember that when we run a regression
230
00:14:42,350 --> 00:14:46,460
that every individual
231
00:14:46,460 --> 00:14:50,330
has a Y-hat, the predicted value,
232
00:14:50,330 --> 00:14:55,170
and a u-hat, the value of their residual.
233
00:14:55,170 --> 00:14:59,640
So where do they fall
on the regression line?
234
00:14:59,640 --> 00:15:03,220
And then that u-hat
measures the difference
235
00:15:03,220 --> 00:15:05,960
between what they actually said
236
00:15:05,960 --> 00:15:08,790
and what we would've predicted they said.
237
00:15:08,790 --> 00:15:13,303
And again, the i is for each
individual in the sample.
238
00:15:18,120 --> 00:15:21,280
So you get the Y-hat for each individual
239
00:15:21,280 --> 00:15:26,190
by plugging their Xs
in to the model, again,
240
00:15:26,190 --> 00:15:30,720
multiplying them by the beta-hats
241
00:15:30,720 --> 00:15:34,330
and coming up with the Y-hat.
242
00:15:34,330 --> 00:15:38,120
And then we also, everybody has a u-hat.
243
00:15:38,120 --> 00:15:40,700
And we will learn down the road
244
00:15:40,700 --> 00:15:43,540
that we can save them both in SPSS.
245
00:15:43,540 --> 00:15:47,423
So when we go into SPSS, I
will show you how to do that.
246
00:15:51,410 --> 00:15:54,070
Here are a number of mathematical
247
00:15:54,070 --> 00:15:57,670
and statistical properties
of the residual.
248
00:15:57,670 --> 00:16:02,417
The sample average of ui-hat equals zero
249
00:16:04,370 --> 00:16:06,310
because the mean is zero.
250
00:16:06,310 --> 00:16:10,650
And so the mean of Y, the observed values,
251
00:16:10,650 --> 00:16:12,860
equals the mean of Y-hat.
252
00:16:12,860 --> 00:16:16,543
That Y-bar-hat equals Y-bar.
253
00:16:19,010 --> 00:16:23,870
the sample covariance
between each Xk and u
254
00:16:23,870 --> 00:16:25,143
is zero.
255
00:16:26,870 --> 00:16:31,870
And the point of the mean observation,
256
00:16:32,230 --> 00:16:37,100
so the mean of Y, the mean of X1,
257
00:16:37,100 --> 00:16:42,100
all the way through Xk, always
lies on the regression line.
258
00:16:50,320 --> 00:16:54,320
An important thing to note
is that adding regressors
259
00:16:54,320 --> 00:16:59,180
almost always changes the
value of your beta-hats.
260
00:16:59,180 --> 00:17:01,400
So when you go from beta,
261
00:17:04,820 --> 00:17:09,820
from one regressor to
two, so from k=1 to k=2,
262
00:17:11,280 --> 00:17:15,350
the value of your beta-1-hat will change.
263
00:17:15,350 --> 00:17:17,790
In the first model, you only have X1,
264
00:17:17,790 --> 00:17:20,873
and then in another model,
you add another X, X2,
265
00:17:21,860 --> 00:17:25,280
there will only be two cases
266
00:17:25,280 --> 00:17:29,190
where the value of
beta-1-hat does not change.
267
00:17:29,190 --> 00:17:34,190
First is that if the
beta-2-hat equals zero,
268
00:17:34,730 --> 00:17:38,103
so X2 has no effect on Y.
269
00:17:38,980 --> 00:17:43,980
And the other is if X1
and X2 are uncorrelated.
270
00:17:44,350 --> 00:17:47,670
So if X2 is uncorrelated
271
00:17:47,670 --> 00:17:50,440
with Y or with X1,
272
00:17:50,440 --> 00:17:53,460
those are the only two
cases where adding X2
273
00:17:53,460 --> 00:17:57,933
will not change the
value of our beta-1-hat.
274
00:18:02,480 --> 00:18:04,430
So think about it in this way.
275
00:18:04,430 --> 00:18:05,887
So we run this Y
276
00:18:09,540 --> 00:18:13,040
with two Xs, and then we run it again
277
00:18:13,040 --> 00:18:14,923
with only one X.
278
00:18:17,010 --> 00:18:20,343
Now, suppose that this A1,
279
00:18:22,020 --> 00:18:26,740
which is the coefficient
280
00:18:26,740 --> 00:18:29,780
in our second model, when
we didn't include X2,
281
00:18:29,780 --> 00:18:34,550
we could write it as
beta-1 plus beta-2 times D,
282
00:18:34,550 --> 00:18:37,890
where D is the slope coefficient
283
00:18:37,890 --> 00:18:39,280
of if you had
284
00:18:41,760 --> 00:18:45,070
regressed X2 on X1.
285
00:18:45,070 --> 00:18:46,210
They will be the same.
286
00:18:46,210 --> 00:18:50,180
So this A1 or A2 will be the same,
287
00:18:50,180 --> 00:18:54,373
only if one of these two things is true,
288
00:18:55,370 --> 00:18:58,510
either B2 here is zero,
289
00:18:58,510 --> 00:19:02,060
so X2 has no effect on Y,
290
00:19:02,060 --> 00:19:05,880
or if X1 and X2 are uncorrelated,
291
00:19:05,880 --> 00:19:08,610
so that if this D is zero.
292
00:19:08,610 --> 00:19:13,610
And I am going to show you in
class what this looks like,
293
00:19:14,060 --> 00:19:16,423
kind of drawing a Venn diagram.
294
00:19:22,500 --> 00:19:25,600
And in the general case of K regressors,
295
00:19:32,020 --> 00:19:34,793
that when you add regressors,
296
00:19:35,640 --> 00:19:40,640
usually it's going to change
the value of your beta-1.
297
00:19:40,750 --> 00:19:44,390
So the only time that
this would not be true
298
00:19:44,390 --> 00:19:47,800
is if all the betas equaled zero,
299
00:19:47,800 --> 00:19:52,250
so none of our regressors
have any effect on Y,
300
00:19:52,250 --> 00:19:56,640
or if X1 is uncorrelated
with every other X,
301
00:19:56,640 --> 00:19:57,727
with X2,...,Xk.
302
00:19:59,610 --> 00:20:04,610
Both of these would be very
rare instances, almost always.
303
00:20:04,650 --> 00:20:09,650
Y and X2 are gonna have
at least some correlation,
304
00:20:09,680 --> 00:20:14,680
or X1 will have some correlation
305
00:20:15,510 --> 00:20:20,450
with one or more, probably
all of our other Xs,
306
00:20:20,450 --> 00:20:21,810
X2 through Xk.
307
00:20:21,810 --> 00:20:26,160
So the bottom line here is
adding or subtracting regressors
308
00:20:26,160 --> 00:20:29,880
almost always changes
the value of every beta.
309
00:20:29,880 --> 00:20:33,933
And that's why it's so important
to include the right ones.
310
00:20:42,000 --> 00:20:46,340
Thinking again about the
concept of R squared,
311
00:20:46,340 --> 00:20:49,480
how well does our model fit the data?
312
00:20:49,480 --> 00:20:51,220
How much of the variation in Y
313
00:20:52,110 --> 00:20:55,040
is explained by the variation in the Xs?
314
00:20:55,040 --> 00:20:58,140
We can, again, decompose it as SST,
315
00:20:58,140 --> 00:21:01,670
the total sum of squares,
the variation in Y,
316
00:21:01,670 --> 00:21:06,670
the explained sum of squares,
the variation in Y-hat,
317
00:21:07,200 --> 00:21:11,040
and the sum of squared residuals, SSR.
318
00:21:11,040 --> 00:21:14,250
So note that, again, OLS
319
00:21:14,250 --> 00:21:16,993
makes SSR as small as possible,
320
00:21:19,400 --> 00:21:20,253
as before.
321
00:21:25,210 --> 00:21:27,560
So in this drawing here,
322
00:21:27,560 --> 00:21:29,243
they're writing SST as TSS,
323
00:21:31,600 --> 00:21:32,433
but it's the same thing.
324
00:21:32,433 --> 00:21:34,700
It's the total sum of squares.
325
00:21:34,700 --> 00:21:37,100
So SST
326
00:21:37,100 --> 00:21:41,190
is the sum of each Y
327
00:21:41,190 --> 00:21:44,770
minus the mean of Y squared.
328
00:21:44,770 --> 00:21:49,373
SSE is the sum of each Y-hat minus Y-bar.
329
00:21:50,601 --> 00:21:54,300
And SSR is the sum of squared residuals.
330
00:21:54,300 --> 00:21:58,250
So here's the formula.
331
00:21:58,250 --> 00:22:00,563
It's one that you've seen before.
332
00:22:01,740 --> 00:22:04,413
So again, SSD equals SSR.
333
00:22:04,413 --> 00:22:09,120
SSE plus SSR, and we do a bit of math,
334
00:22:09,120 --> 00:22:10,760
and we get R squared,
335
00:22:10,760 --> 00:22:14,710
which is defined as one minus SSR
336
00:22:14,710 --> 00:22:16,550
divided by SST.
337
00:22:16,550 --> 00:22:17,643
So it is that,
338
00:22:20,940 --> 00:22:25,430
that part of the variation in Y,
339
00:22:25,430 --> 00:22:28,290
which is explained by the Xs.
340
00:22:28,290 --> 00:22:32,323
R squared is always a
number between zero and one.
341
00:22:33,210 --> 00:22:34,670
Hardly ever is it zero.
342
00:22:34,670 --> 00:22:35,750
Hardly ever is it one.
343
00:22:35,750 --> 00:22:37,650
In fact, in any regression,
344
00:22:37,650 --> 00:22:40,790
you'll basically never see this.
345
00:22:40,790 --> 00:22:44,400
What does it mean if SSR equals zero?
346
00:22:44,400 --> 00:22:48,323
That's something that you
could ponder and think about.
347
00:22:54,850 --> 00:22:57,450
Here are some properties of R squared.
348
00:22:57,450 --> 00:23:01,390
So it never decreases, and
it almost always increases
349
00:23:01,390 --> 00:23:02,920
when you add a regressor.
350
00:23:02,920 --> 00:23:06,290
So even if you add a
total nonsense regressor,
351
00:23:06,290 --> 00:23:07,640
like your shoe size
352
00:23:07,640 --> 00:23:12,640
or how many letters are in your dog's name
353
00:23:12,940 --> 00:23:14,730
or anything like that
354
00:23:14,730 --> 00:23:17,800
that has nothing to do with your model,
355
00:23:17,800 --> 00:23:20,500
it's still going to increase R squared.
356
00:23:20,500 --> 00:23:23,390
And therefore, it's a poor criterion
357
00:23:23,390 --> 00:23:25,290
of whether to add a regressor.
358
00:23:25,290 --> 00:23:27,950
You almost always,
since it always goes up,
359
00:23:27,950 --> 00:23:31,720
it really, it's not going
to tell you anything.
360
00:23:31,720 --> 00:23:35,280
There is a way that you
can calculate R squared
361
00:23:35,280 --> 00:23:37,300
to see which sort of compensates
362
00:23:37,300 --> 00:23:40,440
for the decrease of degrees of freedom.
363
00:23:40,440 --> 00:23:45,440
So it sort of looks at, is the model,
364
00:23:45,680 --> 00:23:49,600
is the R squared better,
given that we know
365
00:23:49,600 --> 00:23:52,740
that we lost some degrees of freedom
366
00:23:52,740 --> 00:23:55,190
and sort of compensates it for that.
367
00:23:55,190 --> 00:23:56,833
And that's a better way.
368
00:24:01,110 --> 00:24:03,260
Now we're gonna look at the
same kind of assumptions
369
00:24:03,260 --> 00:24:05,220
that we looked at last time.
370
00:24:05,220 --> 00:24:08,800
So these are the things
that we assume are true
371
00:24:08,800 --> 00:24:13,670
or that must be true for an OLS model
372
00:24:13,670 --> 00:24:17,773
in order for it to be the
best unbiased estimator.
373
00:24:18,908 --> 00:24:21,270
And these are the same
as you've seen before,
374
00:24:21,270 --> 00:24:25,330
that it has to be linear in
parameters, random sampling,
375
00:24:25,330 --> 00:24:28,180
non-stochastic Xs
376
00:24:28,180 --> 00:24:31,100
that are not perfectly colinear,
377
00:24:31,100 --> 00:24:35,170
and that the residual has to
have zero conditional mean.
378
00:24:35,170 --> 00:24:36,763
So I'm gonna walk through each one.
379
00:24:39,410 --> 00:24:41,510
So it has to be a linear model
380
00:24:41,510 --> 00:24:44,700
that you can actually
write the population model
381
00:24:44,700 --> 00:24:49,700
in these terms, as a function
of Y and Xs, as you see here.
382
00:24:50,940 --> 00:24:54,730
Again, the betas are
the unknown parameters,
383
00:24:54,730 --> 00:24:57,433
and the u is the disturbance term.
384
00:24:58,540 --> 00:25:00,300
And what this means is that the betas
385
00:25:00,300 --> 00:25:03,020
cannot have any exponent other than one
386
00:25:03,020 --> 00:25:05,140
for it to be a linear function.
387
00:25:05,140 --> 00:25:08,150
Note that the Xs can have exponents,
388
00:25:08,150 --> 00:25:10,010
so it could be squares or logs
389
00:25:10,010 --> 00:25:11,890
or square roots or all
kinds of other things.
390
00:25:11,890 --> 00:25:15,930
And I think squared is
probably the most common one
391
00:25:15,930 --> 00:25:17,373
that you're gonna encounter.
392
00:25:19,180 --> 00:25:22,330
If you think that the
relationship has a curve in it,
393
00:25:22,330 --> 00:25:24,830
that it's not aligned, you
can often add a square.
394
00:25:27,550 --> 00:25:29,370
Next is random sampling.
395
00:25:29,370 --> 00:25:34,370
So again, we draw a
sample from a population,
396
00:25:34,850 --> 00:25:36,523
and it's a random sample.
397
00:25:37,570 --> 00:25:40,290
So there's no clear selection bias.
398
00:25:40,290 --> 00:25:44,200
So we don't only choose old folks or men
399
00:25:44,200 --> 00:25:46,790
or high income or larger household
400
00:25:46,790 --> 00:25:48,210
or married people or anything like that.
401
00:25:48,210 --> 00:25:51,723
It's representative of the population.
402
00:25:53,790 --> 00:25:57,070
Third is no perfect collinearity,
403
00:25:57,070 --> 00:26:01,740
that none of the, each regressor,
404
00:26:01,740 --> 00:26:03,010
well, one way to think of it,
405
00:26:03,010 --> 00:26:05,400
is must add some new information,
406
00:26:05,400 --> 00:26:09,110
that there's no perfect
linear relationship
407
00:26:09,110 --> 00:26:11,240
among any of the regressors.
408
00:26:11,240 --> 00:26:14,630
And we need this for it
to work mathematically,
409
00:26:14,630 --> 00:26:18,800
that their betas will not
be defined if you can't,
410
00:26:18,800 --> 00:26:23,233
if there is perfect collinearity.
411
00:26:25,470 --> 00:26:27,490
Note that they can be correlated,
412
00:26:27,490 --> 00:26:31,140
that they almost always are and will be,
413
00:26:31,140 --> 00:26:34,170
just not perfectly so.
414
00:26:34,170 --> 00:26:36,860
So if you are a matrix algebra nerd,
415
00:26:36,860 --> 00:26:39,533
know that our X matrix,
416
00:26:42,070 --> 00:26:47,070
the matrix is a singular matrix.
417
00:26:47,080 --> 00:26:51,210
It'll have a zero determinant,
and it cannot be inverted.
418
00:26:51,210 --> 00:26:53,460
So much like we saw before,
419
00:26:53,460 --> 00:26:55,930
it's sorta like dividing by zero.
420
00:26:55,930 --> 00:26:57,513
It's just undefined.
421
00:26:58,390 --> 00:27:00,180
It's certainly okay
422
00:27:00,180 --> 00:27:03,710
for them to be a non-linear relationship.
423
00:27:03,710 --> 00:27:07,810
So you could include both
income and income squared,
424
00:27:07,810 --> 00:27:12,810
again, if you think the
relationship has a curve in it,
425
00:27:13,360 --> 00:27:16,980
like increasing, but at a decreasing rate.
426
00:27:16,980 --> 00:27:17,953
And that's fine.
427
00:27:20,730 --> 00:27:23,180
Here are some examples.
428
00:27:23,180 --> 00:27:25,650
So one might be
429
00:27:27,670 --> 00:27:32,593
the expenditures in
Canadian and US dollars,
430
00:27:33,630 --> 00:27:38,350
where US dollars equals
a times Canadian dollars,
431
00:27:38,350 --> 00:27:39,563
a is the exchange rate.
432
00:27:41,080 --> 00:27:43,950
We often find it too in dummy variables.
433
00:27:43,950 --> 00:27:46,280
So if you code
434
00:27:46,280 --> 00:27:51,250
whether you live in Vermont as a yes,
435
00:27:51,250 --> 00:27:53,250
one equals yes, zero equals no,
436
00:27:53,250 --> 00:27:57,217
and have another variable,
non-Vermont, coded the other way,
437
00:27:58,080 --> 00:27:59,980
the sum of these two is always one.
438
00:27:59,980 --> 00:28:01,360
So you can include both.
439
00:28:01,360 --> 00:28:04,723
You only include one of
these two in your model.
440
00:28:08,000 --> 00:28:13,000
The intuition is how can we
measure the effect of US dollars
441
00:28:13,810 --> 00:28:18,140
by holding Canadian dollars constant?
442
00:28:18,140 --> 00:28:21,750
Or how can you, if you're
thinking about plant growth,
443
00:28:21,750 --> 00:28:26,260
how do you account for temperature
444
00:28:26,260 --> 00:28:28,480
in degrees Fahrenheit
445
00:28:28,480 --> 00:28:32,120
holding degrees Celsius constant?
446
00:28:32,120 --> 00:28:34,190
So that's the intuition.
447
00:28:34,190 --> 00:28:38,330
So, also, it adds no new information,
448
00:28:38,330 --> 00:28:40,300
that if you know degrees Fahrenheit,
449
00:28:40,300 --> 00:28:45,220
then you automatically
know degrees Celsius.
450
00:28:45,220 --> 00:28:48,630
And the degrees Celsius, one would say,
451
00:28:48,630 --> 00:28:50,853
adds no new information at all.
452
00:28:54,270 --> 00:28:57,380
We must also have a
positive degree of freedom,
453
00:28:57,380 --> 00:29:01,430
so N must be strictly greater than K+1.
454
00:29:01,430 --> 00:29:05,490
So the number of observations
must be strictly greater
455
00:29:05,490 --> 00:29:09,223
than the number of regressors plus one.
456
00:29:10,300 --> 00:29:14,320
Otherwise, you have more
unknowns than equations,
457
00:29:14,320 --> 00:29:18,370
and you have either no
or infinite solution.
458
00:29:18,370 --> 00:29:21,670
And the more degrees of
freedom that you have,
459
00:29:21,670 --> 00:29:24,290
the lower the variance of beta.
460
00:29:24,290 --> 00:29:28,743
And here's a YouTube that explains that.
461
00:29:34,180 --> 00:29:37,020
There are many benefits to having a big N,
462
00:29:37,020 --> 00:29:38,920
to having a large sample size,
463
00:29:38,920 --> 00:29:43,240
and one of those is it increases
your degrees of freedom.
464
00:29:43,240 --> 00:29:46,590
And as you see here in this table,
465
00:29:46,590 --> 00:29:51,590
the more degrees of freedom that you have,
466
00:29:51,860 --> 00:29:56,140
the lower the test stat has to be
467
00:29:56,140 --> 00:29:57,963
to be significant.
468
00:29:58,880 --> 00:30:02,250
And next week when we talk
about hypothesis tests,
469
00:30:02,250 --> 00:30:04,633
I think this is gonna
make even more sense.
470
00:30:06,010 --> 00:30:11,010
But basically, the
higher degree of freedom,
471
00:30:11,180 --> 00:30:14,760
the bigger N leads to a
more efficient estimator.
472
00:30:14,760 --> 00:30:16,390
And it's a theme that we're gonna revisit
473
00:30:16,390 --> 00:30:19,870
over and over again, more information
474
00:30:19,870 --> 00:30:21,900
leads to lower variance,
475
00:30:21,900 --> 00:30:24,940
or more information leads to
476
00:30:29,430 --> 00:30:33,870
a more efficient estimator,
or a lower variance estimator.
477
00:30:33,870 --> 00:30:37,360
And having bigger N, getting
information from more people,
478
00:30:37,360 --> 00:30:39,623
is one way that you can
get more information.
479
00:30:44,630 --> 00:30:47,460
The next assumption, again,
480
00:30:47,460 --> 00:30:50,290
this is one we should
be familiar with by now,
481
00:30:50,290 --> 00:30:53,070
that the expected value of the error term,
482
00:30:53,070 --> 00:30:55,980
no matter the value of X, is zero.
483
00:30:55,980 --> 00:31:00,610
And that no matter the value of X,
484
00:31:00,610 --> 00:31:03,020
the expected value is the same.
485
00:31:03,020 --> 00:31:06,150
It's this idea that the error term
486
00:31:06,150 --> 00:31:10,010
is uncorrelated with the regressors,
487
00:31:10,010 --> 00:31:13,973
and this is needed to have
an unbiased estimator.
488
00:31:14,920 --> 00:31:17,300
And that is why, as we'll see,
489
00:31:17,300 --> 00:31:19,630
that omitting an important variable
490
00:31:22,540 --> 00:31:25,423
will result in bias.
491
00:31:30,700 --> 00:31:33,740
When this zero conditional mean holds,
492
00:31:33,740 --> 00:31:37,110
we say that our regressors
are explanatory.
493
00:31:37,110 --> 00:31:39,100
The variables are exogenous.
494
00:31:39,100 --> 00:31:41,430
That's what we want, exogenous is good.
495
00:31:41,430 --> 00:31:43,280
When they are correlated
with the error term,
496
00:31:43,280 --> 00:31:44,750
they are said to be endogenous.
497
00:31:44,750 --> 00:31:48,020
And most, a lot of the topics
498
00:31:48,020 --> 00:31:50,270
that we'll be covering
toward the end of class
499
00:31:50,270 --> 00:31:54,650
will deal with how to detect
500
00:31:54,650 --> 00:31:57,393
if they're endogenous
and what to do about it.
501
00:32:02,190 --> 00:32:06,440
So, if we have these
four assumptions holding,
502
00:32:06,440 --> 00:32:08,430
if all four are true,
503
00:32:08,430 --> 00:32:12,340
then every beta-hat is unbiased.
504
00:32:12,340 --> 00:32:17,000
And know that what this means
is the procedure is unbiased.
505
00:32:17,000 --> 00:32:18,803
The model is unbiased.
506
00:32:20,628 --> 00:32:23,600
It doesn't mean that every single beta-hat
507
00:32:23,600 --> 00:32:26,810
will fall exactly on the true value.
508
00:32:26,810 --> 00:32:29,410
It means that there's no systematic reason
509
00:32:29,410 --> 00:32:32,780
why we should think it's
too big or too small,
510
00:32:32,780 --> 00:32:35,000
and if we did this over and over again
511
00:32:35,000 --> 00:32:38,810
that the value would
converge to its true value.
512
00:32:38,810 --> 00:32:41,123
And that's what unbiased means.
513
00:32:47,080 --> 00:32:52,080
We're gonna talk about two
cases now under specification.
514
00:32:52,890 --> 00:32:56,697
What regressors should
you include in your model?
515
00:32:56,697 --> 00:32:59,990
And we're gonna talk first
about overspecifying,
516
00:32:59,990 --> 00:33:02,973
which is including irrelevant ones,
517
00:33:04,590 --> 00:33:06,090
like your shoe size
518
00:33:06,090 --> 00:33:09,070
or the number of letters
in your dog's name
519
00:33:09,070 --> 00:33:10,230
or something like that.
520
00:33:10,230 --> 00:33:13,210
And sort of probably more seriously,
521
00:33:13,210 --> 00:33:18,120
what happens when you omit
ones that you should include?
522
00:33:18,120 --> 00:33:21,113
And we'll see those omitted variable bias.
523
00:33:24,470 --> 00:33:27,430
First, we'll deal with the
issue of overspecifying,
524
00:33:27,430 --> 00:33:32,100
which is including a
variable that is irrelevant.
525
00:33:32,100 --> 00:33:33,320
It could be nonsense
526
00:33:33,320 --> 00:33:35,370
or just has nothing to do with the model.
527
00:33:40,180 --> 00:33:44,940
So suppose we specify this
model with three regressors,
528
00:33:44,940 --> 00:33:48,500
and assumptions 1 through 4 are met,
529
00:33:48,500 --> 00:33:52,720
and everything is cool,
but X3 has no effect.
530
00:33:52,720 --> 00:33:56,610
So X3 has no effect on Y,
531
00:33:56,610 --> 00:34:01,610
that in the true parameter
in the population is zero.
532
00:34:02,100 --> 00:34:04,210
The slope is zero.
533
00:34:04,210 --> 00:34:08,563
Changing X3 has absolutely no effect on Y.
534
00:34:12,250 --> 00:34:13,193
What happens?
535
00:34:15,530 --> 00:34:17,520
Well, there's good news and bad news.
536
00:34:17,520 --> 00:34:22,520
The good news is since,
recall a few slides ago,
537
00:34:22,980 --> 00:34:27,120
that because B3 or beta-3 equals zero,
538
00:34:27,120 --> 00:34:28,790
it won't create bias.
539
00:34:28,790 --> 00:34:33,520
It won't effect any bias
of beta-1 or beta-2.
540
00:34:33,520 --> 00:34:38,180
However, it will inflate the
variance of the other betas.
541
00:34:38,180 --> 00:34:42,360
So it will increase the
variance of beta-1 or beta-2.
542
00:34:42,360 --> 00:34:45,520
So, and there's a few ways
that you can think about this.
543
00:34:45,520 --> 00:34:49,120
One, it takes away a degree
of freedom for no reason.
544
00:34:49,120 --> 00:34:54,040
And two, it takes away some
of the explanatory power
545
00:34:54,040 --> 00:34:55,810
of the other Xs.
546
00:34:55,810 --> 00:34:59,220
So especially if X3
547
00:34:59,220 --> 00:35:04,220
has any overlap with X2 and X1,
548
00:35:04,310 --> 00:35:07,300
it will take away some of the information
549
00:35:07,300 --> 00:35:11,960
that is in those variables
550
00:35:11,960 --> 00:35:16,150
and therefore take away
their explanatory power.
551
00:35:16,150 --> 00:35:18,200
And we're gonna talk about this a bit more
552
00:35:18,200 --> 00:35:22,010
when we think about the
variance of our beta-hats,
553
00:35:22,010 --> 00:35:25,833
which is sort of the end of this topic.
554
00:35:28,230 --> 00:35:32,910
So underspecifying is,
in a sense, more serious
555
00:35:32,910 --> 00:35:36,140
because it creates bias.
556
00:35:36,140 --> 00:35:39,320
But sometimes we can
know what the direction
557
00:35:39,320 --> 00:35:42,760
and maybe even size of the bias is.
558
00:35:42,760 --> 00:35:46,900
So assume that the true model is this,
559
00:35:46,900 --> 00:35:49,690
that only X2 and X1
560
00:35:49,690 --> 00:35:53,030
are the relevant regressors,
561
00:35:53,030 --> 00:35:55,680
and it's all well-behaved,
562
00:35:55,680 --> 00:35:57,663
assumptions 1 through 4 hold.
563
00:36:01,750 --> 00:36:04,160
We wanna know, what is beta-1?
564
00:36:04,160 --> 00:36:06,190
What is the effect of X1 on Y?
565
00:36:06,190 --> 00:36:10,970
However, we forget, for some
reason, when we exclude X2,
566
00:36:10,970 --> 00:36:13,910
we don't know enough to include it,
567
00:36:13,910 --> 00:36:16,410
there is no data available,
something like that,
568
00:36:16,410 --> 00:36:19,830
and so we run, instead, this regression
569
00:36:19,830 --> 00:36:21,870
with just a single regressor, X1.
570
00:36:21,870 --> 00:36:23,860
And I'm putting the A instead of B
571
00:36:23,860 --> 00:36:28,563
to sort of set this apart, so
that it's clear, hopefully.
572
00:36:34,780 --> 00:36:37,950
So here's an example
from the Wolters book.
573
00:36:37,950 --> 00:36:40,360
Again, we're looking at wages,
574
00:36:40,360 --> 00:36:43,743
and we're really interested
in the returns to education.
575
00:36:45,050 --> 00:36:49,740
And we have these two
regressors in the true model,
576
00:36:49,740 --> 00:36:51,410
education and ability.
577
00:36:51,410 --> 00:36:55,470
So you come with some innate ability,
578
00:36:55,470 --> 00:36:57,870
and you get education,
579
00:36:57,870 --> 00:37:00,350
and that's what drives your wage.
580
00:37:00,350 --> 00:37:02,020
Again, there's probably more,
581
00:37:02,020 --> 00:37:06,200
but just to make a simpler model.
582
00:37:06,200 --> 00:37:10,480
But we don't have a
variable measuring ability.
583
00:37:10,480 --> 00:37:14,737
So we run just a single regressor model,
584
00:37:16,130 --> 00:37:18,893
education, and we get A1.
585
00:37:20,590 --> 00:37:22,130
And that's what we see.
586
00:37:22,130 --> 00:37:26,600
And here, the error term, which is v,
587
00:37:26,600 --> 00:37:28,450
which we're calling v here,
588
00:37:28,450 --> 00:37:32,570
is beta-2 times ability plus the error,
589
00:37:32,570 --> 00:37:36,240
since we forgot about adding ability.
590
00:37:36,240 --> 00:37:41,240
And almost certainly, ability
and education are correlated,
591
00:37:41,610 --> 00:37:46,550
that I would assume that
those with more ability
592
00:37:46,550 --> 00:37:49,060
probably seek more education.
593
00:37:49,060 --> 00:37:51,020
That might be one hypothesis.
594
00:37:51,020 --> 00:37:55,083
But regardless of the direction,
595
00:37:56,840 --> 00:38:01,270
I think common sense says
that your innate ability
596
00:38:01,270 --> 00:38:03,680
and how much education that you get
597
00:38:03,680 --> 00:38:06,013
would be somewhat correlated.
598
00:38:09,190 --> 00:38:11,950
So we can think about what's the magnitude
599
00:38:11,950 --> 00:38:14,393
and the direction of A1?
600
00:38:15,410 --> 00:38:19,070
So A1 is then beta-1
601
00:38:19,070 --> 00:38:21,970
plus beta-2 times d,
602
00:38:21,970 --> 00:38:25,990
where d is the slope
of regressing X2 on X1.
603
00:38:25,990 --> 00:38:29,980
So it's how correlated are X1 and X2?
604
00:38:29,980 --> 00:38:31,230
What is the effect of it?
605
00:38:34,840 --> 00:38:39,780
And beta-hat-2 is the
slope from the real model,
606
00:38:39,780 --> 00:38:41,953
had we been able to run this.
607
00:38:47,130 --> 00:38:51,660
So here, the expected value of A1
608
00:38:51,660 --> 00:38:55,350
is the expected value of beta-1-hat
609
00:38:55,350 --> 00:38:58,400
plus beta-2-hat times d.
610
00:38:58,400 --> 00:39:01,890
So the bias is this second term,
611
00:39:01,890 --> 00:39:04,370
beta-2-hat times d.
612
00:39:04,370 --> 00:39:06,200
Now, if
613
00:39:07,650 --> 00:39:10,140
beta-2-hat equals zero,
614
00:39:10,140 --> 00:39:13,860
so if ability has no effect on wages,
615
00:39:13,860 --> 00:39:15,960
or if d equals zero,
616
00:39:15,960 --> 00:39:20,500
that is, ability and
education are uncorrelated,
617
00:39:20,500 --> 00:39:22,560
then A1 is unbiased.
618
00:39:22,560 --> 00:39:23,800
Then we're fine.
619
00:39:23,800 --> 00:39:27,393
And we sorta talked about
that a few slides ago.
620
00:39:35,340 --> 00:39:37,350
So again, if d equals zero,
621
00:39:37,350 --> 00:39:38,993
then X1 and X2 are uncorrelated.
622
00:39:41,508 --> 00:39:43,091
And that would mean
623
00:39:45,430 --> 00:39:48,220
that the expected value of X2 given X1
624
00:39:48,220 --> 00:39:49,570
is just X2.
625
00:39:49,570 --> 00:39:51,283
And then the error term,
626
00:39:52,910 --> 00:39:56,170
then X2 does not,
627
00:39:56,170 --> 00:40:00,047
is not correlated with X1,
628
00:40:00,047 --> 00:40:02,540
and X2 being in the error term
629
00:40:02,540 --> 00:40:07,540
does not violate our assumption
4, and everything is cool.
630
00:40:07,540 --> 00:40:11,390
But again, that's going to be rather rare.
631
00:40:11,390 --> 00:40:15,440
And in our example of omitting ability,
632
00:40:15,440 --> 00:40:17,900
I think, hope you can see,
633
00:40:17,900 --> 00:40:20,503
that that would not be
very good reasoning.
634
00:40:23,410 --> 00:40:27,740
So we can also think about
what is the direction of it
635
00:40:28,900 --> 00:40:31,940
and maybe even the magnitude.
636
00:40:31,940 --> 00:40:33,680
We can use our intuition.
637
00:40:33,680 --> 00:40:36,870
So if d is greater than zero,
638
00:40:36,870 --> 00:40:40,940
X2 and X1 are positively correlated,
639
00:40:40,940 --> 00:40:43,670
folks that have more
ability seek education,
640
00:40:43,670 --> 00:40:48,670
that would be my guess,
but maybe that isn't true,
641
00:40:48,880 --> 00:40:51,250
if it's less than zero,
642
00:40:51,250 --> 00:40:54,163
people with more ability
get less education.
643
00:40:55,490 --> 00:40:58,670
So you need to look at
the effect of X2 on Y
644
00:40:58,670 --> 00:41:02,460
as well as the effect of X2 on X1,
645
00:41:02,460 --> 00:41:04,293
and we can sort of intuit this.
646
00:41:08,120 --> 00:41:11,080
So again, this is our real model,
647
00:41:11,080 --> 00:41:15,540
where education and
ability are both included.
648
00:41:15,540 --> 00:41:18,820
So if beta-2 is greater than zero,
649
00:41:18,820 --> 00:41:22,660
which means more ability
leads to higher wage,
650
00:41:22,660 --> 00:41:23,980
that would be my guess,
651
00:41:23,980 --> 00:41:26,940
and d is also greater than zero,
652
00:41:26,940 --> 00:41:31,940
more ability leads to more education,
653
00:41:31,990 --> 00:41:35,380
and that means that the bias is positive,
654
00:41:35,380 --> 00:41:40,380
that the effect of
education would be smaller.
655
00:41:44,727 --> 00:41:47,230
A1 is greater than A2,
656
00:41:47,230 --> 00:41:51,193
so we overstate the effect of education.
657
00:41:56,620 --> 00:42:00,770
When we have a k variable equation,
658
00:42:00,770 --> 00:42:05,320
it's gonna depend again on how correlated
659
00:42:05,320 --> 00:42:10,220
the omitted variable is with
the ones that we include,
660
00:42:10,220 --> 00:42:13,070
but know that every B may be biased,
661
00:42:13,070 --> 00:42:17,057
not just those correlated
with the omitted regressor.
662
00:42:20,410 --> 00:42:23,420
Here is how I approach it in general.
663
00:42:23,420 --> 00:42:26,710
I tend to use a big tent approach
664
00:42:26,710 --> 00:42:30,420
and include a lot of Xs, at
least in the early model.
665
00:42:30,420 --> 00:42:35,420
So you always wanna base
it on these three things,
666
00:42:35,610 --> 00:42:38,260
previous research, theory,
667
00:42:38,260 --> 00:42:40,870
and just sort of common
sense and introspection.
668
00:42:40,870 --> 00:42:41,770
So you think about
669
00:42:43,418 --> 00:42:46,940
what does previous research
suggest we include?
670
00:42:46,940 --> 00:42:48,940
What does theory suggest we improve?
671
00:42:48,940 --> 00:42:52,823
And what does common sense
suggest that we improve?
672
00:42:54,000 --> 00:42:58,240
And we will learn down the
road with hypothesis test
673
00:42:58,240 --> 00:43:02,480
how to pair down and how to
sort of come to the right model,
674
00:43:02,480 --> 00:43:05,900
to start with a lot of Xs,
675
00:43:05,900 --> 00:43:08,350
and there are tests to see,
676
00:43:08,350 --> 00:43:11,923
well, if we take these out,
which is the best model?
677
00:43:15,900 --> 00:43:18,810
We're going to talk about the variance
678
00:43:18,810 --> 00:43:20,260
of the OLS estimators.
679
00:43:20,260 --> 00:43:23,013
What are the variance of beta-hat?
680
00:43:26,070 --> 00:43:28,580
We've spent a bunch of time
681
00:43:28,580 --> 00:43:32,290
looking at the assumptions
under which they are unbiased.
682
00:43:32,290 --> 00:43:35,430
And now we wanna think
about how efficient are they
683
00:43:35,430 --> 00:43:38,980
'cause we want the best
unbiased estimator.
684
00:43:38,980 --> 00:43:43,100
So beta-hat
685
00:43:43,940 --> 00:43:48,270
has variance because it
has the error term in them
686
00:43:48,270 --> 00:43:50,523
because it has the observed Y in them.
687
00:43:51,670 --> 00:43:54,340
So each time you,
688
00:43:54,340 --> 00:43:57,560
so even if you've
specified the model right,
689
00:43:57,560 --> 00:44:01,450
each time you draw a sample and run it,
690
00:44:01,450 --> 00:44:05,810
you're going to get a
slightly different beta-hat
691
00:44:07,525 --> 00:44:11,083
because there's a new set of
error terms that are drawn.
692
00:44:13,510 --> 00:44:18,510
We're going to assume now that
there is homoscedasticity,
693
00:44:19,100 --> 00:44:24,100
which has that the variance of
the error term is a constant.
694
00:44:24,770 --> 00:44:26,680
We're gonna deal with that later
695
00:44:26,680 --> 00:44:28,620
of what do we do when that's not true,
696
00:44:28,620 --> 00:44:31,870
how to test for it and
how to account for it.
697
00:44:31,870 --> 00:44:34,100
But now we're going to assume
698
00:44:34,100 --> 00:44:37,820
that the variance of u,
given any value of X,
699
00:44:37,820 --> 00:44:41,053
equals a constant, which
we call sigma squared.
700
00:44:42,150 --> 00:44:46,030
That is, it's the same,
regardless of any value of X,
701
00:44:46,030 --> 00:44:50,860
so no value of X will change the variance
702
00:44:52,510 --> 00:44:57,510
of u, so sort of the
shape of the bell curve.
703
00:44:57,940 --> 00:45:00,640
We know it's centered over zero,
704
00:45:00,640 --> 00:45:03,640
but it could be a very
tall, skinny bell curve.
705
00:45:03,640 --> 00:45:06,250
It could be a very short, fat bell curve.
706
00:45:06,250 --> 00:45:08,350
The key here, and that's basically
707
00:45:08,350 --> 00:45:11,560
what we're trying to measure,
but the assumption here
708
00:45:11,560 --> 00:45:13,970
is that no matter what the value of X is,
709
00:45:13,970 --> 00:45:16,890
that the shape of that
bell curve is the same.
710
00:45:21,220 --> 00:45:24,410
Here's the formula for the
variance of beta-hat-j.
711
00:45:24,410 --> 00:45:29,410
So you take one particular regressor, Xj.
712
00:45:29,430 --> 00:45:32,200
What is the variance of its beta-hat?
713
00:45:32,200 --> 00:45:33,810
And here it is.
714
00:45:33,810 --> 00:45:37,483
So it has three parts, sigma squared,
715
00:45:38,320 --> 00:45:41,823
SSTj, and one minus R squared j.
716
00:45:42,770 --> 00:45:46,000
So we're gonna think about sigma squared,
717
00:45:46,000 --> 00:45:49,380
that that is, that's the variance of u,
718
00:45:49,380 --> 00:45:51,500
which we already saw, and
we're gonna see in a bit
719
00:45:51,500 --> 00:45:52,683
how to measure that.
720
00:45:55,150 --> 00:45:56,980
There's also SSTj,
721
00:45:56,980 --> 00:46:01,910
which is the total sample variation of X.
722
00:46:01,910 --> 00:46:06,870
So it is the sum of every X,
723
00:46:06,870 --> 00:46:10,910
so what everybody said on the survey
724
00:46:10,910 --> 00:46:13,660
times the mean of that X,
725
00:46:13,660 --> 00:46:16,653
so the mean value from our sample,
726
00:46:17,730 --> 00:46:22,730
subtract every individual's
Xj from the mean of Xj,
727
00:46:23,350 --> 00:46:26,340
square it, and sum it up from one to N.
728
00:46:26,340 --> 00:46:29,390
So for each of our N respondents,
729
00:46:29,390 --> 00:46:34,390
it's the total sum of
squared, so squared X.
730
00:46:35,010 --> 00:46:38,620
And R squared j is the R squared
731
00:46:38,620 --> 00:46:41,010
of if you were to take Xj
732
00:46:41,010 --> 00:46:44,210
and regress it on all the other Xs.
733
00:46:44,210 --> 00:46:45,803
So if we're looking at X1,
734
00:46:48,510 --> 00:46:49,800
it would be the R squared.
735
00:46:49,800 --> 00:46:53,370
If we took X1 and put
it on the left-hand side
736
00:46:53,370 --> 00:46:56,603
and regressed X2, X3, X4,...,Xk,
737
00:46:58,750 --> 00:47:02,590
and look at that R squared,
that is what this is.
738
00:47:02,590 --> 00:47:06,160
So it's basically a
measure of how correlated
739
00:47:06,160 --> 00:47:09,360
is this Xj with all of the others?
740
00:47:09,360 --> 00:47:12,990
A high R squared implies that
they're highly correlated.
741
00:47:12,990 --> 00:47:15,883
A low R squared j means that they are not.
742
00:47:19,890 --> 00:47:21,070
Why does this matter?
743
00:47:21,070 --> 00:47:23,870
Why do we care about variance?
744
00:47:23,870 --> 00:47:26,600
Well, if we have a high variance,
745
00:47:26,600 --> 00:47:27,630
so if you can think about it
746
00:47:27,630 --> 00:47:32,630
as kind of a short, fat bell curve,
747
00:47:34,080 --> 00:47:37,030
we have a less precise estimator.
748
00:47:37,030 --> 00:47:41,080
There's a need for larger
confidence intervals.
749
00:47:41,080 --> 00:47:44,440
And thus, we're less likely
to find significance.
750
00:47:44,440 --> 00:47:47,146
So if you run a regression,
751
00:47:47,146 --> 00:47:49,770
and you don't find anything significant,
752
00:47:49,770 --> 00:47:51,620
it's kind of deflating.
753
00:47:51,620 --> 00:47:53,300
It's like, er, this is,
754
00:47:53,300 --> 00:47:57,580
it's not very interesting.
755
00:47:57,580 --> 00:47:59,953
So even from a really practical matter,
756
00:48:03,010 --> 00:48:06,630
you wanna find things,
if they are significant,
757
00:48:06,630 --> 00:48:11,030
you want to find that
'cause that's sort of
758
00:48:12,160 --> 00:48:14,813
what's interesting to talk about.
759
00:48:18,250 --> 00:48:22,563
So I will go over each of the
three components of variance.
760
00:48:27,330 --> 00:48:29,853
First is sigma squared.
761
00:48:30,830 --> 00:48:33,880
So a higher, note that in the formula,
762
00:48:33,880 --> 00:48:36,710
higher sigma squared
means higher variance.
763
00:48:36,710 --> 00:48:39,820
It means that the error
terms are all over the place.
764
00:48:39,820 --> 00:48:43,693
It means that we have a short, fat,
765
00:48:45,710 --> 00:48:47,893
very spread out bell curve.
766
00:48:48,830 --> 00:48:50,040
Another way of thinking about it
767
00:48:50,040 --> 00:48:52,230
is more noise in the equation
768
00:48:52,230 --> 00:48:55,690
makes it harder to
predict partial effects.
769
00:48:55,690 --> 00:49:00,240
And know that it is a population measure.
770
00:49:00,240 --> 00:49:03,090
It's independent of the sample size.
771
00:49:03,090 --> 00:49:07,810
And it is unknown, but there
is a way to estimate it,
772
00:49:07,810 --> 00:49:09,210
which we'll see in a minute.
773
00:49:15,415 --> 00:49:16,960
SSTj, again,
774
00:49:16,960 --> 00:49:20,773
is the total variation in X.
775
00:49:22,160 --> 00:49:26,410
And the more variation in X,
776
00:49:26,410 --> 00:49:28,260
the smaller the variance.
777
00:49:28,260 --> 00:49:31,300
It's in the denominator,
so you want a big SSTj.
778
00:49:31,300 --> 00:49:35,280
So that means that you want Xs
779
00:49:35,280 --> 00:49:37,533
to have some variation.
780
00:49:38,410 --> 00:49:42,903
So if Xj here is age,
781
00:49:44,420 --> 00:49:46,163
you want your sample,
782
00:49:47,259 --> 00:49:49,680
you always want your sample to be random,
783
00:49:49,680 --> 00:49:52,623
but if you're drawing from,
784
00:49:54,660 --> 00:49:59,220
folks from across the age spectrum
785
00:49:59,220 --> 00:50:02,340
from, say, 18 to 99,
786
00:50:02,340 --> 00:50:05,840
the Xs are going to be more spread out.
787
00:50:05,840 --> 00:50:08,440
And the partial effect of age
788
00:50:08,440 --> 00:50:11,930
is going to be a lot easier to predict.
789
00:50:11,930 --> 00:50:14,030
So you can think about it that
790
00:50:14,030 --> 00:50:18,340
if you're trying to eyeball
the slope of a line,
791
00:50:18,340 --> 00:50:22,430
and all of the dots are
all sort of converged
792
00:50:22,430 --> 00:50:26,070
around a single X, so you only have folks
793
00:50:26,070 --> 00:50:30,230
that are 31, 30, 31, 32, 31, 30,
794
00:50:30,230 --> 00:50:31,670
it's gonna be hard to sort of,
795
00:50:31,670 --> 00:50:33,583
what is the slope of this line?
796
00:50:36,220 --> 00:50:37,800
This is another example
797
00:50:37,800 --> 00:50:41,250
where more info leads to lower variance.
798
00:50:41,250 --> 00:50:45,060
And increasing sample size unambiguously
799
00:50:46,620 --> 00:50:51,280
increases the variation in X
800
00:50:51,280 --> 00:50:55,850
because we're not dividing by N here.
801
00:50:55,850 --> 00:50:57,660
It's just summing them up.
802
00:50:57,660 --> 00:50:59,573
So the more Xs that you have,
803
00:51:00,900 --> 00:51:03,453
unless every single one is on the mean,
804
00:51:04,800 --> 00:51:08,750
adding N will make SST get bigger
805
00:51:08,750 --> 00:51:10,823
and will decrease your variance.
806
00:51:16,400 --> 00:51:21,400
R squared j is the R squared
that if you would take Xj
807
00:51:21,780 --> 00:51:24,610
and put it on the left
side and run a regression
808
00:51:24,610 --> 00:51:26,250
with all the other Xs on the right side,
809
00:51:26,250 --> 00:51:27,740
what's the R squared?
810
00:51:27,740 --> 00:51:31,430
So if R squared j is one,
811
00:51:31,430 --> 00:51:34,343
we have a perfect linear combination.
812
00:51:36,844 --> 00:51:40,060
And you'll see that
we're dividing by zero.
813
00:51:40,060 --> 00:51:42,193
And again, it makes this blow up too.
814
00:51:43,260 --> 00:51:47,420
And in this case, Xj
adds no new information.
815
00:51:47,420 --> 00:51:51,040
So you want Xj
816
00:51:51,040 --> 00:51:54,850
to say something that
the other Xs don't say.
817
00:51:54,850 --> 00:51:59,290
And the more it says something
that the other Xs don't say,
818
00:51:59,290 --> 00:52:02,130
the more new information
that you're getting here,
819
00:52:02,130 --> 00:52:04,823
and the lower the variance.
820
00:52:05,670 --> 00:52:08,010
There's a way, and I will show you,
821
00:52:08,010 --> 00:52:13,010
to calculate the variance
inflation factor,
822
00:52:13,023 --> 00:52:14,203
or the VIF.
823
00:52:14,203 --> 00:52:18,363
And in SPSS, you'll find it
under collinearity diagnostics.
824
00:52:19,630 --> 00:52:24,233
And the VIF is one divided
by one minus R squared j.
825
00:52:25,430 --> 00:52:27,060
If it's less than four,
826
00:52:27,060 --> 00:52:30,700
this is just sort of a
rule of thumb I've learned,
827
00:52:30,700 --> 00:52:34,200
if your VIF is less than four,
it really isn't a problem.
828
00:52:34,200 --> 00:52:36,700
If it's less than 10, it
isn't a major problem.
829
00:52:36,700 --> 00:52:40,223
It's more than 10, you may have a problem.
830
00:52:46,130 --> 00:52:47,710
Here, I will show you
831
00:52:47,710 --> 00:52:51,980
that if you exclude a variable,
832
00:52:51,980 --> 00:52:56,630
it introduces bias, but it
also decreases the variance.
833
00:52:56,630 --> 00:52:59,190
So there's kind of a trade-off here.
834
00:52:59,190 --> 00:53:02,983
So think back of our k=2 example,
835
00:53:04,140 --> 00:53:06,853
where the real model, say, includes X1.
836
00:53:08,010 --> 00:53:10,293
I mean, it includes X2,
837
00:53:11,140 --> 00:53:14,890
but we run another model
where we exclude it.
838
00:53:14,890 --> 00:53:17,920
So beta-1-hat is where we include it,
839
00:53:17,920 --> 00:53:22,533
and it's the formula that we know well,
840
00:53:25,039 --> 00:53:26,580
the bottom box here,
841
00:53:26,580 --> 00:53:31,580
the variance of A1 now
no longer has this term.
842
00:53:35,848 --> 00:53:37,233
Since R squared 1
843
00:53:38,130 --> 00:53:41,680
is always a number between zero and one,
844
00:53:41,680 --> 00:53:46,460
one minus that number is also
a number between zero and one.
845
00:53:46,460 --> 00:53:49,310
And by including it
846
00:53:49,310 --> 00:53:52,133
in the third box down,
847
00:53:53,860 --> 00:53:55,210
it's going to have,
848
00:53:55,210 --> 00:53:58,840
you're dividing by a number less than one,
849
00:53:58,840 --> 00:54:02,350
which is like multiplying by
a number greater than one.
850
00:54:02,350 --> 00:54:06,380
So the variance of A1 is going to be less
851
00:54:06,380 --> 00:54:08,810
than the variance of beta-1-hat.
852
00:54:08,810 --> 00:54:11,853
So that's the trade-off,
and that's why it happens.
853
00:54:17,080 --> 00:54:20,033
So here's more about that.
854
00:54:21,520 --> 00:54:26,117
And it depends on how much
new information does X2 add?
855
00:54:27,420 --> 00:54:30,290
The more new information,
856
00:54:30,290 --> 00:54:34,380
the smaller this R squared is,
857
00:54:34,380 --> 00:54:38,300
and the less effect that
it has on the variance.
858
00:54:38,300 --> 00:54:40,540
So you can play with the math
859
00:54:40,540 --> 00:54:44,423
and see if you can see
what is going on here.
860
00:54:48,760 --> 00:54:52,430
So the variance of A1 is always smaller
861
00:54:52,430 --> 00:54:55,150
than the variance beta-1-hat,
862
00:54:55,150 --> 00:54:58,280
unless X2 is uncorrelated with X1,
863
00:54:58,280 --> 00:55:00,123
and then it would be the same.
864
00:55:01,200 --> 00:55:05,780
And if X1 and X2 are uncorrelated,
865
00:55:05,780 --> 00:55:08,513
I mean, are correlated,
866
00:55:13,070 --> 00:55:16,740
the bias trade-off depends on whether B2
867
00:55:18,270 --> 00:55:20,373
is zero or not.
868
00:55:21,220 --> 00:55:25,510
But the variance of beta
869
00:55:25,510 --> 00:55:27,263
is always going to be greater.
870
00:55:33,500 --> 00:55:38,090
The bottom line is that
adding an irrelevant variable
871
00:55:38,090 --> 00:55:43,003
exacerbates multicollinearity
and increases variance.
872
00:55:44,520 --> 00:55:47,840
And adding observations
can decrease variance,
873
00:55:47,840 --> 00:55:49,903
but it doesn't address bias.
874
00:55:52,800 --> 00:55:55,120
To calculate the variance,
875
00:55:55,120 --> 00:55:58,150
we need to estimate sigma squared.
876
00:55:58,150 --> 00:56:02,140
So we want an unbiased estimator of that,
877
00:56:02,140 --> 00:56:04,140
which is sigma squared hat.
878
00:56:04,140 --> 00:56:08,417
And we do this by using the residuals,
879
00:56:09,400 --> 00:56:12,840
which is, for every person,
880
00:56:12,840 --> 00:56:15,300
ui-hat equals Yi minus Y-i-hat,
881
00:56:15,300 --> 00:56:19,810
so what they actually said
versus the predicted value
882
00:56:19,810 --> 00:56:22,613
based on what their Xs are.
883
00:56:31,807 --> 00:56:34,170
Sigma squared hat is sum
884
00:56:34,170 --> 00:56:38,630
of the ui-hat squared
885
00:56:38,630 --> 00:56:41,630
normalized by degrees of freedom.
886
00:56:41,630 --> 00:56:46,520
So it's the sum of squared residuals
887
00:56:46,520 --> 00:56:49,600
divided by n minus k minus one.
888
00:56:49,600 --> 00:56:53,180
And note that as N goes to infinity,
889
00:56:53,180 --> 00:56:55,470
this is going to get smaller
and smaller and smaller.
890
00:56:55,470 --> 00:56:57,280
So again, a bigger N
891
00:56:57,280 --> 00:57:00,760
will make for a smaller sigma squared hat,
892
00:57:00,760 --> 00:57:04,270
and which makes the overall variance
893
00:57:04,270 --> 00:57:08,163
of beta-1-hat smaller and smaller.
894
00:57:15,040 --> 00:57:17,880
The standard error of the regression
895
00:57:17,880 --> 00:57:20,840
is the positive square
root of this estimate,
896
00:57:20,840 --> 00:57:25,840
of sigma-hat squared, square root of.
897
00:57:25,890 --> 00:57:29,510
And this estimate is unbiased,
898
00:57:29,510 --> 00:57:33,970
only if the assumption 5,
899
00:57:33,970 --> 00:57:35,943
homoscedasticity, holds.
900
00:57:37,220 --> 00:57:40,600
So now you could, in theory,
901
00:57:40,600 --> 00:57:45,090
calculate each of the components
902
00:57:45,090 --> 00:57:50,090
and calculate the
variance for a beta-1-hat.
903
00:57:52,970 --> 00:57:57,940
So here is our final problem set for this,
904
00:57:57,940 --> 00:58:01,410
and want you to think about collinearity,
905
00:58:01,410 --> 00:58:04,210
about omitted variable bias,
906
00:58:04,210 --> 00:58:07,990
and the variance of the beta-hat,
907
00:58:07,990 --> 00:58:12,070
and why, as each one
increases or decreases,
908
00:58:12,070 --> 00:58:14,583
what happens to the variance and why?
909
00:58:17,530 --> 00:58:18,900
This is what we did.
910
00:58:18,900 --> 00:58:23,630
Thanks for watching this,
911
00:58:23,630 --> 00:58:25,533
and have a good day.