1
00:00:03,210 --> 00:00:06,270
- [Instructor] This is the first
part of a three-part series
2
00:00:06,270 --> 00:00:09,473
on time series econometrics.
3
00:00:10,720 --> 00:00:15,720
I hope you find this section useful.
4
00:00:22,450 --> 00:00:27,450
This section on time series
has three main components.
5
00:00:29,830 --> 00:00:31,080
This is the first one,
6
00:00:31,080 --> 00:00:34,990
and I'm going to do a little introduction.
7
00:00:34,990 --> 00:00:38,570
Talk about two common models,
8
00:00:38,570 --> 00:00:41,963
static and finite distributed lag.
9
00:00:42,880 --> 00:00:47,880
Talk about very well-behaved
time series data,
10
00:00:48,560 --> 00:00:50,230
which will remind you a lot
11
00:00:50,230 --> 00:00:54,010
of the very well-behaved
cross-sectional data
12
00:00:54,010 --> 00:00:55,830
we did many weeks ago.
13
00:00:55,830 --> 00:01:00,830
And what we can assume
OLS does based on those
14
00:01:01,470 --> 00:01:05,000
as far as unbiased and efficient.
15
00:01:05,000 --> 00:01:09,370
And in the last part, we're
gonna talk about trends
16
00:01:09,370 --> 00:01:13,303
and seasonal effects and
how to account for those.
17
00:01:15,690 --> 00:01:17,580
So, so far, most of our time
18
00:01:17,580 --> 00:01:20,290
has been spent with cross-sectional data
19
00:01:20,290 --> 00:01:24,640
where we have a single
timeframe, T equals one,
20
00:01:24,640 --> 00:01:27,870
but many observations and equals many.
21
00:01:27,870 --> 00:01:32,870
For example, doing a survey
over a specific time period,
22
00:01:34,730 --> 00:01:38,310
but administering it and gaining data
23
00:01:38,310 --> 00:01:40,670
from lots of different people.
24
00:01:40,670 --> 00:01:42,920
We also spoke a bit about panel data,
25
00:01:42,920 --> 00:01:47,920
where you have, usually
a big N, many respondents
26
00:01:49,030 --> 00:01:53,350
and more than one time period,
27
00:01:53,350 --> 00:01:56,860
but usually a fairly small number.
28
00:01:56,860 --> 00:02:00,963
Say two, three, maybe at most four times.
29
00:02:02,210 --> 00:02:05,610
Now we're looking at a time series
30
00:02:05,610 --> 00:02:08,670
where there's one N but many Ts.
31
00:02:08,670 --> 00:02:13,670
So we're looking at one
set of variables over time,
32
00:02:14,150 --> 00:02:18,620
such as GDP or sales of some good.
33
00:02:18,620 --> 00:02:21,220
It could be a sociological thing
34
00:02:21,220 --> 00:02:26,220
like voting, voting rate,
or many other examples,
35
00:02:27,170 --> 00:02:32,090
educational attainment in a given place,
36
00:02:32,090 --> 00:02:34,420
or a biophysical example,
37
00:02:34,420 --> 00:02:38,823
like phosphorous in the lake or land use.
38
00:02:40,110 --> 00:02:42,550
But in any case, it's looking at
39
00:02:42,550 --> 00:02:47,550
the same respondent, in
a sense, over many years.
40
00:02:48,290 --> 00:02:51,460
And since there's a logical ordering here,
41
00:02:51,460 --> 00:02:53,770
we have to think about,
42
00:02:53,770 --> 00:02:58,770
we're not drawing a random
sample of responses.
43
00:02:59,500 --> 00:03:01,550
We're looking at the same one,
44
00:03:01,550 --> 00:03:06,200
and there's a clear direction of causality
45
00:03:06,200 --> 00:03:09,680
that we can in many cases assume
46
00:03:09,680 --> 00:03:11,240
and we have to account for.
47
00:03:11,240 --> 00:03:14,930
That the past influences the present
48
00:03:14,930 --> 00:03:18,123
and the present will influence the future.
49
00:03:21,540 --> 00:03:24,810
Again, we'll be looking
at a time series process
50
00:03:24,810 --> 00:03:29,810
that has both deterministic
and stochastic components,
51
00:03:30,290 --> 00:03:34,650
that there will be part
of our dependent variable,
52
00:03:34,650 --> 00:03:39,650
which the predicted value is
explained by the regressors.
53
00:03:40,280 --> 00:03:45,130
And also a stochastic
element where we have
54
00:03:45,130 --> 00:03:48,890
an error term of that which
was sort of unexpected
55
00:03:48,890 --> 00:03:51,173
or cannot be measured each time as well.
56
00:03:55,860 --> 00:03:58,860
Looking at all of the topics
57
00:03:58,860 --> 00:04:00,320
in this time series
58
00:04:03,750 --> 00:04:06,610
section again today,
59
00:04:06,610 --> 00:04:10,160
we're gonna be looking at
very well behaved data.
60
00:04:10,160 --> 00:04:13,400
What are the assumptions,
which are in some cases
61
00:04:13,400 --> 00:04:18,400
sort of unrealistic and simplistic
and building from there,
62
00:04:18,610 --> 00:04:23,610
and then looking at a slightly
less well behaved data
63
00:04:24,000 --> 00:04:26,850
and transformations that we can make
64
00:04:26,850 --> 00:04:28,800
to make them better behaved.
65
00:04:28,800 --> 00:04:33,800
And last, how do we test for
these assumptions being true
66
00:04:33,970 --> 00:04:35,200
and what do we adjust
67
00:04:35,200 --> 00:04:38,560
and how do we adjust if
we find that the data
68
00:04:38,560 --> 00:04:42,463
are not as well behaved as we would like?
69
00:04:47,100 --> 00:04:49,520
There are two basic kinds of models
70
00:04:49,520 --> 00:04:53,010
in time series, static
and finite distributed lag
71
00:04:53,010 --> 00:04:54,913
and I will talk to you about each one.
72
00:04:58,420 --> 00:05:01,260
This is a static model and it's static
73
00:05:01,260 --> 00:05:06,260
because you can see in the
equation that starts at YT
74
00:05:08,290 --> 00:05:13,290
that we're looking at every
thing over the same time.
75
00:05:13,420 --> 00:05:18,420
So the T would be say
the years of the study.
76
00:05:19,500 --> 00:05:23,503
So we might have started
collecting data in 2001,
77
00:05:24,620 --> 00:05:29,010
in which case T equals 1 and continue on
78
00:05:29,010 --> 00:05:32,473
through the year 2020, when T equals 20.
79
00:05:35,633 --> 00:05:40,210
Here all of our variables
are contemporaneous,
80
00:05:40,210 --> 00:05:42,600
that they're happening at the same time,
81
00:05:42,600 --> 00:05:47,600
that we only assume
that the regressor, ZT,
82
00:05:47,740 --> 00:05:50,343
had any effect on that year's YT.
83
00:05:51,672 --> 00:05:56,672
That nothing that happened
in the past has any effect,
84
00:05:57,270 --> 00:05:59,080
and that is why it's static
85
00:06:00,040 --> 00:06:03,710
and that it's all
happening in a single time
86
00:06:03,710 --> 00:06:07,730
and this is very much
like cross sectional data.
87
00:06:07,730 --> 00:06:10,520
And just like there that we can assume
88
00:06:10,520 --> 00:06:15,100
with a single regressor
that the change in Y
89
00:06:15,100 --> 00:06:19,273
will be beta one times the change in Z.
90
00:06:20,150 --> 00:06:25,150
And assuming that the Z is exogenous
91
00:06:26,570 --> 00:06:29,060
and it has no effect on you
92
00:06:29,060 --> 00:06:33,170
that we can do this sort
of simple calculation.
93
00:06:33,170 --> 00:06:38,170
And the a beta is then the change in Y
94
00:06:39,040 --> 00:06:43,733
as a result of the change in
Z, the slope of that line.
95
00:06:46,630 --> 00:06:50,843
We could of course, we could
of course add more regressors.
96
00:06:53,623 --> 00:06:57,573
Z2, Z3 as many as what
we think are correct.
97
00:06:59,700 --> 00:07:03,180
And these data are very much in this model
98
00:07:03,180 --> 00:07:05,510
is very much like cross section,
99
00:07:05,510 --> 00:07:08,670
except now we're looking across time
100
00:07:08,670 --> 00:07:10,470
instead of across space.
101
00:07:10,470 --> 00:07:15,280
We're looking at the different realization
102
00:07:15,280 --> 00:07:17,210
of a variable over time,
103
00:07:17,210 --> 00:07:22,210
instead of that variable
across different respondents.
104
00:07:26,850 --> 00:07:30,610
In contrast here is a
finite distributed lag.
105
00:07:30,610 --> 00:07:34,760
So lag meaning that
regressors from past years
106
00:07:37,530 --> 00:07:40,810
are thought to have an
effect on this year's Y.
107
00:07:40,810 --> 00:07:43,100
So you say YT and ZT,
108
00:07:43,100 --> 00:07:46,800
but also ZT minus one, last year's Z,
109
00:07:46,800 --> 00:07:49,400
and ZT minus two, two years ago's Z.
110
00:07:49,400 --> 00:07:52,390
And since there are two lagged regressors,
111
00:07:52,390 --> 00:07:56,610
this is known as a finite distributed
112
00:07:56,610 --> 00:07:58,600
lag model of order two,
113
00:07:58,600 --> 00:08:01,710
because there are two lags
and we could have more lags
114
00:08:03,010 --> 00:08:05,963
if we think that that
is the correct model.
115
00:08:11,370 --> 00:08:12,890
Here are the assumptions
116
00:08:12,890 --> 00:08:17,890
of well-behaved OLS time series data.
117
00:08:18,310 --> 00:08:22,360
So the first is, and these
should look pretty familiar,
118
00:08:22,360 --> 00:08:26,600
the first is that it
is linear in parameters
119
00:08:26,600 --> 00:08:31,600
that we can model the
data generating process
120
00:08:32,090 --> 00:08:35,660
as a linear function.
121
00:08:35,660 --> 00:08:39,890
Next that as before each regressor has
122
00:08:39,890 --> 00:08:43,280
to contribute some new information.
123
00:08:43,280 --> 00:08:47,150
That no Z or X can be a constant
124
00:08:47,150 --> 00:08:52,150
or a linear function of
the other regressors.
125
00:08:52,220 --> 00:08:56,963
Again, that it has to
provide some new information.
126
00:08:58,640 --> 00:09:03,640
Third is the exogeneity assumption
127
00:09:04,570 --> 00:09:09,570
that every error term is uncorrelated
128
00:09:10,240 --> 00:09:13,270
with each of the regressors.
129
00:09:15,040 --> 00:09:18,560
If this holds true, a weaker assumption
130
00:09:18,560 --> 00:09:21,450
that this year's error term
131
00:09:21,450 --> 00:09:25,840
is uncorrelated with
this year's regressors,
132
00:09:25,840 --> 00:09:30,180
this is called
contemporaneously exogenous,
133
00:09:30,180 --> 00:09:33,200
but there's also a stronger assumption
134
00:09:33,200 --> 00:09:37,237
that this year's error term
135
00:09:39,270 --> 00:09:44,270
is uncorrelated with any
regressors for any other time.
136
00:09:47,260 --> 00:09:50,250
And that the regressors are uncorrelated
137
00:09:50,250 --> 00:09:52,990
with the error term for any other time.
138
00:09:52,990 --> 00:09:55,540
This again is a stronger assumption
139
00:09:55,540 --> 00:10:00,463
and when this holds, this
is called strict exogeneity.
140
00:10:04,110 --> 00:10:08,650
More on the third assumption
that we don't really have
141
00:10:08,650 --> 00:10:12,680
to worry about this in
cross sectional data,
142
00:10:12,680 --> 00:10:15,520
that your error term,
143
00:10:15,520 --> 00:10:18,360
some strange thing that
may have happened to you,
144
00:10:18,360 --> 00:10:22,100
or some relatively small and minor thing
145
00:10:22,100 --> 00:10:25,060
that we are not able to measure
146
00:10:25,060 --> 00:10:27,770
is not effected by my regressors.
147
00:10:27,770 --> 00:10:32,600
So whether you found a dollar
148
00:10:32,600 --> 00:10:35,270
or got a flat tire
149
00:10:35,270 --> 00:10:40,270
is not effected by my income
and my age and my behaviors
150
00:10:41,760 --> 00:10:45,313
and so forth since it's a random sample.
151
00:10:46,330 --> 00:10:49,640
But we don't randomly
sample in time series
152
00:10:49,640 --> 00:10:52,380
because we want a series of data
153
00:10:53,670 --> 00:10:56,260
and they are all very intimately tied.
154
00:10:56,260 --> 00:10:58,870
That what happened last year
155
00:10:58,870 --> 00:11:02,940
might affect what happens to this year,
156
00:11:02,940 --> 00:11:06,273
because we're looking at the
same phenomenon over time.
157
00:11:12,270 --> 00:11:16,390
This assumption three fails the
same in cross sectional data
158
00:11:16,390 --> 00:11:18,860
in cases where there's a measurement error
159
00:11:18,860 --> 00:11:23,860
or omitted variables, as
well as in time series.
160
00:11:24,470 --> 00:11:27,840
If there is a lag defect
that this may hold true,
161
00:11:27,840 --> 00:11:30,047
that this year's error term
162
00:11:33,910 --> 00:11:38,910
may be effected by last year's regressor.
163
00:11:40,040 --> 00:11:44,990
And in this case, we want to
put last year's regressors,
164
00:11:44,990 --> 00:11:48,870
or last two years
regressors into our model.
165
00:11:48,870 --> 00:11:53,410
So we can specifically
account and control for them
166
00:11:53,410 --> 00:11:56,163
and to take them out of the error term.
167
00:11:59,610 --> 00:12:02,500
When these three assumptions hold,
168
00:12:02,500 --> 00:12:05,790
we can assume that OLS is unbiased.
169
00:12:05,790 --> 00:12:09,830
And when there is an omitted variable,
170
00:12:09,830 --> 00:12:14,830
the same kind of omitted variable
bias analysis that we did
171
00:12:15,440 --> 00:12:19,940
way back in the beginning
of class would hold true.
172
00:12:19,940 --> 00:12:24,940
The same basic themes and
analyses would hold true.
173
00:12:29,340 --> 00:12:31,140
The fourth assumption
174
00:12:31,140 --> 00:12:35,440
in well-behaved time series
data is homoskedasticity.
175
00:12:36,630 --> 00:12:40,100
That the variance of our error term,
176
00:12:40,100 --> 00:12:45,100
given any value of X, our
regressor is a constant.
177
00:12:45,270 --> 00:12:50,270
So before it was that the variance
178
00:12:51,180 --> 00:12:53,663
of your error term and my error term,
179
00:12:55,040 --> 00:12:58,120
had a constant variance.
180
00:12:58,120 --> 00:13:02,730
Now it's that the variance
of this year's error term
181
00:13:02,730 --> 00:13:06,090
and last year's error
term, going back in time,
182
00:13:06,090 --> 00:13:08,550
all have a constant variance.
183
00:13:08,550 --> 00:13:12,610
That's what homoskedasticity
means in this context.
184
00:13:14,180 --> 00:13:17,940
And we could do the same
basic kinds of tests
185
00:13:17,940 --> 00:13:20,470
like we did in cross sectional data
186
00:13:20,470 --> 00:13:23,130
to test for homoskedasticity
187
00:13:23,130 --> 00:13:26,573
such as the White test and
the Breusch-Pagan test.
188
00:13:33,170 --> 00:13:38,050
The fifth assumption is
no serial correlation
189
00:13:38,050 --> 00:13:40,280
that this year's error term
190
00:13:40,280 --> 00:13:43,447
and last year's error
term are uncorrelated.
191
00:13:45,940 --> 00:13:50,490
And in fact, this year's
error term is uncorrelated
192
00:13:50,490 --> 00:13:54,343
with any error term from any past year.
193
00:13:55,820 --> 00:14:00,820
The violation is that something
abnormally high this year
194
00:14:02,960 --> 00:14:06,480
is also abnormally high last year,
195
00:14:06,480 --> 00:14:10,470
that the thing that we
sort of failed to measure,
196
00:14:10,470 --> 00:14:14,300
the minuscule thing that
we failed to measure
197
00:14:14,300 --> 00:14:18,830
or didn't account for this year
is that same minuscule thing
198
00:14:18,830 --> 00:14:21,740
that we didn't account for last year.
199
00:14:21,740 --> 00:14:26,013
And we will learn a
test on how to do that.
200
00:14:27,410 --> 00:14:31,480
I think it's a lot like the in panel data,
201
00:14:33,940 --> 00:14:37,370
the idea that there's
this AI that goes along
202
00:14:37,370 --> 00:14:42,370
with every individual respondent
203
00:14:42,580 --> 00:14:47,580
that is very hard to measure
but that holds through
204
00:14:48,070 --> 00:14:53,020
and here again, I think
that the parts of the model
205
00:14:53,020 --> 00:14:57,170
that are just very hard to
measure will follow through
206
00:14:57,170 --> 00:15:00,200
because we're looking at the same thing
207
00:15:00,200 --> 00:15:04,950
and it's likely that the
errors will be correlated.
208
00:15:06,950 --> 00:15:09,640
And again, we will learn a test
209
00:15:09,640 --> 00:15:14,640
on how to see if this is
true later on in this series.
210
00:15:16,010 --> 00:15:21,010
Note that let's not rule out
that the Ys are correlated,
211
00:15:21,120 --> 00:15:23,160
that they almost always are.
212
00:15:23,160 --> 00:15:26,400
That last year's GDP
213
00:15:26,400 --> 00:15:31,400
is highly correlated
with the year before that
214
00:15:32,540 --> 00:15:35,503
and the year before that, et cetera.
215
00:15:38,210 --> 00:15:40,990
When these five assumptions hold,
216
00:15:40,990 --> 00:15:45,990
the variance of the beta J has
the same formula as before.
217
00:15:46,040 --> 00:15:49,260
It is a function of the variance
218
00:15:49,260 --> 00:15:52,910
of the overall error term, sigma squared,
219
00:15:52,910 --> 00:15:57,910
as well as how spread out that XJ is
220
00:15:58,920 --> 00:16:02,430
and how collinear that XJ is
221
00:16:02,430 --> 00:16:05,253
with the other regressors in our model.
222
00:16:06,250 --> 00:16:09,490
And under those assumptions,
223
00:16:09,490 --> 00:16:11,790
those five assumptions,
the three unbiasedness
224
00:16:13,120 --> 00:16:17,900
and the sort of spherical errors
assumption for time series.
225
00:16:17,900 --> 00:16:22,030
If they all hold, if they are
all true, then OLS is blue.
226
00:16:22,030 --> 00:16:24,540
It's the best linear unbiased estimator.
227
00:16:24,540 --> 00:16:27,523
It's the most efficient,
unbiased estimator.
228
00:16:31,910 --> 00:16:35,770
Remember that when we started
to get into hypothesis tests
229
00:16:35,770 --> 00:16:38,400
that we added a new assumption,
230
00:16:38,400 --> 00:16:43,400
and that is the error term
is normally distributed
231
00:16:44,140 --> 00:16:46,100
with mean zero
232
00:16:46,100 --> 00:16:51,100
and variance sigma squared.
233
00:16:52,210 --> 00:16:56,600
And we are adding another
slightly stronger assumption
234
00:16:56,600 --> 00:17:01,600
that the Us are not only
independent of X as before,
235
00:17:02,350 --> 00:17:04,110
I think at assumption four,
236
00:17:04,110 --> 00:17:06,747
but that they are independently
237
00:17:09,740 --> 00:17:12,903
and identically distributed, IID.
238
00:17:13,780 --> 00:17:18,150
So if the sixth assumption holds
239
00:17:18,150 --> 00:17:23,150
then the OLS estimators also
240
00:17:23,560 --> 00:17:27,630
have predictable statistical properties,
241
00:17:27,630 --> 00:17:31,200
and we can use the kind
of inference tests,
242
00:17:31,200 --> 00:17:33,270
T-tests, the F-tests,
243
00:17:33,270 --> 00:17:38,270
and we can create confidence
intervals the same,
244
00:17:40,790 --> 00:17:43,250
and they have the same interpretation
245
00:17:43,250 --> 00:17:45,653
as we learned about in cross section.
246
00:17:48,810 --> 00:17:51,671
So just a bit more on IID.
247
00:17:51,671 --> 00:17:56,630
IID stands for independent
and identically distributed.
248
00:17:56,630 --> 00:18:00,320
So one way of thinking about this is
249
00:18:00,320 --> 00:18:05,250
the bingo ball wheel that
you're drawing the errors from.
250
00:18:05,250 --> 00:18:09,730
It does not change, that it's
the exact same distribution.
251
00:18:09,730 --> 00:18:14,730
That's what the identically
distributed means.
252
00:18:14,760 --> 00:18:19,670
It's the exact same bell curve probability
253
00:18:19,670 --> 00:18:23,313
of drawing any error term over every year.
254
00:18:24,590 --> 00:18:28,460
And the independent means
that it has no memory,
255
00:18:28,460 --> 00:18:32,020
that the ball is replaced each year,
256
00:18:32,020 --> 00:18:34,980
that the error term
that you drew this time
257
00:18:34,980 --> 00:18:39,273
has no effect on the error
term that you draw next time.
258
00:18:47,150 --> 00:18:51,560
We very often want to use
dummy variables in time series
259
00:18:51,560 --> 00:18:53,740
to account for whether
260
00:18:53,740 --> 00:18:58,740
an important event or policy or regime
261
00:18:59,360 --> 00:19:01,653
was in place during that time.
262
00:19:03,190 --> 00:19:07,590
A good example would be if there was
263
00:19:07,590 --> 00:19:12,590
a big recall of romaine,
which we've seen a few times,
264
00:19:13,940 --> 00:19:18,940
that the demand for romaine
would depend not only sort of
265
00:19:19,540 --> 00:19:24,540
on the supply and the prices
and the people's preferences
266
00:19:26,410 --> 00:19:30,410
and things, and their
incomes and things like that,
267
00:19:30,410 --> 00:19:35,140
but you would want to have
a dummy variable in place
268
00:19:35,140 --> 00:19:37,670
that accounts for there
was this huge scare
269
00:19:37,670 --> 00:19:39,690
and there was this huge recall
270
00:19:41,300 --> 00:19:43,873
that happened during that time.
271
00:19:45,920 --> 00:19:49,770
Otherwise you're not really
going to get a good account
272
00:19:49,770 --> 00:19:52,610
of what is the effect of income
273
00:19:52,610 --> 00:19:57,270
and price and supply and preferences.
274
00:19:57,270 --> 00:19:59,580
And I was thinking as well,
275
00:19:59,580 --> 00:20:03,670
that when we model time series GDP
276
00:20:03,670 --> 00:20:08,610
and other economic variables for right now
277
00:20:08,610 --> 00:20:12,940
that we will probably need
to have a dummy variable
278
00:20:12,940 --> 00:20:17,940
for the coronavirus that, you know,
279
00:20:19,200 --> 00:20:21,290
the intercept shift of what happened
280
00:20:21,290 --> 00:20:24,353
just as a result of that
holding all else equal.
281
00:20:27,820 --> 00:20:31,460
It's also very common in time series
282
00:20:31,460 --> 00:20:33,620
to acknowledge that the data
283
00:20:33,620 --> 00:20:36,433
have natural trends and seasons.
284
00:20:37,420 --> 00:20:42,420
So trends take into account that there's,
285
00:20:43,560 --> 00:20:47,420
especially in economics,
that there is a time trend,
286
00:20:47,420 --> 00:20:51,210
that our GDP increases over time,
287
00:20:51,210 --> 00:20:55,050
worker productivity increases over time
288
00:20:55,050 --> 00:20:59,540
sort of independent of
many, many other factors.
289
00:20:59,540 --> 00:21:02,000
That it adds this sort of a natural
290
00:21:02,000 --> 00:21:04,903
or recurring time trend built into it.
291
00:21:06,440 --> 00:21:10,390
And it's important that
we account for that
292
00:21:10,390 --> 00:21:15,390
because if productivity and
GDP are both trending upward,
293
00:21:18,250 --> 00:21:22,730
that failing to account for
that trend in both of them
294
00:21:22,730 --> 00:21:25,380
would make the relationship between them
295
00:21:25,380 --> 00:21:27,690
seem stronger than it actually is
296
00:21:27,690 --> 00:21:29,390
and we would get a biased estimate.
297
00:21:29,390 --> 00:21:31,690
That it would be in a sense,
298
00:21:31,690 --> 00:21:34,600
an omitted variable bias by failing
299
00:21:34,600 --> 00:21:37,183
to account for that trend.
300
00:21:40,710 --> 00:21:44,440
Here is a very simple example
301
00:21:47,890 --> 00:21:50,460
of a Y that has a time trend.
302
00:21:50,460 --> 00:21:54,070
So T would just be the actual year
303
00:21:54,070 --> 00:21:57,630
that the data were collected.
304
00:21:57,630 --> 00:22:02,630
So again, if we have 2001 to 2020 data,
305
00:22:03,270 --> 00:22:07,730
T equals 1 in year 2001, and so on,
306
00:22:07,730 --> 00:22:11,530
T equals 20 in year 2020,
307
00:22:13,760 --> 00:22:17,680
and in this model, holding all else equal,
308
00:22:17,680 --> 00:22:22,680
Y will increase or shrink
by the rate of A1 each year.
309
00:22:23,430 --> 00:22:26,550
So if A1 is greater than zero,
310
00:22:26,550 --> 00:22:31,550
the predicted value of Y will
increase by A1 each year.
311
00:22:39,290 --> 00:22:42,230
We very often model things with
312
00:22:42,230 --> 00:22:45,860
a number of regressors and a time trend.
313
00:22:45,860 --> 00:22:48,457
So you see here, there's two regressors,
314
00:22:48,457 --> 00:22:52,870
X1, X2, as well as this time trend, T.
315
00:22:52,870 --> 00:22:57,170
And again, that this
accounts for the trend
316
00:22:57,170 --> 00:23:02,170
that may be in either Y or in X1 or X2,
317
00:23:03,570 --> 00:23:08,570
and by leaving out this trend
by not controlling for it,
318
00:23:08,830 --> 00:23:12,520
it can cause omitted variable bias.
319
00:23:12,520 --> 00:23:17,480
And note that you should
include a time trend,
320
00:23:17,480 --> 00:23:22,010
even if the Xs trend and
the Ys don't that you want
321
00:23:22,010 --> 00:23:26,380
to take out the effect
of time over these Xs
322
00:23:26,380 --> 00:23:30,733
and holding time equal,
how does X affect Y.
323
00:23:37,530 --> 00:23:42,530
The R squared of a time series
model tends to be quite high,
324
00:23:43,970 --> 00:23:48,890
that these kind of aggregated
data tend to explain
325
00:23:48,890 --> 00:23:52,910
the relationships and
result in a higher R squared
326
00:23:52,910 --> 00:23:55,130
than we get with something
327
00:23:55,130 --> 00:23:59,170
like doing a survey and
how do the regressors
328
00:23:59,170 --> 00:24:04,170
across individual people
affect the dependent.
329
00:24:04,220 --> 00:24:06,680
So, one way to get a better sense
330
00:24:06,680 --> 00:24:11,680
of how much of the variation
in Y is explained by X,
331
00:24:13,260 --> 00:24:16,680
that is R squared independent of time
332
00:24:16,680 --> 00:24:19,860
is to detrend our data.
333
00:24:19,860 --> 00:24:20,840
And you do that
334
00:24:22,505 --> 00:24:25,083
by regressing YT only on T
335
00:24:27,310 --> 00:24:29,700
and saving those residuals,
336
00:24:29,700 --> 00:24:32,567
which I call this weird Y umlaut.
337
00:24:33,478 --> 00:24:37,283
And then R squared is this formula
338
00:24:38,930 --> 00:24:42,960
where SSR is that of regressing,
339
00:24:44,000 --> 00:24:48,530
YT on X1, X2 and T.
340
00:24:48,530 --> 00:24:53,530
And it gives you a better idea
of taking away time trend,
341
00:24:54,810 --> 00:24:57,633
how well do the Xs explain our Y.
342
00:25:02,640 --> 00:25:06,940
There's also many economic phenomena
343
00:25:06,940 --> 00:25:08,970
that have a clear seasonality,
344
00:25:08,970 --> 00:25:11,450
such as most Christmas trees
345
00:25:11,450 --> 00:25:14,540
are sold in November and December,
346
00:25:14,540 --> 00:25:19,540
hay is sold in the summer when it's fresh.
347
00:25:22,290 --> 00:25:27,230
Another example is
nursery flower transplants
348
00:25:28,540 --> 00:25:32,200
are much more often sold in the spring.
349
00:25:32,200 --> 00:25:35,610
So you can include dummy variables
350
00:25:35,610 --> 00:25:39,520
for specific months or quarters or seasons
351
00:25:39,520 --> 00:25:44,520
where you think that sales will
be particularly high or low,
352
00:25:45,720 --> 00:25:50,720
model it with them and you
can use a T or an F-test
353
00:25:51,280 --> 00:25:53,900
to determine whether that has
354
00:25:53,900 --> 00:25:56,920
a significant effect on our model
355
00:25:56,920 --> 00:26:01,920
and much like we did before,
we can deseasonalize our Ys,
356
00:26:03,200 --> 00:26:06,380
where we regress our YTs
357
00:26:06,380 --> 00:26:11,380
only on our seasonal dummy
variables, save those residual,
358
00:26:14,390 --> 00:26:19,390
and then regress our Xs
359
00:26:19,560 --> 00:26:22,900
on those Y umlauts again,
360
00:26:22,900 --> 00:26:26,030
and use R squared as another measure
361
00:26:26,030 --> 00:26:28,203
of the goodness of fit of the model.
362
00:26:30,880 --> 00:26:34,050
Basically, again, that
many of the assumptions
363
00:26:34,050 --> 00:26:37,700
are unrealistic, too restrictive,
364
00:26:37,700 --> 00:26:40,000
doesn't really meet the real world.
365
00:26:40,000 --> 00:26:43,100
So we're gonna look at those models
366
00:26:43,100 --> 00:26:48,100
and look at some common
ones and some principles
367
00:26:48,220 --> 00:26:50,300
that we have to abide by
368
00:26:50,300 --> 00:26:52,970
for those assumptions to fit and last,
369
00:26:52,970 --> 00:26:56,750
a transformation that
you're already familiar with
370
00:26:56,750 --> 00:26:57,850
that can deal with it.
371
00:27:02,270 --> 00:27:06,430
So two pieces of vocabulary
that we're gonna deal with,
372
00:27:06,430 --> 00:27:08,860
dependence and stationarity.
373
00:27:08,860 --> 00:27:10,500
And both of them
374
00:27:10,500 --> 00:27:13,023
cause issues with inference,
375
00:27:16,240 --> 00:27:20,700
that it can mess up our
hypothesis tests, our DNF tests
376
00:27:20,700 --> 00:27:22,823
that we would like to run on our data.
377
00:27:28,470 --> 00:27:32,400
We think about dependence and how much
378
00:27:32,400 --> 00:27:37,400
does the past affect the
present and the future.
379
00:27:39,150 --> 00:27:42,603
And we say that a time series process
380
00:27:46,150 --> 00:27:48,500
is a weakly dependent,
381
00:27:48,500 --> 00:27:53,113
that as the distance between time periods
382
00:27:55,380 --> 00:27:58,360
gets larger, the relationship
383
00:27:58,360 --> 00:28:01,390
between those variables gets small.
384
00:28:01,390 --> 00:28:06,390
So basically we want weak dependence
385
00:28:06,690 --> 00:28:10,620
so that this year's
386
00:28:13,310 --> 00:28:18,310
realization of X and future years,
387
00:28:18,520 --> 00:28:23,033
that relationship gets
very small, very fast.
388
00:28:24,190 --> 00:28:28,290
You can think of independent is that like
389
00:28:28,290 --> 00:28:31,070
when we're drawing a random sample
390
00:28:31,070 --> 00:28:34,270
where the coin has no memory at all,
391
00:28:34,270 --> 00:28:38,100
that last year that you
and I as respondents
392
00:28:38,100 --> 00:28:42,713
don't have any thing
to do with each other.
393
00:28:45,030 --> 00:28:50,030
Dependence really, where they
are very intimately related
394
00:28:53,500 --> 00:28:57,360
and last year and past
years effect this year
395
00:28:57,360 --> 00:29:00,740
in a very strong way, or thinking about
396
00:29:00,740 --> 00:29:04,250
that the coin remembers every flip
397
00:29:06,912 --> 00:29:11,507
really messes up inference
and it makes it impossible.
398
00:29:13,170 --> 00:29:17,793
Where with weak dependence,
this memory fades over time.
399
00:29:18,960 --> 00:29:23,960
Stationarity means that the variables
400
00:29:23,960 --> 00:29:28,960
have the same probability
distribution over time.
401
00:29:29,630 --> 00:29:32,970
So much as, we assumed that they have
402
00:29:32,970 --> 00:29:37,800
this normal IID distribution over time,
403
00:29:42,110 --> 00:29:45,100
we need to make assumptions
404
00:29:45,100 --> 00:29:49,143
and know that the probability distribution
405
00:29:50,397 --> 00:29:55,360
of our error terms and of
our dependent variables
406
00:29:55,360 --> 00:29:59,373
have a predictable, the same distribution
407
00:30:00,920 --> 00:30:05,350
in order for us to be able
to make any kind of inference
408
00:30:05,350 --> 00:30:07,410
and use T-tests and F-tests
409
00:30:07,410 --> 00:30:10,173
and those tests that
we have learned about.
410
00:30:13,180 --> 00:30:15,820
Basically weak dependence is needed
411
00:30:15,820 --> 00:30:18,280
for the law of large numbers
412
00:30:18,280 --> 00:30:22,100
and the central limit theorem to hold.
413
00:30:22,100 --> 00:30:25,570
That because we do not random sample,
414
00:30:25,570 --> 00:30:30,570
if the realizations are
too closely related,
415
00:30:31,600 --> 00:30:33,340
we can't do inference
416
00:30:33,340 --> 00:30:38,340
and strong dependence
totally messes it up.
417
00:30:38,640 --> 00:30:41,460
So we are going to look at models
418
00:30:41,460 --> 00:30:45,910
that have strong and weak dependence
419
00:30:45,910 --> 00:30:50,090
that we may encounter in time series data.
420
00:30:54,070 --> 00:30:57,920
So what we're finding is
these autoregressive models
421
00:30:57,920 --> 00:31:02,300
where last year's Y
422
00:31:02,300 --> 00:31:06,530
or past year's Y appear as regressors
423
00:31:06,530 --> 00:31:11,530
with this year's Y on the left side.
424
00:31:12,018 --> 00:31:13,800
And then two examples of that,
425
00:31:13,800 --> 00:31:18,800
the so-called unit root
process and the random walk.
426
00:31:19,660 --> 00:31:23,323
And we'll go through each one
over in the next few slides.
427
00:31:27,750 --> 00:31:28,940
Here's an example
428
00:31:28,940 --> 00:31:33,803
of an autoregressive process
of order one, an AR1.
429
00:31:34,700 --> 00:31:39,620
It's autoregressive,
because last year's Y,
430
00:31:39,620 --> 00:31:43,323
YT minus one, appears as a regressor.
431
00:31:44,210 --> 00:31:46,870
And it's of order one because only
432
00:31:46,870 --> 00:31:51,870
one lagged dependent variable
is a regressor in this model.
433
00:31:57,470 --> 00:32:01,690
The properties of this model
434
00:32:01,690 --> 00:32:04,770
is that Y starts at some sort
435
00:32:04,770 --> 00:32:08,030
of starting point, at time zero.
436
00:32:08,030 --> 00:32:13,030
Our error term is again,
IID zero sigma squared,
437
00:32:14,060 --> 00:32:19,060
and that the ETs are
independent of Y naught.
438
00:32:19,700 --> 00:32:23,240
And just for simplicity, we often assume
439
00:32:23,240 --> 00:32:27,003
that the expected value
of Y naught equals zero,
440
00:32:28,050 --> 00:32:33,050
or we could subtract
out that value each time
441
00:32:33,270 --> 00:32:37,100
to make the math easier.
442
00:32:37,100 --> 00:32:39,720
So we're gonna look at the value
443
00:32:39,720 --> 00:32:43,620
of this coefficient P
444
00:32:43,620 --> 00:32:47,230
or I believe in the
Wooldridge book it's rho,
445
00:32:47,230 --> 00:32:50,350
but I just made it a P
446
00:32:50,350 --> 00:32:55,203
because I have P on my
keyboard and not rho.
447
00:32:57,180 --> 00:33:01,550
So it's important that for any kind of...
448
00:33:08,490 --> 00:33:12,100
So when rho is less than one,
449
00:33:12,100 --> 00:33:17,100
then we see that with the
expected value of YT plus H,
450
00:33:17,710 --> 00:33:22,133
given our starting point YT,
is rho to the H times YT,
451
00:33:23,360 --> 00:33:27,180
for any number of years greater than one.
452
00:33:27,180 --> 00:33:31,750
And we see that the influence
of our starting point,
453
00:33:31,750 --> 00:33:34,090
YT gets smaller and smaller over time,
454
00:33:34,090 --> 00:33:37,770
and it approaches zero overall
as H gets bigger and bigger.
455
00:33:37,770 --> 00:33:41,470
And this is what we want,
we have weak dependence.
456
00:33:41,470 --> 00:33:44,580
When rho equals one,
457
00:33:44,580 --> 00:33:48,390
the best prediction of
YT plus H is always YT.
458
00:33:52,050 --> 00:33:56,550
We call this a unit root
process when rho equals one,
459
00:33:56,550 --> 00:33:59,400
and this leads to strong dependence,
460
00:33:59,400 --> 00:34:03,180
and we need to deal with this
461
00:34:03,180 --> 00:34:08,180
in order to do estimation and inference.
462
00:34:10,200 --> 00:34:14,120
So again, the unit root process
463
00:34:14,120 --> 00:34:18,263
is when this P or rho equals one.
464
00:34:22,400 --> 00:34:27,400
One very common unit root process is the
465
00:34:28,790 --> 00:34:33,153
so-called random walk where Y,
466
00:34:34,830 --> 00:34:39,830
this year, equals last
year's Y plus an error term.
467
00:34:40,520 --> 00:34:43,100
And we assume that the error term, ET,
468
00:34:43,100 --> 00:34:47,860
is as a mean of zero and constant variance
469
00:34:47,860 --> 00:34:52,860
and so Y starts at the
previous year's value
470
00:34:52,970 --> 00:34:54,830
and adds an error term.
471
00:34:54,830 --> 00:34:59,830
And every year just adds that error term
472
00:35:01,440 --> 00:35:05,783
to last year's value so
that there's a random walk.
473
00:35:06,810 --> 00:35:09,830
The random part is the error,
474
00:35:09,830 --> 00:35:12,640
and it might be high, it might be low,
475
00:35:12,640 --> 00:35:17,010
but it'll just sort of
walk around that mean.
476
00:35:17,010 --> 00:35:22,010
And if we assume, Y naught,
just for mathematical
477
00:35:22,360 --> 00:35:24,533
again equals zero,
478
00:35:27,030 --> 00:35:31,630
and the expected value of Y in any year
479
00:35:34,240 --> 00:35:36,980
equals this starting point.
480
00:35:36,980 --> 00:35:39,930
So you can imagine the line sort of
481
00:35:39,930 --> 00:35:44,930
meandering all around over
time, going up and down,
482
00:35:45,840 --> 00:35:50,840
but basically the
expected value of any year
483
00:35:50,900 --> 00:35:53,143
is that starting point.
484
00:35:56,930 --> 00:36:01,460
So the value of Y
485
00:36:03,140 --> 00:36:05,710
in year T plus H
486
00:36:05,710 --> 00:36:10,710
is just the sum of the error
terms over all those years,
487
00:36:11,150 --> 00:36:14,490
plus YT, the starting point.
488
00:36:14,490 --> 00:36:19,490
So the value of any year is the
sum of all those error terms
489
00:36:21,060 --> 00:36:26,060
plus the starting points.
490
00:36:26,274 --> 00:36:31,274
So if we assume a starting
point of zero, again,
491
00:36:31,700 --> 00:36:34,910
it'll just be a line that meanders around
492
00:36:36,140 --> 00:36:40,793
the zero line, the X axis.
493
00:36:42,060 --> 00:36:45,523
And that for any value of YT,
494
00:36:47,900 --> 00:36:51,517
that the expected value
of YT plus H equals YT.
495
00:36:54,780 --> 00:36:58,150
And you can see in the first equation
496
00:36:58,150 --> 00:37:01,440
that if we have a big enough H
497
00:37:01,440 --> 00:37:05,050
that if we add them
all up, that they will,
498
00:37:05,050 --> 00:37:08,450
since the expected value is
zero, that they are going
499
00:37:08,450 --> 00:37:12,150
to eventually converge and sum to zero.
500
00:37:12,150 --> 00:37:17,150
And so the expected value of YT plus H
501
00:37:18,390 --> 00:37:22,743
will always equal our starting point, YT.
502
00:37:27,720 --> 00:37:32,720
So another example is the
so-called random walk with drift.
503
00:37:32,900 --> 00:37:37,470
So this is again, a unit root process
504
00:37:37,470 --> 00:37:40,483
because our rho, or P, equals one.
505
00:37:44,640 --> 00:37:47,587
And in this case, there's a drift
506
00:37:47,587 --> 00:37:50,720
and that drift is A naught.
507
00:37:50,720 --> 00:37:55,720
So every this year is Y is last year's Y
508
00:37:55,910 --> 00:38:00,030
plus this drift plus this A naught.
509
00:38:00,030 --> 00:38:03,903
So this has both persistence and trend.
510
00:38:05,630 --> 00:38:10,630
It's persistent because it
builds on the last year's value
511
00:38:12,621 --> 00:38:14,850
and the expected value is
512
00:38:14,850 --> 00:38:18,167
of what will be carried over.
513
00:38:21,600 --> 00:38:25,170
And the trend is that every year
514
00:38:25,170 --> 00:38:27,463
it adds this factor A naught.
515
00:38:30,730 --> 00:38:32,780
So that let's us talk
516
00:38:32,780 --> 00:38:36,510
a bit more about persistence and trend.
517
00:38:36,510 --> 00:38:39,350
So we talked a bunch
about trend last time,
518
00:38:39,350 --> 00:38:43,890
and persistence is again,
519
00:38:43,890 --> 00:38:47,830
when something that happens
520
00:38:47,830 --> 00:38:51,453
is felt over time.
521
00:38:54,920 --> 00:38:58,420
So if we don't deal with them,
they'll mess up analysis.
522
00:38:58,420 --> 00:39:02,440
We learned last time, how trend can be
523
00:39:02,440 --> 00:39:05,800
an important omitted, variable bias,
524
00:39:05,800 --> 00:39:10,800
and the persistence is when some shock
525
00:39:11,400 --> 00:39:16,400
continues to be felt into the future,
526
00:39:16,890 --> 00:39:21,620
or that this year's realization
527
00:39:21,620 --> 00:39:25,150
is very closely related to past years,
528
00:39:25,150 --> 00:39:29,980
that all of last year's
and previous years values
529
00:39:29,980 --> 00:39:33,720
are sort of remembered over time.
530
00:39:33,720 --> 00:39:36,380
And it's interesting to note
531
00:39:36,380 --> 00:39:39,350
that interest rates are persistent,
532
00:39:39,350 --> 00:39:44,350
that this quarters interest rate
533
00:39:44,700 --> 00:39:48,920
is going to be very closely
related to last quarters,
534
00:39:48,920 --> 00:39:51,930
but they don't really trend.
535
00:39:51,930 --> 00:39:56,930
That they're not steadily
upward or downward over time
536
00:39:57,250 --> 00:39:59,930
whereas GDP is likely
537
00:39:59,930 --> 00:40:02,563
both persistent and trending.
538
00:40:05,320 --> 00:40:10,320
That, you know, in most
years here in the US,
539
00:40:11,890 --> 00:40:14,690
you'll get in a normal sort of growth
540
00:40:14,690 --> 00:40:19,680
that you'll get only
about 2% annual growth.
541
00:40:19,680 --> 00:40:23,053
So last year's GDP is very,
542
00:40:24,780 --> 00:40:28,090
plays a big factor on what this year's is,
543
00:40:28,090 --> 00:40:33,090
but there's this sort of
overall trend over time.
544
00:40:33,160 --> 00:40:37,660
Now, the huge contraction of our economy
545
00:40:40,430 --> 00:40:43,520
that we're experiencing with the virus
546
00:40:43,520 --> 00:40:46,880
sort of throws it off,
but that's hopefully
547
00:40:46,880 --> 00:40:51,773
a one in a lifetime event.
548
00:40:57,000 --> 00:41:02,000
So we can see that the
random walk with drift
549
00:41:02,200 --> 00:41:05,840
has both persistence and trend
550
00:41:05,840 --> 00:41:10,840
and so the A naught is the trend part
551
00:41:13,040 --> 00:41:16,740
and the fact that it's a unit root process
552
00:41:16,740 --> 00:41:19,263
makes it highly persistent.
553
00:41:20,820 --> 00:41:25,820
So for any year, T,
554
00:41:25,860 --> 00:41:27,003
that the YT
555
00:41:33,200 --> 00:41:36,940
equals this A naught added on T time.
556
00:41:36,940 --> 00:41:41,510
So A naught times T plus
all of the error terms,
557
00:41:41,510 --> 00:41:44,010
plus Y naught, our starting point.
558
00:41:44,010 --> 00:41:46,870
And if we, again, assume that
our starting point is zero,
559
00:41:46,870 --> 00:41:51,870
then the expected value
of YT is just A naught T.
560
00:41:52,510 --> 00:41:55,163
And if A naught is greater than zero,
561
00:41:56,160 --> 00:41:58,470
the expected value will grow over time.
562
00:41:58,470 --> 00:42:00,963
And if it's less, it
will shrink over time.
563
00:42:05,030 --> 00:42:10,030
So again, this is a
highly persistent process.
564
00:42:10,530 --> 00:42:15,050
It's a unit root,
565
00:42:15,050 --> 00:42:19,080
it's trending and persistent,
566
00:42:19,080 --> 00:42:23,650
and we need to be able
to transform the data
567
00:42:23,650 --> 00:42:28,650
so that it becomes weakly dependent
568
00:42:29,130 --> 00:42:32,053
and we can work with these data.
569
00:42:35,260 --> 00:42:38,150
Not surprisingly the most
570
00:42:38,150 --> 00:42:42,360
common transformation is differencing.
571
00:42:42,360 --> 00:42:46,640
That we saw that much like
we did with panel data
572
00:42:46,640 --> 00:42:48,940
where we subtract
573
00:42:50,940 --> 00:42:53,700
this years minus last years
574
00:42:53,700 --> 00:42:57,780
and that year's minus the
year before and so on.
575
00:42:57,780 --> 00:43:02,750
And in this way, it gets
rid of both the persistence
576
00:43:02,750 --> 00:43:07,330
because it's only measuring
the change over time.
577
00:43:07,330 --> 00:43:12,013
And the trend, because the A naught trend
578
00:43:13,080 --> 00:43:15,260
is subtracted out every time.
579
00:43:15,260 --> 00:43:20,260
And in that way, when we
transform with the difference,
580
00:43:20,730 --> 00:43:24,400
a random walk, then the change in Y
581
00:43:24,400 --> 00:43:28,070
then only becomes the
change in an error term.
582
00:43:28,070 --> 00:43:30,580
And if we have other regressors
583
00:43:30,580 --> 00:43:33,590
much like we saw with panel data,
584
00:43:33,590 --> 00:43:37,530
we can regress how the change in Y
585
00:43:37,530 --> 00:43:40,740
is a function of the change
in Xs just like we did
586
00:43:40,740 --> 00:43:44,633
in the first differencing
model in panel data.
587
00:43:49,090 --> 00:43:53,810
So by way of overview in the first section
588
00:43:53,810 --> 00:43:58,810
we looked at well behaved data
with restrictive assumptions
589
00:43:59,330 --> 00:44:01,540
and what are their properties.
590
00:44:01,540 --> 00:44:03,460
Here we're looking at times
591
00:44:03,460 --> 00:44:07,010
when the data are less well behaved,
592
00:44:07,010 --> 00:44:09,610
but with more realistic symptoms
593
00:44:09,610 --> 00:44:13,660
and what we can do to transform it.
594
00:44:13,660 --> 00:44:17,160
And then last, in the next one,
595
00:44:17,160 --> 00:44:21,957
we begin to look at
what happens if we have
596
00:44:22,990 --> 00:44:27,513
either serial correlation
or heteroskedasticity.
597
00:44:30,600 --> 00:44:34,070
Now we're gonna get into section three
598
00:44:34,070 --> 00:44:36,630
and look at what happens
599
00:44:36,630 --> 00:44:41,200
when the last few assumptions don't hold.
600
00:44:41,200 --> 00:44:44,990
And specifically when the assumption five
601
00:44:44,990 --> 00:44:48,980
of a serial correlation
602
00:44:49,860 --> 00:44:52,470
is a problem.
603
00:44:52,470 --> 00:44:57,120
So again, under serial correlation,
604
00:44:57,120 --> 00:44:59,760
our assumption does not hold,
605
00:44:59,760 --> 00:45:04,760
and as we'll see much
like heteroskedasticity,
606
00:45:06,390 --> 00:45:09,770
it leads to faulty inference
607
00:45:09,770 --> 00:45:12,350
and biased errors and things like that.
608
00:45:12,350 --> 00:45:15,970
So, and also inefficient models.
609
00:45:15,970 --> 00:45:18,320
So we're gonna learn how to test for it
610
00:45:18,320 --> 00:45:19,900
and how to deal with it
611
00:45:21,380 --> 00:45:25,513
and we will also revisit
heteroskedasticity briefly.
612
00:45:28,570 --> 00:45:31,723
Much like we learned with
cross-sectional data,
613
00:45:33,437 --> 00:45:37,187
heteroskedasticity and
autocorrelated errors,
614
00:45:39,070 --> 00:45:44,070
give us a faulty variance
estimator of the OLS estimators,
615
00:45:46,330 --> 00:45:48,910
and therefore we need to deal with them
616
00:45:48,910 --> 00:45:52,790
and they tend to be fairly
common in time series,
617
00:45:52,790 --> 00:45:55,900
especially autocorrelated errors,
618
00:45:55,900 --> 00:45:59,570
because we are not
random sampling each time
619
00:46:02,780 --> 00:46:05,420
parts of the model that we don't estimate
620
00:46:05,420 --> 00:46:10,420
that we can't measure,
that are seemingly random,
621
00:46:10,950 --> 00:46:15,950
recur over time, and we
need to deal with it.
622
00:46:16,530 --> 00:46:19,460
So, as we've learned before,
623
00:46:19,460 --> 00:46:23,523
when we have autocorrelated errors,
624
00:46:24,520 --> 00:46:29,170
the good news is that there is no bias,
625
00:46:29,170 --> 00:46:32,367
but as we learned with heteroskedasticity,
626
00:46:34,500 --> 00:46:38,170
this model will yield a higher variance
627
00:46:38,170 --> 00:46:39,710
so it's no longer blue.
628
00:46:39,710 --> 00:46:41,940
It's no longer the most
efficient estimator
629
00:46:41,940 --> 00:46:45,350
and the T-test and F-tests don't work
630
00:46:45,350 --> 00:46:48,373
due to the biased estimate
631
00:46:51,980 --> 00:46:54,930
of the variance.
632
00:46:54,930 --> 00:46:58,570
So the good news is that if
633
00:46:58,570 --> 00:47:02,840
the dynamics are properly
specified, there is no problems.
634
00:47:02,840 --> 00:47:05,560
So if we have put the right lags in there
635
00:47:05,560 --> 00:47:08,670
and every thing else we can deal with it.
636
00:47:08,670 --> 00:47:10,740
And there's also techniques
637
00:47:10,740 --> 00:47:13,370
that we will learn about
now to test for it.
638
00:47:13,370 --> 00:47:17,283
And in some cases to transform the data.
639
00:47:23,600 --> 00:47:28,120
Again, the problem with serial correlation
640
00:47:30,070 --> 00:47:33,180
is that OLS is no longer blue
641
00:47:33,180 --> 00:47:37,170
and all of the tests that
we use no longer work.
642
00:47:37,170 --> 00:47:39,940
Very much like heteroskedasticity
643
00:47:39,940 --> 00:47:41,390
that we've dealt with before.
644
00:47:43,780 --> 00:47:45,930
So we learned last time about
645
00:47:49,451 --> 00:47:50,901
the autoregressive model one.
646
00:47:52,000 --> 00:47:54,830
And we can assume in some cases
647
00:47:54,830 --> 00:47:59,520
that the errors follow a similar pattern.
648
00:47:59,520 --> 00:48:03,853
That this year's error is a
function of last year's error,
649
00:48:05,080 --> 00:48:09,380
plus some new E, some new
sort of random information.
650
00:48:09,380 --> 00:48:13,883
So when P here, rho, equals zero,
651
00:48:16,120 --> 00:48:19,500
then the errors are not correlated
652
00:48:19,500 --> 00:48:21,340
and we don't have a problem.
653
00:48:21,340 --> 00:48:23,573
When it's greater than zero,
654
00:48:24,790 --> 00:48:27,950
OLS underestimates the variance,
655
00:48:27,950 --> 00:48:31,990
when P is less than zero it overestimates,
656
00:48:31,990 --> 00:48:36,990
and our T and F and
Lagrange multiplier test
657
00:48:38,270 --> 00:48:40,623
and all of those no longer work.
658
00:48:44,800 --> 00:48:48,690
So when we have strictly
exogenous regressors,
659
00:48:48,690 --> 00:48:51,130
where none of the regressors
660
00:48:51,130 --> 00:48:54,590
or the error terms over
time are correlated,
661
00:48:54,590 --> 00:48:56,800
it makes it a little bit easier.
662
00:48:56,800 --> 00:48:59,210
And let's assume that we have
663
00:49:00,400 --> 00:49:05,400
this AR1 model for serial correlation
664
00:49:05,470 --> 00:49:07,370
that we saw last slide,
665
00:49:07,370 --> 00:49:11,870
where this year's error is a
function of last year's error.
666
00:49:11,870 --> 00:49:15,280
So you could simply run a regression
667
00:49:15,280 --> 00:49:18,023
where you put this year's error,
668
00:49:19,180 --> 00:49:22,860
basically error T on the left side
669
00:49:22,860 --> 00:49:27,860
and the year before that, T
minus one on the right side.
670
00:49:28,650 --> 00:49:33,373
Run a regression, our null
hypothesis is that our P,
671
00:49:34,370 --> 00:49:38,447
the coefficient with UT
minus one equals zero.
672
00:49:42,840 --> 00:49:46,440
And in this case,
673
00:49:46,440 --> 00:49:51,170
we hope that we get a small test stat,
674
00:49:51,170 --> 00:49:54,450
and we can fail to to reject our null
675
00:49:54,450 --> 00:49:59,450
and assume that we do not
have serial correlation.
676
00:49:59,610 --> 00:50:02,250
But if we do, if we reject our null,
677
00:50:02,250 --> 00:50:03,700
then we have to deal with it.
678
00:50:06,520 --> 00:50:08,440
So here's how you would do it.
679
00:50:08,440 --> 00:50:12,730
Again, First regress YT
680
00:50:12,730 --> 00:50:16,410
on each of your regressors,
save the residuals,
681
00:50:16,410 --> 00:50:21,180
then regress each UT on the previous year.
682
00:50:21,180 --> 00:50:23,740
So you would have to
sort of go in in Excel
683
00:50:23,740 --> 00:50:25,070
and cut and paste,
684
00:50:25,070 --> 00:50:29,500
and sort of just knock it down one row,
685
00:50:29,500 --> 00:50:34,500
and run a T-test on this coefficient.
686
00:50:34,524 --> 00:50:39,524
And hopefully you can
687
00:50:39,680 --> 00:50:42,720
fail to reject the null.
688
00:50:42,720 --> 00:50:44,850
And note that this only tests
689
00:50:44,850 --> 00:50:49,610
if the adjacent errors are correlated,
690
00:50:49,610 --> 00:50:51,910
since it's an AR1,
691
00:50:51,910 --> 00:50:55,780
it's only looking at the
previous year's error.
692
00:50:55,780 --> 00:51:00,510
So if UT and UT minus two
693
00:51:00,510 --> 00:51:05,510
are correlated, this test
wouldn't pick that up.
694
00:51:09,630 --> 00:51:13,250
A very common test is a
so-called Durbin-Watson test,
695
00:51:13,250 --> 00:51:17,700
where we also use UT or the UT hat.
696
00:51:17,700 --> 00:51:19,970
So we save the residuals,
697
00:51:19,970 --> 00:51:24,530
our null hypothesis again,
is that P equals zero.
698
00:51:24,530 --> 00:51:29,530
And this stat sums up on the denominator,
699
00:51:30,220 --> 00:51:34,230
UT minus, UT minus one, and squares that,
700
00:51:34,230 --> 00:51:35,900
and in the denominator,
701
00:51:35,900 --> 00:51:40,830
just the sum of squared
residuals as we've come to know.
702
00:51:40,830 --> 00:51:43,220
And then we look at the stats,
703
00:51:43,220 --> 00:51:47,700
so this stat will be approximately equal
704
00:51:47,700 --> 00:51:52,700
to two times the quantity one minus P.
705
00:51:53,490 --> 00:51:58,283
So if P equals zero,
706
00:52:00,370 --> 00:52:04,730
and then we can fail to reject the null
707
00:52:04,730 --> 00:52:08,560
and that's a good thing
that we don't have it.
708
00:52:08,560 --> 00:52:12,050
But if the number is far less than zero,
709
00:52:12,050 --> 00:52:16,340
so you're multiplying
by a number almost one.
710
00:52:16,340 --> 00:52:19,070
So one minus P is a very small number,
711
00:52:19,070 --> 00:52:22,610
multiplied by two is
still a very small number
712
00:52:22,610 --> 00:52:27,180
and implies that P is
significantly different than zero
713
00:52:27,180 --> 00:52:30,153
and we have serial correlation.
714
00:52:32,730 --> 00:52:36,060
When we don't have strict exogeneity,
715
00:52:37,177 --> 00:52:39,700
and this will happen most often
716
00:52:39,700 --> 00:52:42,510
when we have lagged regressors,
717
00:52:42,510 --> 00:52:46,030
when we not only put XT into the model,
718
00:52:46,030 --> 00:52:49,333
this year's regressor, but
also put last years in.
719
00:52:50,560 --> 00:52:52,840
This is more common,
720
00:52:52,840 --> 00:52:57,840
and this could be valid for
any number of regressors.
721
00:52:58,070 --> 00:53:01,393
So this is the more sort of general case.
722
00:53:06,350 --> 00:53:09,190
The test is much the same,
723
00:53:09,190 --> 00:53:14,190
except now in the second step,
we include our regressors.
724
00:53:15,690 --> 00:53:20,690
So we start with regressing
YT just on our regressors
725
00:53:20,980 --> 00:53:23,050
and save the residuals.
726
00:53:23,050 --> 00:53:26,560
And then we re regress the residuals
727
00:53:26,560 --> 00:53:31,560
on the original regressors
and on UT minus one,
728
00:53:31,640 --> 00:53:36,640
and run the T-test on P, the
coefficient of UT minus one.
729
00:53:38,600 --> 00:53:41,710
And by including the regressors,
730
00:53:41,710 --> 00:53:46,710
we are controlling for any correlations
731
00:53:47,590 --> 00:53:52,550
between our Xs and our error terms.
732
00:53:52,550 --> 00:53:57,490
So we don't need to make this assumption,
733
00:53:57,490 --> 00:54:02,407
but we might have to
calculate, use robust errors
734
00:54:03,410 --> 00:54:07,693
if we suspect that
heteroskedasticity is present.
735
00:54:14,410 --> 00:54:17,580
We can also test if there's
736
00:54:17,580 --> 00:54:21,520
an autoregressive process of order two
737
00:54:21,520 --> 00:54:26,520
where we regress UT on our regressors
738
00:54:26,700 --> 00:54:29,320
plus each of the past two years
739
00:54:29,320 --> 00:54:32,500
and do a joint F-test on P1 and P2,
740
00:54:32,500 --> 00:54:35,710
because here our null is that P2 and P2
741
00:54:35,710 --> 00:54:38,373
are jointly equal to zero.
742
00:54:43,130 --> 00:54:47,700
And you could do it for
any number of regressors
743
00:54:47,700 --> 00:54:50,440
and some general number Q,
744
00:54:50,440 --> 00:54:54,910
where you regress the UT
on Xs, save the residuals,
745
00:54:54,910 --> 00:54:59,910
and then regress them
on Q lagged residuals,
746
00:55:00,760 --> 00:55:02,610
and use a joint F-test
747
00:55:02,610 --> 00:55:05,983
that all of these rows P1, P2, da-da-da,
748
00:55:07,330 --> 00:55:11,410
up to PQ are equal to zero.
749
00:55:11,410 --> 00:55:15,560
And again, we want to
fail to reject that null,
750
00:55:15,560 --> 00:55:18,793
hoping that we don't
have serial correlation.
751
00:55:25,600 --> 00:55:27,930
If these tests determined
752
00:55:27,930 --> 00:55:31,440
that you do have serial correlation,
753
00:55:31,440 --> 00:55:36,440
if the coefficient on last year's errors
754
00:55:37,180 --> 00:55:39,980
are significant on this year error,
755
00:55:39,980 --> 00:55:44,740
then we have to do some
sort of transformation.
756
00:55:44,740 --> 00:55:46,920
One of the things that we learned before
757
00:55:46,920 --> 00:55:51,670
was to difference and subtract
758
00:55:53,280 --> 00:55:56,070
this year minus last year and last year,
759
00:55:56,070 --> 00:55:58,893
minus the year before that and so on.
760
00:55:59,930 --> 00:56:02,060
But you lose an observation
761
00:56:02,060 --> 00:56:06,660
and this should remind you
a little bit of panel data,
762
00:56:06,660 --> 00:56:11,170
where differencing cost us an observation
763
00:56:11,170 --> 00:56:13,680
and therefore cost of efficiency.
764
00:56:13,680 --> 00:56:16,750
So there is a transformation,
765
00:56:16,750 --> 00:56:21,750
very similar to random effects models,
766
00:56:23,300 --> 00:56:27,330
where you can do a transformation.
767
00:56:27,330 --> 00:56:32,330
You need to know the value of
this P or rho, but if you do,
768
00:56:33,910 --> 00:56:38,280
then you can do a
transformation like this,
769
00:56:38,280 --> 00:56:39,483
and I'm gonna show you.
770
00:56:40,510 --> 00:56:43,190
And I just used a
771
00:56:45,580 --> 00:56:47,510
slide from the textbook slides
772
00:56:47,510 --> 00:56:50,140
'cause they make them look much more neat.
773
00:56:50,140 --> 00:56:55,140
So here, if you have
strictly exogenous regressors
774
00:56:55,660 --> 00:56:58,600
that you can do this
kind of transformation
775
00:56:58,600 --> 00:57:03,090
where it's sort of like differencing,
776
00:57:03,090 --> 00:57:08,090
but you only subtract
rho times the past year.
777
00:57:09,430 --> 00:57:13,350
So if you look at about
the third equation down
778
00:57:13,350 --> 00:57:15,590
with the red boxes all around it,
779
00:57:15,590 --> 00:57:20,590
you instead of of subtracting
out the full value
780
00:57:22,240 --> 00:57:26,730
of last year's Y, that
you only subtract out rho
781
00:57:26,730 --> 00:57:31,730
and rho being some number
between zero and one,
782
00:57:32,640 --> 00:57:34,713
only subtract out that.
783
00:57:35,620 --> 00:57:37,721
The problem is that you need
784
00:57:37,721 --> 00:57:42,283
a good estimator for this value of rho.
785
00:57:49,640 --> 00:57:52,930
And you can do the same thing
786
00:57:52,930 --> 00:57:57,930
if you have an AR2 or an ARQ process,
787
00:57:59,740 --> 00:58:03,640
and most of these are available
788
00:58:03,640 --> 00:58:07,960
in more advanced stats packages.
789
00:58:07,960 --> 00:58:12,550
I don't believe SPSS has
an easy way to do this,
790
00:58:12,550 --> 00:58:14,040
but, you know, again,
791
00:58:14,040 --> 00:58:17,470
if you're working with time series data,
792
00:58:17,470 --> 00:58:18,550
you're probably gonna want
793
00:58:18,550 --> 00:58:23,550
to get and work with a more
sort of econometrics specific
794
00:58:25,180 --> 00:58:28,833
statistical software package like SAS.
795
00:58:35,080 --> 00:58:40,080
So again, it becomes a trade off here
796
00:58:42,910 --> 00:58:47,830
that if you have a static model like this,
797
00:58:47,830 --> 00:58:52,830
and you think that the UT
follows this AR1 process,
798
00:58:54,430 --> 00:58:58,010
that then you can just subtract it out.
799
00:58:58,010 --> 00:59:03,010
And then it basically
fixes a lot of our efforts
800
00:59:04,990 --> 00:59:09,840
because UT then, if UT as a random walk,
801
00:59:09,840 --> 00:59:14,220
then subtracting it out
and taking the difference
802
00:59:14,220 --> 00:59:19,220
makes it a zero mean constantly
various variant error term.
803
00:59:22,120 --> 00:59:26,720
And we can use OLS and we
can use our F and T-tests
804
00:59:26,720 --> 00:59:27,810
and everything's good.
805
00:59:27,810 --> 00:59:29,933
But again, we lose an observation.
806
00:59:35,610 --> 00:59:40,610
There is also specific transformations.
807
00:59:42,200 --> 00:59:46,440
The so-called Newey-West corrections,
808
00:59:46,440 --> 00:59:51,160
most software that works specifically
809
00:59:51,160 --> 00:59:54,090
with time series will have this.
810
00:59:54,090 --> 00:59:58,600
It does require that this is
another FGLS transformation,
811
01:00:00,750 --> 01:00:04,197
and it does require strict exogeneity
812
01:00:05,092 --> 01:00:07,103
and ARQ correlation,
813
01:00:08,510 --> 01:00:13,510
but it can fix and give you
814
01:00:13,610 --> 01:00:15,750
the correct standard errors,
815
01:00:15,750 --> 01:00:19,150
which will allow you to do better T-tests.
816
01:00:19,150 --> 01:00:22,740
It does need a large sample.
817
01:00:22,740 --> 01:00:27,740
And as before, if you really
have severe serial correlation,
818
01:00:30,130 --> 01:00:34,740
it might be the best idea to simply
819
01:00:37,030 --> 01:00:41,073
take the difference, and
use that model instead.
820
01:00:43,130 --> 01:00:47,040
There are also tests
for heteroskedasticity
821
01:00:47,040 --> 01:00:51,940
that you can use the ones
that we are familiar with,
822
01:00:51,940 --> 01:00:56,940
but we have to first rule out
823
01:00:57,780 --> 01:01:02,780
serial correlation that
these tests will sort
824
01:01:03,120 --> 01:01:04,950
of give you a false positive
825
01:01:04,950 --> 01:01:07,640
that if there is serial correlation,
826
01:01:07,640 --> 01:01:11,080
it might make you think that
there is heteroskedasticity.
827
01:01:11,080 --> 01:01:15,710
So you want to do the serial
correlation tests first
828
01:01:15,710 --> 01:01:17,600
and only if that's ruled out,
829
01:01:17,600 --> 01:01:22,500
then do the Breusch-Pagan or
White tests that we've learned
830
01:01:22,500 --> 01:01:24,470
and then we can use things
831
01:01:24,470 --> 01:01:26,983
like weighted least squares as before.