1 00:00:03,210 --> 00:00:06,270 - [Instructor] This is the first part of a three-part series 2 00:00:06,270 --> 00:00:09,473 on time series econometrics. 3 00:00:10,720 --> 00:00:15,720 I hope you find this section useful. 4 00:00:22,450 --> 00:00:27,450 This section on time series has three main components. 5 00:00:29,830 --> 00:00:31,080 This is the first one, 6 00:00:31,080 --> 00:00:34,990 and I'm going to do a little introduction. 7 00:00:34,990 --> 00:00:38,570 Talk about two common models, 8 00:00:38,570 --> 00:00:41,963 static and finite distributed lag. 9 00:00:42,880 --> 00:00:47,880 Talk about very well-behaved time series data, 10 00:00:48,560 --> 00:00:50,230 which will remind you a lot 11 00:00:50,230 --> 00:00:54,010 of the very well-behaved cross-sectional data 12 00:00:54,010 --> 00:00:55,830 we did many weeks ago. 13 00:00:55,830 --> 00:01:00,830 And what we can assume OLS does based on those 14 00:01:01,470 --> 00:01:05,000 as far as unbiased and efficient. 15 00:01:05,000 --> 00:01:09,370 And in the last part, we're gonna talk about trends 16 00:01:09,370 --> 00:01:13,303 and seasonal effects and how to account for those. 17 00:01:15,690 --> 00:01:17,580 So, so far, most of our time 18 00:01:17,580 --> 00:01:20,290 has been spent with cross-sectional data 19 00:01:20,290 --> 00:01:24,640 where we have a single timeframe, T equals one, 20 00:01:24,640 --> 00:01:27,870 but many observations and equals many. 21 00:01:27,870 --> 00:01:32,870 For example, doing a survey over a specific time period, 22 00:01:34,730 --> 00:01:38,310 but administering it and gaining data 23 00:01:38,310 --> 00:01:40,670 from lots of different people. 24 00:01:40,670 --> 00:01:42,920 We also spoke a bit about panel data, 25 00:01:42,920 --> 00:01:47,920 where you have, usually a big N, many respondents 26 00:01:49,030 --> 00:01:53,350 and more than one time period, 27 00:01:53,350 --> 00:01:56,860 but usually a fairly small number. 28 00:01:56,860 --> 00:02:00,963 Say two, three, maybe at most four times. 29 00:02:02,210 --> 00:02:05,610 Now we're looking at a time series 30 00:02:05,610 --> 00:02:08,670 where there's one N but many Ts. 31 00:02:08,670 --> 00:02:13,670 So we're looking at one set of variables over time, 32 00:02:14,150 --> 00:02:18,620 such as GDP or sales of some good. 33 00:02:18,620 --> 00:02:21,220 It could be a sociological thing 34 00:02:21,220 --> 00:02:26,220 like voting, voting rate, or many other examples, 35 00:02:27,170 --> 00:02:32,090 educational attainment in a given place, 36 00:02:32,090 --> 00:02:34,420 or a biophysical example, 37 00:02:34,420 --> 00:02:38,823 like phosphorous in the lake or land use. 38 00:02:40,110 --> 00:02:42,550 But in any case, it's looking at 39 00:02:42,550 --> 00:02:47,550 the same respondent, in a sense, over many years. 40 00:02:48,290 --> 00:02:51,460 And since there's a logical ordering here, 41 00:02:51,460 --> 00:02:53,770 we have to think about, 42 00:02:53,770 --> 00:02:58,770 we're not drawing a random sample of responses. 43 00:02:59,500 --> 00:03:01,550 We're looking at the same one, 44 00:03:01,550 --> 00:03:06,200 and there's a clear direction of causality 45 00:03:06,200 --> 00:03:09,680 that we can in many cases assume 46 00:03:09,680 --> 00:03:11,240 and we have to account for. 47 00:03:11,240 --> 00:03:14,930 That the past influences the present 48 00:03:14,930 --> 00:03:18,123 and the present will influence the future. 49 00:03:21,540 --> 00:03:24,810 Again, we'll be looking at a time series process 50 00:03:24,810 --> 00:03:29,810 that has both deterministic and stochastic components, 51 00:03:30,290 --> 00:03:34,650 that there will be part of our dependent variable, 52 00:03:34,650 --> 00:03:39,650 which the predicted value is explained by the regressors. 53 00:03:40,280 --> 00:03:45,130 And also a stochastic element where we have 54 00:03:45,130 --> 00:03:48,890 an error term of that which was sort of unexpected 55 00:03:48,890 --> 00:03:51,173 or cannot be measured each time as well. 56 00:03:55,860 --> 00:03:58,860 Looking at all of the topics 57 00:03:58,860 --> 00:04:00,320 in this time series 58 00:04:03,750 --> 00:04:06,610 section again today, 59 00:04:06,610 --> 00:04:10,160 we're gonna be looking at very well behaved data. 60 00:04:10,160 --> 00:04:13,400 What are the assumptions, which are in some cases 61 00:04:13,400 --> 00:04:18,400 sort of unrealistic and simplistic and building from there, 62 00:04:18,610 --> 00:04:23,610 and then looking at a slightly less well behaved data 63 00:04:24,000 --> 00:04:26,850 and transformations that we can make 64 00:04:26,850 --> 00:04:28,800 to make them better behaved. 65 00:04:28,800 --> 00:04:33,800 And last, how do we test for these assumptions being true 66 00:04:33,970 --> 00:04:35,200 and what do we adjust 67 00:04:35,200 --> 00:04:38,560 and how do we adjust if we find that the data 68 00:04:38,560 --> 00:04:42,463 are not as well behaved as we would like? 69 00:04:47,100 --> 00:04:49,520 There are two basic kinds of models 70 00:04:49,520 --> 00:04:53,010 in time series, static and finite distributed lag 71 00:04:53,010 --> 00:04:54,913 and I will talk to you about each one. 72 00:04:58,420 --> 00:05:01,260 This is a static model and it's static 73 00:05:01,260 --> 00:05:06,260 because you can see in the equation that starts at YT 74 00:05:08,290 --> 00:05:13,290 that we're looking at every thing over the same time. 75 00:05:13,420 --> 00:05:18,420 So the T would be say the years of the study. 76 00:05:19,500 --> 00:05:23,503 So we might have started collecting data in 2001, 77 00:05:24,620 --> 00:05:29,010 in which case T equals 1 and continue on 78 00:05:29,010 --> 00:05:32,473 through the year 2020, when T equals 20. 79 00:05:35,633 --> 00:05:40,210 Here all of our variables are contemporaneous, 80 00:05:40,210 --> 00:05:42,600 that they're happening at the same time, 81 00:05:42,600 --> 00:05:47,600 that we only assume that the regressor, ZT, 82 00:05:47,740 --> 00:05:50,343 had any effect on that year's YT. 83 00:05:51,672 --> 00:05:56,672 That nothing that happened in the past has any effect, 84 00:05:57,270 --> 00:05:59,080 and that is why it's static 85 00:06:00,040 --> 00:06:03,710 and that it's all happening in a single time 86 00:06:03,710 --> 00:06:07,730 and this is very much like cross sectional data. 87 00:06:07,730 --> 00:06:10,520 And just like there that we can assume 88 00:06:10,520 --> 00:06:15,100 with a single regressor that the change in Y 89 00:06:15,100 --> 00:06:19,273 will be beta one times the change in Z. 90 00:06:20,150 --> 00:06:25,150 And assuming that the Z is exogenous 91 00:06:26,570 --> 00:06:29,060 and it has no effect on you 92 00:06:29,060 --> 00:06:33,170 that we can do this sort of simple calculation. 93 00:06:33,170 --> 00:06:38,170 And the a beta is then the change in Y 94 00:06:39,040 --> 00:06:43,733 as a result of the change in Z, the slope of that line. 95 00:06:46,630 --> 00:06:50,843 We could of course, we could of course add more regressors. 96 00:06:53,623 --> 00:06:57,573 Z2, Z3 as many as what we think are correct. 97 00:06:59,700 --> 00:07:03,180 And these data are very much in this model 98 00:07:03,180 --> 00:07:05,510 is very much like cross section, 99 00:07:05,510 --> 00:07:08,670 except now we're looking across time 100 00:07:08,670 --> 00:07:10,470 instead of across space. 101 00:07:10,470 --> 00:07:15,280 We're looking at the different realization 102 00:07:15,280 --> 00:07:17,210 of a variable over time, 103 00:07:17,210 --> 00:07:22,210 instead of that variable across different respondents. 104 00:07:26,850 --> 00:07:30,610 In contrast here is a finite distributed lag. 105 00:07:30,610 --> 00:07:34,760 So lag meaning that regressors from past years 106 00:07:37,530 --> 00:07:40,810 are thought to have an effect on this year's Y. 107 00:07:40,810 --> 00:07:43,100 So you say YT and ZT, 108 00:07:43,100 --> 00:07:46,800 but also ZT minus one, last year's Z, 109 00:07:46,800 --> 00:07:49,400 and ZT minus two, two years ago's Z. 110 00:07:49,400 --> 00:07:52,390 And since there are two lagged regressors, 111 00:07:52,390 --> 00:07:56,610 this is known as a finite distributed 112 00:07:56,610 --> 00:07:58,600 lag model of order two, 113 00:07:58,600 --> 00:08:01,710 because there are two lags and we could have more lags 114 00:08:03,010 --> 00:08:05,963 if we think that that is the correct model. 115 00:08:11,370 --> 00:08:12,890 Here are the assumptions 116 00:08:12,890 --> 00:08:17,890 of well-behaved OLS time series data. 117 00:08:18,310 --> 00:08:22,360 So the first is, and these should look pretty familiar, 118 00:08:22,360 --> 00:08:26,600 the first is that it is linear in parameters 119 00:08:26,600 --> 00:08:31,600 that we can model the data generating process 120 00:08:32,090 --> 00:08:35,660 as a linear function. 121 00:08:35,660 --> 00:08:39,890 Next that as before each regressor has 122 00:08:39,890 --> 00:08:43,280 to contribute some new information. 123 00:08:43,280 --> 00:08:47,150 That no Z or X can be a constant 124 00:08:47,150 --> 00:08:52,150 or a linear function of the other regressors. 125 00:08:52,220 --> 00:08:56,963 Again, that it has to provide some new information. 126 00:08:58,640 --> 00:09:03,640 Third is the exogeneity assumption 127 00:09:04,570 --> 00:09:09,570 that every error term is uncorrelated 128 00:09:10,240 --> 00:09:13,270 with each of the regressors. 129 00:09:15,040 --> 00:09:18,560 If this holds true, a weaker assumption 130 00:09:18,560 --> 00:09:21,450 that this year's error term 131 00:09:21,450 --> 00:09:25,840 is uncorrelated with this year's regressors, 132 00:09:25,840 --> 00:09:30,180 this is called contemporaneously exogenous, 133 00:09:30,180 --> 00:09:33,200 but there's also a stronger assumption 134 00:09:33,200 --> 00:09:37,237 that this year's error term 135 00:09:39,270 --> 00:09:44,270 is uncorrelated with any regressors for any other time. 136 00:09:47,260 --> 00:09:50,250 And that the regressors are uncorrelated 137 00:09:50,250 --> 00:09:52,990 with the error term for any other time. 138 00:09:52,990 --> 00:09:55,540 This again is a stronger assumption 139 00:09:55,540 --> 00:10:00,463 and when this holds, this is called strict exogeneity. 140 00:10:04,110 --> 00:10:08,650 More on the third assumption that we don't really have 141 00:10:08,650 --> 00:10:12,680 to worry about this in cross sectional data, 142 00:10:12,680 --> 00:10:15,520 that your error term, 143 00:10:15,520 --> 00:10:18,360 some strange thing that may have happened to you, 144 00:10:18,360 --> 00:10:22,100 or some relatively small and minor thing 145 00:10:22,100 --> 00:10:25,060 that we are not able to measure 146 00:10:25,060 --> 00:10:27,770 is not effected by my regressors. 147 00:10:27,770 --> 00:10:32,600 So whether you found a dollar 148 00:10:32,600 --> 00:10:35,270 or got a flat tire 149 00:10:35,270 --> 00:10:40,270 is not effected by my income and my age and my behaviors 150 00:10:41,760 --> 00:10:45,313 and so forth since it's a random sample. 151 00:10:46,330 --> 00:10:49,640 But we don't randomly sample in time series 152 00:10:49,640 --> 00:10:52,380 because we want a series of data 153 00:10:53,670 --> 00:10:56,260 and they are all very intimately tied. 154 00:10:56,260 --> 00:10:58,870 That what happened last year 155 00:10:58,870 --> 00:11:02,940 might affect what happens to this year, 156 00:11:02,940 --> 00:11:06,273 because we're looking at the same phenomenon over time. 157 00:11:12,270 --> 00:11:16,390 This assumption three fails the same in cross sectional data 158 00:11:16,390 --> 00:11:18,860 in cases where there's a measurement error 159 00:11:18,860 --> 00:11:23,860 or omitted variables, as well as in time series. 160 00:11:24,470 --> 00:11:27,840 If there is a lag defect that this may hold true, 161 00:11:27,840 --> 00:11:30,047 that this year's error term 162 00:11:33,910 --> 00:11:38,910 may be effected by last year's regressor. 163 00:11:40,040 --> 00:11:44,990 And in this case, we want to put last year's regressors, 164 00:11:44,990 --> 00:11:48,870 or last two years regressors into our model. 165 00:11:48,870 --> 00:11:53,410 So we can specifically account and control for them 166 00:11:53,410 --> 00:11:56,163 and to take them out of the error term. 167 00:11:59,610 --> 00:12:02,500 When these three assumptions hold, 168 00:12:02,500 --> 00:12:05,790 we can assume that OLS is unbiased. 169 00:12:05,790 --> 00:12:09,830 And when there is an omitted variable, 170 00:12:09,830 --> 00:12:14,830 the same kind of omitted variable bias analysis that we did 171 00:12:15,440 --> 00:12:19,940 way back in the beginning of class would hold true. 172 00:12:19,940 --> 00:12:24,940 The same basic themes and analyses would hold true. 173 00:12:29,340 --> 00:12:31,140 The fourth assumption 174 00:12:31,140 --> 00:12:35,440 in well-behaved time series data is homoskedasticity. 175 00:12:36,630 --> 00:12:40,100 That the variance of our error term, 176 00:12:40,100 --> 00:12:45,100 given any value of X, our regressor is a constant. 177 00:12:45,270 --> 00:12:50,270 So before it was that the variance 178 00:12:51,180 --> 00:12:53,663 of your error term and my error term, 179 00:12:55,040 --> 00:12:58,120 had a constant variance. 180 00:12:58,120 --> 00:13:02,730 Now it's that the variance of this year's error term 181 00:13:02,730 --> 00:13:06,090 and last year's error term, going back in time, 182 00:13:06,090 --> 00:13:08,550 all have a constant variance. 183 00:13:08,550 --> 00:13:12,610 That's what homoskedasticity means in this context. 184 00:13:14,180 --> 00:13:17,940 And we could do the same basic kinds of tests 185 00:13:17,940 --> 00:13:20,470 like we did in cross sectional data 186 00:13:20,470 --> 00:13:23,130 to test for homoskedasticity 187 00:13:23,130 --> 00:13:26,573 such as the White test and the Breusch-Pagan test. 188 00:13:33,170 --> 00:13:38,050 The fifth assumption is no serial correlation 189 00:13:38,050 --> 00:13:40,280 that this year's error term 190 00:13:40,280 --> 00:13:43,447 and last year's error term are uncorrelated. 191 00:13:45,940 --> 00:13:50,490 And in fact, this year's error term is uncorrelated 192 00:13:50,490 --> 00:13:54,343 with any error term from any past year. 193 00:13:55,820 --> 00:14:00,820 The violation is that something abnormally high this year 194 00:14:02,960 --> 00:14:06,480 is also abnormally high last year, 195 00:14:06,480 --> 00:14:10,470 that the thing that we sort of failed to measure, 196 00:14:10,470 --> 00:14:14,300 the minuscule thing that we failed to measure 197 00:14:14,300 --> 00:14:18,830 or didn't account for this year is that same minuscule thing 198 00:14:18,830 --> 00:14:21,740 that we didn't account for last year. 199 00:14:21,740 --> 00:14:26,013 And we will learn a test on how to do that. 200 00:14:27,410 --> 00:14:31,480 I think it's a lot like the in panel data, 201 00:14:33,940 --> 00:14:37,370 the idea that there's this AI that goes along 202 00:14:37,370 --> 00:14:42,370 with every individual respondent 203 00:14:42,580 --> 00:14:47,580 that is very hard to measure but that holds through 204 00:14:48,070 --> 00:14:53,020 and here again, I think that the parts of the model 205 00:14:53,020 --> 00:14:57,170 that are just very hard to measure will follow through 206 00:14:57,170 --> 00:15:00,200 because we're looking at the same thing 207 00:15:00,200 --> 00:15:04,950 and it's likely that the errors will be correlated. 208 00:15:06,950 --> 00:15:09,640 And again, we will learn a test 209 00:15:09,640 --> 00:15:14,640 on how to see if this is true later on in this series. 210 00:15:16,010 --> 00:15:21,010 Note that let's not rule out that the Ys are correlated, 211 00:15:21,120 --> 00:15:23,160 that they almost always are. 212 00:15:23,160 --> 00:15:26,400 That last year's GDP 213 00:15:26,400 --> 00:15:31,400 is highly correlated with the year before that 214 00:15:32,540 --> 00:15:35,503 and the year before that, et cetera. 215 00:15:38,210 --> 00:15:40,990 When these five assumptions hold, 216 00:15:40,990 --> 00:15:45,990 the variance of the beta J has the same formula as before. 217 00:15:46,040 --> 00:15:49,260 It is a function of the variance 218 00:15:49,260 --> 00:15:52,910 of the overall error term, sigma squared, 219 00:15:52,910 --> 00:15:57,910 as well as how spread out that XJ is 220 00:15:58,920 --> 00:16:02,430 and how collinear that XJ is 221 00:16:02,430 --> 00:16:05,253 with the other regressors in our model. 222 00:16:06,250 --> 00:16:09,490 And under those assumptions, 223 00:16:09,490 --> 00:16:11,790 those five assumptions, the three unbiasedness 224 00:16:13,120 --> 00:16:17,900 and the sort of spherical errors assumption for time series. 225 00:16:17,900 --> 00:16:22,030 If they all hold, if they are all true, then OLS is blue. 226 00:16:22,030 --> 00:16:24,540 It's the best linear unbiased estimator. 227 00:16:24,540 --> 00:16:27,523 It's the most efficient, unbiased estimator. 228 00:16:31,910 --> 00:16:35,770 Remember that when we started to get into hypothesis tests 229 00:16:35,770 --> 00:16:38,400 that we added a new assumption, 230 00:16:38,400 --> 00:16:43,400 and that is the error term is normally distributed 231 00:16:44,140 --> 00:16:46,100 with mean zero 232 00:16:46,100 --> 00:16:51,100 and variance sigma squared. 233 00:16:52,210 --> 00:16:56,600 And we are adding another slightly stronger assumption 234 00:16:56,600 --> 00:17:01,600 that the Us are not only independent of X as before, 235 00:17:02,350 --> 00:17:04,110 I think at assumption four, 236 00:17:04,110 --> 00:17:06,747 but that they are independently 237 00:17:09,740 --> 00:17:12,903 and identically distributed, IID. 238 00:17:13,780 --> 00:17:18,150 So if the sixth assumption holds 239 00:17:18,150 --> 00:17:23,150 then the OLS estimators also 240 00:17:23,560 --> 00:17:27,630 have predictable statistical properties, 241 00:17:27,630 --> 00:17:31,200 and we can use the kind of inference tests, 242 00:17:31,200 --> 00:17:33,270 T-tests, the F-tests, 243 00:17:33,270 --> 00:17:38,270 and we can create confidence intervals the same, 244 00:17:40,790 --> 00:17:43,250 and they have the same interpretation 245 00:17:43,250 --> 00:17:45,653 as we learned about in cross section. 246 00:17:48,810 --> 00:17:51,671 So just a bit more on IID. 247 00:17:51,671 --> 00:17:56,630 IID stands for independent and identically distributed. 248 00:17:56,630 --> 00:18:00,320 So one way of thinking about this is 249 00:18:00,320 --> 00:18:05,250 the bingo ball wheel that you're drawing the errors from. 250 00:18:05,250 --> 00:18:09,730 It does not change, that it's the exact same distribution. 251 00:18:09,730 --> 00:18:14,730 That's what the identically distributed means. 252 00:18:14,760 --> 00:18:19,670 It's the exact same bell curve probability 253 00:18:19,670 --> 00:18:23,313 of drawing any error term over every year. 254 00:18:24,590 --> 00:18:28,460 And the independent means that it has no memory, 255 00:18:28,460 --> 00:18:32,020 that the ball is replaced each year, 256 00:18:32,020 --> 00:18:34,980 that the error term that you drew this time 257 00:18:34,980 --> 00:18:39,273 has no effect on the error term that you draw next time. 258 00:18:47,150 --> 00:18:51,560 We very often want to use dummy variables in time series 259 00:18:51,560 --> 00:18:53,740 to account for whether 260 00:18:53,740 --> 00:18:58,740 an important event or policy or regime 261 00:18:59,360 --> 00:19:01,653 was in place during that time. 262 00:19:03,190 --> 00:19:07,590 A good example would be if there was 263 00:19:07,590 --> 00:19:12,590 a big recall of romaine, which we've seen a few times, 264 00:19:13,940 --> 00:19:18,940 that the demand for romaine would depend not only sort of 265 00:19:19,540 --> 00:19:24,540 on the supply and the prices and the people's preferences 266 00:19:26,410 --> 00:19:30,410 and things, and their incomes and things like that, 267 00:19:30,410 --> 00:19:35,140 but you would want to have a dummy variable in place 268 00:19:35,140 --> 00:19:37,670 that accounts for there was this huge scare 269 00:19:37,670 --> 00:19:39,690 and there was this huge recall 270 00:19:41,300 --> 00:19:43,873 that happened during that time. 271 00:19:45,920 --> 00:19:49,770 Otherwise you're not really going to get a good account 272 00:19:49,770 --> 00:19:52,610 of what is the effect of income 273 00:19:52,610 --> 00:19:57,270 and price and supply and preferences. 274 00:19:57,270 --> 00:19:59,580 And I was thinking as well, 275 00:19:59,580 --> 00:20:03,670 that when we model time series GDP 276 00:20:03,670 --> 00:20:08,610 and other economic variables for right now 277 00:20:08,610 --> 00:20:12,940 that we will probably need to have a dummy variable 278 00:20:12,940 --> 00:20:17,940 for the coronavirus that, you know, 279 00:20:19,200 --> 00:20:21,290 the intercept shift of what happened 280 00:20:21,290 --> 00:20:24,353 just as a result of that holding all else equal. 281 00:20:27,820 --> 00:20:31,460 It's also very common in time series 282 00:20:31,460 --> 00:20:33,620 to acknowledge that the data 283 00:20:33,620 --> 00:20:36,433 have natural trends and seasons. 284 00:20:37,420 --> 00:20:42,420 So trends take into account that there's, 285 00:20:43,560 --> 00:20:47,420 especially in economics, that there is a time trend, 286 00:20:47,420 --> 00:20:51,210 that our GDP increases over time, 287 00:20:51,210 --> 00:20:55,050 worker productivity increases over time 288 00:20:55,050 --> 00:20:59,540 sort of independent of many, many other factors. 289 00:20:59,540 --> 00:21:02,000 That it adds this sort of a natural 290 00:21:02,000 --> 00:21:04,903 or recurring time trend built into it. 291 00:21:06,440 --> 00:21:10,390 And it's important that we account for that 292 00:21:10,390 --> 00:21:15,390 because if productivity and GDP are both trending upward, 293 00:21:18,250 --> 00:21:22,730 that failing to account for that trend in both of them 294 00:21:22,730 --> 00:21:25,380 would make the relationship between them 295 00:21:25,380 --> 00:21:27,690 seem stronger than it actually is 296 00:21:27,690 --> 00:21:29,390 and we would get a biased estimate. 297 00:21:29,390 --> 00:21:31,690 That it would be in a sense, 298 00:21:31,690 --> 00:21:34,600 an omitted variable bias by failing 299 00:21:34,600 --> 00:21:37,183 to account for that trend. 300 00:21:40,710 --> 00:21:44,440 Here is a very simple example 301 00:21:47,890 --> 00:21:50,460 of a Y that has a time trend. 302 00:21:50,460 --> 00:21:54,070 So T would just be the actual year 303 00:21:54,070 --> 00:21:57,630 that the data were collected. 304 00:21:57,630 --> 00:22:02,630 So again, if we have 2001 to 2020 data, 305 00:22:03,270 --> 00:22:07,730 T equals 1 in year 2001, and so on, 306 00:22:07,730 --> 00:22:11,530 T equals 20 in year 2020, 307 00:22:13,760 --> 00:22:17,680 and in this model, holding all else equal, 308 00:22:17,680 --> 00:22:22,680 Y will increase or shrink by the rate of A1 each year. 309 00:22:23,430 --> 00:22:26,550 So if A1 is greater than zero, 310 00:22:26,550 --> 00:22:31,550 the predicted value of Y will increase by A1 each year. 311 00:22:39,290 --> 00:22:42,230 We very often model things with 312 00:22:42,230 --> 00:22:45,860 a number of regressors and a time trend. 313 00:22:45,860 --> 00:22:48,457 So you see here, there's two regressors, 314 00:22:48,457 --> 00:22:52,870 X1, X2, as well as this time trend, T. 315 00:22:52,870 --> 00:22:57,170 And again, that this accounts for the trend 316 00:22:57,170 --> 00:23:02,170 that may be in either Y or in X1 or X2, 317 00:23:03,570 --> 00:23:08,570 and by leaving out this trend by not controlling for it, 318 00:23:08,830 --> 00:23:12,520 it can cause omitted variable bias. 319 00:23:12,520 --> 00:23:17,480 And note that you should include a time trend, 320 00:23:17,480 --> 00:23:22,010 even if the Xs trend and the Ys don't that you want 321 00:23:22,010 --> 00:23:26,380 to take out the effect of time over these Xs 322 00:23:26,380 --> 00:23:30,733 and holding time equal, how does X affect Y. 323 00:23:37,530 --> 00:23:42,530 The R squared of a time series model tends to be quite high, 324 00:23:43,970 --> 00:23:48,890 that these kind of aggregated data tend to explain 325 00:23:48,890 --> 00:23:52,910 the relationships and result in a higher R squared 326 00:23:52,910 --> 00:23:55,130 than we get with something 327 00:23:55,130 --> 00:23:59,170 like doing a survey and how do the regressors 328 00:23:59,170 --> 00:24:04,170 across individual people affect the dependent. 329 00:24:04,220 --> 00:24:06,680 So, one way to get a better sense 330 00:24:06,680 --> 00:24:11,680 of how much of the variation in Y is explained by X, 331 00:24:13,260 --> 00:24:16,680 that is R squared independent of time 332 00:24:16,680 --> 00:24:19,860 is to detrend our data. 333 00:24:19,860 --> 00:24:20,840 And you do that 334 00:24:22,505 --> 00:24:25,083 by regressing YT only on T 335 00:24:27,310 --> 00:24:29,700 and saving those residuals, 336 00:24:29,700 --> 00:24:32,567 which I call this weird Y umlaut. 337 00:24:33,478 --> 00:24:37,283 And then R squared is this formula 338 00:24:38,930 --> 00:24:42,960 where SSR is that of regressing, 339 00:24:44,000 --> 00:24:48,530 YT on X1, X2 and T. 340 00:24:48,530 --> 00:24:53,530 And it gives you a better idea of taking away time trend, 341 00:24:54,810 --> 00:24:57,633 how well do the Xs explain our Y. 342 00:25:02,640 --> 00:25:06,940 There's also many economic phenomena 343 00:25:06,940 --> 00:25:08,970 that have a clear seasonality, 344 00:25:08,970 --> 00:25:11,450 such as most Christmas trees 345 00:25:11,450 --> 00:25:14,540 are sold in November and December, 346 00:25:14,540 --> 00:25:19,540 hay is sold in the summer when it's fresh. 347 00:25:22,290 --> 00:25:27,230 Another example is nursery flower transplants 348 00:25:28,540 --> 00:25:32,200 are much more often sold in the spring. 349 00:25:32,200 --> 00:25:35,610 So you can include dummy variables 350 00:25:35,610 --> 00:25:39,520 for specific months or quarters or seasons 351 00:25:39,520 --> 00:25:44,520 where you think that sales will be particularly high or low, 352 00:25:45,720 --> 00:25:50,720 model it with them and you can use a T or an F-test 353 00:25:51,280 --> 00:25:53,900 to determine whether that has 354 00:25:53,900 --> 00:25:56,920 a significant effect on our model 355 00:25:56,920 --> 00:26:01,920 and much like we did before, we can deseasonalize our Ys, 356 00:26:03,200 --> 00:26:06,380 where we regress our YTs 357 00:26:06,380 --> 00:26:11,380 only on our seasonal dummy variables, save those residual, 358 00:26:14,390 --> 00:26:19,390 and then regress our Xs 359 00:26:19,560 --> 00:26:22,900 on those Y umlauts again, 360 00:26:22,900 --> 00:26:26,030 and use R squared as another measure 361 00:26:26,030 --> 00:26:28,203 of the goodness of fit of the model. 362 00:26:30,880 --> 00:26:34,050 Basically, again, that many of the assumptions 363 00:26:34,050 --> 00:26:37,700 are unrealistic, too restrictive, 364 00:26:37,700 --> 00:26:40,000 doesn't really meet the real world. 365 00:26:40,000 --> 00:26:43,100 So we're gonna look at those models 366 00:26:43,100 --> 00:26:48,100 and look at some common ones and some principles 367 00:26:48,220 --> 00:26:50,300 that we have to abide by 368 00:26:50,300 --> 00:26:52,970 for those assumptions to fit and last, 369 00:26:52,970 --> 00:26:56,750 a transformation that you're already familiar with 370 00:26:56,750 --> 00:26:57,850 that can deal with it. 371 00:27:02,270 --> 00:27:06,430 So two pieces of vocabulary that we're gonna deal with, 372 00:27:06,430 --> 00:27:08,860 dependence and stationarity. 373 00:27:08,860 --> 00:27:10,500 And both of them 374 00:27:10,500 --> 00:27:13,023 cause issues with inference, 375 00:27:16,240 --> 00:27:20,700 that it can mess up our hypothesis tests, our DNF tests 376 00:27:20,700 --> 00:27:22,823 that we would like to run on our data. 377 00:27:28,470 --> 00:27:32,400 We think about dependence and how much 378 00:27:32,400 --> 00:27:37,400 does the past affect the present and the future. 379 00:27:39,150 --> 00:27:42,603 And we say that a time series process 380 00:27:46,150 --> 00:27:48,500 is a weakly dependent, 381 00:27:48,500 --> 00:27:53,113 that as the distance between time periods 382 00:27:55,380 --> 00:27:58,360 gets larger, the relationship 383 00:27:58,360 --> 00:28:01,390 between those variables gets small. 384 00:28:01,390 --> 00:28:06,390 So basically we want weak dependence 385 00:28:06,690 --> 00:28:10,620 so that this year's 386 00:28:13,310 --> 00:28:18,310 realization of X and future years, 387 00:28:18,520 --> 00:28:23,033 that relationship gets very small, very fast. 388 00:28:24,190 --> 00:28:28,290 You can think of independent is that like 389 00:28:28,290 --> 00:28:31,070 when we're drawing a random sample 390 00:28:31,070 --> 00:28:34,270 where the coin has no memory at all, 391 00:28:34,270 --> 00:28:38,100 that last year that you and I as respondents 392 00:28:38,100 --> 00:28:42,713 don't have any thing to do with each other. 393 00:28:45,030 --> 00:28:50,030 Dependence really, where they are very intimately related 394 00:28:53,500 --> 00:28:57,360 and last year and past years effect this year 395 00:28:57,360 --> 00:29:00,740 in a very strong way, or thinking about 396 00:29:00,740 --> 00:29:04,250 that the coin remembers every flip 397 00:29:06,912 --> 00:29:11,507 really messes up inference and it makes it impossible. 398 00:29:13,170 --> 00:29:17,793 Where with weak dependence, this memory fades over time. 399 00:29:18,960 --> 00:29:23,960 Stationarity means that the variables 400 00:29:23,960 --> 00:29:28,960 have the same probability distribution over time. 401 00:29:29,630 --> 00:29:32,970 So much as, we assumed that they have 402 00:29:32,970 --> 00:29:37,800 this normal IID distribution over time, 403 00:29:42,110 --> 00:29:45,100 we need to make assumptions 404 00:29:45,100 --> 00:29:49,143 and know that the probability distribution 405 00:29:50,397 --> 00:29:55,360 of our error terms and of our dependent variables 406 00:29:55,360 --> 00:29:59,373 have a predictable, the same distribution 407 00:30:00,920 --> 00:30:05,350 in order for us to be able to make any kind of inference 408 00:30:05,350 --> 00:30:07,410 and use T-tests and F-tests 409 00:30:07,410 --> 00:30:10,173 and those tests that we have learned about. 410 00:30:13,180 --> 00:30:15,820 Basically weak dependence is needed 411 00:30:15,820 --> 00:30:18,280 for the law of large numbers 412 00:30:18,280 --> 00:30:22,100 and the central limit theorem to hold. 413 00:30:22,100 --> 00:30:25,570 That because we do not random sample, 414 00:30:25,570 --> 00:30:30,570 if the realizations are too closely related, 415 00:30:31,600 --> 00:30:33,340 we can't do inference 416 00:30:33,340 --> 00:30:38,340 and strong dependence totally messes it up. 417 00:30:38,640 --> 00:30:41,460 So we are going to look at models 418 00:30:41,460 --> 00:30:45,910 that have strong and weak dependence 419 00:30:45,910 --> 00:30:50,090 that we may encounter in time series data. 420 00:30:54,070 --> 00:30:57,920 So what we're finding is these autoregressive models 421 00:30:57,920 --> 00:31:02,300 where last year's Y 422 00:31:02,300 --> 00:31:06,530 or past year's Y appear as regressors 423 00:31:06,530 --> 00:31:11,530 with this year's Y on the left side. 424 00:31:12,018 --> 00:31:13,800 And then two examples of that, 425 00:31:13,800 --> 00:31:18,800 the so-called unit root process and the random walk. 426 00:31:19,660 --> 00:31:23,323 And we'll go through each one over in the next few slides. 427 00:31:27,750 --> 00:31:28,940 Here's an example 428 00:31:28,940 --> 00:31:33,803 of an autoregressive process of order one, an AR1. 429 00:31:34,700 --> 00:31:39,620 It's autoregressive, because last year's Y, 430 00:31:39,620 --> 00:31:43,323 YT minus one, appears as a regressor. 431 00:31:44,210 --> 00:31:46,870 And it's of order one because only 432 00:31:46,870 --> 00:31:51,870 one lagged dependent variable is a regressor in this model. 433 00:31:57,470 --> 00:32:01,690 The properties of this model 434 00:32:01,690 --> 00:32:04,770 is that Y starts at some sort 435 00:32:04,770 --> 00:32:08,030 of starting point, at time zero. 436 00:32:08,030 --> 00:32:13,030 Our error term is again, IID zero sigma squared, 437 00:32:14,060 --> 00:32:19,060 and that the ETs are independent of Y naught. 438 00:32:19,700 --> 00:32:23,240 And just for simplicity, we often assume 439 00:32:23,240 --> 00:32:27,003 that the expected value of Y naught equals zero, 440 00:32:28,050 --> 00:32:33,050 or we could subtract out that value each time 441 00:32:33,270 --> 00:32:37,100 to make the math easier. 442 00:32:37,100 --> 00:32:39,720 So we're gonna look at the value 443 00:32:39,720 --> 00:32:43,620 of this coefficient P 444 00:32:43,620 --> 00:32:47,230 or I believe in the Wooldridge book it's rho, 445 00:32:47,230 --> 00:32:50,350 but I just made it a P 446 00:32:50,350 --> 00:32:55,203 because I have P on my keyboard and not rho. 447 00:32:57,180 --> 00:33:01,550 So it's important that for any kind of... 448 00:33:08,490 --> 00:33:12,100 So when rho is less than one, 449 00:33:12,100 --> 00:33:17,100 then we see that with the expected value of YT plus H, 450 00:33:17,710 --> 00:33:22,133 given our starting point YT, is rho to the H times YT, 451 00:33:23,360 --> 00:33:27,180 for any number of years greater than one. 452 00:33:27,180 --> 00:33:31,750 And we see that the influence of our starting point, 453 00:33:31,750 --> 00:33:34,090 YT gets smaller and smaller over time, 454 00:33:34,090 --> 00:33:37,770 and it approaches zero overall as H gets bigger and bigger. 455 00:33:37,770 --> 00:33:41,470 And this is what we want, we have weak dependence. 456 00:33:41,470 --> 00:33:44,580 When rho equals one, 457 00:33:44,580 --> 00:33:48,390 the best prediction of YT plus H is always YT. 458 00:33:52,050 --> 00:33:56,550 We call this a unit root process when rho equals one, 459 00:33:56,550 --> 00:33:59,400 and this leads to strong dependence, 460 00:33:59,400 --> 00:34:03,180 and we need to deal with this 461 00:34:03,180 --> 00:34:08,180 in order to do estimation and inference. 462 00:34:10,200 --> 00:34:14,120 So again, the unit root process 463 00:34:14,120 --> 00:34:18,263 is when this P or rho equals one. 464 00:34:22,400 --> 00:34:27,400 One very common unit root process is the 465 00:34:28,790 --> 00:34:33,153 so-called random walk where Y, 466 00:34:34,830 --> 00:34:39,830 this year, equals last year's Y plus an error term. 467 00:34:40,520 --> 00:34:43,100 And we assume that the error term, ET, 468 00:34:43,100 --> 00:34:47,860 is as a mean of zero and constant variance 469 00:34:47,860 --> 00:34:52,860 and so Y starts at the previous year's value 470 00:34:52,970 --> 00:34:54,830 and adds an error term. 471 00:34:54,830 --> 00:34:59,830 And every year just adds that error term 472 00:35:01,440 --> 00:35:05,783 to last year's value so that there's a random walk. 473 00:35:06,810 --> 00:35:09,830 The random part is the error, 474 00:35:09,830 --> 00:35:12,640 and it might be high, it might be low, 475 00:35:12,640 --> 00:35:17,010 but it'll just sort of walk around that mean. 476 00:35:17,010 --> 00:35:22,010 And if we assume, Y naught, just for mathematical 477 00:35:22,360 --> 00:35:24,533 again equals zero, 478 00:35:27,030 --> 00:35:31,630 and the expected value of Y in any year 479 00:35:34,240 --> 00:35:36,980 equals this starting point. 480 00:35:36,980 --> 00:35:39,930 So you can imagine the line sort of 481 00:35:39,930 --> 00:35:44,930 meandering all around over time, going up and down, 482 00:35:45,840 --> 00:35:50,840 but basically the expected value of any year 483 00:35:50,900 --> 00:35:53,143 is that starting point. 484 00:35:56,930 --> 00:36:01,460 So the value of Y 485 00:36:03,140 --> 00:36:05,710 in year T plus H 486 00:36:05,710 --> 00:36:10,710 is just the sum of the error terms over all those years, 487 00:36:11,150 --> 00:36:14,490 plus YT, the starting point. 488 00:36:14,490 --> 00:36:19,490 So the value of any year is the sum of all those error terms 489 00:36:21,060 --> 00:36:26,060 plus the starting points. 490 00:36:26,274 --> 00:36:31,274 So if we assume a starting point of zero, again, 491 00:36:31,700 --> 00:36:34,910 it'll just be a line that meanders around 492 00:36:36,140 --> 00:36:40,793 the zero line, the X axis. 493 00:36:42,060 --> 00:36:45,523 And that for any value of YT, 494 00:36:47,900 --> 00:36:51,517 that the expected value of YT plus H equals YT. 495 00:36:54,780 --> 00:36:58,150 And you can see in the first equation 496 00:36:58,150 --> 00:37:01,440 that if we have a big enough H 497 00:37:01,440 --> 00:37:05,050 that if we add them all up, that they will, 498 00:37:05,050 --> 00:37:08,450 since the expected value is zero, that they are going 499 00:37:08,450 --> 00:37:12,150 to eventually converge and sum to zero. 500 00:37:12,150 --> 00:37:17,150 And so the expected value of YT plus H 501 00:37:18,390 --> 00:37:22,743 will always equal our starting point, YT. 502 00:37:27,720 --> 00:37:32,720 So another example is the so-called random walk with drift. 503 00:37:32,900 --> 00:37:37,470 So this is again, a unit root process 504 00:37:37,470 --> 00:37:40,483 because our rho, or P, equals one. 505 00:37:44,640 --> 00:37:47,587 And in this case, there's a drift 506 00:37:47,587 --> 00:37:50,720 and that drift is A naught. 507 00:37:50,720 --> 00:37:55,720 So every this year is Y is last year's Y 508 00:37:55,910 --> 00:38:00,030 plus this drift plus this A naught. 509 00:38:00,030 --> 00:38:03,903 So this has both persistence and trend. 510 00:38:05,630 --> 00:38:10,630 It's persistent because it builds on the last year's value 511 00:38:12,621 --> 00:38:14,850 and the expected value is 512 00:38:14,850 --> 00:38:18,167 of what will be carried over. 513 00:38:21,600 --> 00:38:25,170 And the trend is that every year 514 00:38:25,170 --> 00:38:27,463 it adds this factor A naught. 515 00:38:30,730 --> 00:38:32,780 So that let's us talk 516 00:38:32,780 --> 00:38:36,510 a bit more about persistence and trend. 517 00:38:36,510 --> 00:38:39,350 So we talked a bunch about trend last time, 518 00:38:39,350 --> 00:38:43,890 and persistence is again, 519 00:38:43,890 --> 00:38:47,830 when something that happens 520 00:38:47,830 --> 00:38:51,453 is felt over time. 521 00:38:54,920 --> 00:38:58,420 So if we don't deal with them, they'll mess up analysis. 522 00:38:58,420 --> 00:39:02,440 We learned last time, how trend can be 523 00:39:02,440 --> 00:39:05,800 an important omitted, variable bias, 524 00:39:05,800 --> 00:39:10,800 and the persistence is when some shock 525 00:39:11,400 --> 00:39:16,400 continues to be felt into the future, 526 00:39:16,890 --> 00:39:21,620 or that this year's realization 527 00:39:21,620 --> 00:39:25,150 is very closely related to past years, 528 00:39:25,150 --> 00:39:29,980 that all of last year's and previous years values 529 00:39:29,980 --> 00:39:33,720 are sort of remembered over time. 530 00:39:33,720 --> 00:39:36,380 And it's interesting to note 531 00:39:36,380 --> 00:39:39,350 that interest rates are persistent, 532 00:39:39,350 --> 00:39:44,350 that this quarters interest rate 533 00:39:44,700 --> 00:39:48,920 is going to be very closely related to last quarters, 534 00:39:48,920 --> 00:39:51,930 but they don't really trend. 535 00:39:51,930 --> 00:39:56,930 That they're not steadily upward or downward over time 536 00:39:57,250 --> 00:39:59,930 whereas GDP is likely 537 00:39:59,930 --> 00:40:02,563 both persistent and trending. 538 00:40:05,320 --> 00:40:10,320 That, you know, in most years here in the US, 539 00:40:11,890 --> 00:40:14,690 you'll get in a normal sort of growth 540 00:40:14,690 --> 00:40:19,680 that you'll get only about 2% annual growth. 541 00:40:19,680 --> 00:40:23,053 So last year's GDP is very, 542 00:40:24,780 --> 00:40:28,090 plays a big factor on what this year's is, 543 00:40:28,090 --> 00:40:33,090 but there's this sort of overall trend over time. 544 00:40:33,160 --> 00:40:37,660 Now, the huge contraction of our economy 545 00:40:40,430 --> 00:40:43,520 that we're experiencing with the virus 546 00:40:43,520 --> 00:40:46,880 sort of throws it off, but that's hopefully 547 00:40:46,880 --> 00:40:51,773 a one in a lifetime event. 548 00:40:57,000 --> 00:41:02,000 So we can see that the random walk with drift 549 00:41:02,200 --> 00:41:05,840 has both persistence and trend 550 00:41:05,840 --> 00:41:10,840 and so the A naught is the trend part 551 00:41:13,040 --> 00:41:16,740 and the fact that it's a unit root process 552 00:41:16,740 --> 00:41:19,263 makes it highly persistent. 553 00:41:20,820 --> 00:41:25,820 So for any year, T, 554 00:41:25,860 --> 00:41:27,003 that the YT 555 00:41:33,200 --> 00:41:36,940 equals this A naught added on T time. 556 00:41:36,940 --> 00:41:41,510 So A naught times T plus all of the error terms, 557 00:41:41,510 --> 00:41:44,010 plus Y naught, our starting point. 558 00:41:44,010 --> 00:41:46,870 And if we, again, assume that our starting point is zero, 559 00:41:46,870 --> 00:41:51,870 then the expected value of YT is just A naught T. 560 00:41:52,510 --> 00:41:55,163 And if A naught is greater than zero, 561 00:41:56,160 --> 00:41:58,470 the expected value will grow over time. 562 00:41:58,470 --> 00:42:00,963 And if it's less, it will shrink over time. 563 00:42:05,030 --> 00:42:10,030 So again, this is a highly persistent process. 564 00:42:10,530 --> 00:42:15,050 It's a unit root, 565 00:42:15,050 --> 00:42:19,080 it's trending and persistent, 566 00:42:19,080 --> 00:42:23,650 and we need to be able to transform the data 567 00:42:23,650 --> 00:42:28,650 so that it becomes weakly dependent 568 00:42:29,130 --> 00:42:32,053 and we can work with these data. 569 00:42:35,260 --> 00:42:38,150 Not surprisingly the most 570 00:42:38,150 --> 00:42:42,360 common transformation is differencing. 571 00:42:42,360 --> 00:42:46,640 That we saw that much like we did with panel data 572 00:42:46,640 --> 00:42:48,940 where we subtract 573 00:42:50,940 --> 00:42:53,700 this years minus last years 574 00:42:53,700 --> 00:42:57,780 and that year's minus the year before and so on. 575 00:42:57,780 --> 00:43:02,750 And in this way, it gets rid of both the persistence 576 00:43:02,750 --> 00:43:07,330 because it's only measuring the change over time. 577 00:43:07,330 --> 00:43:12,013 And the trend, because the A naught trend 578 00:43:13,080 --> 00:43:15,260 is subtracted out every time. 579 00:43:15,260 --> 00:43:20,260 And in that way, when we transform with the difference, 580 00:43:20,730 --> 00:43:24,400 a random walk, then the change in Y 581 00:43:24,400 --> 00:43:28,070 then only becomes the change in an error term. 582 00:43:28,070 --> 00:43:30,580 And if we have other regressors 583 00:43:30,580 --> 00:43:33,590 much like we saw with panel data, 584 00:43:33,590 --> 00:43:37,530 we can regress how the change in Y 585 00:43:37,530 --> 00:43:40,740 is a function of the change in Xs just like we did 586 00:43:40,740 --> 00:43:44,633 in the first differencing model in panel data. 587 00:43:49,090 --> 00:43:53,810 So by way of overview in the first section 588 00:43:53,810 --> 00:43:58,810 we looked at well behaved data with restrictive assumptions 589 00:43:59,330 --> 00:44:01,540 and what are their properties. 590 00:44:01,540 --> 00:44:03,460 Here we're looking at times 591 00:44:03,460 --> 00:44:07,010 when the data are less well behaved, 592 00:44:07,010 --> 00:44:09,610 but with more realistic symptoms 593 00:44:09,610 --> 00:44:13,660 and what we can do to transform it. 594 00:44:13,660 --> 00:44:17,160 And then last, in the next one, 595 00:44:17,160 --> 00:44:21,957 we begin to look at what happens if we have 596 00:44:22,990 --> 00:44:27,513 either serial correlation or heteroskedasticity. 597 00:44:30,600 --> 00:44:34,070 Now we're gonna get into section three 598 00:44:34,070 --> 00:44:36,630 and look at what happens 599 00:44:36,630 --> 00:44:41,200 when the last few assumptions don't hold. 600 00:44:41,200 --> 00:44:44,990 And specifically when the assumption five 601 00:44:44,990 --> 00:44:48,980 of a serial correlation 602 00:44:49,860 --> 00:44:52,470 is a problem. 603 00:44:52,470 --> 00:44:57,120 So again, under serial correlation, 604 00:44:57,120 --> 00:44:59,760 our assumption does not hold, 605 00:44:59,760 --> 00:45:04,760 and as we'll see much like heteroskedasticity, 606 00:45:06,390 --> 00:45:09,770 it leads to faulty inference 607 00:45:09,770 --> 00:45:12,350 and biased errors and things like that. 608 00:45:12,350 --> 00:45:15,970 So, and also inefficient models. 609 00:45:15,970 --> 00:45:18,320 So we're gonna learn how to test for it 610 00:45:18,320 --> 00:45:19,900 and how to deal with it 611 00:45:21,380 --> 00:45:25,513 and we will also revisit heteroskedasticity briefly. 612 00:45:28,570 --> 00:45:31,723 Much like we learned with cross-sectional data, 613 00:45:33,437 --> 00:45:37,187 heteroskedasticity and autocorrelated errors, 614 00:45:39,070 --> 00:45:44,070 give us a faulty variance estimator of the OLS estimators, 615 00:45:46,330 --> 00:45:48,910 and therefore we need to deal with them 616 00:45:48,910 --> 00:45:52,790 and they tend to be fairly common in time series, 617 00:45:52,790 --> 00:45:55,900 especially autocorrelated errors, 618 00:45:55,900 --> 00:45:59,570 because we are not random sampling each time 619 00:46:02,780 --> 00:46:05,420 parts of the model that we don't estimate 620 00:46:05,420 --> 00:46:10,420 that we can't measure, that are seemingly random, 621 00:46:10,950 --> 00:46:15,950 recur over time, and we need to deal with it. 622 00:46:16,530 --> 00:46:19,460 So, as we've learned before, 623 00:46:19,460 --> 00:46:23,523 when we have autocorrelated errors, 624 00:46:24,520 --> 00:46:29,170 the good news is that there is no bias, 625 00:46:29,170 --> 00:46:32,367 but as we learned with heteroskedasticity, 626 00:46:34,500 --> 00:46:38,170 this model will yield a higher variance 627 00:46:38,170 --> 00:46:39,710 so it's no longer blue. 628 00:46:39,710 --> 00:46:41,940 It's no longer the most efficient estimator 629 00:46:41,940 --> 00:46:45,350 and the T-test and F-tests don't work 630 00:46:45,350 --> 00:46:48,373 due to the biased estimate 631 00:46:51,980 --> 00:46:54,930 of the variance. 632 00:46:54,930 --> 00:46:58,570 So the good news is that if 633 00:46:58,570 --> 00:47:02,840 the dynamics are properly specified, there is no problems. 634 00:47:02,840 --> 00:47:05,560 So if we have put the right lags in there 635 00:47:05,560 --> 00:47:08,670 and every thing else we can deal with it. 636 00:47:08,670 --> 00:47:10,740 And there's also techniques 637 00:47:10,740 --> 00:47:13,370 that we will learn about now to test for it. 638 00:47:13,370 --> 00:47:17,283 And in some cases to transform the data. 639 00:47:23,600 --> 00:47:28,120 Again, the problem with serial correlation 640 00:47:30,070 --> 00:47:33,180 is that OLS is no longer blue 641 00:47:33,180 --> 00:47:37,170 and all of the tests that we use no longer work. 642 00:47:37,170 --> 00:47:39,940 Very much like heteroskedasticity 643 00:47:39,940 --> 00:47:41,390 that we've dealt with before. 644 00:47:43,780 --> 00:47:45,930 So we learned last time about 645 00:47:49,451 --> 00:47:50,901 the autoregressive model one. 646 00:47:52,000 --> 00:47:54,830 And we can assume in some cases 647 00:47:54,830 --> 00:47:59,520 that the errors follow a similar pattern. 648 00:47:59,520 --> 00:48:03,853 That this year's error is a function of last year's error, 649 00:48:05,080 --> 00:48:09,380 plus some new E, some new sort of random information. 650 00:48:09,380 --> 00:48:13,883 So when P here, rho, equals zero, 651 00:48:16,120 --> 00:48:19,500 then the errors are not correlated 652 00:48:19,500 --> 00:48:21,340 and we don't have a problem. 653 00:48:21,340 --> 00:48:23,573 When it's greater than zero, 654 00:48:24,790 --> 00:48:27,950 OLS underestimates the variance, 655 00:48:27,950 --> 00:48:31,990 when P is less than zero it overestimates, 656 00:48:31,990 --> 00:48:36,990 and our T and F and Lagrange multiplier test 657 00:48:38,270 --> 00:48:40,623 and all of those no longer work. 658 00:48:44,800 --> 00:48:48,690 So when we have strictly exogenous regressors, 659 00:48:48,690 --> 00:48:51,130 where none of the regressors 660 00:48:51,130 --> 00:48:54,590 or the error terms over time are correlated, 661 00:48:54,590 --> 00:48:56,800 it makes it a little bit easier. 662 00:48:56,800 --> 00:48:59,210 And let's assume that we have 663 00:49:00,400 --> 00:49:05,400 this AR1 model for serial correlation 664 00:49:05,470 --> 00:49:07,370 that we saw last slide, 665 00:49:07,370 --> 00:49:11,870 where this year's error is a function of last year's error. 666 00:49:11,870 --> 00:49:15,280 So you could simply run a regression 667 00:49:15,280 --> 00:49:18,023 where you put this year's error, 668 00:49:19,180 --> 00:49:22,860 basically error T on the left side 669 00:49:22,860 --> 00:49:27,860 and the year before that, T minus one on the right side. 670 00:49:28,650 --> 00:49:33,373 Run a regression, our null hypothesis is that our P, 671 00:49:34,370 --> 00:49:38,447 the coefficient with UT minus one equals zero. 672 00:49:42,840 --> 00:49:46,440 And in this case, 673 00:49:46,440 --> 00:49:51,170 we hope that we get a small test stat, 674 00:49:51,170 --> 00:49:54,450 and we can fail to to reject our null 675 00:49:54,450 --> 00:49:59,450 and assume that we do not have serial correlation. 676 00:49:59,610 --> 00:50:02,250 But if we do, if we reject our null, 677 00:50:02,250 --> 00:50:03,700 then we have to deal with it. 678 00:50:06,520 --> 00:50:08,440 So here's how you would do it. 679 00:50:08,440 --> 00:50:12,730 Again, First regress YT 680 00:50:12,730 --> 00:50:16,410 on each of your regressors, save the residuals, 681 00:50:16,410 --> 00:50:21,180 then regress each UT on the previous year. 682 00:50:21,180 --> 00:50:23,740 So you would have to sort of go in in Excel 683 00:50:23,740 --> 00:50:25,070 and cut and paste, 684 00:50:25,070 --> 00:50:29,500 and sort of just knock it down one row, 685 00:50:29,500 --> 00:50:34,500 and run a T-test on this coefficient. 686 00:50:34,524 --> 00:50:39,524 And hopefully you can 687 00:50:39,680 --> 00:50:42,720 fail to reject the null. 688 00:50:42,720 --> 00:50:44,850 And note that this only tests 689 00:50:44,850 --> 00:50:49,610 if the adjacent errors are correlated, 690 00:50:49,610 --> 00:50:51,910 since it's an AR1, 691 00:50:51,910 --> 00:50:55,780 it's only looking at the previous year's error. 692 00:50:55,780 --> 00:51:00,510 So if UT and UT minus two 693 00:51:00,510 --> 00:51:05,510 are correlated, this test wouldn't pick that up. 694 00:51:09,630 --> 00:51:13,250 A very common test is a so-called Durbin-Watson test, 695 00:51:13,250 --> 00:51:17,700 where we also use UT or the UT hat. 696 00:51:17,700 --> 00:51:19,970 So we save the residuals, 697 00:51:19,970 --> 00:51:24,530 our null hypothesis again, is that P equals zero. 698 00:51:24,530 --> 00:51:29,530 And this stat sums up on the denominator, 699 00:51:30,220 --> 00:51:34,230 UT minus, UT minus one, and squares that, 700 00:51:34,230 --> 00:51:35,900 and in the denominator, 701 00:51:35,900 --> 00:51:40,830 just the sum of squared residuals as we've come to know. 702 00:51:40,830 --> 00:51:43,220 And then we look at the stats, 703 00:51:43,220 --> 00:51:47,700 so this stat will be approximately equal 704 00:51:47,700 --> 00:51:52,700 to two times the quantity one minus P. 705 00:51:53,490 --> 00:51:58,283 So if P equals zero, 706 00:52:00,370 --> 00:52:04,730 and then we can fail to reject the null 707 00:52:04,730 --> 00:52:08,560 and that's a good thing that we don't have it. 708 00:52:08,560 --> 00:52:12,050 But if the number is far less than zero, 709 00:52:12,050 --> 00:52:16,340 so you're multiplying by a number almost one. 710 00:52:16,340 --> 00:52:19,070 So one minus P is a very small number, 711 00:52:19,070 --> 00:52:22,610 multiplied by two is still a very small number 712 00:52:22,610 --> 00:52:27,180 and implies that P is significantly different than zero 713 00:52:27,180 --> 00:52:30,153 and we have serial correlation. 714 00:52:32,730 --> 00:52:36,060 When we don't have strict exogeneity, 715 00:52:37,177 --> 00:52:39,700 and this will happen most often 716 00:52:39,700 --> 00:52:42,510 when we have lagged regressors, 717 00:52:42,510 --> 00:52:46,030 when we not only put XT into the model, 718 00:52:46,030 --> 00:52:49,333 this year's regressor, but also put last years in. 719 00:52:50,560 --> 00:52:52,840 This is more common, 720 00:52:52,840 --> 00:52:57,840 and this could be valid for any number of regressors. 721 00:52:58,070 --> 00:53:01,393 So this is the more sort of general case. 722 00:53:06,350 --> 00:53:09,190 The test is much the same, 723 00:53:09,190 --> 00:53:14,190 except now in the second step, we include our regressors. 724 00:53:15,690 --> 00:53:20,690 So we start with regressing YT just on our regressors 725 00:53:20,980 --> 00:53:23,050 and save the residuals. 726 00:53:23,050 --> 00:53:26,560 And then we re regress the residuals 727 00:53:26,560 --> 00:53:31,560 on the original regressors and on UT minus one, 728 00:53:31,640 --> 00:53:36,640 and run the T-test on P, the coefficient of UT minus one. 729 00:53:38,600 --> 00:53:41,710 And by including the regressors, 730 00:53:41,710 --> 00:53:46,710 we are controlling for any correlations 731 00:53:47,590 --> 00:53:52,550 between our Xs and our error terms. 732 00:53:52,550 --> 00:53:57,490 So we don't need to make this assumption, 733 00:53:57,490 --> 00:54:02,407 but we might have to calculate, use robust errors 734 00:54:03,410 --> 00:54:07,693 if we suspect that heteroskedasticity is present. 735 00:54:14,410 --> 00:54:17,580 We can also test if there's 736 00:54:17,580 --> 00:54:21,520 an autoregressive process of order two 737 00:54:21,520 --> 00:54:26,520 where we regress UT on our regressors 738 00:54:26,700 --> 00:54:29,320 plus each of the past two years 739 00:54:29,320 --> 00:54:32,500 and do a joint F-test on P1 and P2, 740 00:54:32,500 --> 00:54:35,710 because here our null is that P2 and P2 741 00:54:35,710 --> 00:54:38,373 are jointly equal to zero. 742 00:54:43,130 --> 00:54:47,700 And you could do it for any number of regressors 743 00:54:47,700 --> 00:54:50,440 and some general number Q, 744 00:54:50,440 --> 00:54:54,910 where you regress the UT on Xs, save the residuals, 745 00:54:54,910 --> 00:54:59,910 and then regress them on Q lagged residuals, 746 00:55:00,760 --> 00:55:02,610 and use a joint F-test 747 00:55:02,610 --> 00:55:05,983 that all of these rows P1, P2, da-da-da, 748 00:55:07,330 --> 00:55:11,410 up to PQ are equal to zero. 749 00:55:11,410 --> 00:55:15,560 And again, we want to fail to reject that null, 750 00:55:15,560 --> 00:55:18,793 hoping that we don't have serial correlation. 751 00:55:25,600 --> 00:55:27,930 If these tests determined 752 00:55:27,930 --> 00:55:31,440 that you do have serial correlation, 753 00:55:31,440 --> 00:55:36,440 if the coefficient on last year's errors 754 00:55:37,180 --> 00:55:39,980 are significant on this year error, 755 00:55:39,980 --> 00:55:44,740 then we have to do some sort of transformation. 756 00:55:44,740 --> 00:55:46,920 One of the things that we learned before 757 00:55:46,920 --> 00:55:51,670 was to difference and subtract 758 00:55:53,280 --> 00:55:56,070 this year minus last year and last year, 759 00:55:56,070 --> 00:55:58,893 minus the year before that and so on. 760 00:55:59,930 --> 00:56:02,060 But you lose an observation 761 00:56:02,060 --> 00:56:06,660 and this should remind you a little bit of panel data, 762 00:56:06,660 --> 00:56:11,170 where differencing cost us an observation 763 00:56:11,170 --> 00:56:13,680 and therefore cost of efficiency. 764 00:56:13,680 --> 00:56:16,750 So there is a transformation, 765 00:56:16,750 --> 00:56:21,750 very similar to random effects models, 766 00:56:23,300 --> 00:56:27,330 where you can do a transformation. 767 00:56:27,330 --> 00:56:32,330 You need to know the value of this P or rho, but if you do, 768 00:56:33,910 --> 00:56:38,280 then you can do a transformation like this, 769 00:56:38,280 --> 00:56:39,483 and I'm gonna show you. 770 00:56:40,510 --> 00:56:43,190 And I just used a 771 00:56:45,580 --> 00:56:47,510 slide from the textbook slides 772 00:56:47,510 --> 00:56:50,140 'cause they make them look much more neat. 773 00:56:50,140 --> 00:56:55,140 So here, if you have strictly exogenous regressors 774 00:56:55,660 --> 00:56:58,600 that you can do this kind of transformation 775 00:56:58,600 --> 00:57:03,090 where it's sort of like differencing, 776 00:57:03,090 --> 00:57:08,090 but you only subtract rho times the past year. 777 00:57:09,430 --> 00:57:13,350 So if you look at about the third equation down 778 00:57:13,350 --> 00:57:15,590 with the red boxes all around it, 779 00:57:15,590 --> 00:57:20,590 you instead of of subtracting out the full value 780 00:57:22,240 --> 00:57:26,730 of last year's Y, that you only subtract out rho 781 00:57:26,730 --> 00:57:31,730 and rho being some number between zero and one, 782 00:57:32,640 --> 00:57:34,713 only subtract out that. 783 00:57:35,620 --> 00:57:37,721 The problem is that you need 784 00:57:37,721 --> 00:57:42,283 a good estimator for this value of rho. 785 00:57:49,640 --> 00:57:52,930 And you can do the same thing 786 00:57:52,930 --> 00:57:57,930 if you have an AR2 or an ARQ process, 787 00:57:59,740 --> 00:58:03,640 and most of these are available 788 00:58:03,640 --> 00:58:07,960 in more advanced stats packages. 789 00:58:07,960 --> 00:58:12,550 I don't believe SPSS has an easy way to do this, 790 00:58:12,550 --> 00:58:14,040 but, you know, again, 791 00:58:14,040 --> 00:58:17,470 if you're working with time series data, 792 00:58:17,470 --> 00:58:18,550 you're probably gonna want 793 00:58:18,550 --> 00:58:23,550 to get and work with a more sort of econometrics specific 794 00:58:25,180 --> 00:58:28,833 statistical software package like SAS. 795 00:58:35,080 --> 00:58:40,080 So again, it becomes a trade off here 796 00:58:42,910 --> 00:58:47,830 that if you have a static model like this, 797 00:58:47,830 --> 00:58:52,830 and you think that the UT follows this AR1 process, 798 00:58:54,430 --> 00:58:58,010 that then you can just subtract it out. 799 00:58:58,010 --> 00:59:03,010 And then it basically fixes a lot of our efforts 800 00:59:04,990 --> 00:59:09,840 because UT then, if UT as a random walk, 801 00:59:09,840 --> 00:59:14,220 then subtracting it out and taking the difference 802 00:59:14,220 --> 00:59:19,220 makes it a zero mean constantly various variant error term. 803 00:59:22,120 --> 00:59:26,720 And we can use OLS and we can use our F and T-tests 804 00:59:26,720 --> 00:59:27,810 and everything's good. 805 00:59:27,810 --> 00:59:29,933 But again, we lose an observation. 806 00:59:35,610 --> 00:59:40,610 There is also specific transformations. 807 00:59:42,200 --> 00:59:46,440 The so-called Newey-West corrections, 808 00:59:46,440 --> 00:59:51,160 most software that works specifically 809 00:59:51,160 --> 00:59:54,090 with time series will have this. 810 00:59:54,090 --> 00:59:58,600 It does require that this is another FGLS transformation, 811 01:00:00,750 --> 01:00:04,197 and it does require strict exogeneity 812 01:00:05,092 --> 01:00:07,103 and ARQ correlation, 813 01:00:08,510 --> 01:00:13,510 but it can fix and give you 814 01:00:13,610 --> 01:00:15,750 the correct standard errors, 815 01:00:15,750 --> 01:00:19,150 which will allow you to do better T-tests. 816 01:00:19,150 --> 01:00:22,740 It does need a large sample. 817 01:00:22,740 --> 01:00:27,740 And as before, if you really have severe serial correlation, 818 01:00:30,130 --> 01:00:34,740 it might be the best idea to simply 819 01:00:37,030 --> 01:00:41,073 take the difference, and use that model instead. 820 01:00:43,130 --> 01:00:47,040 There are also tests for heteroskedasticity 821 01:00:47,040 --> 01:00:51,940 that you can use the ones that we are familiar with, 822 01:00:51,940 --> 01:00:56,940 but we have to first rule out 823 01:00:57,780 --> 01:01:02,780 serial correlation that these tests will sort 824 01:01:03,120 --> 01:01:04,950 of give you a false positive 825 01:01:04,950 --> 01:01:07,640 that if there is serial correlation, 826 01:01:07,640 --> 01:01:11,080 it might make you think that there is heteroskedasticity. 827 01:01:11,080 --> 01:01:15,710 So you want to do the serial correlation tests first 828 01:01:15,710 --> 01:01:17,600 and only if that's ruled out, 829 01:01:17,600 --> 01:01:22,500 then do the Breusch-Pagan or White tests that we've learned 830 01:01:22,500 --> 01:01:24,470 and then we can use things 831 01:01:24,470 --> 01:01:26,983 like weighted least squares as before.