1 00:00:04,540 --> 00:00:07,109 Welcome to Tools and Techniques. 2 00:00:07,109 --> 00:00:08,110 I'm Xana Wolf. 3 00:00:08,110 --> 00:00:11,257 I'm a web database developer at the FEMC, 4 00:00:11,257 --> 00:00:13,060 and I'm giving the first talk. 5 00:00:13,060 --> 00:00:14,693 I actually recorded it. 6 00:00:14,693 --> 00:00:16,766 I didn't need any more stress today. 7 00:00:16,766 --> 00:00:17,599 (audience laughing) 8 00:00:17,599 --> 00:00:19,300 (Xana laughing) 9 00:00:19,300 --> 00:00:20,860 I'm gonna take it off 10 00:00:20,860 --> 00:00:22,450 because it's a little bit long, 11 00:00:22,450 --> 00:00:24,790 but if you have questions 12 00:00:24,790 --> 00:00:26,080 that we don't get to at the end, 13 00:00:26,080 --> 00:00:27,790 I'm around for lunch 14 00:00:27,790 --> 00:00:28,780 and the poster session. 15 00:00:28,780 --> 00:00:30,230 So feel free to (indistinct). 16 00:00:36,970 --> 00:00:38,020 Welcome to a Tour 17 00:00:38,020 --> 00:00:40,690 of the Northeastern Forest Inventory Network 18 00:00:40,690 --> 00:00:42,040 or NEFIN. 19 00:00:42,040 --> 00:00:43,210 Many of you have probably heard 20 00:00:43,210 --> 00:00:44,650 of this project as it has been 21 00:00:44,650 --> 00:00:46,270 in the works for a few years now. 22 00:00:46,270 --> 00:00:49,690 But if not, what is NEFIN? 23 00:00:49,690 --> 00:00:53,067 Essentially NEFIN is a searchable database for CFI data 24 00:00:53,067 --> 00:00:55,660 from a variety of sources. 25 00:00:55,660 --> 00:00:56,650 So as you can see, 26 00:00:56,650 --> 00:00:57,490 we looked at a number 27 00:00:57,490 --> 00:01:00,100 of programs with studies on Northeast 28 00:01:00,100 --> 00:01:02,593 forests going back over 60 years. 29 00:01:04,120 --> 00:01:05,020 Some of you may know 30 00:01:05,020 --> 00:01:06,963 about the precursor to NEFIN, 31 00:01:06,963 --> 00:01:09,100 an FEMC project from a few years ago 32 00:01:09,100 --> 00:01:11,770 called the CFI Program Comparison Tool, 33 00:01:11,770 --> 00:01:13,000 which provides details 34 00:01:13,000 --> 00:01:16,750 on the methodologies of various CFI studies. 35 00:01:16,750 --> 00:01:18,610 So if I clicked here under volume, 36 00:01:18,610 --> 00:01:21,820 biomass and carbon next to mass CFI, 37 00:01:21,820 --> 00:01:23,410 you can see information about 38 00:01:23,410 --> 00:01:26,203 what mass CFI is collecting in that area. 39 00:01:27,520 --> 00:01:28,810 And here's a view with all 40 00:01:28,810 --> 00:01:31,570 of the metadata from mass CFI, 41 00:01:31,570 --> 00:01:33,370 plot density and layout. 42 00:01:33,370 --> 00:01:35,710 Things like expansion factors, 43 00:01:35,710 --> 00:01:39,130 codes for things like crown class 44 00:01:39,130 --> 00:01:40,480 thresholds and standards 45 00:01:40,480 --> 00:01:43,180 like minimum DVH and also things 46 00:01:43,180 --> 00:01:44,320 like sapling and seedling 47 00:01:44,320 --> 00:01:47,530 attributes and forest health indicators. 48 00:01:47,530 --> 00:01:49,720 So once we had collected all of this information 49 00:01:49,720 --> 00:01:51,250 and analyzed it for NEFIN, 50 00:01:51,250 --> 00:01:52,270 we were able to come up 51 00:01:52,270 --> 00:01:54,940 with a set of standard fields. 52 00:01:54,940 --> 00:01:56,920 We came up with about 60 of them 53 00:01:56,920 --> 00:01:58,840 and we correlate them to certain fields 54 00:01:58,840 --> 00:02:00,760 in a programs data set. 55 00:02:00,760 --> 00:02:03,550 Very important things like DBH and height, 56 00:02:03,550 --> 00:02:06,520 which you can see belong to a tree category. 57 00:02:06,520 --> 00:02:09,073 So each tree can have a DBH and a height. 58 00:02:09,910 --> 00:02:11,950 One of the things we are also able 59 00:02:11,950 --> 00:02:12,910 to offer programs 60 00:02:12,910 --> 00:02:15,790 is the option to fuzz their plot locations. 61 00:02:15,790 --> 00:02:17,860 So a plot latitude and longitude aren't 62 00:02:17,860 --> 00:02:19,780 available to the public. 63 00:02:19,780 --> 00:02:21,760 This is a program level definition 64 00:02:21,760 --> 00:02:23,290 and you can see that these fields 65 00:02:23,290 --> 00:02:24,730 are marked private, 66 00:02:24,730 --> 00:02:26,930 so they won't show up on the public website. 67 00:02:27,880 --> 00:02:29,650 Then we have a few more tree metrics, 68 00:02:29,650 --> 00:02:32,500 species, sample year, ID, 69 00:02:32,500 --> 00:02:35,353 and then we have forest health impacts, 70 00:02:36,220 --> 00:02:38,560 which signifies an issue in forest health. 71 00:02:38,560 --> 00:02:40,180 So I'm gonna pull up a list of them 72 00:02:40,180 --> 00:02:41,290 from our database so you 73 00:02:41,290 --> 00:02:42,520 can see how these are done. 74 00:02:42,520 --> 00:02:44,770 So for example, you can see FEMC 75 00:02:44,770 --> 00:02:47,380 forest health monitoring has bigger 76 00:02:47,380 --> 00:02:50,320 dieback, defoliation and discoloration, 77 00:02:50,320 --> 00:02:52,900 and a vigor that's in three, four, five 78 00:02:52,900 --> 00:02:57,850 or eight will identify a forest health impact 79 00:02:57,850 --> 00:02:59,200 or a forest health concern. 80 00:02:59,200 --> 00:03:01,210 A dieback of greater than 55 81 00:03:01,210 --> 00:03:03,610 or defoliation of greater than zero would 82 00:03:03,610 --> 00:03:06,673 also indicate a forest health impact. 83 00:03:08,110 --> 00:03:10,210 We go back to our list of metrics. 84 00:03:10,210 --> 00:03:13,330 We can see more things like plot sample year 85 00:03:13,330 --> 00:03:15,820 and plot ID that are plot categorized 86 00:03:15,820 --> 00:03:17,950 a sample year for saplings. 87 00:03:17,950 --> 00:03:18,850 And then we have a bunch 88 00:03:18,850 --> 00:03:21,070 of program level definitions. 89 00:03:21,070 --> 00:03:23,743 And like I said, we have about 60 of these fields. 90 00:03:24,970 --> 00:03:26,772 So here is a little visualization 91 00:03:26,772 --> 00:03:29,710 of the NEFIN processing workflow. 92 00:03:29,710 --> 00:03:30,970 So you can see these programs 93 00:03:30,970 --> 00:03:32,920 on the left have differing data. 94 00:03:32,920 --> 00:03:34,600 They have different IDs, 95 00:03:34,600 --> 00:03:36,010 they have a different field name 96 00:03:36,010 --> 00:03:36,970 for their species 97 00:03:36,970 --> 00:03:38,980 and different codes for their species. 98 00:03:38,980 --> 00:03:40,990 They could have different units 99 00:03:40,990 --> 00:03:42,733 that they're measuring things in. 100 00:03:44,170 --> 00:03:46,690 So the data is fed into the NEFIN 101 00:03:46,690 --> 00:03:48,790 pre-processor where we have scripts 102 00:03:48,790 --> 00:03:51,370 to match the program's fields with our standard 103 00:03:51,370 --> 00:03:53,200 fields and translate codes 104 00:03:53,200 --> 00:03:55,480 like crown class or species codes. 105 00:03:55,480 --> 00:03:58,930 We save a copy of the original uploaded data files, 106 00:03:58,930 --> 00:04:01,600 code list, et cetera as a snapshot. 107 00:04:01,600 --> 00:04:02,770 And then we process the data 108 00:04:02,770 --> 00:04:04,630 into our own standardized data 109 00:04:04,630 --> 00:04:07,540 tables with our standardized projected 110 00:04:07,540 --> 00:04:09,730 coordinate system units and codes. 111 00:04:09,730 --> 00:04:10,960 And we keep a record of how 112 00:04:10,960 --> 00:04:13,780 to match each program's codes to our codes. 113 00:04:13,780 --> 00:04:15,460 The data can be loaded, again, 114 00:04:15,460 --> 00:04:17,560 if the original data needs to be updated 115 00:04:17,560 --> 00:04:20,350 because of QC or if we have additional years 116 00:04:20,350 --> 00:04:21,850 monitored as time goes on, 117 00:04:21,850 --> 00:04:24,220 we can just put that new additional data 118 00:04:24,220 --> 00:04:27,250 into the program and then at the end, 119 00:04:27,250 --> 00:04:30,520 the users front end users can access 120 00:04:30,520 --> 00:04:33,310 our website to search our standardized data 121 00:04:33,310 --> 00:04:35,863 or download copies of the original data. 122 00:04:37,210 --> 00:04:38,350 All right, so now let's look 123 00:04:38,350 --> 00:04:41,383 at how we actually load data onto the website. 124 00:04:43,720 --> 00:04:46,330 So NEFIN is not currently public facing yet. 125 00:04:46,330 --> 00:04:47,470 We're very close, 126 00:04:47,470 --> 00:04:49,900 but we're just wrapping up with our own QC 127 00:04:49,900 --> 00:04:51,130 and we haven't moved everything 128 00:04:51,130 --> 00:04:54,250 to its permanent home yet, but soon, 129 00:04:54,250 --> 00:04:55,390 and I'm sure we'll announce 130 00:04:55,390 --> 00:04:58,030 that through our FEMC social media and newsletter. 131 00:04:58,030 --> 00:04:59,320 So definitely subscribe 132 00:04:59,320 --> 00:05:00,913 to those if you don't already. 133 00:05:01,960 --> 00:05:03,280 So for loading data, 134 00:05:03,280 --> 00:05:04,480 the program manager would have 135 00:05:04,480 --> 00:05:06,490 to log in to get to this page 136 00:05:06,490 --> 00:05:08,050 and this is where they would be taken 137 00:05:08,050 --> 00:05:10,000 to begin uploading their data. 138 00:05:10,000 --> 00:05:11,680 You can see at the top the steps 139 00:05:11,680 --> 00:05:13,120 they'll have to go through. 140 00:05:13,120 --> 00:05:17,380 You'll upload files, match fields, check code lists, 141 00:05:17,380 --> 00:05:20,770 quality checks, ancillary file uploads, 142 00:05:20,770 --> 00:05:22,783 and then review and submit. 143 00:05:23,920 --> 00:05:25,510 So in this example, 144 00:05:25,510 --> 00:05:27,430 this is a MEER test case 145 00:05:27,430 --> 00:05:29,170 that just means that the files 146 00:05:29,170 --> 00:05:30,910 we're uploading will be in the format 147 00:05:30,910 --> 00:05:32,290 that the main ecological 148 00:05:32,290 --> 00:05:34,900 reserves program provided to us. 149 00:05:34,900 --> 00:05:36,550 Right now we have to pre-set 150 00:05:36,550 --> 00:05:37,720 up the structure for each 151 00:05:37,720 --> 00:05:39,280 program prior 152 00:05:39,280 --> 00:05:40,870 to having their data loaded. 153 00:05:40,870 --> 00:05:43,810 So we know what format to expect. 154 00:05:43,810 --> 00:05:44,650 We're planning to make 155 00:05:44,650 --> 00:05:46,540 a generic template in the future so 156 00:05:46,540 --> 00:05:48,100 that a new program could format 157 00:05:48,100 --> 00:05:49,930 their data to fit our structure 158 00:05:49,930 --> 00:05:50,860 and we wouldn't have to do 159 00:05:50,860 --> 00:05:53,380 as much setup to bring in a new program. 160 00:05:53,380 --> 00:05:55,296 But for now, we need to know which columns 161 00:05:55,296 --> 00:05:58,510 in their data contain are 162 00:05:58,510 --> 00:06:00,430 required fields such as plot 163 00:06:00,430 --> 00:06:02,500 or tree IDs and sample years. 164 00:06:02,500 --> 00:06:05,500 So first we'll want to enter the years 165 00:06:05,500 --> 00:06:08,380 that our data contains that we're uploading. 166 00:06:08,380 --> 00:06:10,873 So I know that I have 2002, 167 00:06:13,030 --> 00:06:18,030 2004 and 2010 in this data set. 168 00:06:18,940 --> 00:06:20,620 And then I'm gonna confirm 169 00:06:20,620 --> 00:06:23,020 that the data set field names 170 00:06:23,020 --> 00:06:25,750 that I uploaded prior still 171 00:06:25,750 --> 00:06:27,733 match these fields. 172 00:06:29,110 --> 00:06:29,943 In the future 173 00:06:29,943 --> 00:06:31,990 we are gonna move this around a little bit, 174 00:06:31,990 --> 00:06:33,340 so this step won't be necessary, 175 00:06:33,340 --> 00:06:34,900 but for now I'm just gonna confirm 176 00:06:34,900 --> 00:06:36,550 that that's all the same. 177 00:06:36,550 --> 00:06:38,830 And then we're taking to the step two, 178 00:06:38,830 --> 00:06:40,330 upload our actual files. 179 00:06:40,330 --> 00:06:42,880 So you can see here we give you the name 180 00:06:42,880 --> 00:06:44,740 of the last file that was 181 00:06:44,740 --> 00:06:46,840 uploaded so that you can go 182 00:06:46,840 --> 00:06:48,220 and check that your 183 00:06:48,220 --> 00:06:52,210 field names are the same or you can match that 184 00:06:52,210 --> 00:06:53,980 these names are the same when uploading. 185 00:06:53,980 --> 00:06:56,500 So I already selected these. 186 00:06:56,500 --> 00:06:59,323 We're gonna upload and pre-process. 187 00:07:06,310 --> 00:07:09,520 Now that the files are pre-processed, 188 00:07:09,520 --> 00:07:11,443 we have to match fields. 189 00:07:13,030 --> 00:07:14,110 You can see our NEFIN 190 00:07:14,110 --> 00:07:17,260 standard fields are listed on the left. 191 00:07:17,260 --> 00:07:18,910 And then there's a dropdown here 192 00:07:18,910 --> 00:07:20,890 with all of the data field names. 193 00:07:20,890 --> 00:07:22,210 So these are all of the fields 194 00:07:22,210 --> 00:07:23,443 that were in the data. 195 00:07:24,280 --> 00:07:26,920 So now if we had changed the name 196 00:07:26,920 --> 00:07:29,170 of our DBH field and there 197 00:07:29,170 --> 00:07:31,930 was no longer a DBH field in the data, 198 00:07:31,930 --> 00:07:32,830 this would be blank 199 00:07:32,830 --> 00:07:35,110 and it would have a red border around it 200 00:07:35,110 --> 00:07:36,520 'cause it's required 201 00:07:36,520 --> 00:07:38,170 and you wouldn't be able to go to the next 202 00:07:38,170 --> 00:07:41,113 step until you chose a field that matched DBH. 203 00:07:42,130 --> 00:07:45,130 So you can also set units for your field. 204 00:07:45,130 --> 00:07:46,900 You get a little visualization 205 00:07:46,900 --> 00:07:48,130 of the sample data there, 206 00:07:48,130 --> 00:07:49,690 and then you can put in notes 207 00:07:49,690 --> 00:07:52,210 or a description of your field. 208 00:07:52,210 --> 00:07:53,980 So these are all tree fields. 209 00:07:53,980 --> 00:07:56,180 Under this tree setting 210 00:07:57,070 --> 00:07:58,600 you can see one of these forests 211 00:07:58,600 --> 00:08:02,020 health impacts fields like we talked about before, 212 00:08:02,020 --> 00:08:04,090 damage type in the mirror data 213 00:08:04,090 --> 00:08:05,470 if it's greater than zero, 214 00:08:05,470 --> 00:08:09,520 that would indicate a forest health impact. 215 00:08:09,520 --> 00:08:11,920 And then we can look at our plot fields, 216 00:08:11,920 --> 00:08:13,090 our sapling fields, 217 00:08:13,090 --> 00:08:14,350 our seedling fields, 218 00:08:14,350 --> 00:08:16,170 and then program information here. 219 00:08:16,170 --> 00:08:18,670 So you can update your expansion factors, 220 00:08:18,670 --> 00:08:20,140 your coordinate systems, 221 00:08:20,140 --> 00:08:22,660 make sure that everything looks good. 222 00:08:22,660 --> 00:08:24,250 And we're gonna say that that's good. 223 00:08:24,250 --> 00:08:26,200 And go to the next page. 224 00:08:26,200 --> 00:08:27,850 On this page we are going 225 00:08:27,850 --> 00:08:29,680 to match any codes. 226 00:08:29,680 --> 00:08:31,630 So what this is telling us is 227 00:08:31,630 --> 00:08:33,040 that there are no new codes in 228 00:08:33,040 --> 00:08:34,330 our sapling data, 229 00:08:34,330 --> 00:08:37,540 our seedling data for size class for species. 230 00:08:37,540 --> 00:08:38,440 Oh, but look, 231 00:08:38,440 --> 00:08:40,720 we have a new code in our tree data. 232 00:08:40,720 --> 00:08:44,140 So I went in and I added this underscore test 233 00:08:44,140 --> 00:08:45,343 to one of the, 234 00:08:46,180 --> 00:08:48,400 I guess that's paper birch. 235 00:08:48,400 --> 00:08:51,070 So it came up as a new code. 236 00:08:51,070 --> 00:08:53,320 So I'm gonna enter paper birch test 237 00:08:53,320 --> 00:08:55,330 as my code meaning, 238 00:08:55,330 --> 00:08:59,353 and then I'm gonna match it to paper birch. 239 00:09:00,280 --> 00:09:02,740 And you can see I can search by common name 240 00:09:02,740 --> 00:09:04,123 or scientific name. 241 00:09:05,020 --> 00:09:06,600 And now I've matched those codes 242 00:09:06,600 --> 00:09:08,653 so we can go on to the next step. 243 00:09:11,740 --> 00:09:14,530 Okay, now this is quality control checks. 244 00:09:14,530 --> 00:09:16,840 So we've made decisions about what we want 245 00:09:16,840 --> 00:09:19,273 to allow into our standardized database. 246 00:09:20,470 --> 00:09:22,690 So here you can see we have a warning, 247 00:09:22,690 --> 00:09:23,980 non-numeric value, 248 00:09:23,980 --> 00:09:27,130 empty blank found in field 13 height 249 00:09:27,130 --> 00:09:27,963 first occurrence found 250 00:09:27,963 --> 00:09:30,730 in line 12 and occurs 16 times. 251 00:09:30,730 --> 00:09:32,800 So warnings are just warnings 252 00:09:32,800 --> 00:09:33,940 and you can continue on 253 00:09:33,940 --> 00:09:35,200 and the data will be included 254 00:09:35,200 --> 00:09:37,333 in our database for better or worse. 255 00:09:38,920 --> 00:09:40,240 The next thing you can see 256 00:09:40,240 --> 00:09:43,600 is an error out of range value 257 00:09:43,600 --> 00:09:45,130 zero centimeters found 258 00:09:45,130 --> 00:09:48,280 in field five DBH range is five inches, 259 00:09:48,280 --> 00:09:50,920 12.7 centimeters to 300 centimeters. 260 00:09:50,920 --> 00:09:52,660 First occurrence found line eight 261 00:09:52,660 --> 00:09:54,670 and it occurs two times. 262 00:09:54,670 --> 00:09:56,740 So row with errors will be removed 263 00:09:56,740 --> 00:09:58,570 before entry into our database. 264 00:09:58,570 --> 00:10:00,340 You may fix the errors and return 265 00:10:00,340 --> 00:10:01,510 to step one and upload 266 00:10:01,510 --> 00:10:04,120 files to upload revised files 267 00:10:04,120 --> 00:10:06,463 or click next to accept removal. 268 00:10:07,570 --> 00:10:12,570 So what this is saying is that (indistinct) 269 00:10:13,000 --> 00:10:15,610 has defined a minimum tree DBH 270 00:10:15,610 --> 00:10:20,080 of five inches or 12.7 centimeters and we found a 271 00:10:20,080 --> 00:10:21,940 DBH of zero centimeters, 272 00:10:21,940 --> 00:10:24,400 which does not meet that criteria. 273 00:10:24,400 --> 00:10:26,950 It's not within five and 300 centimeters. 274 00:10:26,950 --> 00:10:28,180 So we are gonna say that 275 00:10:28,180 --> 00:10:29,920 that does not meet the definition 276 00:10:29,920 --> 00:10:32,290 of a tree as defined by the program. 277 00:10:32,290 --> 00:10:34,543 And so we're gonna remove those rows. 278 00:10:36,640 --> 00:10:37,473 Just to switch back 279 00:10:37,473 --> 00:10:39,433 to the PowerPoint for a minute, 280 00:10:41,920 --> 00:10:44,320 these are our NEFIN standard ranges. 281 00:10:44,320 --> 00:10:46,180 So if a program does not 282 00:10:46,180 --> 00:10:48,670 provide ranges for tree DBH, 283 00:10:48,670 --> 00:10:50,320 we're gonna say that the DBH 284 00:10:50,320 --> 00:10:51,410 has to be between 10 285 00:10:51,410 --> 00:10:53,950 centimeters and 300 centimeters. 286 00:10:53,950 --> 00:10:57,210 Sapling DBH has to be between 2.54 centimeters 287 00:10:57,210 --> 00:10:59,920 to 10 centimeters and tree height 288 00:10:59,920 --> 00:11:01,600 has to be from zero centimeters 289 00:11:01,600 --> 00:11:03,610 to 5,000 centimeters. 290 00:11:03,610 --> 00:11:05,800 But if a program has a different definition, 291 00:11:05,800 --> 00:11:07,273 we will honor those. 292 00:11:13,060 --> 00:11:15,220 So I am gonna go ahead and accept 293 00:11:15,220 --> 00:11:16,930 that those rows with zero 294 00:11:16,930 --> 00:11:18,793 DBH will be removed. 295 00:11:20,590 --> 00:11:21,423 Next, we're allowed 296 00:11:21,423 --> 00:11:23,530 to upload any additional files, 297 00:11:23,530 --> 00:11:27,010 field protocols, information about your code lists, 298 00:11:27,010 --> 00:11:29,530 et cetera that you wanna include in this package. 299 00:11:29,530 --> 00:11:32,208 So I am going to include 300 00:11:32,208 --> 00:11:35,330 the monitoring plan from 2003 301 00:11:39,730 --> 00:11:41,260 and then I can add more files 302 00:11:41,260 --> 00:11:42,160 or remove this file. 303 00:11:42,160 --> 00:11:43,753 I'm just gonna upload that one. 304 00:11:45,640 --> 00:11:48,580 And then finally we can review and submit. 305 00:11:48,580 --> 00:11:50,236 So I will give this the date. 306 00:11:50,236 --> 00:11:53,800 (keyboard clicks) 307 00:11:53,800 --> 00:11:58,800 And we'll say years 2002, 2004 and 2010 loaded. 308 00:12:00,070 --> 00:12:01,573 And I'll put my name. 309 00:12:04,870 --> 00:12:08,410 Now we can review what our input files are. 310 00:12:08,410 --> 00:12:10,780 Oh, and look, we have a warning. 311 00:12:10,780 --> 00:12:12,010 So the years we've specified 312 00:12:12,010 --> 00:12:13,630 for this version overlap with 313 00:12:13,630 --> 00:12:16,516 existing data already uploaded into NEFIN, 314 00:12:16,516 --> 00:12:19,900 so I guess the year 2002 was already in there. 315 00:12:19,900 --> 00:12:22,210 So what'll happen is if we proceed 316 00:12:22,210 --> 00:12:23,500 with clicking submit version, 317 00:12:23,500 --> 00:12:25,510 the existing data associated 318 00:12:25,510 --> 00:12:26,343 with these years will be 319 00:12:26,343 --> 00:12:28,570 removed and replaced with the data 320 00:12:28,570 --> 00:12:31,210 that we are about to upload. 321 00:12:31,210 --> 00:12:32,500 The previous version of data 322 00:12:32,500 --> 00:12:35,170 will still be preserved in the version package, 323 00:12:35,170 --> 00:12:36,730 that snapshot that was taken 324 00:12:36,730 --> 00:12:38,140 at the last upload. 325 00:12:38,140 --> 00:12:40,000 But the standardized data 326 00:12:40,000 --> 00:12:41,200 that you can search in the front 327 00:12:41,200 --> 00:12:42,940 end of NEFIN is gonna be updated 328 00:12:42,940 --> 00:12:44,830 to what we are uploading now. 329 00:12:44,830 --> 00:12:46,693 And I'm gonna say that that's okay. 330 00:12:48,106 --> 00:12:50,350 And then we can review our standard metrics, 331 00:12:50,350 --> 00:12:53,350 our program definitions, all of these plot trees, 332 00:12:53,350 --> 00:12:54,613 sapling metrics, 333 00:12:55,540 --> 00:12:57,820 our new codes that are going in 334 00:12:57,820 --> 00:12:59,950 and our ancillary files that we upload. 335 00:12:59,950 --> 00:13:01,873 So I'm gonna submit that version. 336 00:13:03,280 --> 00:13:04,510 So we can check our upload 337 00:13:04,510 --> 00:13:07,090 status and once it's finished 338 00:13:07,090 --> 00:13:08,110 we'll be allowed to upload 339 00:13:08,110 --> 00:13:09,703 another version if we want. 340 00:13:10,540 --> 00:13:11,770 So that's how a program will 341 00:13:11,770 --> 00:13:13,540 upload data into the system. 342 00:13:13,540 --> 00:13:15,490 So let's now check out what it's like 343 00:13:15,490 --> 00:13:17,773 to search the data on the front end. 344 00:13:19,750 --> 00:13:23,773 So again, this isn't public yet, but stay tuned. 345 00:13:24,610 --> 00:13:26,650 So the homepage just gives 346 00:13:26,650 --> 00:13:28,750 a few summary statistics like 347 00:13:28,750 --> 00:13:32,260 how many plots, trees, and species we have. 348 00:13:32,260 --> 00:13:33,820 And these numbers will update 349 00:13:33,820 --> 00:13:36,160 every time data is loaded. 350 00:13:36,160 --> 00:13:38,140 And then we have a little chart below 351 00:13:38,140 --> 00:13:40,900 that for top 10 species by basal area, 352 00:13:40,900 --> 00:13:43,543 which is just a whole lot of sugar maple. 353 00:13:45,400 --> 00:13:47,530 So the bulk of the utility of this site 354 00:13:47,530 --> 00:13:49,573 is in the get data page. 355 00:13:52,030 --> 00:13:53,080 Here you can search through 356 00:13:53,080 --> 00:13:56,470 our various entities, programs, plots, trees, 357 00:13:56,470 --> 00:13:59,500 saplings and seedlings using 358 00:13:59,500 --> 00:14:01,720 different criteria on the left. 359 00:14:01,720 --> 00:14:03,100 And these criteria will change 360 00:14:03,100 --> 00:14:05,710 depending on what entity you're looking at. 361 00:14:05,710 --> 00:14:08,020 I'll expand this for now. 362 00:14:08,020 --> 00:14:10,780 So I can search by program, by state, 363 00:14:10,780 --> 00:14:12,130 or by time range as far 364 00:14:12,130 --> 00:14:14,020 as the programs are concerned. 365 00:14:14,020 --> 00:14:19,020 So say we want to search for everything in Vermont 366 00:14:21,670 --> 00:14:23,770 that gives us five programs. 367 00:14:23,770 --> 00:14:28,770 And then let's say I only want from 2020 to 2022, 368 00:14:29,320 --> 00:14:32,410 that narrows it down to four programs. 369 00:14:32,410 --> 00:14:34,300 So if I wanted to download my data, 370 00:14:34,300 --> 00:14:36,640 everything from those four programs, 371 00:14:36,640 --> 00:14:38,320 I could either go to this download tab 372 00:14:38,320 --> 00:14:39,160 or this link where it 373 00:14:39,160 --> 00:14:41,170 says download results 374 00:14:41,170 --> 00:14:43,220 and it takes us to this download tab 375 00:14:44,080 --> 00:14:45,640 where I can download a package 376 00:14:45,640 --> 00:14:47,320 that has all of the original 377 00:14:47,320 --> 00:14:51,340 data, that snapshot that we uploaded, 378 00:14:51,340 --> 00:14:53,210 all of the standardized data 379 00:14:54,280 --> 00:14:55,300 that we were just searching 380 00:14:55,300 --> 00:14:58,690 for, and then all of those related files. 381 00:14:58,690 --> 00:15:00,460 And then this option is 382 00:15:00,460 --> 00:15:02,110 to include the original data in the 383 00:15:02,110 --> 00:15:05,170 standardized data file called a dock field. 384 00:15:05,170 --> 00:15:06,559 What we've done is we've taken 385 00:15:06,559 --> 00:15:08,470 this original data and if you 386 00:15:08,470 --> 00:15:13,210 have a tree observation, we've stuck a JSON object, 387 00:15:13,210 --> 00:15:15,760 basically a a string of all that original data 388 00:15:15,760 --> 00:15:18,520 into one field in our standardized data. 389 00:15:18,520 --> 00:15:19,353 So if you wanted, 390 00:15:19,353 --> 00:15:21,220 you could only download the standardized 391 00:15:21,220 --> 00:15:23,470 data but include that original data 392 00:15:23,470 --> 00:15:26,350 in the doc field so that you would get both. 393 00:15:26,350 --> 00:15:27,610 Then you also have the option 394 00:15:27,610 --> 00:15:30,190 to just download these files. 395 00:15:30,190 --> 00:15:33,140 So these are grayed out these packages 396 00:15:34,000 --> 00:15:36,070 because these programs have chosen 397 00:15:36,070 --> 00:15:37,330 to fuzz their plot data. 398 00:15:37,330 --> 00:15:38,440 So you're not actually allowed 399 00:15:38,440 --> 00:15:40,930 to download the original data from them. 400 00:15:40,930 --> 00:15:42,400 You would have to contact us 401 00:15:42,400 --> 00:15:43,960 or contact the program to see 402 00:15:43,960 --> 00:15:46,150 if you could access that original data. 403 00:15:46,150 --> 00:15:49,420 But FEMC, we're not fuzzing our plot. 404 00:15:49,420 --> 00:15:51,160 So you can actually download the package. 405 00:15:51,160 --> 00:15:54,100 So this would be our original data package. 406 00:15:54,100 --> 00:15:56,890 And then you also have these scripts alongside. 407 00:15:56,890 --> 00:15:59,140 So these are our pre-processing scripts. 408 00:15:59,140 --> 00:16:00,730 There's one for each project. 409 00:16:00,730 --> 00:16:03,790 And then allfuncs just has common functions 410 00:16:03,790 --> 00:16:04,930 that are used by other scripts. 411 00:16:04,930 --> 00:16:07,870 So that would be downloaded as well. 412 00:16:07,870 --> 00:16:09,850 So I'm gonna wait to download 413 00:16:09,850 --> 00:16:11,380 until we refine our search a 414 00:16:11,380 --> 00:16:13,300 little bit more so we're 415 00:16:13,300 --> 00:16:15,790 not downloading quite as much stuff. 416 00:16:15,790 --> 00:16:19,873 So let us look at the plot table. 417 00:16:21,790 --> 00:16:23,320 You'll see when we change tabs, 418 00:16:23,320 --> 00:16:25,120 our search criteria remains. 419 00:16:25,120 --> 00:16:28,390 So we're still searching Vermont year 2020. 420 00:16:28,390 --> 00:16:30,040 And then this actually changed 421 00:16:30,040 --> 00:16:31,600 to 2020 because I know we 422 00:16:31,600 --> 00:16:34,690 don't have any plot observation data past 2020. 423 00:16:34,690 --> 00:16:37,420 So it updated to the the max year 424 00:16:37,420 --> 00:16:38,620 that's in the database right 425 00:16:38,620 --> 00:16:42,010 now, which happens to be 2020. 426 00:16:42,010 --> 00:16:44,770 So if we look at this little tool tip here, 427 00:16:44,770 --> 00:16:47,410 it says click link to see original data 428 00:16:47,410 --> 00:16:50,680 if program allows or program info. 429 00:16:50,680 --> 00:16:53,830 So I know that NAMP fuzzes their plots. 430 00:16:53,830 --> 00:16:55,960 So if we click on this, 431 00:16:55,960 --> 00:16:58,540 we're getting all this information about the program, 432 00:16:58,540 --> 00:16:59,740 we have program details, 433 00:16:59,740 --> 00:17:02,290 we have a link to the actual project, 434 00:17:02,290 --> 00:17:03,400 we get to see all of our 435 00:17:03,400 --> 00:17:06,040 program definitions here and then 436 00:17:06,040 --> 00:17:07,150 we see the files 437 00:17:07,150 --> 00:17:09,163 that we can download from this program. 438 00:17:10,090 --> 00:17:14,590 However, I know that FEMC does not fuzz spot, 439 00:17:14,590 --> 00:17:17,890 so I'm just gonna search for FEMC for a minute. 440 00:17:17,890 --> 00:17:20,470 And now since FEMC allows you 441 00:17:20,470 --> 00:17:22,300 to see your original data, 442 00:17:22,300 --> 00:17:24,830 when we click on that we get 443 00:17:25,924 --> 00:17:28,810 this original plot observation data. 444 00:17:28,810 --> 00:17:32,320 So this is a JSON of everything that was in the 445 00:17:32,320 --> 00:17:34,990 original data for this plot observation. 446 00:17:34,990 --> 00:17:36,490 So I did think it was strange 447 00:17:36,490 --> 00:17:37,540 that we have a field called 448 00:17:37,540 --> 00:17:39,430 year that gives an entire date, 449 00:17:39,430 --> 00:17:40,263 not just a year, 450 00:17:40,263 --> 00:17:43,000 but I went back and looked and that is how it is. 451 00:17:43,000 --> 00:17:45,313 So this is actually correct. 452 00:17:47,380 --> 00:17:49,723 So now if we move on to trees, 453 00:17:50,830 --> 00:17:51,988 we can see that there are quite 454 00:17:51,988 --> 00:17:55,360 a few more filters in the 455 00:17:55,360 --> 00:17:57,070 trees that we can filter by. 456 00:17:57,070 --> 00:17:57,903 And you can also see 457 00:17:57,903 --> 00:17:59,410 that our search parameters change. 458 00:17:59,410 --> 00:18:01,630 So now we're searching for tree states, 459 00:18:01,630 --> 00:18:06,013 Vermont and tree year min instead of just the plots. 460 00:18:07,120 --> 00:18:09,760 So let's look at some of these filters. 461 00:18:09,760 --> 00:18:11,890 We can do species, 462 00:18:11,890 --> 00:18:15,580 we can do tree status, DVH height, crown class, 463 00:18:15,580 --> 00:18:19,480 and then presence, absence of forest health indicator. 464 00:18:19,480 --> 00:18:22,150 So I'm going to search for some species, 465 00:18:22,150 --> 00:18:24,313 let's search for spruce. 466 00:18:26,260 --> 00:18:28,630 So now this is now filtered 467 00:18:28,630 --> 00:18:30,400 to a bunch of different spruce 468 00:18:30,400 --> 00:18:33,220 and I'm gonna select them all 469 00:18:33,220 --> 00:18:34,810 and search for them all. 470 00:18:34,810 --> 00:18:37,510 So now in our search parameters up here, 471 00:18:37,510 --> 00:18:39,340 you can see that we have a list 472 00:18:39,340 --> 00:18:42,880 of names and TSMs that we're 473 00:18:42,880 --> 00:18:44,100 searching in our database 474 00:18:44,100 --> 00:18:46,423 so you know exactly what's being searched. 475 00:18:49,270 --> 00:18:50,680 And this is similar where 476 00:18:50,680 --> 00:18:53,140 you'll get the original tree 477 00:18:53,140 --> 00:18:54,763 observation data here. 478 00:18:56,050 --> 00:18:58,390 So now let's look at saplings. 479 00:18:58,390 --> 00:19:00,490 You can see our search parameters changed. 480 00:19:00,490 --> 00:19:01,780 So we're searching saplings 481 00:19:01,780 --> 00:19:04,393 for those species states in here. 482 00:19:05,470 --> 00:19:07,453 And then the same thing with seedlings. 483 00:19:09,340 --> 00:19:11,680 So let's download our results. 484 00:19:11,680 --> 00:19:13,600 I'm gonna download an entire package 485 00:19:13,600 --> 00:19:15,463 so we can see that that looks like. 486 00:19:17,620 --> 00:19:20,600 Okay, I'm gonna download our search results 487 00:19:22,122 --> 00:19:24,172 and I'll take a look at what we got here. 488 00:19:26,440 --> 00:19:29,860 Okay, so in our search results dated zip, 489 00:19:29,860 --> 00:19:32,890 you can see we have three CSV files, programs, 490 00:19:32,890 --> 00:19:34,780 plots and seedlings. 491 00:19:34,780 --> 00:19:36,550 Now if we were on the sapling 492 00:19:36,550 --> 00:19:38,380 or tree tab when we downloaded, 493 00:19:38,380 --> 00:19:41,680 we would have a tab sapling or tree CSV. 494 00:19:41,680 --> 00:19:44,740 If we were on the program or plots tab 495 00:19:44,740 --> 00:19:45,760 and we did not have a 496 00:19:45,760 --> 00:19:48,760 tree sapling or seedling filter set, 497 00:19:48,760 --> 00:19:50,920 then we would get all three tree sapling 498 00:19:50,920 --> 00:19:51,973 and seedling CSVs. 499 00:19:53,188 --> 00:19:54,190 But for now, let's take a look 500 00:19:54,190 --> 00:19:55,960 at what we have in our CSVs. 501 00:19:55,960 --> 00:19:57,880 So this is the program.csv. 502 00:19:57,880 --> 00:19:59,230 It just has our one program 503 00:19:59,230 --> 00:20:01,390 that we had returned in our 504 00:20:01,390 --> 00:20:04,780 search results, the FBMC forest health monitoring. 505 00:20:04,780 --> 00:20:07,120 Let's look at plots. 506 00:20:07,120 --> 00:20:09,190 So this is all of our FHM plots 507 00:20:09,190 --> 00:20:10,510 that came back in our search results. 508 00:20:10,510 --> 00:20:11,350 And you can see we have 509 00:20:11,350 --> 00:20:13,322 this doc field, raise (indistinct) 510 00:20:13,322 --> 00:20:14,380 with all of the original data. 511 00:20:14,380 --> 00:20:15,640 So this is the original 512 00:20:15,640 --> 00:20:19,330 data observation that we uploaded. 513 00:20:19,330 --> 00:20:21,790 And then this row is actually 514 00:20:21,790 --> 00:20:22,930 our standardized data. 515 00:20:22,930 --> 00:20:24,550 So these (indistinct) longs, 516 00:20:24,550 --> 00:20:25,750 if they had been in a different 517 00:20:25,750 --> 00:20:27,250 coordinate system or if there 518 00:20:27,250 --> 00:20:30,400 was a DBH that had been calculated in inches 519 00:20:30,400 --> 00:20:32,500 and we store in centimeters, 520 00:20:32,500 --> 00:20:34,303 they would be converted. 521 00:20:35,320 --> 00:20:38,440 And then we can look at the seedling file. 522 00:20:38,440 --> 00:20:40,330 Same thing, we have our dock field 523 00:20:40,330 --> 00:20:42,190 with our original data and then we 524 00:20:42,190 --> 00:20:44,503 have our row with our standardized data. 525 00:20:47,740 --> 00:20:49,720 So let's take a look at this folder. 526 00:20:49,720 --> 00:20:52,360 So each of the programs returned in our search, 527 00:20:52,360 --> 00:20:53,620 will have a folder 528 00:20:53,620 --> 00:20:55,420 that has packages, scripts, 529 00:20:55,420 --> 00:20:56,500 and then there would be another 530 00:20:56,500 --> 00:20:58,780 folder here for files if we 531 00:20:58,780 --> 00:21:01,330 have ancillary files that we uploaded. 532 00:21:01,330 --> 00:21:03,370 So the scripts are those same two scripts, 533 00:21:03,370 --> 00:21:04,690 the pre-processing script 534 00:21:04,690 --> 00:21:06,880 and then the included functions 535 00:21:06,880 --> 00:21:08,500 with allfuncs. 536 00:21:08,500 --> 00:21:09,333 And then packages, 537 00:21:09,333 --> 00:21:12,610 we get that snapshot data package 538 00:21:12,610 --> 00:21:14,560 of what was uploaded. 539 00:21:14,560 --> 00:21:16,270 So in the pre-processor folder 540 00:21:16,270 --> 00:21:17,873 we get those scripts 541 00:21:17,873 --> 00:21:20,140 once again in the program data folder though, 542 00:21:20,140 --> 00:21:21,520 we get these data sets. 543 00:21:21,520 --> 00:21:23,260 So these are the original files 544 00:21:23,260 --> 00:21:25,873 that we chose to upload from our program. 545 00:21:27,460 --> 00:21:29,350 Metadata gives us a metadata 546 00:21:29,350 --> 00:21:31,570 EML with more information, oh, 547 00:21:31,570 --> 00:21:33,190 I guess that's not gonna show up on our video, 548 00:21:33,190 --> 00:21:34,930 but it's a metadata EML, 549 00:21:34,930 --> 00:21:36,370 take my word for it 550 00:21:36,370 --> 00:21:38,680 and you'll be able to 551 00:21:38,680 --> 00:21:41,830 get a description of what metadata is being downloaded. 552 00:21:41,830 --> 00:21:44,170 And then in our raw files, 553 00:21:44,170 --> 00:21:48,370 these are our own standardized data files. 554 00:21:48,370 --> 00:21:50,470 So if you took those files 555 00:21:50,470 --> 00:21:52,180 in the data set folder 556 00:21:52,180 --> 00:21:54,970 and you ran those scripts that we included on these, 557 00:21:54,970 --> 00:21:57,463 you would come out with our raw files. 558 00:22:00,860 --> 00:22:04,783 Okay, so back to the website for one last thing, 559 00:22:05,800 --> 00:22:07,750 this inventory programs page, 560 00:22:07,750 --> 00:22:11,020 you can get to all of your program level information. 561 00:22:11,020 --> 00:22:12,820 So if I just wanted 562 00:22:12,820 --> 00:22:15,820 to download data from FEMC Forest Health 563 00:22:15,820 --> 00:22:16,750 Monitoring and that's it, 564 00:22:16,750 --> 00:22:18,310 I could come here and there's 565 00:22:18,310 --> 00:22:19,690 a link right here to download 566 00:22:19,690 --> 00:22:21,710 standardized program data and files 567 00:22:22,900 --> 00:22:25,450 and get all of my program definitions here. 568 00:22:25,450 --> 00:22:27,280 I can get a history of the packages 569 00:22:27,280 --> 00:22:29,755 that were downloaded and the notes 570 00:22:29,755 --> 00:22:33,400 and then I can also get my files, 571 00:22:33,400 --> 00:22:35,893 packages and scripts to download separately. 572 00:22:37,210 --> 00:22:38,770 And then this little data summary 573 00:22:38,770 --> 00:22:41,770 section is just an example 574 00:22:41,770 --> 00:22:44,200 of what can be done with our data. 575 00:22:44,200 --> 00:22:47,150 Sorin put together an R shiny app 576 00:22:49,690 --> 00:22:53,780 and you can make visualizations like plot maps 577 00:22:55,840 --> 00:23:00,840 or tree year sample, year histograms, 578 00:23:01,540 --> 00:23:02,650 things like that. 579 00:23:02,650 --> 00:23:04,900 I'm not gonna go into too much detail, 580 00:23:04,900 --> 00:23:07,903 but you can look through this when the site is live. 581 00:23:08,980 --> 00:23:11,323 So thank you all for your interest in NEFIN. 582 00:23:14,648 --> 00:23:18,504 (audience applauding)