1 00:00:01,110 --> 00:00:02,920 - [Instructor] Welcome back to module six. 2 00:00:02,920 --> 00:00:04,020 In this part, we'll look at 3 00:00:04,020 --> 00:00:08,423 Deterministic Interpolation Operations within ArcGIS. 4 00:00:10,250 --> 00:00:11,690 In case you stepped away and forgot 5 00:00:11,690 --> 00:00:13,820 what we were working with from the previous lectures, 6 00:00:13,820 --> 00:00:16,200 I thought I'd show a quick slide with the data. 7 00:00:16,200 --> 00:00:17,990 Once again, these are the recreation sites 8 00:00:17,990 --> 00:00:19,640 for the state of Vermont. 9 00:00:19,640 --> 00:00:22,870 And I display the point locations on the left 10 00:00:22,870 --> 00:00:24,660 and then use that acreage attribute 11 00:00:24,660 --> 00:00:28,760 that we've been working with and a graduated symbol size 12 00:00:28,760 --> 00:00:31,890 to reveal where the difference in the size 13 00:00:31,890 --> 00:00:34,253 of the individual recreation sites. 14 00:00:36,950 --> 00:00:39,010 Now, the first of the deterministic methods 15 00:00:39,010 --> 00:00:41,740 that we'll look at is known as Natural Neighbor. 16 00:00:41,740 --> 00:00:43,760 And really this is an offshoot 17 00:00:43,760 --> 00:00:45,080 of something we've already done, 18 00:00:45,080 --> 00:00:47,410 which is to create Thiessen polygons. 19 00:00:47,410 --> 00:00:50,590 So the simplest of all methods for interpolation 20 00:00:50,590 --> 00:00:53,020 could be to create Thiessen polygons 21 00:00:53,020 --> 00:00:56,160 and then assign the cell value 22 00:00:56,160 --> 00:01:01,010 to that location that it's closest to on the landscape. 23 00:01:01,010 --> 00:01:03,490 In this case, the Natural Neighbor approach 24 00:01:03,490 --> 00:01:05,030 takes it one step further. 25 00:01:05,030 --> 00:01:07,900 So first it constructs the Thiessen polygons 26 00:01:07,900 --> 00:01:10,310 based on those sample point locations. 27 00:01:10,310 --> 00:01:13,670 Let me see those polygons here drawn in black. 28 00:01:13,670 --> 00:01:17,520 Then it draws a Thiessen polygon 29 00:01:17,520 --> 00:01:19,253 for the interpolation point, 30 00:01:20,310 --> 00:01:23,850 breaking down those individual polygons 31 00:01:23,850 --> 00:01:26,260 from the first construct. 32 00:01:26,260 --> 00:01:27,100 And the overlap 33 00:01:27,100 --> 00:01:29,840 between those two different polygon datasets 34 00:01:29,840 --> 00:01:32,680 define the weights that are used for estimation. 35 00:01:32,680 --> 00:01:34,120 So the values that come out 36 00:01:34,120 --> 00:01:37,550 of each one of those individual Thiessen polygons 37 00:01:37,550 --> 00:01:40,730 is based on how much of that slice 38 00:01:40,730 --> 00:01:43,100 comprises the total Thiessen polygon 39 00:01:43,100 --> 00:01:45,840 from that round one calculation. 40 00:01:45,840 --> 00:01:49,660 Now, this produces polygons of irregular size. 41 00:01:49,660 --> 00:01:54,140 Where you have denser sampling, you'll get smaller polygons. 42 00:01:54,140 --> 00:01:55,873 And this is an exact interpolator. 43 00:01:57,030 --> 00:01:58,130 So, what's that look like 44 00:01:58,130 --> 00:02:00,750 in terms of the Geoprocessing interface? 45 00:02:00,750 --> 00:02:05,520 We enter our input point features here, recreation sites. 46 00:02:05,520 --> 00:02:08,530 In this case, it's a Z value field 47 00:02:08,530 --> 00:02:09,620 that we'll be working with. 48 00:02:09,620 --> 00:02:12,110 It needs to be a numeric data type, 49 00:02:12,110 --> 00:02:14,170 and we'll focus on the acreage attribute 50 00:02:14,170 --> 00:02:16,343 of that recreation site's dataset. 51 00:02:17,300 --> 00:02:19,480 If you switch over to the Environments tab, 52 00:02:19,480 --> 00:02:21,370 you'll notice that the Cell Size 53 00:02:21,370 --> 00:02:25,393 and Snap Raster Environments are honored, but not the Mask. 54 00:02:26,480 --> 00:02:30,310 I set my output cell size to 500 meters in this case 55 00:02:30,310 --> 00:02:32,690 since we're working on an interpolation 56 00:02:32,690 --> 00:02:33,993 for the entire state. 57 00:02:35,670 --> 00:02:37,950 And here's the output that we see. 58 00:02:37,950 --> 00:02:41,440 Again, note that the mask parameter 59 00:02:41,440 --> 00:02:44,760 from our environment settings has not been respected here 60 00:02:44,760 --> 00:02:47,430 even though I have the state of Vermont boundaries 61 00:02:47,430 --> 00:02:48,883 set as that mask. 62 00:02:51,230 --> 00:02:55,280 If we zoom in, we get a closer look at the islands 63 00:02:55,280 --> 00:02:59,783 that can form around the data from an approach like this. 64 00:03:04,260 --> 00:03:05,093 We could also use 65 00:03:05,093 --> 00:03:10,093 a fixed radius local averaging approach for interpolation. 66 00:03:10,470 --> 00:03:12,720 It's a bit more complex than Natural Neighbor, 67 00:03:12,720 --> 00:03:14,260 but certainly less complex 68 00:03:14,260 --> 00:03:16,230 than really the rest of the others 69 00:03:16,230 --> 00:03:19,150 that we have at our disposal. 70 00:03:19,150 --> 00:03:22,140 The fixed radius approach smooths the sample data 71 00:03:22,140 --> 00:03:24,160 when creating that surface. 72 00:03:24,160 --> 00:03:26,320 It's not unlike a moving window analysis 73 00:03:26,320 --> 00:03:30,060 that we've seen in previous exercises. 74 00:03:30,060 --> 00:03:32,440 It computes the average of all the sample points 75 00:03:32,440 --> 00:03:35,593 located within some search radius that you specify. 76 00:03:36,680 --> 00:03:38,100 Too small of a search radius 77 00:03:38,100 --> 00:03:39,480 and you'll have a lot of empty cells 78 00:03:39,480 --> 00:03:41,700 resulting in no data values. 79 00:03:41,700 --> 00:03:45,610 Too large and you might smooth out the dataset too much. 80 00:03:45,610 --> 00:03:49,000 So, we see over on the right-hand side in the lower image 81 00:03:49,000 --> 00:03:51,823 four cells for which we'll compute values. 82 00:03:53,170 --> 00:03:55,520 The algorithm draws the search radius 83 00:03:55,520 --> 00:03:58,593 around each one of those four cell locations, 84 00:03:59,550 --> 00:04:03,760 identifies all of the neighbors within that search radius, 85 00:04:03,760 --> 00:04:07,000 sums the value of those observations 86 00:04:07,000 --> 00:04:09,090 and then computes the average 87 00:04:09,090 --> 00:04:11,373 to assign to the output raster. 88 00:04:13,070 --> 00:04:17,370 Note that if there are no observations 89 00:04:17,370 --> 00:04:21,993 present within a search radius, no data value is assigned. 90 00:04:23,140 --> 00:04:27,840 Because the fixed radius approach computes an average value, 91 00:04:27,840 --> 00:04:30,260 it's an inexact interpolator, 92 00:04:30,260 --> 00:04:33,610 meaning that the value at a known location, 93 00:04:33,610 --> 00:04:36,780 the interpolated value at a sample location 94 00:04:36,780 --> 00:04:38,803 might not match that sample value. 95 00:04:41,500 --> 00:04:42,370 Next up is the 96 00:04:42,370 --> 00:04:45,853 inverse distance weighted interpolation method. 97 00:04:47,550 --> 00:04:49,570 This approach has an explicit assumption 98 00:04:49,570 --> 00:04:52,000 that the things that are closer together are more alike 99 00:04:52,000 --> 00:04:53,810 than those that are further apart. 100 00:04:53,810 --> 00:04:56,853 Again, the basic tenant of spatial autocorrelation. 101 00:04:57,750 --> 00:05:00,470 Now, the weight assignment here 102 00:05:00,470 --> 00:05:02,380 is equal to the inverse distance 103 00:05:02,380 --> 00:05:06,603 between the unknown location and the sample point location. 104 00:05:07,820 --> 00:05:10,740 IDW also uses a power function. 105 00:05:10,740 --> 00:05:12,330 The higher the P value, 106 00:05:12,330 --> 00:05:16,840 the more rapid the decrease in the weight with distance, 107 00:05:16,840 --> 00:05:20,570 meaning that you need to be closer to a location 108 00:05:20,570 --> 00:05:22,980 in order to play a contributing factor 109 00:05:22,980 --> 00:05:25,190 in the estimated value. 110 00:05:25,190 --> 00:05:27,650 Now, this is an exact interpolator, 111 00:05:27,650 --> 00:05:31,310 but you should use caution with both sampling or testing 112 00:05:31,310 --> 00:05:34,230 when designating the number of neighbors 113 00:05:34,230 --> 00:05:35,870 and the power function. 114 00:05:35,870 --> 00:05:38,180 And we'll see an example here of the effect 115 00:05:38,180 --> 00:05:41,633 that power function can have on the output surface. 116 00:05:44,610 --> 00:05:47,540 The IDW interface is pretty straightforward. 117 00:05:47,540 --> 00:05:50,970 Once again, I'm going to input my recreation sites 118 00:05:50,970 --> 00:05:52,860 as my input point features 119 00:05:52,860 --> 00:05:56,503 and specify acreage as my Z value field. 120 00:05:57,860 --> 00:06:00,883 I've got a search distance of 2,500, 121 00:06:01,780 --> 00:06:04,603 and then I'll put cell size of 500. 122 00:06:05,740 --> 00:06:10,740 Now, my search radius is fixed and I'm using a power of two. 123 00:06:12,740 --> 00:06:14,260 So that search radius is drawn 124 00:06:14,260 --> 00:06:17,633 around that unknown value location, 125 00:06:20,150 --> 00:06:24,983 the neighbors are identified and the value is assigned. 126 00:06:27,040 --> 00:06:32,040 Now, I could also use a variable radius for my search. 127 00:06:32,840 --> 00:06:35,970 In this case, I specify a minimum number of points 128 00:06:36,840 --> 00:06:38,893 and, or a maximum distance. 129 00:06:40,366 --> 00:06:45,030 ArcGIS then looks for the, in this case nearest 12 points 130 00:06:45,030 --> 00:06:48,380 to that search location 131 00:06:48,380 --> 00:06:50,650 and selects them for calculating the value 132 00:06:50,650 --> 00:06:52,433 at that unknown point location. 133 00:06:55,240 --> 00:06:59,910 We see here the output from an IDW operation 134 00:06:59,910 --> 00:07:02,850 where we're looking for 12 neighbors 135 00:07:02,850 --> 00:07:05,493 for each one of those interpolated points. 136 00:07:08,530 --> 00:07:12,980 Zooming in, we see some more definition around the edges 137 00:07:12,980 --> 00:07:15,620 where we saw a smoother pattern previously 138 00:07:15,620 --> 00:07:17,420 with the Natural Neighbors approach. 139 00:07:18,450 --> 00:07:19,540 Let's look at the impact 140 00:07:19,540 --> 00:07:23,430 of the power parameter on the output. 141 00:07:23,430 --> 00:07:27,800 All of these examples use the same choropleth color scheme 142 00:07:30,160 --> 00:07:32,300 to render the values. 143 00:07:32,300 --> 00:07:34,380 The one difference is that I changed the power 144 00:07:34,380 --> 00:07:38,320 between each iteration of the IDW approach. 145 00:07:38,320 --> 00:07:40,583 So here we are with a power factor of 0.5, 146 00:07:42,810 --> 00:07:47,810 one, two, 10 and 25. 147 00:07:51,270 --> 00:07:53,993 Remember, as the power increases, 148 00:07:55,890 --> 00:07:59,340 the weight decreases quite rapidly, 149 00:07:59,340 --> 00:08:02,520 meaning you need to be closer to that unknown location 150 00:08:03,890 --> 00:08:07,520 to make a contribution to the estimated value. 151 00:08:07,520 --> 00:08:10,390 That's why we see islands around some of these points 152 00:08:10,390 --> 00:08:14,130 that have larger acreages assigned to them 153 00:08:14,130 --> 00:08:16,500 and a lot of green area out there 154 00:08:16,500 --> 00:08:19,770 which is the low end value on the color scale in this case 155 00:08:22,270 --> 00:08:26,860 Now, you can also use the Geostatistical Wizard 156 00:08:26,860 --> 00:08:29,950 to compute an inverse distance weighting surface. 157 00:08:29,950 --> 00:08:32,510 And this is slightly different and a bit more complex 158 00:08:32,510 --> 00:08:33,683 than what we just saw. 159 00:08:35,290 --> 00:08:36,543 In the previous version, 160 00:08:37,805 --> 00:08:40,710 ArcGIS makes quite a few assumptions for you. 161 00:08:40,710 --> 00:08:42,800 When you use the Geostatistical Wizard, 162 00:08:42,800 --> 00:08:44,120 you have a lot more control 163 00:08:44,120 --> 00:08:47,853 over the way that your interpolation approach proceeds. 164 00:08:49,230 --> 00:08:52,080 So, I've selected my inverse distance weighting 165 00:08:52,080 --> 00:08:53,963 and input my recreation sites. 166 00:08:55,900 --> 00:09:00,480 The next thing I see is a plot of the dataset 167 00:09:01,990 --> 00:09:03,030 where the hash mark 168 00:09:03,930 --> 00:09:07,770 is centered over an unknown point location 169 00:09:09,140 --> 00:09:11,520 and the radius is drawn around it. 170 00:09:11,520 --> 00:09:15,090 From that we can also see which of the other features 171 00:09:15,090 --> 00:09:19,080 have been selected to be included in the estimation 172 00:09:19,080 --> 00:09:21,183 of that value at the unknown location. 173 00:09:23,170 --> 00:09:28,170 I can change my sector type to four with a 45 degree offset, 174 00:09:28,270 --> 00:09:29,790 which now means that ArcGIS 175 00:09:29,790 --> 00:09:33,140 will look for a minimum number of 10 neighbors 176 00:09:33,140 --> 00:09:35,590 in each one of the quadrants that I've specified. 177 00:09:37,770 --> 00:09:41,050 Next, we can look at the cross validation pane, 178 00:09:41,050 --> 00:09:43,670 and in this case see that the model that's been built 179 00:09:43,670 --> 00:09:45,960 does not necessarily fit all that well 180 00:09:45,960 --> 00:09:49,070 and it might be due to the wild distribution 181 00:09:49,070 --> 00:09:50,593 of the dataset itself. 182 00:09:52,130 --> 00:09:54,010 From here, I can click Finish 183 00:09:54,010 --> 00:09:55,860 and I can review my method report. 184 00:09:55,860 --> 00:09:57,280 This gives me all the details 185 00:09:57,280 --> 00:10:02,280 of the parameters that I set via the Geostatistical Wizard, 186 00:10:02,540 --> 00:10:05,803 and I can also see the output over on the right-hand side. 187 00:10:06,700 --> 00:10:07,960 Now, one thing to note, 188 00:10:07,960 --> 00:10:09,920 when you're using the Geostatistical Wizard, 189 00:10:09,920 --> 00:10:14,920 the output is a Geostatistical analyst layer file. 190 00:10:15,820 --> 00:10:17,220 They're still easy to work with, 191 00:10:17,220 --> 00:10:19,060 but if you want to share them 192 00:10:19,060 --> 00:10:22,530 or do some further advanced calculations, 193 00:10:22,530 --> 00:10:25,343 you might need to export those to a raster dataset. 194 00:10:26,440 --> 00:10:27,273 If you right click, 195 00:10:27,273 --> 00:10:29,880 I think you can find the right path to achieving that. 196 00:10:32,490 --> 00:10:35,510 The next approach we'll look at is known as the Spline. 197 00:10:35,510 --> 00:10:37,760 Now, this is a mathematical function 198 00:10:37,760 --> 00:10:42,270 that attempts to minimize the overall surface curvature. 199 00:10:42,270 --> 00:10:45,070 So think of it as sort of bending a piece of rubber 200 00:10:45,070 --> 00:10:46,180 or a piece of paper 201 00:10:46,180 --> 00:10:49,490 to fit through all of the observation points. 202 00:10:49,490 --> 00:10:51,790 It evaluates a specified number of neighbors, 203 00:10:51,790 --> 00:10:54,940 again, determined by the user. 204 00:10:54,940 --> 00:10:57,570 Now, there's two different kinds of Spline approaches. 205 00:10:57,570 --> 00:10:59,750 One is the regularized. 206 00:10:59,750 --> 00:11:03,130 This is a smooth, gradually changing surface 207 00:11:03,130 --> 00:11:06,540 whose values may exceed or fall below 208 00:11:06,540 --> 00:11:11,300 the maximum and minimum values in the dataset respectively. 209 00:11:11,300 --> 00:11:15,180 It also uses 1st, 2nd and 3rd order derivatives 210 00:11:15,180 --> 00:11:16,640 in the calculation, 211 00:11:16,640 --> 00:11:18,750 but don't worry, that happens behind the scenes. 212 00:11:18,750 --> 00:11:21,300 It's not something I'm expecting you to learn. 213 00:11:21,300 --> 00:11:23,530 When you specify the weight parameter here, 214 00:11:23,530 --> 00:11:24,740 the higher the weight, 215 00:11:24,740 --> 00:11:26,893 the smoother the surface that will result. 216 00:11:28,190 --> 00:11:32,020 The range is usually between zero and five, 217 00:11:32,020 --> 00:11:33,320 although it can go higher, 218 00:11:33,320 --> 00:11:36,090 but typically you'll see the weight value 219 00:11:36,090 --> 00:11:38,573 ranging between zero and 0.5. 220 00:11:39,440 --> 00:11:41,763 The tension approach only uses 1st 221 00:11:41,763 --> 00:11:43,900 and 2nd order derivatives. 222 00:11:43,900 --> 00:11:48,420 It considers more points and produces a smoother surface, 223 00:11:48,420 --> 00:11:51,440 but it increases the processing time. 224 00:11:51,440 --> 00:11:53,250 And in this case, the higher the weight, 225 00:11:53,250 --> 00:11:54,870 the coarser the surface. 226 00:11:54,870 --> 00:11:56,690 The tighter it's going to be snapped 227 00:11:57,610 --> 00:12:01,270 to those sample value locations. 228 00:12:01,270 --> 00:12:03,223 The weight must be greater than zero. 229 00:12:04,430 --> 00:12:08,060 Now, the Spline operations are both exact interpolators, 230 00:12:08,060 --> 00:12:12,880 but they can be relaxed the exactness. 231 00:12:12,880 --> 00:12:15,710 This is particularly helpful with the tension approach 232 00:12:15,710 --> 00:12:19,070 because of that increased processing time that is required 233 00:12:19,070 --> 00:12:21,310 if you want the surface to snap 234 00:12:21,310 --> 00:12:24,583 directly to the sample values. 235 00:12:27,110 --> 00:12:32,033 Here's a look at the regularized Spline acreage attribute 236 00:12:33,130 --> 00:12:35,830 from the recreation sites dataset 237 00:12:35,830 --> 00:12:37,830 where I'm using a weight of 0.1 238 00:12:37,830 --> 00:12:40,943 and looking for 12 points within my neighborhood. 239 00:12:42,670 --> 00:12:45,697 Contrast that with the Tension Spline. 240 00:12:46,946 --> 00:12:48,840 Same set up here with a weight of 0.1 241 00:12:49,700 --> 00:12:51,223 and 12 neighbor points. 242 00:12:53,030 --> 00:12:56,830 And lastly, if we look at all four of these approaches 243 00:12:57,750 --> 00:12:59,780 that we evaluated here, 244 00:12:59,780 --> 00:13:03,040 I'm using the same choropleth shading scheme 245 00:13:03,040 --> 00:13:07,150 for all four of these, and then I know the value ranges 246 00:13:07,150 --> 00:13:10,760 for each of the output datasets. 247 00:13:10,760 --> 00:13:13,030 So with our Natural Neighbor, 248 00:13:13,030 --> 00:13:18,030 that data set range estimates from five to 152,000, 249 00:13:18,378 --> 00:13:21,663 IDW, five to 154,000. 250 00:13:22,640 --> 00:13:26,670 My Regularized Spline, -91,000 acres. 251 00:13:26,670 --> 00:13:31,003 Not exactly sure what that means, all the way up to 512,000. 252 00:13:32,110 --> 00:13:34,793 And lastly, the Tension Spline, 253 00:13:34,793 --> 00:13:39,540 - 72,000, all the way up to 229,000. 254 00:13:39,540 --> 00:13:43,130 So, you can see the difference in the output 255 00:13:43,130 --> 00:13:45,190 from the various approaches. 256 00:13:45,190 --> 00:13:48,240 Of course, if you were doing this in a real-world setting, 257 00:13:48,240 --> 00:13:51,090 you'd want to get to know the data a lot more 258 00:13:51,090 --> 00:13:52,220 before making decisions 259 00:13:52,220 --> 00:13:54,473 about your neighborhood sizes and shapes. 260 00:13:55,500 --> 00:13:57,680 Well, that's it for deterministic methods. 261 00:13:57,680 --> 00:13:59,080 There's certainly a lot more options 262 00:13:59,080 --> 00:14:02,410 within the ArcGIS toolkit that you could explore, 263 00:14:02,410 --> 00:14:06,450 but I think this is a good place to start and stop for now. 264 00:14:06,450 --> 00:14:09,320 Next up, we'll look at some Geostatistical approaches 265 00:14:09,320 --> 00:14:11,053 to round out this week's lectures.