Send an email with your question and I will respond and then post the response if your question has general class interest—

 

…<click here> to review the Report Writing Tips

 

Part 2, Email Dialog 2010 covering discussions for weeks 6 through 10

 

<click here> for Part 1, Email Dialog 2010 covering discussions for weeks 1 through 5




___________________________

 

3/11/10

 

Jeremy—good to hear from you …responses embedded below.  Joe

 

Dr. Berry--Quick question regarding Exercise 9 question 6.  Regards, Jeremy

Discuss any relationships you detect among the three maps.  (Aside: don’t do the Composite analysis to determine how separable the clusters are).

 

The cluster algorithm compares data values between maps at the same location and depending on the number of cluster categories specified, calculates data distances and places data locations into the aforementioned number of categories.  …you might mention something about the “iterative nature” of the algorithm and that it attempts to minimize the intra-cluster (within each cluster) data distances and maximize the inter-cluster (among the different clusters) data distances. 

 

In the simple 2 category example the categories specify locations of similar and dissimilar data values.  As can be seen these areas are approximately evenly distributed over the map with similar regions (green) of approximately the same size as dissimilar locations (red). 

 

  …it is important to note how the “boundaries” change as another cluster is added.  For example, from two to three-cluster results there is minimal change in Cluster 1 (red) and 2 (green) with the new Cluster 3 (blue) moving in at the edges, but taking most of the locations from Cluster 1 (red).  What change in the three-cluster pattern do you see when it “morphs” to a four-cluster pattern (yellow cluster introduced)?  Also, note that I used the Shading Manager to carefully keep the colors assignments consistent in the three displays.    

 

Is something like the above what is required with additional mention of 3 and 4 clusters or should more mention of ISODATA classification be added?  Is this what the question is asking?



___________________________

 

3/10/10

 

Elizabeth—sorry for the delayed response …meetings. meetings, meetings.  Responses embedded below.  Joe

 

Hello Joe-- A couple questions for you on exercise 9.   Thanks!  Elizabeth

 

1. For question 2, the coincidence summary map ends up with discrete data in classes 1-9.   I haven't figured out yet how to know which intersections are producing which resulting numbers, since it is not the 2-digit code we have seen before.  

 

The classes represent a binary progression (1, 2, 4 for 1997 yield classes and 8, 16, 32 for 1998 yield classes).  The addition of a binary progression of numbers results in a unique sum.  For example, the only way you can get a sum of 9 is by the coincidence of class 1 in 1997 and class 1 in 1998.   

 

2. Also, for curiosity's sake, why are the values for the 1998 yield classes map 8, 16, and 32?  And the highest value for the 1997 yield class map a 4 instead of a 3?   Does the pattern give an advantage in the results?

 

…yep—the values form a binary progression.  I purposely set the numbers up as a binary progression so some of you would take the subtle hint (extension of the question) to use “Compute plus” as an alternate way to form the coincidence map.  Or others might renumber to 1, 2, 3 for both maps, then “Compute times 10 plus” to form the coincidence map. 

 

3. I ran a cross-tab query to get the cell counts since I hadn't yet figured the relation of 1-9 to categories of intersections, described above.   In the results, shown on the table below, the total cell count did not matching up between the two layers and it seems it should.  So, I went back and looked at cell counts for each of the two input map layers, and found the strangeness of the attached screen shot---the 1997_Yield_Classes map shows blue cells on it for the higher than 1 standard deviation, but when looking at the it says the count is 0.   Why would that be?   If the count for that value is truly 0, then there should be no blue specks on the map. 

 

    …the 3289 is the correct total for the number of cells—not sure where you got the 3285 figure???  …or the problem with the “blue cells”???

 

4. I'm not sure if my discrepancy in the count of the cells matching up in the table.  I'm also attaching the results of the cross-tab query so you can see if I interpreted it correctly.  It does seem like there should be 3289 cells (both input maps had that value). 

 

For those who followed the “Intersect completely” approach the resulting values are automatically assigned a sequential number to all value pairs between the two reclassified yield maps—1,8= 1 (low, low); 1,16= 2; 1,32= 3; 2,8 = 4; 2,16 = 5; 2,32= 6; 4,8= 7; 4,16= 8; 4,32= 9 (high, high).

 

 … intersected map (coincidence).

 

This is the coincidence table statistics I got—

 

 …where the total number of “matching cells” is 2626 out of a total number of cells of 3289  results in (2626 / 3289) * 100 = 79.8% matching cells.

 

 

___________________________

 

3/9/10

 

Hi Dr. Berry-- I have a question about Exercise 9.  Why in Question #7 after we use the Scan function, does it say to make 34 user defined ranges starting from 0?  This would means that a value of 35 is not labeled in this map.  Why is that? 

 

I also am having a hard time with question 8 and 9 because I have not had a statistics course yet, so I am having a hard time with residuals, R-squared, and regression.  Would you mind taking a look at my answers and seeing if I am on the right track?  Also is there some time Thursday that we could met to go over this.  Thank you, Luke

 

Luke—you are right, my mistake.  The “Number of ranges” needs to be 35 …coupled with a lot of patient number entry to get from 0-1 (light grey) to 34-35 (green) with a yellow color inflection point at 16-17 (see Shading Manager below).   

 

As to your concern about regression, residuals and r-squared in predictive spatial statistics, the following links…

 

-         http://www.innovativegis.com/basis/MapAnalysis/Topic10/Topic10.htm#Map_correlation, topics “Use Scatterplots to Understand Map Correlation” and “Can Predictable Maps Work for You?”      

-         Slides #34 through #39 that we didn’t get to during last week’s class (Wk9_lec.ppt) and

-         Online regression short primer at http://www.epa.gov/bioindicators/statprimer/regression.html  

 

…might help in your understanding of predictive spatial statistics using regression, residuals and r-squared.  Please review these materials before we get together on Thursday.  Joe

 

 

 

___________________________

 

3/7/10

 

Professor Berry-- I am currently working on question #5 of this week’s lab. My question regards the calculation of similarity values. I feel that I mostly understand the procedure, but I am not sure about the actual calculation.

 

In the question, we are using values of P and K content in the soil at a certain grid cell location as comparison values. When the final Similarity Value is calculated, does the algorithm assign an equal weight to a the similarity between P values (comparison and location values) and K values? In other words, is the final Similarity Value calculated based on 50% of the K similarity and 50% of the P similarity? For example if a value is 100% similar concerning K and 0% similar concerning P, will the final Similarity Value be 50% or not?

 

Also, how is the cutoff for a value to have 0% similarity determined; I would have thought that it would have been if there was absolutely NO P or K in the soil, but the table in Question 5 suggests otherwise.  Any help would be great as I work on the response.  Curtis

 

Curtis—it sounds like you are comfortable with the concept of “data distance” the multi-dimensional distance between two data patterns in numeric space.  If not, checkout …http://www.innovativegis.com/basis/MapAnalysis/Topic16/Topic16.htm where Topic “Geographic Software Removes Guesswork from Map Similarity” describes the similarity index calculation.

 

In short, the procedure calculates the data distance between the data patterns at every map location in a project area.  The maximum data distance becomes the reference for 0 (percent similar) indicating the least similar location.  The calculation for all other locations is (1 – mapValue / (maxDdistance) * 100 to express as a percent similar.  In the case of the location with the maximum data distance, the calculation becomes (1 – maxDdistance / (maxDdistance) * 100 = 0%.  For the comparison location itself, the calculation becomes (1 – 0 / (maxDdistance) * 100 = 100% identifying a data pattern exactly the same as the comparison point.  All of the other data distances are similarly processed to yield a similarity scale from 0 (least similar) to 100 (most similar).  Joe 

 

 

___________________________

 

3/7/10

 

Hey Joe-- quick question on question 1 in Exercise 9.  Since the range of P is from 0-102 and the range of K is from 88-310, using the same legend does not seem appropriate.  When you say, the same legend do you mean the same number of ranges and the same color pattern, and not necessarily the same map value breaks within the ranges?  Thanks, Jason

 

Jason— good question.  By now in the course it should be readily apparent that the default displays are not “best”.  The question then is to determine the best surface map display strategy for the two maps viewed side-by-side.

 

The map surfaces characterize two different nutrient concentrations for a given field …sort of apples and oranges (related but different mapped phenomena).  This perspective would suggest that they should be displayed with a similar interval step but both colored ramped to indicate 0 though their different maximum concentrations (Option 1 below).

 

But the two maps are viewed as crop nutrients and have the same base unit (ppM) but just differ in their ranges …sort of orange and tangerine (same mapped phenomena of nutrient concentration).  This perspective would suggest that they should be displayed with the same interval step and color ramp indicating 0 though the maximum concentration (Option 2 below).

 

Your must decide on the “best” display strategy for both maps (or you can introduce another display strategy …maybe utilizing 3-D?) and explain your reasoning of your choice in terms of the nature of the data and cartographic sensitivities.  The good part is that there isn’t a right/wrong to the question …just degrees of better reasoning.  Joe

 

Option 1 – both are in steps of 20 ppm and color ramped from 0 (green) to their individual Maximum value (red) with a yellow inflection at mid-range.

 

   

 

Option 2 – both are in steps of 20 ppm and color ramped from 0 (green) to the highest Maximum value (red) with a yellow inflection at mid-range.

 

   

 

 

___________________________

 

3/6/10

 

Professor Berry-- In question #3 of this week's lab, the last part asks us to “Briefly describe the statistical summary in the Percent_Difference Map’s Shading Manager.  Be sure to investigate the Statistics and Histogram tabs.”

 

Because there are some values that are very extreme positive percent change values, the histogram shows that the majority of values are near 0% plus or minus change. Is it possible to change the display of the histogram so that I can explain it better?   Thank you, Curtis

 

Curtis—you have hit on the “nugget” of question …the default display (shading Manager Summary) is not useful because of the extreme outliners.  The Statistics tab shows—

 

 …Min= -76.131% but Max= 4,158.209%.  You need to briefly explain how this can happen (more than a 40-fold increase in yield at a map location).

 

  … switching to Equal Count helps with the display problem, but doesn’t help with the Histogram display. 

 

You can get a better Histogram display by selecting MapSetà New Graphà Histogram and choosing the Percent_Difference map…

 

    …but it still doesn’t get to the heart of the matter.  You want a better look at the “heart of the data” ..say from -50 to 140. 

 

A quick and dirty way to get rid of the outliers would be to enter “RENUMBER Percent_Difference ASSIGNING 140 TO 140 THRU 5000  ASSIGNING -50 TO -100 THRU -50  FOR Percent_Difference_noOutliers”…

 

 

Since this is “Map Analysis” one can’t just “throw away” outliers (assign the values to Null or No Data), as that would leave holes in the map display, so the generally accepted method is to “cap” the upper and lower tails …-50 and 150% in this case.  Since N is so big, “capping” instead of “disregarding” outlier data is thought to be OK.   Joe

 

 

___________________________

 

3/4/10

 

Luke-- responses embedded below.  Joe 

 

Hi Dr. Berry-- I have three questions about GIS modeling:

 

1)  I guess I am a geek at heart.  I have a modeling question that I wanted to see how to solve using MapCalc.  How would one make a model that looks at which way is faster driving to Durango, CO from Du or driving from DU to the airport then waiting on line and waiting though security and then finally flying to Durango, CO?  One would also have to calculate driving from Durango airport to the end location because offend the airport itself is not the end location.  How would one make a model using MapCalc to look at which way is faster?  Also how could you make the model so that there was not only one start location?  For instance what if instead of starting at DU one started in Conifer how would that effect the model.  What commands would be needed to make this model in MapCalc?  I hope this makes sense.

 

MapCalc’s Spread operation (Costdistance in ESRI) is the key.  You would need to 1) convert a vector road map to grid (PolyGrid in ESRI); 3) use Renumber (Reclassify in ESRI) to assign friction values (typical time to cross a grid cell of each road type) to the different road types; 3) create a “starter” map for DIA; Spread starter thru Friction for DIA travel-time surface that contains the estimated driving time to the airport.  Repeat for the Durango airport.  Identify a Begin and an End location (any road location) on the travel-time surfaces estimates the time to and from the airport, then add a time estimate for parking, shuttle time, check-in, security, train, gate wait, flight time and any other times you can think of after leaving the car.

 

The accumulated surface contains the estimated travel-time from any location in your departing (e.g., DU to DIA) and destination (e.g., Durango home) locations.  You calculate this surface “once and use many” as different departure and destination addresses are entered (geo-coding to convert street address to Lat/Lon coordinates in ESRI).  The information provided in the grid approach is radically different from vector’s Network Analysis.  It has the major advantages of 1) generating a continuous solution from a location to everywhere instead of a point-to-point solution that has to be calculated for every departing and destination address, 2) can include off-road travel, and 3) less demanding of input data (no requirement for defining “turntable” specifications for every intersection).  On the other hand, vector-based Network Analysis 1) responds to the very real one-way streets, left turn delays, etc. and 2) provides navigation information (take this street, then turn left, etc.).

 

The bottom line is that vector’s Network Analysis is best for point-to-point navigation and raster’s Accumulation Surface Analysis is better for large data sets, such as modeling LA basin retail store travel-time for with thousands and thousands of customers and their purchases.  Remember “raster is faster but vector is corrector.”

 

2)  In the reading it states that typically in kriging the values are brought down below IDW. What is the math behind this?  I know the kriging uses the statistics of surrounding values, but how does that pull down the values?  I also have read Introduction to GIS by Chang and I am still unclear about why this is.  Do you have a math table to explain the numbers better?

 

Krig’s approach can generate interpolated estimates that are outside the range (min to max) of original sample data set.  This usually happens in 1) the “extrapolation” for areas outside the extent of the data where it is carrying the localized trend of the data, or 2) in areas where data is rapidly changing (e.g., a peak to a valley back to a peak Krig will over-shoot the valley).  I do not have a data table that explains the effect as that requires “human calculation” of a lot of ugly equations.  However, the effect the effect can be seen in slide #28, week 8 PowerPoint where the minimum value is -1 (impossible for the data). 

 

3)  For question four in this week’s lab can we combine a post map with the IDW and kriging map?  I think in order to do this one need a file called sample3.dat which is not in the file list.  Does this mean that we cannot do this?  Visually it would have made the maps easier to look at because one would know the values of the valleys and peaks.

 

Someone in your group must have found the Sample3.dat file or you couldn’t have created the surfaces in the exercise.  In part 2, “…then select the SAMPLE3.dat file in the \Samples folder.

 

Thank you, Luke

 

 

___________________________

 

3/3/10

 

Joe-- I was trying to figure out what the cell size is in Surfer.  I was asking help and read a bit about cell properties, but I still haven't stumbled across the cell size.   Could you point me in the right direction?  Thanks!  Elizabeth

 

Elizabeth— you have hit on a really knurly question.  “Cell size” is determined by “Grid Line Geometry” during gridding—

 

Grid line geometry defines the grid limits and grid density. Grid limits are the minimum and maximum X and Y coordinates for the grid.  Grid density is usually defined by the number of columns and rows in the grid.  The # of Lines in the X Direction is the number of grid columns, and the # of Lines in the Y Direction is the number of grid rows.  By defining the grid limits and the number of rows and columns, the Spacing values are automatically determined as the distance in data units between adjacent rows and adjacent columns.

 

From the Grid Line Geometry fields in the Grid dialog box for the Sample3 data set…

 

 

…it appears that the XY coordinates likely extend from 0 to 50 in both directions (just no samples at the borders of the project area).  But the process uses the min/max in the sample data set to set the cell size, so the values are slightly different in the two planer directions.  This might work for simply making a good plot of the data but hardly sufficient for consistent grid-based “analysis frame” which needs to be the same distance in both the X and the Y directions for the analytical function to work …a basic assumption of grid-based map analysis.

 

The bottom line is that the default Grid Line Geometry does not generate true raster/grid data.  To be true grid data the analysis frame should be configured as 100 columns by 100 rows with a cell size (spacing) at 0.5 in both the X and the Y directions.  If my assumptions are correct, the user needs to “force” the X/Y min and max to 0 and 50, respectively, and set the spacing to 0.5 to form a true analysis frame that can be used in grid-based map analysis.  As for the X,Y units (Z for that matter), I don’t know …I haven’t found where that metadata is found in Surfer.

 

Joe

 

___________________________

 

2/28/10

 

Exercise 8, Question 4.  What is the maximum, minimum and average difference between the two interpolated surfaces?  Be sure your answer discusses the interpretation of the “sign” and “magnitude” of the differences. 

 

I am having trouble answering this question because I do not understand what the units are on the Z-Values. The class discussion lead me to believe that it was not possible to have such a large difference (Almost 2000) between the two models. Could you please clarify how to interpret the values in this table?  Thanks, Curtis

 

Curtis—actually I don’t know the nature of the data either …knowing Surfer it is likely environmental data like parts per million of lead in the soil.  However, numbers are numbers and the computer never knows the “real-world” legacy of the numbers it analyzes (seems fair that we should know either).  The descriptive statistics for the “Sample3, Z1 column” data that you interpolated using IDW and Krig DEFAULT SETTINGS are…

 

…indicating that the range of the sample data is Min= 9.3 and Max= 5277.7.  And from the difference surface statistics for IDW – Krig you had attached to your email…

 

 …indicates Min/Max of difference values of around plus/minus 1000 isn’t outrageous (=/- about 20% of the orginal data range).  Also, the Median is only -33.5 which is fairly small (<.6%).  This suggests that the two surfaces aren’t “radically” different overall but “widely” different in a few areas.

 

  Further noodle-ing notices that the “wildly” different locations seem to share the same spatial pattern of the input sample data points …hummmm, maybe the different spatial interpolation approaches treat sampled locations differently.  But the big positive differences (blue tones) occurring in the SW portion seems very odd …hummmm, maybe something to do with how the different spatial interpolation approaches handle sparse data (particularly if the window size is fixed fairly small).  Finally, the large negative differences don’t occur very often and seem to be contained in the “holes” around the sample locations …hummmm, interesting.

 

Hopefully these ramblings stir some thinking.  Class discussion suggested that usually aren’t large differences (which the preponderance of tan tones suggests) but in this case that “blue plateau” seems to suggest differently.  Your charge is to suggest some possible reasons based on the sample data pattern and differences in the spatial interpolation approaches of IDW and Krig.

 

Joe  

 

 

___________________________

 

2/26/10

 

Dr Berry-- A couple questions regarding Question 2 and 5.   Regards, Jeremy.

What is the visual effect of perceived precision/accuracy by decreasing the contour interval from 5 to 10?

 

Question 2.  Decreasing the contour interval from 5 to 10 feet produces contours with greater distance between them.  The contour lines describe elevation points.  Any point on the line given an elevation value of that line.   A decrease in contour intervals decreases the precision and accuracy of the contour map.  A point’s elevation not on a contour line will be estimated from the nearest contour line.  The greater the contour interval the greater the distance between contour lines.  The greater the contour interval the lower the accuracy and precision of the estimation of the elevation point on the map surface.  Are both accuracy and precision lowered or does there need to be a distinction between the two? 

 

Keep in mind that “Precision” refers to the correct positioning of spatial information (contour bands, in this case) and “Accuracy” refers to the correct classification of spatial information (contour interval ranges).  In “noodle-ing” this question, also keep in mind that a contour map is a discrete map representation of a continuous surface and not a traditional polygon map of property ownership or cover type parcels.

 

Is the actual “level of detail” of the elevation data in the display increased in a contour map with more intervals? 

 

The actual detail of the elevation data is unchanged.  The estimation of points of elevation on the surface map is increased with increased contour internals.  Elevation locations are more accurately estimated with increased contour intervals.  True?

 

I believe you are on the right tract …the elevation data is unchanged, just the visual rendering of the continuous surface as discrete map features (contour bands) is changed.  Merge this thought with your “noodle-ing” above and I think it might help you formulate your responses to both questions. 

 

What defines the actual “spatial resolution” contained in any map surface (data)? 

 

The actual spatial resolution is determined by the density of the spatial distribution of the elevation data points.  The greater the density of the elevation points over the map surface the higher the spatial resolution of the map surface.  Accurate?   

 

However, the elevation data is a grid-based representation (DEM) of a continuous surface, not a set of points.  So what characteristic determines the spatial resolution of grid data?  If you want to extend the discussion a bit you might see if you can figure out the actual resolution for this data set—requires some snooping about with Surfer to locate the metric.

Question 5.  I wish to create a difference map of Average minus IDW or Kriging interpolation.  How do you create an average map?   I don't see this option in the Grid Methods provided.  

 

Not an “Average map” …just identify the “maximum, minimum and average difference statistics (three scalar numbers) between the two interpolated surfaces you just created.  These are “descriptive statistics” summarizing the map values comprising the Difference Surface; not a new map layer.

 

Also how do you add a legend to the difference map?

 

 …check the “Show Color Scale” box to draw a color scale bar.  Double-click on the scale bar to modify.  If you want to add a geographic Bar Scale (distance) and North Arrow I have never done that …but I have seen some pretty cool “finished” Surfer plots, so there must be a way.

 

___________________________

 

2/24/10

 

Dr. Berry-- It seems as though you have two versions of Surfer out there. Surfer 8 and 9. Surfer 9 demo version from the website does not have the full functionality that 8 has. Could you confirm this before I waste another evening trying to create an Overlay Map.  Regards, Jeremy.

 

Jeremy—I didn’t know they had released a version 9 …must be their new upgrade.  I recommend working with version 8 as that is what the exercise write-up anticipates.  Version 8 is on the Map Analysis book CD, in the class folder in the GIS lab, and available for download from my website—

 

s8demo.exe — click on this link and select Open.  Follow the onscreen installation instructions.  It is recommended that you accept the default specifications as the exercise write-ups assume this installation location.

 

Joe

 

___________________________

 

2/22/10

 

Dr. Berry-- is there an exercise 7?  Regards, Jeremy

 

Jeremy—no Exercise 7 due this Thursday.

 

In the original materials I referred to Exercise #7, as following exercise #6 (mini-project).  Since there isn’t an exercise for week 7, the new naming scheme referees to the last two exercises as Exercise #8 (week 8) and Exercise #9 (week 9).  The exercise number now refers to the week the material is introduced …not the sequence order; I apologize for the confusion. 

 

The next Exercise #8 will cover the material presented in this week’s class (week 8) on Spatial Interpolation considerations and is due the following class.  You will form your own 1-3 member teams

 

Or you can do a short paper (4-8 pages) on a GIS Modeling related topic of your choosing in place of Exercise 8 or/and Exercise 9.  For an even more extended experience, you could design you own mini-project.  For example, you might be interested in developing a GIS Modeling exercise for Junior High or High School students.  In this case, the week 8 “special project” (Part 1) would be to develop the plan and sketch-out what needs to be done; week 9 “special project” (Part 2) would be to complete the exercises(s) and create a self-contained CD with all the materials needed for teachers. Make me an “offer I can’t refuse” about what you want to do.

 

However, just for fun (we’re still having fun, right?) Exercise 8 for this week uses Surfer software to investigate concepts and practice in surface modeling (spatial interpolation)—what could possibly be more fun.  Surfer is installed in the GIS lab, or you can download and install the Surfer software on your own computer from the book CD, from class website, from the class materials in the GIS Modeling course folder or by downloading directly from Golden Software’s site …http://www.goldensoftware.com/demo.shtml.

 

Joe

 

___________________________

 

2/17/10

 

Folks—some of are “over-driving” the purpose of a Prototype Model—to demonstrate a viable approach and stimulate discussion.  It is important to keep it simple stupid (KISS) to insure clients focus on model approach and logicduring the early discussion phase of a project. 

 

Anticipated refinements are reserved for the “Further Considerations” section ...discussion only of additional enhancements at this stage, not implementation. 

 

If model refinement simultaneously accompanies prototype development, there isn’t a need for a prototype.  But that is the bane of a “waterfall approach” to modeling …you can easily drown by jumping off edge at the onset; whereas calmly walking into the pool with your client (baby steps) engages and involves them, as well as presenting a manageable first cut of the approach and logic for discussion. 

 

This type of thinking is the foundation of the “Agile” project management approach that is sweeping the software development and business worlds (http://en.wikipedia.org/wiki/Agile_software_development) …baby steps with a client, not top-down GIS’er solutions out-of-the-box.  --Joe

 

___________________________

 

2/16/10

 

Jeremy—I am not sure what might cause the problem …maybe there is some preference setting that goes for the auto-resize.  I mostly use Office 2007 on both XP and Vista machines and don’t have the problem.  You might try “forcing” fixed dimensions for the table and cells—

 

    The safest way to “insert” the graphics (picture in MS Word-speak) in a table cell is to paste it there, double-click to pop-up the Picture Format tab, then access the Size dialog box and enter the exact width you want (be sure “Lock aspect ratio” is turned on; it always will be with a Snagit screen grab).  “Dinking” with a picture’s handles can be troublesome— “diagonal dragging” keeps the fixed aspect ratio, but the side handles “squish” things.     

 

What is the optimal size of the MapCalc window to convey the map images appropriately?  I don’t think there is an “optimal” size window BUT BE SURE it is consistent when grabbing a series of screen shots.  MapCalc will resize to the window, so if it is not consistent the aspect ratios of the maps won’t be consistent.  I suppose the best thought is to always maximize the MapCalc window to insure consistency (and highest resolution captures).  It is best to not use the “Restore down” window and “dink” with its sizing (might be hard to do if you are constantly moving to different computers).  MapCalc keeps the display of the grid cells “square” but attempts to maximize the screen “real estate” hence you see different display relationships among the map, title and legend— a screen display, not a stable paper map.

 

I personally think the one of the right looks more appropriate.  I disagree.  The Tutor25.rgs analysis frame is 25 columns x 25 rows so it is a square, not a vertically elongated rectangle.  You must be used to topographic sheets that have elongated borders, but the UTM and Lat/Lon coordinates do their best to be square.  So in grid terms, each cell is a “square” and the grid lines have to be displayed that way (except for 3D perspective of course).  --Joe

 

Dr. Berry-- I have a question regarding copying images from MapCalc into tables.  

 

Here are two examples of the same map that are the same size in WORD 2.5x2.5.  But obviously they are different because the MAPCALC window they were taken from are different sizes. What is the optimal size of the MapCalc window you require so as to convey the map images appropriately?

With the aspect ratio turned on they look different, and they are not 2.5x2.5. With the aspect ratio turned off they are 2.5x2.5 but are "distorted". I personally think the one of the right looks more appropriate.

Regards. Jeremy

 

___________________________

 

2/15/10

 

Folks—I have received some emails about solutions to the “word problems” in Question 5 of Exercise #4.  As mentioned in class, I purposely ordered the questions from easiest to progressively harder.  However, some of you correctly noted that the hardest part” was interpreting what a question was asking

 

First, as a general rule in solving word problems you need to state any “assumptions” you make in its solution.  With MapCalc and Surfer it is “fair” to assume that the question intends to use system defaults, unless there is a specific note in the question to use a different option.  Don’t over-reach a question by cluttering it with a lot of “yes, but I could…” variations—like Luke Skywalker, go with your force, but state any assumptions.

 

A particularly useful tool that most you probably learned in grade school is to use a “parsing” outline.  This means to break the question into a set of specific tasks required, and then order the tasks into a logical sequence of steps leading to the solution. 

 

Generally speaking, most GIS modeling problems are a good fit for an ordered “Parsing Outline” as the solutions are usually linear with few logical jumps (rarely are there “Do While” loops or “If <this>, Then <that>” jumps in modeling logic) ...the models are more like a recipe for Banana Bread—do this, then this, then this and bake until brown.  (Aside: don’t confuse this statement with programmer’s coding of algorithms for the basic map analysis tools …lots and lots of loops, tests and stacks).

 

Below are the solutions for all five word problems with discussion that ought to help in understanding the spatial reasoning behind their solutions.  (Aside: hence the Optional Question 4-1 about completing the other two is off the table, but send me an email and I will substitute a couple of new ones …I have a million of them).

 

Joe

________________________________________ 

 

Exercise #4 – Question 5

   

·         Q5-1) Using the Tutor25.rgs database, determine the average visual exposure (Vexposure map from question #4) for each of the administrative districts (Districts map).  Screen grab the important map(s) and briefly describe your solution as a narrative flowchart.

 

Parsing the question into processing considerations:

1) Visual exposure— derived in Question 4 by

                ANALYZE Ve_roads_sliced with Ve_housing_sliced Mean for Vexposure

…involved a process sequence that used visual exposure to Roads and weighted visual exposure to Houses to create an overall visual exposure index to human activity with a potential range of 1 (low) to 5 (high).  The question did not ask you to re-compute the average VE index, just use the existing map.

2) Districts—a given base map with eight contiguous regions defining the Districts in the project area.

3) Average VE per District—use region-wide overlay to calculate the average VE index (data map) within each of the Districts (template map) by

                COMPOSITE Districts with Vexpose average for AvgVE_Districts

 

The hardest part was recalling that COMPOSITE is the region-wide operator in MapCalc (ZONALmean in ESRI Grid/Spatial Analyst).

 

·         Q5-2) Using the Tutor25.rgs database, identify the visual exposure to roads (Vexpose map) within a 300m simple buffer (3 cells) around roads (Roads map).  Screen grab the important map(s) and briefly describe your solution as a narrative flowchart.

 

Parsing the question into processing considerations:

1) Visual exposure to roads— use the “completely” option to count the number of viewer cells connected to each map location by

                RADIATE Roads to 100 Completely For Vexpose

2) 300m proximity buffer around roads— must calculate a simple proximity map that extends to 4 (or more 100 meter cell lengths) away from roads so you can renumber to a 300 meter (3 cell lengths) binary buffer

                SPREAD Roads to 4 For Road_prox4

                RENUMBER Road_prox4 Assign 0 to 0 thru 4 Assign 1 to 0 thru 3 For Road_buffer3

3) Identify Vexpose values within Road_buffer3— use grid math to multiply the binary buffer by the visual exposure values

                COMPUTE Road_buffer3 Times Vexpose for VE_ Road3    (could use CALCULATE)

 

The hardest part of this question was interpreting the question. One interpretation was to just determine the visual exposure within just a 300 meter reach of a road by RADIATE Roads to 3 Completely For Vexpose3The problem with this solution is that there might be locations that are within the 300 meter buffer that are seen from road locations that are farther away than 300 meters.  Another interpretation just calculated the viewshed (binary map of just seen or not seen) not visual exposure.

 

·         Q5-3) Using the Tutor25.rgs database, create a map (not just a display) that shows the locations that have the highest (top 10%) visual exposure to houses (Housing map).  Screen grab the important map(s) and briefly describe your solution as a narrative flowchart.

 

Parsing the question into processing considerations:

1) Visual exposure to Houses— some confusion in interpretation as whether to use “completely” the counts the number of Housing locations visually connected, or to use “weighted” that sums the Housing values for the total number of Houses

                RADIATE Housing to 100 Weighted For Vexpose_houses

2) Determine top 10%— need to use the Shading Manager to set 10 ranges that will identify the top 10% (1/10).  Confusion can arise as whether the question is asking to use “equal count” that identifies intervals of 10% area, or to use “equal ranges” that identifies intervals 10% values.  Since area wasn’t specified, the map value for the lower bounds of the tenth interval is best to use.

3) Create a map— reclassify the VE map to a binary map that identifies the top 10% by

                RENUMBER Vexpose_houses Assigning 0 to 0 thru <maxValue> Assign 1 to <lower 10% cutoff> thru <maxValue> For Vexpose_houses_top10pct

 

To just use the Shading Manager to set the color for the first 9 intervals to light grey and the 10th interval to red creates the same visual display, BUT it is not a binary map that the computer can use in further processing, such as determining the average elevation for the most visually exposed areas (COMPOSITE Vexpose_houses_top10pct with Elevation Average).

 

·         Q5-4) Using the Island.rgs database, create a map that identifies locations that are fairly steep (15% or more on the Slope map) and are westerly oriented (SW, W and NW on the Aspect map) and are within 1500 feet inland from the ocean (hint: the analysis frame cell size is a “property” of any map display).  Screen grab the important map(s) and briefly describe your solution as a narrative flowchart.  

 

Parsing the question into processing considerations:

1) Fairly steep— some confusion could arise about what slope algorithm to use.  Best to use the default “fitted” (and state in your assumption) unless otherwise directed to use another method (maximum, minimum, average).

                SLOPE Elevation Fitted For Slopemap

                RENUMBER Slopemap Assign 0 to 0 thru <maxValue> Assign 1 to 15 thru <maxValue> For Fairly_steep

2) Westerly oriented— easiest to use “octants” as the question implies octants, but you could use “Precisely” and then determine the azimuth cutoffs for the bearings (1=N, 2=NE, 3=E, 4=SE, 5=S, 6=SW, 7=W, 8=NW, 9=Flat).  Since the question didn’t specify what to do for flat areas without a dominant terrain orientation, it seems safe to assume (and state your assumption) that such areas are not of interest.

ORIENT Elevation Octants For Aspectmap

                RENUMBER Aspectmap Assign 0 to 0 thru 9 Assign 1 to 6 thru 8 For Westerly_oriented

3) Ocean— the tricky part here was determining which map to use in the calculation …Land_mask.  Mask_all has “coast” as a feature but that would mean establishing proximity from the coast cells (one cell inland) not the ocean cells.

                RENUMBER Land_mask Assigning 1 to -1  For Ocean

4) Within 1500 feet inland— checking the Propertiesà Source tab as suggested identifies the cell size of the Island.rgs database as 82.5 feet.  Thus the “reach” inland is 1500/82.5 = 18.18, rounded to 18 cell lengths (plus one for “or more” that needs to be eliminated when renumbering for the binary mask)

SPREAD Ocean to 19 For Inland_1500plus

                RENUMBER Inland_1500plus Assign 1 to 0 thru 18 Assign 0 to 18 thru 19 For Inland_1500    (could use 18.0001 thru 19 if concerned about an exact “tie”)

3) Locations meeting all three criteria—

                COMPUTE Fairly_steep Times Westerly_oriented Times Inland_1500 For All_three    (could use CALCULATE)

 

Displaying the result on a 3D Grid or Lattice plot of the Elevation surface is best as it lets the viewer “see” the terrain relationships.

 

·         Q5-5) Using the Agdata.rgs database, create a map that locates areas that have unusually high phosphorous levels (one standard deviation above the mean on the 1997_Fall_P map) and are within 300 feet of a “high yield pocket” (top 10% on the 1997_Yield_Volume map).  Screen grab the important map(s) and briefly describe your solution as a narrative flowchart.

 

Parsing the question into processing considerations:

1) Unusually high Phosphorous levels— need to use the Shading Manager’s “Statistics Tab” to identify the Mean and StDev of the 1997_Fall_P mapped data.  The lower cutoff for “unusually high levels” is Mean + 1 StDev.

                RENUMBER 1997_Fall_P map Assign 0 to 0 thru <maxValue> Assign 1 to <Mean + 1StDev> thru <maxValue> For High_P

2) High yield pocket— need to display the 1997_Yield_Volume and use the Shading Manager to set 10 ranges that will identify the top 10% (1/10).  Confusion can arise as whether the question is asking to use “equal count” that identifies intervals of 10% area, or to use “equal ranges” that identifies intervals 10% values.  Since area wasn’t specified, the map value for the lower bounds of the tenth interval is best to use.

RENUMBER 1997_Yield_Volume Assigning 0 to 0 thru <maxValue> Assign 1 to <lower 10% cutoff> thru <maxValue> For High_yield

 

3) Within 300 feet of a high yield pocket— checking the Propertiesà Source tab as suggested identifies the cell size of the AgData.rgs database as 50.0 feet.  Thus the “reach” away from high yield pockets is 300/50.0 = 6 cell lengths (plus one for “or more” that needs to be eliminated when renumbering for the binary mask)

SPREAD High_yield to 7 For Pockets_300plus

                RENUMBER Pockets_300plus Assign 1 to 0 thru 6 Assign 0 to 6 thru 7 For Pockets_300    (could use 6.0001 thru 7 if concerned about an exact “tie”)

 4) Locations meeting both criteria—

                COMPUTE High_P Times Pockets_300 For HighP_HighYield    (could use CALCULATE)

 

Displaying the result on a 3D Grid or Lattice plot of the 1997_Fall_P surface is best as it lets the viewer “see” the relationship with P levels.

 

___________________________

 

2/13/10

 

Katie, Jason and Luke— great scoping session yesterday!!!  I think our clients are comfortable with the proof-of-concept prototype that emerged from the meeting.  Below is a somewhat more refined version of the draft flowchart for the Basic and Extended Forest Access models you sent (aside for all GIS Modeling students: this is a good flowchart for “Techy-guy”— note how the boxes align and match the levels of abstraction from Descriptive mapped data to Prescriptive map solution).  Hopefully you captured the “Future Considerations” thoughts and discussion as well—cover type, soils, effective proximity buffer, etc. 

 

Because this is a “mini-Project” I have decided not to require you to implement and report the additional models for identifying the Landings and characterizing their Timbersheds as we outlined on the whiteboard …these are fairly complex solutions and outside the workload bounds of a mini-Project.  All you need to do is implement the Basic and Extended models flowcharted above …and generate a comparison between the output from Basic model alone and the Extended model.  Your comparison should include a map showing the difference and a table summarizing the difference (easier than you think …hint: Shading Manager table).

 

 

The roughest conceptual part is in understanding the extended “adjustments” and how they affect the Basic Access Model.  The idea is that the Basic model considers direct factors determining absolute and relative barriers to harvesting— landscape features (Ownership, Water and Sensitive Areas) and machinery operating conditions (Slope).  The Extended Model considers indirect human-related factors that suggests it more attractive to harvest in certain circumstances—areas of low visual exposure and low housing density (out of sight and away) have the effect of lowering the direct factor “impedance” by as much as 50% (.5).  The effect is to favor harvesting in areas with less people impact.

 

Another conceptual hurdle is the final step that summarizes the total forest accessible area within each of the watersheds.  The “Renumber” provides for infusing end user expertise and judgment.  If the renumbering for the binary map is to one unit below the maximum “To <value>” it will identify all of the accessible forested areas as the “Composite” will count the total number of accessible cells.  However, if the user wants just the “most” accessible forested areas (i.e., low hanging fruit) they would renumber to a smaller relative access value—less of a costly reach into the woods.  The procedure is like saying “we only want to harvest within a quarter mile of a road,” except the harvesting reach from roads is in realistic terms that considers the availability and accessibility dictated by the intervening terrain. 

 

It looks like all of the maps are in the Bighorn.rgs database except “Ownership” and “Watershed” maps that I will create and send to you as a text file for importing …by tomorrow 5:00 pm (earlier if all goes well).

 

I will complete the processing for the MBA Team client and forward to them with your report a week from Sunday.  I’ll share the Landings/Timber-sheds Addendum with you and the rest of the class ...and who knows how many others as I plan to use the model in my Beyond Mapping column for GeoWorld as a three-part series to illustrate the anatomy of a GIS access model.

 

Have a great weekend, Joe

 

___________________________

 

1/18/10

 

Folks—in grading your Exercise #1 reports I have made numerous “red-line comments” embedded in your responses.  When you get the reports back, PLEASE take note.

 

The most flagrant mistake was failing to link the model logic to the commands and results by thoroughly considering the map values   For example,

 

“…looking at the interpretation results shows that the Gentle Slope criterion is the least selective—just about everywhere is rated as “pretty good.”  However, if the model is moved to another project area that is predominantly east-facing, the aspect consideration might be the most restrictive.  You missed a chance to comment on which criteria map layer was the most restrictive and which was least restrictive.”   you need to “fully digest and think about” the map values, as well as simply “completing” the exercise.  

 

Just to reinforce the “deadline policy” for your reports—

 

…the “deadline etiquette” spiel in class is part of the sidebar education opportunities that comes with the course …more bang for your tuition buck.  In the “real-world” there are often a lot of folks counting on a sequence of progressive steps—if one is missed the procession can get off kilter. Outside factors are part of reality but a “heads-up” of likely missing a deadline lessens the impact, as others appreciate the courtesy and, provided you announce a new expected deadline, they can adjust their schedules as needed …softens the blow by recognizing others and demonstrates you are a team player.  The opposite reaction occurs if the deadline is disregarded …hardens the blow by ignoring others and suggests that you are a soloist. 

 

Also, below is a list of general “report writing tips” that might be useful in future exercises.  Hopefully these tips will help the “final” polishing of your Exercise #2 reports (and beyond!!!)  --Joe

­­­­­­­­­­­­­­­­­­­­­­­­

Underlying Principle: Report writing is all about helping the “hurried” reader 1) see the organization of you thinking, as well as 2) clearly identify the major points in your discussion.

 

Report Writing Tip #1: enumeration is useful in report writing as the reader usually is in a hurry and wants to “see” points in a list.

 

Report Writing Tip #2: when expanding on an enumerated list you might consider underlining the points to help the hurried reader “see” your organization of the extended discussion/description.

 

Report Writing Tip #3: avoid long paragraphs with several major points—break large, complex paragraphs into a set smaller ones with each smaller paragraph containing a single idea with descriptive sentences all relating to the one thought. Don’t be “afraid” to have a paragraph with just one sentence.

 

Report Writing Tip #4: it is a good idea to use two spaces in separating sentences as it makes paragraphs less dense …makes it easier to “see” breaks in your thoughts—goes with the “tip” to break-up long paragraphs as both are distracting/intimidating to a hurried reader as they make your writing seem overly complex and difficult to decipher.  Most professional reports do not indent paragraphs—appears more “essay-like” than report-like.  A report is not a literary essay.

 

Report Writing Tip #5: avoid using personal pronouns (I, we, me, etc.) in a professional report.  A report is not a letter (or a text message).   

 

Report Writing Tip #6: “In order to…” is a redundant phase and should be reduced to simply “To…”  For example, “In order to empirically evaluate the results …” is more efficiently/effectively written as “To empirically evaluate the results…”  This and two other points of grammar are often used to “differentiate” the Ivy scholar from the inferior educated masses.  The other two are 1) the split infinitive ( e.g., This thing also is going to be big, not “…is also going to be…”; don’t stick adjectives or adverbs in the middle of a compound verb) and extraneous hyperbole (e.g., “That’s a really good map for…” versus “That’s a good map for…”; avoid using “really”).

 

Report Writing Tip #7: need to ALWAYS include a caption with any embedded graphic or table.  Also, it is a general rule is that if a figure is not discussed in the text it is not needed—therefore, ALWAYS direct the reader’s attention to the graphic or table with a statement of its significance to the discussion point(s) you are making.

 

Report Writing Tip #8: ALWAYS have Word’s Spelling and Grammar checkers turned on. When reviewing a document, right click on Red (spelling error) and Green (grammar error) underlined text and then correct.

 

Report Writing Tip #9: it is easiest/best to construct (and review) a report in “Web Layout” as page breaks do not affect the placement of figures (no gaps or “widows”).  Once the report is in final form and ready for printing, you can switch to “Print Layout” and cut/paste figures and captions as needed.

 

Report Writing Tip #10: be sure to use a consistent font and pitch size throughout the report.  Change font only to highlight a special point you are making or if you insert text from another source (include the copied section in quotes).

 

Report Writing Tip #11: don’t use “justify” text alignment as it can cause spacing problems when a window is resized in “Web Layout” view; the document will not be printed ...it’s the “paperless society,” right?  Also, be consistent with line spacing …usually single space (or 1.5 space) is best …avoid double spacing as it takes up too much “screen real estate” went viewing a report.

 

Report Writing Tip #12: it is easier (and more professional) to use a table for the multiple screen gabs and figure #/title/caption as everything is “relatively anchored” within the table and pieces won’t fly around when resizing the viewing window—

 

    …be sure to keep the table width within page margin limits if you plan to print (also for easier viewing in Web Layout).

 

CoverType map

CLUMP dialog box

CLUMPED CoverType map

Figure 2-1.  Script construction and map output for the CLUMP operation.  The left inset shows the CLUMP operation settings.  The CoverClumps output map on the right identifying unique map values for each “contiguous Covertype grouping” is displayed in discrete 2D grid format with layer mesh turned on.

 

                

the easiest (and best) way to center items in the table is to click on each item and choose “Center” from the Paragraph tools; to create upper and lower spacing Select the entire table and the Table Propertiesà Cell tabà Cell Optionsà uncheck Cell Margins boxà specify .08 as both top and bottom margins.

 

 

 

___________________________