How much data is enough? Investigating how spatial data resolution impacts conservation decision making

What information do you need to make a good decision? It might seem like an obvious question, but given the increasing array of remote data sources, it can be hard to pick one. On the one hand, you don’t want to make the wrong decision based on incomplete data, but working with very high resolution data can be costly and slow (and isn’t always necessary). If your decision is relatively clear it may not matter how much more better the best option is than the others; regardless you’ll take the same action.

A new paper published in Remote Sensing in Ecology and Conservation tackles this question in a conservation context. We were doing a detailed analysis (using sub-meter satellite imagery from Digital Globe) in support of the Camboriú water fund in Brazil (see https://global.nature.org/content/how-to-assess-the-return-on-investment-in-watershed-conservation for details), examining how planned conservation work would impact water quality in the river (reducing treatment costs and the risk of water shortages). The analysis involved classifying land cover, converting that to land use, predicting future land use change, and then running a SWAT model to compare the sediment load under different land use scenarios (e.g. comparing a baseline to a conservation option).

Hydrologic monitoring at the study site. Photo by Timm Kroeger, The Nature Conservancy

You can read more about the context of the project at https://blog.nature.org/science/2017/08/24/camboriu-data-for-water-funds/, but the key point for spatial scientists is that when each land use change modeling run took two weeks to complete, we started wondering whether a simpler model would have been good enough (and how to pick the right data in future analyses).

So we compared the impact of beginning with either 1 m imagery or 30 m Landsat imagery on subsequent derivatives: land use (See Figure 1 for a comparison), water quality (via SWAT models), return on investment (ROI) for the water fund, and the time and costs of analysis. We found that the simpler model would have led to the same decision in Camboriú (to invest in conservation), but that in other contexts the higher-resolution model could have made a difference. Since the 30 m data was free and led to a land use change model that only took a day to run instead of two weeks, this could have saved a lot of time and money.

Fig 1 — Figure 1. Map showing 1 m pixels of agreement (gray) and disagreement (red) between 30 m and 1 m land use. 22.8% of the 1 m pixels within the study area had a different land use class in the 30 m data. First published in Fisher et al. (2017).

While much of this paper is fairly technical and narrowly focused on water quality modeling in Camboriú, we believe that the discussion offers insights which are much more broadly applicable to anyone deciding how to pick which imagery source is ideal for their needs. To get a sense of how available information shifts with imagery source, Figure 2 shows what a 90 m x 90 m area looks like using (from left to right) Landsat (30 m pixels), WorldView 2 (0.5 m pixels), and a drone (0.07 m pixels). The Landsat and WorldView 2 images show the identical area in Camboriú, but we didn’t have drone imagery in our study area so the final image is from a different farm in Maryland.

Fig 2 — Figure 2. Comparison of imagery from Landsat (30 m), WorldView 2 (0.5 m), and a drone (0.07 m)

While it is apparent that much more detail is visible in the higher-resolution images (e.g. tree branches, farm rows or even individual leaves of corn), note that there is also more shadow, and that visible bare branches on trees and bare ground within fields may complicate land cover / land use classification.

Table 1 provides several factors to consider in deciding whether higher-resolution or lower-resolution data is appropriate for a given project. For example, for a project with a large watershed that is fairly homogeneous and will require frequent updates despite a low budget, starting with free lower-resolution data probably makes sense. But if there is low tolerance for error (and a high risk of making the wrong decision), then especially if an initial model produces estimates close to an important decision making threshold (like an ROI close to 1), the use of higher-resolution data is likely warranted.

It’s tempting to always go for what you think is the “best” data; and in conservation and remote sensing that often means drones or new satellites with very high resolution. But as we have learned here and in other projects, those data sources can be both expensive and hard to work with. With a bit of thought and planning, analysts should be able to save time and money in some cases, and in others identify where extra work is really necessary to make the right decision.

Jon Fisher

Clean Camboriú river entering the floodplain. Photo by Timm Kroeger, The Nature Conservancy

Table 1. Considerations for selecting the appropriate spatial data resolution for a given analysis.

Consideration	Use higher-resolution data	Use lower-resolution data
Project budget	Higher budget	Lower budget
Study area size (affecting imagery cost & processing time)	Smaller study area	Larger study area
Required accuracy / precision (and risk associated with error)	High accuracy & precision needed, low tolerance for error	Accuracy & precision less critical, more error acceptable
Need to explore / refine model (affecting elapsed time for multiple model runs)	Model inputs and process well known, few model runs likely required	Exploration of model development needed, many model runs and refinements expected
Thresholds in decision making (e.g. if the estimates change slightly, a different decision is needed)	Initial estimates are close to an important threshold (e.g. ROI near 1)	Initial estimates appear to be safely distant from key thresholds
Size of land cover /land use patches in the landscape	Smaller patches	Larger patches
Size distribution of individual land cover / land use change patches	Most land cover change occurs in patches smaller than lower-resolution pixels	Most land cover change occurs in patches larger than lower-resolution pixels
Size (and heterogeneity) of parcels identified for conservation interventions	Small and/or heterogeneous parcels	Large and/or homogeneous parcels
Scale of variations in elevation / slope	Elevation varies considerably across small areas / steep slopes	Relatively gradual elevation changes / gentle slopes
Presence of small but important features or management practices that could impact your results (e.g. small dirt roads, water control bars on roads, strip-tillage, drainage ditches, thin riparian buffers or grass strips)	Important small features present	Small features absent or unimportant
Frequency of data updates needed (temporal resolution)	Infrequent updates acceptable	Need for frequent updates
Other uses for spatial data	The data can be used for multiple analyses	The data will solely be used for a single analysis

How much data is enough? Investigating how spatial data resolution impacts conservation decision making

6 thoughts on “How much data is enough? Investigating how spatial data resolution impacts conservation decision making”

Leave a comment Cancel reply

How much data is enough? Investigating how spatial data resolution impacts conservation decision making

Share this:

6 thoughts on “How much data is enough? Investigating how spatial data resolution impacts conservation decision making”

Leave a comment Cancel reply