Skip to main content

On the effects of spatial resolution on effective distance measurement in digital landscapes

Abstract

Background

Connectivity is an important landscape attribute in ecological studies and conservation practices and is often expressed in terms of effective distance. If the cost of movement of an organism over a landscape is effectively represented by a raster surface, effective distances can be equated with the cost-weighted distance of least-cost paths. It is generally recognized that this measure is sensitive to the grid’s cell size, but little is known if it is always sensitive in the same way and to the same degree and if not, what makes it more (or less) sensitive. We conducted computational experiments with both synthetic and real landscape data, in which we generated and analyzed large samples of effective distances measured on cost surfaces of varying cell sizes derived from those data. The particular focus was on the statistical behavior of the ratio—referred to as ‘accuracy indicator’—of the effective distance measured on a lower-resolution cost surface to that measured on a higher-resolution cost surface.

Results

In the experiment with synthetic cost surfaces, the sample values of the accuracy indicator were generally clustered around 1, but slightly greater with the absence of linear sequences (or barriers) of high-cost or inadmissible cells and smaller with the presence of such sequences. The latter tendency was more dominant, and both tendencies became more pronounced as the difference between the spatial resolutions of the associated cost surfaces increased. When two real satellite images (of different resolutions with fairly large discrepancies) were used as the basis of cost estimation, the variation of the accuracy indicator was found to be substantially large in the vicinity (1500 m) of the source but decreases quickly with an increase in distance from it.

Conclusions

Effective distances measured on lower-resolution cost surfaces are generally highly correlated with—and useful predictors of—effective distances measured on higher-resolution cost surfaces. This relationship tends to be weakened when linear barriers to dispersal (e.g., roads and rivers) exist, but strengthened when moving away from sources of dispersal and/or when linear barriers (if any) are detected by other presumably more accessible and affordable sources such as vector line data. Thus, if benefits of high-resolution data are not likely to substantially outweigh their costs, the use of lower resolution data is worth considering as a cost-effective alternative in the application of least-cost path modeling to landscape connectivity analysis.

Background

With the continuing loss and fragmentation of wildlife habitats worldwide over the past decades, landscape connectivity—i.e., “the degree to which the landscape facilitates or impedes movement among resource patches” (Taylor et al. 1993)—has received much attention in ecological research (e.g., Merriam 1984; Taylor et al. 1993; With et al. 1997; Tischendorf and Fahrig 2000; Fletcher Jr et al. 2018; Wilkinson et al. 2018; Sullivan et al. 2019). Since organisms need to move routinely for resource exploitation (Van Dyck and Baguette 2005) and interact with different populations for reproduction (Stevens et al. 2006; Michels et al. 2001; Wang et al. 2009; Cushman and Lewis 2010; Spear et al. 2010), weakened landscape connectivity may lead to the reduction of populations, or even local extinction, of vulnerable native species (Rudnick et al. 2012).

While distance between patches of habitats is a key parameter to quantify landscape connectivity, its concept and measurement have been defined in different ways depending on context. More specifically, the level of complexity and relevance of any distance defined depends on how much we know about a target species and its environment—from the species’ mobility (e.g., by walking or flying), diet, habitat, and its interaction (e.g., predation) with other species to the environment’s vegetation, topography, climate, and spatial structure of its components. The simplest one is Euclidean distance along a straight line and its use, implicitly or explicitly, assumes that organisms move at a constant rate across a landscape. If the landscape contains impenetrable features (e.g., water bodies and highways for terrestrial animals), this assumption may be modified such that organisms move at a constant rate inside the penetrable part of the landscape but do not move at all outside it. The distance measured as such is useful to evaluate structural connectivity (Tischendorf and Fahrig 2000; Adriaensen et al. 2003; Chardon et al. 2003). More recently, however, it has become increasingly common to acknowledge heterogeneity in “the behavioral responses of an organism to the various landscape elements” (Tischendorf and Fahrig 2000) and modify Euclidean distance based on the organism’s mobility through each landscape element (e.g., Taylor et al. 1993; Graham 2001; Rayfield et al. 2010). The resulting distance is referred to herein as effective distance (Tischendorf and Fahrig 2000; Ferreras 2001; Adriaensen et al. 2003; Verbeylen et al. 2003; Broquet et al. 2006; Spear et al. 2010) but may be called effective geographical distance (Michels et al. 2001), ecological distance (Royle et al. 2013; Sutherland et al. 2015), or functional distance (Petit and Burel 1998) elsewhere.

One approach to measuring effective distance employs geospatial technologies such as Geographic Information Systems (GIS). In particular, a raster-based GIS may be used to discretize a landscape thematically into a set of single-attribute layers, each corresponding to one landscape variable, and spatially into a grid of (typically equal-sized square) cells, each representing a location on the Earth’s surface. For example, a land cover layer assigns each cell a value indicating the most dominating (or critical, depending on context) land cover type. If a layer is given in which each cell is assigned a ‘cost’ (or additive, undesirable quantity) per unit length for a certain use or activity within that cell, another useful function of GIS computes the cost-weighted length of a path (represented by a sequence of cells)—i.e., the sum of the length between each consecutive cells in it multiplied by their average cost value—and finds a least-cost path from a source to a destination—i.e., one that has the minimum cost-weighted length of all paths connecting them. See Sawyer et al. (2011) for a critical review on the application of the least-cost path function to landscape connectivity analysis and linkage design.

A cost surface is often created from a land cover layer by translating each land cover type into a cost value according to its assumed resistance to the movement of target species. The literature contains many examples of cost evaluation methods (Knaapen et al. 1992; Ray et al. 2002; Chardon et al. 2003; Nikolakaki 2004; Kautz et al. 2006; Driezen et al. 2007; Gonzales and Gergel 2007; LaRue and Nielsen 2008; Spear et al. 2010; Stevenson-Holt et al. 2014; Ziółkowska et al. 2014). However, there are potential problems associated with them. Those include subjectivity in the choice of parameters (Beier et al. 2008; Rayfield et al. 2010; Spear et al. 2010; Zeller et al. 2012; Ligmann-Zielinska and Jankowski 2014) and sensitivity to their changes (Schadt et al. 2002; Larkin et al. 2004; Driezen et al. 2007; Gonzales and Gergel 2007; Rayfield et al. 2010; Murekatete and Shirabe 2018). Nevertheless, if a cost surface is available that simulates the mobility of target species accurately, GIS will enable one to find a least-cost path between any two locations and use its cost-weighted length, which may be called “least-cost distance” (Etherington 2016), as the effective distance between the two locations (Ferreras 2001; Adriaensen et al. 2003; Chardon et al. 2003; Verbeylen et al. 2003; Broquet et al. 2006; Royle et al. 2013; Ziółkowska et al. 2014).

Effective distance is scale dependent, as is the case with other natural phenomena on the Earth’s surface (Wiens 1989, Wu et al. 2002, Liu et al. 2007, Cushman and Landguth 2010 for examples in ecology; Deng et al. 2007, Smith et al. 2019 for examples in geomorphology; Ghaffari 2011, Goulden et al. 2014, Buakhao and Kangrang 2016, Thomas et al. 2007 for examples of hydrology). It is generally known that the cost-weighted length of a least-cost path (Rae et al. 2007; Etherington 2016) as well as its geometric length (Broquet et al. 2006) are affected by the spatial resolution of the input cost surface partly because some landscape elements are too small to be detected at low (or coarse) resolutions. This, however, does not always justify the acquisition of expensive high-resolution data for the analysis of landscape connectivity, since the most appropriate scale of a relevant ecological process is not necessarily attained by the highest possible resolution of a grid (see Wiens (1989) for a general discussion and Rae et al. (2007) and Cushman and Landguth (2010) for examples). Even if so, effective distances measured at lower resolutions may suffice if they are similar to or good predictors of those measured at higher resolutions.

To see this, imagine two extreme types of cost surfaces, uniform and random, instances of which are here created by assigning each cell of a 1-m-resolution grid a value of 1 and an integer chosen from 1 to 10 with an equal probability, respectively. Suppose that each of these cost surfaces is resampled to resolutions of 3, 9, and 27 m with majority filters of dimensions of 3 × 3, 9 × 9, and 27 × 27, respectively. In this procedure, a filter was placed at each cell on the original cost surface, and the most frequent value within the filter was assigned to that cell, while ties were randomly broken. As a result, we have four uniform and four random cost surfaces with four different resolutions. Let us place two points at the centers of the top-left and bottom-right cells of the coarsest cost surface (i.e., one with a cell size of 27 m), designate them as a source and sink, and find a least-cost path between them on each of the original and resampled cost surfaces. On all four uniform cost surfaces, a straight line segment is the least-cost path between the source and sink. On the other hand, different least-cost paths connect the source and sink on the four random cost surfaces (see Fig. 1), and interestingly, their cost-weighted lengths are found to increase more or less linearly with their cell sizes. Is this just a coincidence? Certainly, real landscapes are not completely uniform or random. Yet, we suspect that there may exist some relationships between effective distances measured with different spatial resolutions and that these relationships may be strengthened or weakened depending on how the cost values are distributed and/or disturbed.

Fig. 1
figure 1

Least-cost paths (colored in green) between a source and sink on random cost surfaces with resolutions of a 1, b 3, c 9, and d 27. Note that darker shades represent higher cost values

A major hypothesis of this study is that the effective distance between two locations measured on a lower-resolution cost surface can be a useful predictor of the corresponding distance measured on a higher-resolution cost surface. We expect that this hypothesis holds at least when the spatial distribution of cost values does not contain abrupt discontinuities. We have yet to know if any general characteristics of the underlying distribution of cost values affect the relationships between effective distances of different resolutions, although some of them (e.g., range and grain) may well affect the locations and lengths of individual least-cost paths (see, e.g., Larkin et al. 2004; Schadt et al. 2002; Driezen et al. 2007; Gonzales and Gergel 2007; Murekatete and Shirabe 2018). To answer these, we designed computational experiments with both synthetic and real landscape data, in which we measured effective distances on cost surfaces derived from these data with varied cell sizes and examined how they differ.

Methods

We conducted two experiments. In the first experiment, we used computer-generated random landscapes of a variety of spatial configurations. This allowed us to obtain a large sample of independent cost surfaces and observe the statistical behavior of effective distance measures in response to the change of spatial resolution. In the second experiment, we used a single instance of real-world geographic data to test the relevance of findings of the first experiment to an actual landscape. For both experiments, we coded algorithms for creation of cost surfaces and search for least-cost paths in the Java programming language and ran them on a 2.80 GHz Intel Core i7-7600 U CPU processor with 32.8 GB of RAM. In addition, we used the ArcGIS 10.6 software, ENVI image processing 5.3 software, and NLMpy PYTHON 1.0.0 software module for data preparation and presentation.

Data

Experiment 1

We first used the NLMpy PYTHON software package (Etherington et al. 2015) to generate a large number of neutral landscape models, which are, as originally introduced by Gardner et al. (1987), computer-generated landscapes encoded in raster format. In particular, with the NLMpy adaptation of the mid-point displacement method (Fournier et al. 1982), we generated 1000 729 × 729 grids of cells with values ranging from 0 to 1 and having various levels of autocorrelation (Anselin 1995) controlled by a parameter, h, to which random decimal numbers between 0 and 1 were assigned. For ease of presentation and discussion, we assumed that the resolution of all the grids was 1 m.

Then, we converted each of the 1000 neutral landscape models generated in the previous step into a cost surface by reclassifying its decimal numbers to a set of integers defined by four parameters, dist_type, #_of_classes, min_value, and max_value, whose values were also randomly set. dist_type specifies one of the four types of frequency distribution of cost values including uniform, symmetric, left skewed, and right skewed. #_of_classes specifies the number of unique cost values, i.e., the number of land cover types, whose range was set to 3 to 20. min_value and max_value respectively specify the minimum and maximum cost values such that 1 ≤ min_valuemax_value ≤ 100.

Next, to each of the 1000 cost surfaces generated in the previous step, we added one linear barrier after another and created two additional cost surfaces: one with one barrier and the other with two barriers. A barrier was generated by randomly selecting two cells, drawing a line segment between them, and creating a buffer around it with a distance of 3 m on each side (i.e., the barrier width = 6 m). Note that, in order to prevent creating disconnected areas (which could result in too many unreachable cells), we enforced a rule that no buffers would intersect the grid boundary.

Finally, we resampled each of the 3000 cost surfaces (i.e., 1000 each with 0, 1, and 2 barriers) generated in the previous step to resolutions of 3, 9, and 27 m by using majority filters of dimensions of 3 × 3, 9 × 9, and 27 × 27, respectively. As the result, we obtained 12,000 cost surfaces, i.e., 12 cost surfaces from each of the 1000 neutral landscape models (see Fig. 2 for an example).

Fig. 2
figure 2

Twelve cost surfaces with different resolutions and different numbers of linear barriers (colored in red) derived from a common neutral landscape model. Note that darker shades represent higher cost values. Note also that (parts of) some linear barriers have disappeared at lower resolutions

There were two notes concerning the assumption and limitation of our experimental design. First, the barrier width was set to 6 m so that the three resampling filters would effectively simulate the gradual loss of linear barriers to decreasing spatial resolution. In fact, at resolutions of 1 m and 3 m, all barriers were thick enough to correctly block passage (see Fig. 2e, f, i, and j), but at resolutions of 9 m and 27 m, some barriers were so thin as to partially or completely disappear and allow false penetration (see Fig. 2g, h, k, and l). Second, while the neutral landscape models were, by design, generic, the following assumptions were made to give them some ecological context: (1) they were heterogeneous in terms of land covers, some of which were natural (e.g., forest and grassland) and others were manmade (e.g., settlements and roads), (2) they were inhabited by a terrestrial target species, whose mobility varied only with land cover types, and (3) the added barriers were major roads that were not supposed to be crossed by the target species.

Experiment 2

We selected an approximately 15 × 16 km rectangular area of northern Rwanda that lies between two national parks, the Volcanoes National Park (VNP), and Gishwati Mukura National Park (GMNP), as the study area of the second experiment. Both parks are home to various taxa including primates (Grueter et al. 2013), birds (Vande Weghe and Vande Weghe 2011), amphibians (van der Hoek et al. 2019a), and plants (van der Hoek et al. 2019b). The remaining part of the study area is predominantly occupied by human settlements and their surrounding farms for subsistence and cash crops as it is covered with fertile volcanic soil. These land uses have been putting stress on the otherwise “prime area for biodiversity conservation and tourism” (Akinyemi 2017; Kanyamibwa 2013).

As target species, we continued to use the hypothetical terrestrially moving animals. For this particular study area, they might be exemplified by golden monkeys (Cercopithecus mitis kandti), which are half-meter-long primates that feed on bamboo, leaves, and fruits and are found in different types of vegetation, mostly forest and bamboo (Twinomugisha and Chapman 2008). As listed as an endangered subspecies (of the blue monkey) on the IUCN Red List, their population and range are declining due to habitat degradation, loss, and fragmentation as a result of human activities (Butynski and de Jong 2020).

To estimate the cost of movement for the target species over the study area, we acquired two satellite images from the data catalogue of Google Earth Engine and used their visible (i.e., red, green, and blue) and near-infrared bands to detect land cover types of the study area. They were a PlanetScope image (Planet 2016) with a resolution of 3 m (captured on August 15, 2019) and a Sentinel-2 MultiSpectral Instrument image (European Space Agency 2015) with a resolution of 10 m (captured between June and September 2019). In addition, we used a digital elevation model (DEM) with a resolution of 30 m created by the Shuttle Radar Topography Mission (Farr et al. 2007) to help identify some vegetation types known to have topographic preferences. All the data were georeferenced to the Universal Transverse Mercator projection on the World Geodetic System 1984. Using the ENVI image analysis software, we applied to the two images a pixel-based land cover classification tool, which was an implementation of a non-parametric machine-learning algorithm called Support Vector Machine (Vapnik 1995) and generated two raster land cover layers (see Fig. 3).

Fig. 3
figure 3

Land cover layers with resolutions of a 3 m and b 10 m. Note that the water bodies were too small to be seen

To each land cover type, we assigned a value from 1 to 50 representing the cost per unit distance for moving through that land cover type (Table 1), except that water bodies (which were very few and small) were considered impermeable. This was based on an assumption that the target species would not swim, would move most easily in the forests, and take more efforts to move over other types of land cover. Accordingly, the 3-m and 10-m land cover layers were converted into two cost surfaces of the respective resolutions, which are referred to as COST3 and COST10, respectively.

Table 1 Land cover types and their cost values

Notice that some human settlements were expected to serve as linear barriers as they were given the highest cost value (50) and densely clustered along major roads, which were assumed to be as costly as human settlements. The linear barriers induced by human settlements and major roads actually had many gaps in them and allowed unlikely passage. This was because major roads were not detected from the satellite images mainly because many of them were mixed up with cropland and bare land surrounding human settlements. To plug these gaps, we used the ArcGIS software to identify all cells intersected by vector line data representing a road network (which was initially digitized from a 1:5000 topographic map of 1988 and updated by Rwanda Land Management and Use Authority based on 25-cm-resolution aerial photographs captured in 2008) and assign them the highest cost value (50). The use of this high but finite cost value did not make the linear barriers completely impermeable but still difficult to cross, which seemed to be a realistic assumption for many terrestrial animals such as golden monkeys. The resulting two cost surfaces (with resolutions of 3 m and 10 m) reflected the location of major roads, and are referred to as COST3_V and COST10_V, respectively.

Effective distance measurement

Experiment 1

For each of the 1000 sets of 12 cost surfaces, we randomly selected two locations that coincided with the centers of cells at all four resolutions as a source and sink, with conditions that they would not be inside linear barriers. Then, we applied the shortest path algorithm of Dijkstra (1959) to each cost surface to generate a least-cost path between the source and sink (see Fig. 4 for examples), and measured its cost-weighted length (CWL) as the effective distance between them. In this experiment, effective distances measured on a cost surface with a resolution of 1, 3, 9, and 27 m were referred to as CWL1, CWL3, CWL9, and CWL27, respectively. As a result, we obtained a sample of 1000 sets of CWL1, CWL3, CWL9, and CWL27 values for each case of no barriers, one barrier, and two barriers.

Fig. 4
figure 4

Least-cost paths (colored in green) on the 12 cost surfaces with different resolutions and different numbers of linear barriers (colored in red) derived from a common neutral landscape model shown in Fig. 3

Although four of the 12 cost surfaces derived from each neutral landscape model were supposed to be replicas of each other but of different resolutions (Fig. 2), their associated least-cost paths (Fig. 4) were generated independently of each other. This does not mean, however, that their cost-weighted lengths—representing the effective distance between their common source and sink—were as different (or similar) as their forms might suggest, because of the heterogeneity in the underlying distribution of cost values. Also, it is important to note that computationally optimal paths (especially those that went around barriers) would not necessarily be taken (or even recognized) by actual animals, verification of which was beyond the scope of the present experiment.

Then, for each of the 1000 sets of CWL1, CWL3, CWL9, and CWL27 values in each case with no barriers, one barrier, and two barriers, we calculated the ratio of the effective distance measured on each of the lower-resolution cost surfaces (i.e., CWL3, CWL9, or CWL27) to that on the highest-resolution cost surface (i.e., CWL1), which is referred to herein as the ‘accuracy indicator’ of the former against the latter. It takes a value greater than 1 if the former distance overestimates the latter distance (which is assumed to be more accurate), smaller than 1 if the former distance underestimates the latter distance, and equal to 1 otherwise. We analyzed the sampling distributions of those accuracy indicators and also performed a simple linear regression analysis of each accuracy indicator against each of the five parameters characterizing cost surfaces, h, dist_type, #_of_classes, min_value, and max_value (see the subsection Experiment 1 of the Data section), as well as straight-line source-to-sink distance.

Experiment 2

Assuming that the natural forest patch occupying the northeast of the study area served as a source of dispersion of the hypothetical species, we applied Dijkstra’s algorithm to the COST3 and COST10 surfaces to generate two layers representing effective distances from the source patch, which were referred to as CWL3 and CWL10, respectively. Since, unlike in Experiment 1, the two cost distance surfaces did not align each other (i.e., did not share grid lines or cell centers), we sampled points from the study area at an equal interval of 150 m, which totaled 9203 points after excluding those within the source patch and water bodies. At each of those sample points, we recorded a pair of CWL3 and CWL10 values and calculated the accuracy indicator CWL10/ CWL3. Note that we could have sampled more points at finer intervals (but not finer than the resolution of the CWL10) but we considered that the sample size of 9203 was large enough to capture the variation of accuracy indicator in this study area.

Similarly, from the COST3_V, and COST10_V surfaces, we generated two layers of effective distances from the source patch. They were referred to as CWL3_V and CWL10_V, respectively. Then, we recorded a pair of CWL3_V and CWL10_V at each of the 9203 sample points and analyzed the resulting sampling distribution of CWL10_V /CWL3_V.

Results

Experiment 1

The frequency distribution of 1000 sample values of CWL3/CWL1, CWL9/CWL1, and CWL27/CWL1 on cost surfaces with no barriers, one barrier, and two barriers are presented in Fig. 5, and their summary statistics are reported in Table 2.

Fig. 5
figure 5

Frequency distributions of 1000 sample values of CWL3/CWL1, CWL9/CWL1, and CWL27/CWL1 on cost surfaces with ac no barriers, df one barrier, and gi two barriers

Table 2 Summary statistics of 1000 sample values of CWL3/CWL1, CWL9/CWL1, and CWL27/CWL1 on cost surfaces with no barrier, one barrier, and two barriers

When there were no barriers, the sampling distribution of CWL3/CWL1 was overall clustered around 1 with a high peak and little deviation but slightly skewed to the right. CWL9/CWL1 and CWL27/CWL1 had similar sampling distributions to that of CWL3/CWL1 but with a little longer right tails. The mean of CWL3/CWL1 was slightly smaller than that of CWL9/CWL1, which was, in turn, slightly smaller than that of CWL27/CWL1. Their differences were found statistically significant by two t tests, one for the difference between the means of CWL3/CWL1 and CWL9/CWL1 (p value < 0.001) and the other for the difference between the means of CWL9/CWL1 and CWL27/CWL1 (p value < 0.001).

When there was one barrier, CWL3/CWL1 had an almost identical distribution to that with no barriers. On the other hand, CWL9/CWL1 extended its distribution to the left, and so did CWL27/CWL1 with an even greater extent (notice in particular the increased # (< 0.9) values in Table 2). Two t tests showed that the mean of CWL3/CWL1 remained significantly larger than that of CWL9/CWL1 (p value < 0.001), but there was no significant difference between the means of CWL9/CWL1 and CWL27/CWL1 (p value = 0.258).

When there were two barriers, all three accuracy indicators generally behaved similarly to those with one barrier, but CWL9/CWL1 and CWL27/CWL1 extended their distributions even further to the left. Two t tests showed that the mean of CWL3/CWL1 was still significantly larger than that of CWL9/CWL1 (p value < 0.001), and there was no significant difference between the means of CWL9/CWL1 and CWL27/CWL1 (p value = 0.824).

The results of the simple linear regression of each of CWL3/CWL1, CWL9/CWL1, and CWL27/CWL1 against the six parameters, h, dist_type, #_of_classes, min_value, and max_value and straight-line source-to-sink distance showed that the three accuracy indicators had weak linear relationships with h and min_value but not with the other parameters (Table 3).

Table 3 R2 (above) and p value (below) for the linear regression of each of CWL3/CWL1, CWL9/CWL1, and CWL27/CWL1 against h, dist_type, #_of_classes, min_value, max_value, and straight-line source-to-sink distance

A visual inspection of the regression lines of the three accuracy indicators against straight-line source-to-sink found that when there were barriers, CWL9/CWL1 and CWL27/CWL1 tended to vary more for shorter straight-line source-to-sink distances (see Fig. 6 for the case of two barriers), but no such tendency was seen in CWL3/CWL1 (which was overall tightly clustered). To verify this, we divided the sample for each accuracy indicator into two subsamples with varying cutoff distances (50, 100, 200, 300, and 400 m) and performed an F test for the difference between their variances. It showed that there was a distance (somewhere around 200 to 300 m) beyond which the accuracy indicator dropped significantly (see Table 4 for the case of two barriers).

Fig. 6
figure 6

Plots of 1000 sample values of a CWL9/CWL1 and b CWL27/CWL1 against straight-line source-to-sink distance in the case of two barriers

Table 4 Variances of each of CWL3/CWL1, CWL9/CWL1, and CWL27/CWL1 for closer source-sink pairs and the remaining source-sink pairs in the case of two barriers

Experiment 2

While their two sampling distributions were similar in shape (Fig. 7), CWL10_V/CWL3_V had a slightly greater mean than CWL10/CWL3 (Table 5), which was found statistically significant by a t test for the difference between them (p value < 0.001). In fact, CWL10_V/CWL3_V was greater than CWL10/CWL3 at all but 103 sample points. These imply that CWL10_V underestimated CWL3_V to a smaller degree than CWL10 underestimated CWL3.

Fig. 7
figure 7

Frequency distribution of 9203 sample values of a CWL10/CWL3 and b CWL10_V/CWL3_V

Table 5 Summary statistics of 9203 sample values of CWL10/CWL3 and CWL10_V/CWL3_V

Many of the accuracy indicator values deviating from their respective means were found to cluster in a short range of straight-line distances from the source (Fig. 8). The large variation caused by them, however, sharply diminished as moving further away from the source. This was well visualized by the spatial distributions of the sample values of CWL10/CWL3 and CWL10_V/CWL3_V over the study area (Fig. 9).

Fig. 8
figure 8

Plot of 9203 sample values of a CWL10/CWL3 and b CWL10_V/CWL3_V against straight-line distance from the source patch. Note that the portion of the plot (containing 14 values) where CWL10/CWL3 > 1.60 are not shown here for ease of illustration

Fig. 9
figure 9

Spatial distribution of 9203 sample values of a CWL10/CWL3 and b CWL10_V/CWL3_V

As for CWL10/CWL3, both a very high range of values and very low range of values were only found near the source patch—roughly within a distance of 1500 m from it. Also, there were two southward-elongated clusters of relatively high and low values in the east of the study area. The rest of the study area was largely homogenous. In fact, if the study area was divided into two areas, one within 1500 m from the source patch (containing 822 sample points) and the other beyond 1500 m from the source patch (containing 8381 sample points), their means (0.862 and 0.867, respectively) were almost identical, but their standard deviations (0.982 and 0.063, respectively) were very different. An F test for the difference between the corresponding variances found that the difference was statistically significant (p value < 0.001).

The spatial distribution of CWL10_V/CWL3_V was generally similar to that of CWL10/CWL3. In fact, CWL10_V/CWL3_V took the same value as CWL10/CWL3 at all but 48 sample points within a distance of 1500 m of the source patch, because those points could be reached before needing to cross—thus without being affected by—linear barriers. Elsewhere, CWL10_V/CWL3_V was generally slightly higher than CWL10/CWL3, which explains that while the elongated cluster of relatively low values in the west had shrunk, the elongated cluster of relatively high values in the west were more pronounced.

Discussion

The major finding of this study was that effective distances measured on lower-resolution cost surfaces are generally highly related to those measured on higher-resolution cost surfaces. However, whether the former can serve as a useful predictor of the latter depends on other conditions. Detailed discussions are as follows.

Overestimation and underestimation

Assuming that higher resolution data enable more accurate effective distance measurement, we have demonstrated that a decrease in spatial resolution generally has two opposite effects: overestimation and underestimation of effective distances. In Experiment 1, the comparison of the sampling distributions of the three accuracy indicators CWL3/CWL1, CWL9/CWL1, CWL27/CWL1 (Fig. 5 and Table 2) suggests that effective distances tend to be longer on lower-resolution cost surfaces if the underlying distribution of cost values is positively spatially autocorrelated (thus smoothly varying) across the study area. We interpret this as a computational artifact as follows. Given a source and destination, a higher-resolution cost surface allows more alternative paths to connect them than a lower-resolution cost surface does because it contains a greater number of cells. This difference in number of alternative paths makes the least-cost path on the lower-resolution cost surface tend to have a greater cost-weighted length than that on the higher-resolution cost surface. This effect is most pronounced when the underlying distribution of cost values is random, exhibiting a salt-and-pepper pattern (as was the case with the four least-cost paths in Fig. 1). On the contrary, no overestimation should occur on completely homogeneous cost surfaces. Autocorrelated surfaces sat between these two extreme spatial patterns, being closer to the latter, which explains the relatively small overestimation observed in Experiment 1.

The effect of overestimation is expected to remain even if there are patches of high cost values, which make the spatial distribution of cost values discontinuous at their boundaries. This is implied by the results of the regression analysis of Experiment 1 in which we varied the level of patchiness using the #_of_classes parameter but did not find it to affect any of the accuracy indicators (Table 3). We give a computational account of this as follows. When patches are sufficiently wide to survive reduction of spatial resolution or short to be easily dodged, least-cost paths on cost surfaces of different resolutions tend to take similar turns or different but small turns in going around high-cost patches.

The effect of underestimation comes in and dominates the effect of overestimation, however, if the cost surface contains linear barriers incurred by long and narrow (rather than short and wide) high-cost or impermeable features such as rivers and roads. From Experiment 1, we have learned that when there are linear barriers, effective distances tend to be shorter on lower-resolution cost surfaces and that this tendency becomes stronger as the number of linear barriers increases. This agrees with a result of the sensitivity analysis by Rae et al. (2007), which found “the dependence of the least-cost inter-patch distance calculations on the presence or absence of costly linear barriers to movement”. Again, this is a computational artifact, because lower-resolution cost surfaces have higher chances of, completely or partially, overlooking linear barriers, and the cost surfaces derived from them have greater risks of allowing least-cost paths false shortcuts. Such errors would be even more problematic when model outputs are to be utilized for decision making on actual operations or activities, for example, acquisition of land or planting of trees to establish connectivity corridors or stepping stones.

It is important to note that in Experiment 1, we resampled cost surfaces to create coarser cost surfaces using a method that takes the most frequent value in each sampling filter (as described in the subsection Experiment 1 of the Data section) so that the original and resampled cost surfaces had similar compositions of cost values. This may not be the case with other resampling methods as demonstrated in an experiment by Liu et al. (2007). For example, if we had used a method that takes the mean value in each sampling filter (useful when cost values are to be derived from quantitative attributes rather than categorical ones), the resampled cost surfaces would have had narrower ranges of cost values than the original, which, in turn, would have systematically caused least-cost paths to accumulate more cost on the resampled cost surfaces. In our experiment, we avoided such a bias by assuming that cost values were mapped from categorical values and using the majority filter.

Variation

While we do not want effective distances measured on a lower-resolution cost surface to inaccurately estimate effective distances measured on a higher-resolution cost surface, we argue that whether overestimation or underestimation occurs, it may not necessarily be a problem if its rate is constant across a study area. For example, the absolute values of effective distances may not be meaningful if cost is measured on a subjective, dimensionless scale of cost, which may be quantified according to expert opinions (e.g., Knaapen et al. 1992; Chardon et al. 2003; Gonzales and Gergel 2007; LaRue and Nielsen 2008; Spear et al. 2010) or take the form of an inverse of “suitability” (e.g., Ferreras 2001; Wang et al. 2008; Chetkiewicz and Boyce 2009; Poor et al. 2012; Trainor et al. 2013; Reding et al. 2013; Ziółkowska et al. 2014) in practice.

Results of Experiment 1 support the possibility of constant rates of overestimation/underestimation under certain conditions. The comparison of the sampling distributions of CWL3/CWL1, CWL9/CWL1, and CWL27/CWL1 (Fig. 5 and Table 2) suggests that the variation of the accuracy indicator tends to increase with the difference between the resolutions of its associated cost surfaces and with the number of linear barriers. Interestingly, however, even if the variation of the accuracy indicator is high overall, high variation tends to be concentrated near the source (Table 4). We consider this as yet another computational artifact since generally, when two locations are closer to each other—i.e., in terms of number of cells (or more correctly, with respect to the cell size as the unit of length), a least-cost path between them comprises fewer cells and its cost-weighted length is thus more affected by the error associated with a single cell. This error propagation mechanism is similar to that of topographic characterization with a raster DEM, in which a terrain attribute (e.g., slope and aspect) of each cell is derived by combining elevation values within its immediate neighborhood (typically limited to nine cells including itself) (see Zhang et al. 1999; Deng et al. 2007, and Smith et al. 2019).

Application

In Experiment 2 with actual satellite imagery, the sampling distribution of CWL10/CWL3 shows that effective distances measured on the 10-m-resolution cost surface underestimated effective distances measured on the 3-m-resolution cost surface (Fig. 7 and Table 5). The compositions of cost values of the two cost surfaces were similar except that the 10-m-resolution cost surface had a larger percentage of cells with the highest cost value than the 3-m-resolution cost surface (Table 1), which should have had the effect of making effective distances longer on the 10-m-resolution cost surface. Thus, a possible cause of the observed underestimation was the presence of linear barriers—which was found in the clusters of human settlements along major roads (Fig. 3)—and of false gaps in them due to insufficient spatial resolution. The comparison of the sampling distribution of CWL10_V/CWL3_V with that of CWL10/CWL3 (Fig. 7 and Table 5) seems to support this, as the former was generally greater than the latter, which implies that there were more gaps in the linear barriers on the 10-m-resolution layer than on the 3-m-resolution layer. We acknowledge, however, that the statistical significance of the difference between the means of CWL10/CWL3 and CWL10_V/CWL3_V (p value < 0.001) might need to be questioned, since the 9203 observations of any of CWL3, CWL10, CWL3_V, and CWL10_V were not independent (because all the corresponding paths were derived from the same cost surface).

It was also found that both CWL10/CWL3 and CWL10_V/CWL3_V had fairly high variation. Their standard deviations were both 0.300, which was much greater than those observed in Experiment 1. This may imply that effective distances measured on the 10-m-resolution cost surface are not strong predictors of effective distances measured on the 3-m-resolution cost surface. For both indicators, however, high variation was seen only in the vicinity of the source patch, and quickly diminished as they were moving away from it (Figs. 8 and 9). In fact, the standard deviations of CWL10/CWL3 and of CWL10_V/CWL3_V beyond 1500 m (equivalent to 150 cell sides on the 10-m-resolutin cost surface) from the source patch were 0.063 and 0.064, respectively. So, if the study area had been as small as, say, a 150 × 150 grid, we could not have expected CWL10 to be a useful predictor of CWL3. An example by Etherington (2016) can be considered as one such case, where least-cost paths on a 21 × 21 cost surface appeared to substantially deviate from those on its original, finer grid.

Lastly, we have seen that both the effects of underestimation/overestimation and the variation of their indicators were more dramatic in Experiment 2 than in Experiment 1. This can be ascribed to the difference between synthetic data and real-world data. In Experiment 1, each of the lower-resolution cost surfaces was a result of resampling of a higher-resolution cost surface, so that the two cost surfaces certainly had different precision, but their accuracy should not be considered different. In Experiment 2, on the other hand, the lower-resolution cost surface and higher-resolution cost surface were generated from different satellite images captured by different devices at different times, so that their levels of accuracy were not expected to be the same.

Conclusions

Application of Geographic Information Systems (GIS) has become commonplace in ecological research. In the context of landscape connectivity, their capability of computing least-cost paths on a raster cost surface is particularly useful for estimation of effective distances. While GIS takes any cost surface encoded in raster format as input to this function, it is the user’s responsibility to ensure that it has an appropriate spatial resolution—appropriate, i.e., in accordance with the grain of ecological process affecting the movement of target species. Thus, it seems conservative yet reasonable to attempt to obtain as high-resolution data as possible, considering that an option is available to downsample them. This is, however, not always the case in real-world applications, because high-resolution data tend to have a high price and high volume, and the availability of financial and computational resources is often limited.

Through computational experiments with neutral landscape models and actual satellite images, we have demonstrated that when certain conditions are met, effective distances measured on a cost surface with a higher resolution are strongly related (or even substantially similar) to those measured on a cost surface with a lower resolution, and it is possible to estimate the former from the latter. These conditions include the absence of linear barriers to dispersal, the availability of ancillary information (e.g., through vector line data) on the location of linear barriers, if present, and large distance (with respect to the cell size) between locations for which effective distances are to be measured. Still, we acknowledge that the results and findings of our experiments may not be universal or even applicable to any particular species because of the use of synthetic landscapes and/or a hypothetical species and the simplistic assumption on their relationships.

A practical implication of the results of this study is that if it is known in advance how much detail must be considered in estimation of effective distances and landscape connectivity, it is still important to use data with a spatial resolution high enough to capture the required amount of detail. However, if their benefits are not expected to outweigh their costs substantially, the use of lower-resolution data is worth considering as a cost-effective alternative. Thus, considering that remote sensing data are not the only means for detecting the location of dispersal barriers, local geographic knowledge and information remain critical in the application of GIS to landscape connectivity analysis.

Availability of data and materials

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

CWL:

Cost-weighted length

GIS:

Geographic Information Systems

GNMP:

Gishwati Mukura National Park

VNP:

Volcanoes National Park

References

Download references

Acknowledgements

This work was jointly supported by the Swedish Research Council for Sustainable Development Formas (grant number: 942-2015-1513) and the Swedish International Development Cooperation Agency (Sida) (grant number: 51160059-06). The authors thank Theodomir Mugiraneza for obtaining the two satellite images and creating land cover layers. The authors also thank the editor and anonymous reviewers for their valuable and constructive comments on an earlier draft of the article. Any errors remain, of course, the sole reponsibility of the authors.

Funding

This work was part of the doctoral study of Rachel Mundeli Murekatete, which was supported by the Swedish International Development Cooperation Agency (Sida) (grant number: 51160059-06). This work was also part of the research project “Drawing with Geography”, which was fully supported by the Swedish Research Council for Sustainable Development Formas (grant number: 942-2015-1513).

Author information

Authors and Affiliations

Authors

Contributions

RMM conceptualized and conducted computational experiments and analyzed the results and was a major contributor in writing the manuscript. TS coded algorithmic procedures for creation of the synthetic data and generation of the least-cost paths and edited earlier drafts of the manuscript. Both authors contributed to the experimental design and interpretation of the results and read and approved the final manuscript.

Authors’ information

Rachel Mundeli Murekatete is a PhD student in Geoinformatics at the KTH Royal Institute of Technology, Sweden. She is also Assistant Researcher at the Centre for Geographic Information Systems and Remote Sensing (CGIS), University of Rwanda. She has a degree in Geoinformation Science for Urban and Regional Planning from the University of Twente, Faculty of Geoinformation Science and Earth Observation (ITC), Netherlands. Her current research focus is on the design, evaluation, and application of GIS-based spatial optimization models and methods for spatial planning practices with emphasis on raster-based model.

Takeshi Shirabe is currently an Associate Professor of Geoinformatics at the Royal Institute of Technology (KTH), Sweden. He has a Ph.D. in City and Regional Planning from the University of Pennsylvania, USA and a habilitation of Geoinformation from the Vienna University of Technology, Austria. His main research area is spatial optimization and its application to planning and design.

Corresponding author

Correspondence to Rachel Mundeli Murekatete.

Ethics declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Murekatete, R.M., Shirabe, T. On the effects of spatial resolution on effective distance measurement in digital landscapes. Ecol Process 10, 50 (2021). https://doi.org/10.1186/s13717-021-00296-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13717-021-00296-3

Keywords