Skip to main content

Exclusion of tourist species from assemblages in ecological studies: a methodological approach using spiders

Abstract

Background

The exclusion of tourist species from samples is important to avoid bias in community analyses. However, in practice, this is a very difficult task. The commonly used methods by researchers, when the habitat of the species is not known, have several shortcomings: first, they exclude not only pseudo-rare species but also genuinely rare species; second, the results obtained with those methods depend on the abundance of the sampling; and third, they follow very subjective rules. The aims of this study were: (i) to propose a methodology to detect and exclude habitat-tourist species from the database used to carry out analyses in community ecology studies, (ii) to evaluate how the presence of habitat-tourist species affects the richness estimates, and (iii) to assess the effect of including juvenile spiders in the detection of tourist species and the effect of removing them from the richness estimates.

Results

When the adult + juvenile dataset was considered, twenty-one habitat-tourist species were detected: 8 in forest foliage, 11 in forest leaf litter, and 2 in grassland. When habitat-tourist species were considered with this dataset, richness overestimation was significant in foliage and in leaf litter, and the final slopes of the richness estimation curves were significantly steeper in leaf litter. When only the adult dataset was considered, eight habitat-tourist species were detected: 3 in forest foliage, 4 in forest leaf litter, and just one in grassland. The inclusion of habitat-tourist species in this dataset showed an overestimation of richness, but this was not significant.

Conclusions

The proposed methodology contributes to solving the problem of tourist species, which was recognized as one of the great problems in biodiversity studies. This study showed that common estimators overestimate species richness when habitat-tourist species are included, leading to erroneous conclusions. Besides, this research showed that the inclusion of juveniles (e.g. spiders) could improve the analysis outputs because it allowed the detection of more habitat-tourist species.

Background

Ecosystems are characterized by an exchange of matter and energy (Willis 1997). Since these systems are open, communities of species from a particular biogeographic region or habitat patch within this region can be expected to interact with communities from surrounding regions or habitats. A kind of interaction occurs due to dispersion, which is one of the factors that regulate the characteristics of a metacommunity or a particular set of communities (Chase et al. 2020). Individuals of species need to colonize new places to ensure the long-term persistence of metapopulations (Hanski 1998). For this, it is expected that some individuals will arrive in habitats or regions that are not suitable for them (Novotný and Basset 2000; Archer 2003; Hadidian 2015).

Several species adapted to a particular biogeographic region or habitat, where they have their normal distribution, may occur in other habitats or regions in a very low abundance. These species are known as “tourist”, “transitory”, or “occasional” species. Tourist species (TS) are classified within the pseudo-rare species. The term pseudo-rare species was coined by Rabinowitz (1981) to refer to species that are rare in only one or a few parts of their distribution range. Pseudo-rare species also include those that are assumed to be the product of sampling bias. It is believed that pseudo-rare species appear due to a continuous flow of immigrants, a phenomenon known as the "mass" or "source-sink" effect (Novotný and Basset 2000; Coddington et al. 2009). Furthermore, it is believed that they are not capable of sustaining populations long-term when isolated from the source populations in the primary habitat (Barlow et al. 2010).

Ribeiro and Borges (2010) coined the term “habitat-tourist species” (HTS) to refer to species that are rare in one habitat but abundant and specialists in another. Examples of HTS are certain herbivorous insects that appear on the foliage of a particular plant species only by accident but do not feed on it (Novotný and Basset 2000). So, the HTS is a particular case of TS.

According to Novotný and Basset (2000), excluding TS from the samples is the first logical step of any community analysis. However, no methodology to do it properly has been found. They did not mention the reason for removing these species, but they seem to be the first to make this suggestion. According to Barlow et al. (2010), the main methodological obstacle to excluding TS is to differentiate them from genuinely rare species (i.e. those that have a small geographic range, small habitat range, or low local density) because it is highly context-dependent and requires detailed biological information that remains unavailable for the vast majority of species. This difficulty in separating TS from genuinely rare species was also noted by Archer (2003).

One of the methods used to exclude TS or pseudo-rare species from the data, or their possible effect on the analyses, is to use richness and diversity indices that are not very sensitive to rare species in abundance or frequency of occurrence in the samples (Magurran 2004). Some of these indices are the Simpson diversity index (Simpson 1949), the Abundance-based Coverage Estimator (ACE), and the Incidence-based Coverage Estimator (ICE) (Chao and Yang 1993; Lee and Chao 1994). Another method, which is often applied, consists in depurating the data and removing rare species before carrying out the analyses. For example, Edinger and Risk (2000) exclude species with fewer than three occurrences; Duffy-Anderson et al. (2006) exclude species that are not present in at least 5% of the samples; Arnold and Lutzoni (2007), Zhan et al. (2014), and Pinto et al. (2021) remove singletons (species represented by a single individual); and Barlow et al. (2010) exclude species with fewer than 10 records. These methods have several shortcomings. First and foremost, removing rare species in abundance (or not considering them in the case of the indexes mentioned above) excludes not only pseudo-rare species but genuinely rare species as well. This implies the risk of losing valuable information on conservation values​ (Barlow et al. 2010). Second, the results obtained from these methods depend on the abundance of the sampling, and in particular, on the abundance obtained for each species, which can fluctuate from year to year or from season to season (Bonte et al. 2002; Nadal et al. 2018). Third, as noted by Barlow et al. (2010), the methods follow very subjective rules about how many species to remove or where to remove them from (e.g. rare species are removed from the samples or from the entire sample). Archer (2003) and Eggleton et al. (2021) used different techniques to remove TS and HTS, respectively, based on empirical knowledge of the natural history of each species. Although these techniques are more appropriate than those methods mentioned before, they require prior knowledge of the natural history or habitat preferences of all species, and these authors have not suggested a specific technique to assess habitat or region preference when this information is not available.

Some of the ecological indices that may be most affected by the inclusion of HTS are the estimators of richness since several of these are sensitive to rare species in abundance (Magurran 2004). Eggleton et al. (2021) have confirmed this with litter-dwelling beetles and Archer (2003) with wasps and bees. Spiders are a taxon where estimation curves almost never reach an asymptote (Toti et al. 2000; Sørensen 2004; Rico-G et al. 2005). This shortcoming has been attributed to the high species richness that exists and the abundance of sampling that is never enough to complete inventories (Toti et al. 2000). This drawback has also been attributed to the discard of juveniles from the analyses, which leads to the waste of a large amount of information (Jiménez-Valverde and Lobo 2006). However, the effect that TS or HTS could have on richness estimates in this taxon has not been assessed.

The spiders of the Salticidae family are the most representative family of the Araneae order, with more than 6000 described species (World Spider Catalog 2022). They hunt during the day by jumping on their prey (e.g. without using webs) and occupy a wide variety of habitats (Cumming and Wesołowska 2004; Foelix 2011; Nadal et al. 2018). Herein, spider assemblage is presented as a case study in order to: (i) propose a methodology to detect and eliminate HTS from the database used to carry out analyses in community ecology studies, particularly analyses of assemblages of a given taxon; (ii) evaluate how the inclusion/exclusion of the HTS affects richness estimates; and (iii) assess the effect of including juveniles of spiders in the detection of HTS and the effect of removing them from the richness estimates. Regarding the second aim, it is expected to find an overestimation of the richness by including HTS, while regarding the third aim, it is expected to find a higher number of HTS, and richness estimates close to the asymptote when juveniles are included.

Materials and methods

Study area

The study area was located in the Chaco Biogeographic Province of Argentina, specifically in the Eastern Chaco District (Arana et al. 2021). Four sites were selected in this area: two protected natural parks and two not protected and semi-natural sites located between the first (Fig. 1). The two protected areas selected were the Chaco National Park (CNP) (26° 50' S, 59° 48´ W), which extends through the Presidencia de la Plaza and the Sargento Cabral departments, and the Pampa del Indio Provincial Park (PIPP) (26° 16' S, 59° 58´ W), located in the Libertador General San Martin department. The two not protected areas were the semi-natural site 1 (SN1) (26° 34' S, 59° 46' W), located in the Sargento Cabral department, and the semi-natural site 2 (SN2) (26° 27' S, 59° 59' W), located in the Veinticinco de Mayo department. The map of Fig. 1 was made with QGIS version 3.16.7 (QGIS Development Team 2021) and the satellite images were downloaded from SAS Planet (SAS.Planet Development Team 2019).

Fig. 1
figure 1

Study area in Eastern Chaco District of Chaco Biogeographic Province, Argentina. PIPP Pampa del Indio Provincial Park, CNP Chaco National Park, SN1 semi-natural site 1, SN2 semi-natural site 2

The main plant communities of the Eastern Chaco District consisted of xerophytic forests of Schinopsis balansae, Aspidosperma quebracho-blanco, Bulnesia spp., and other species (Cabrera 1971, 1976). However, these have lost quality due to the selective logging of hardwood trees, mainly S. balansae, conducted since the nineteenth century (Cabrera 1971; Morello and Rodrigues 2009). Additionally, in the last four decades, the forests have lost coverage due to clearcutting (Morello and Rodrigues 2009). The remaining communities are represented by riparian forests, shrublands, palm groves, grasslands, savannahs, and other plant communities associated with wetlands (Cabrera 1971, 1976; Morello and Rodrigues 2009; Arana et al. 2021). There are pieces of evidence that grasslands have reduced coverage in Chaco Province as a consequence of the introduction of cattle. This has favored shrub encroachment via selective grazing (Grau et al. 2015).

The climate of the Eastern Chaco District according to the Köppen-Geiger climate classification (Peel et al. 2007) is humid subtropical without dry seasons and with hot summers. The mean annual temperature follows a N-S gradient, ranging from approximately 23 °C to 24 °C in the north and 18 °C to 19 °C in the south (Morello et al. 2012). The climate, in general, is more humid than in the other districts, increasing precipitation from west to east. Thus, the mean annual rainfall in the west (the border region with the Western Chaqueño District or Chaco Seco) is around 750 mm yr−1, while in the east it is approximately 1300 mm yr−1 (Cabrera 1971; Morello et al. 2012; Arana et al. 2021).

Fieldwork

The samplings were conducted seasonally: from March 6 to 9 of 2017 (summer), from August 7 to 10 of 2017 (winter), from December 4 to 7 (spring), and from May 28 to 31 of 2018 (autumn). The total sampling comprised 720 samples (4 dates × 4 sites × 3 transects × 5 samples per transect × 3 techniques). The samples were not joint in any case.

The samplings were conducted systematically, covering three types of habitats to achieve the greatest representativeness of the Salticidae species: forest foliage, forest leaf litter, and grassland. In forests and grasslands, three 200-m transects were drawn, separated by a distance of no less than 2 km to increase the representativeness of spider species. A sample was taken every 50 m from the grassland transects. Two samples were taken every 50 m from the forest transects, one for leaf litter and one for foliage. Each leaf litter sample was taken before the foliage sample to avoid contamination of the leaf litter samples with spiders that might have fallen from foliage. So, the presence of HTS by sampling artifacts is unlikely in this study. The distance of 50 m between samples was established to avoid problems of non-independence of the samples and follows the sampling standards for spiders in northeastern Argentina (Avalos et al. 2018; Nadal et al. 2018; Achitte-Schmutzler et al. 2022).

Three collection techniques were applied in this study. In the forest, epigeal spiders were obtained from leaf litter samples from an area of 1 × 1 m each. The leaf litter samples were passed through a 1 cm mesh opening sieve on a white canvas. Foliage spiders were collected by beating the vegetation, with 15 beatings on the shrubby vegetation and the lower portion of the tree stratum up to a height of 2.5 m, and collecting the material on a 2 × 2 m white canvas. In grasslands, samples were taken by vacuuming, using a G-Vac garden vacuum cleaner (NIWA SNW260, China). For each sample, the vegetation was sucked into the vacuum in an area of 2 × 2 m for a designated time of 1 min. The material collected with the vacuum cleaner was scattered on a white canvas. In the three sampling types, once the spiders were detected on the canvas, they were captured with entomological forceps and placed in bottles with 70% ethyl alcohol. All samples were taken during daytime hours (from 8 AM until 5 PM), when Salticidae spiders are more active. The techniques applied herein are similar to those employed in Avalos et al. (2018); Nadal et al. (2018), and Achitte-Schmutzler et al. (2022).

Laboratory work

Adult and juvenile spiders were identified using the information provided by the genitalia or the habitus of each specimen, respectively. The information provided by the habitus is the general appearance of the specimen, including the shape and color of the body parts. Adult specimens were identified to the genus or species level using the original descriptions available in the World Spider Catalog (2022) and the Metzner Catalog of Salticidae Spiders (Metzner 2022). Juvenile specimens were identified to the genus or species level based on the habitus of adults and a set of supporting evidence or information, as follows: (i) geographical distribution of the potential species, (ii) occurrence of juvenile and adult individuals in the same sample, and (iii) comparison with specimens previously collected in the Eastern Chaco District. The specimens that could not be identified, not even as morphospecies, were excluded from the analyses, e.g. (i) juveniles at a very early stage of development and juveniles of two Gastromicans species that had very similar habitus; and (ii) adult males of the genus Gastromicans, which could not be differentiated since their palps (male genitalia) and chelicerae were very similar. Unidentified specimens were presented in the species list as “not determinable” and “Gastromicans spp.”, and they represented 8% of the total.

For the identification of the material, different stereoscopic microscopes available at the Biología de los Artrópodos Laboratory of the Facultad de Ciencias Exactas y Naturales y Agrimensura (FACENA), Universidad Nacional del Nordeste (UNNE) were used: Olympus SZ51, Olympus SZ40, Leica ES2 and Leica EZ4. The identified material was deposited in the CARTROUNNE collection (Facultad de Ciencias Exactas y Naturales y Agrimensura, Corrientes, Argentina) and the IBSI-Ara collection (Instituto de Biología Subtropical, Misiones, Argentina).

Data analysis

The following analyses were performed separately with adult and adult + juvenile datasets:

(i) In order to detect HTS, the habitat (or habitats) to which the species should be adapted or be resident (sensu Archer 2003 and Eggleton et al. 2021) was assessed by analyzing the habitat specificity of the species. For this purpose, the IndVal index (Dufrêne and Legendre 1997) was used with the addition of the square root by De Cáceres and Legrende (2009) and the combinations of groups proposed by De Cáceres et al. (2010), through the following formula:

$${\sqrt{I\mathrm{ndVal}}^{\mathrm{g}}=\sqrt{{\mathrm{A}}^{\mathrm{g}}\mathrm{ x B }}=\sqrt{\frac{{\mathrm{a}}_{\mathrm{C}}^{\mathrm{g}}}{{\mathrm{a}}^{\mathrm{g}}}\mathrm{x}\frac{{\mathrm{n}}_{\mathrm{C}}}{{\mathrm{N}}_{\mathrm{C}}}}}$$
$${\mathrm{a}}_{\mathrm{C}}^{\mathrm{g}}=\frac{\mathrm{N}}{\mathrm{K}}{\sum }_{j\in \mathrm{C}}({\mathrm{a}}_{i}/{\mathrm{N}}_{i})$$
$${\mathrm{a}}^{\mathrm{g}}=\frac{\mathrm{N}}{\mathrm{K}}{\sum }_{j\in \mathrm{K}}({\mathrm{a}}_{i}/{\mathrm{N}}_{i})$$

where \(\sqrt{{I\mathrm{ndVal}}^{\mathrm{g}}}\) is the indicator value of a species in a group (in this paper habitat) of sites (in this paper samples), or indicator value of a species in a combination of groups; \({\mathrm{A}}^{g}\) is the calculation of specificity with combination of groups, that is, calculation of the average abundance of a species in the sites of a group or a combination of groups compared to all groups in the study; B is the calculation of fidelity, that is, the relative frequency of occurrence of a species in the sites of a group; a is the sum of abundances of the target species over all sites; ai is the sum of abundances of the target species over all sites in site group i; C is the a set of c site groups conforming a particular group of sites; K is the set of all k site groups; n is the number of sites where the target species occurs; N is the number of sites in the data set; and Ni is the number of sites belonging to site group i. \({\mathrm{A}}^{\mathrm{g}}\) is maximum (1 or 100%) when a species is only present in a group or only in a group association. B is maximum when a species is present at all sites in a group. The IndVal index is maximum when individuals of a species are observed at all sites of a single group or at all sites of a group association.

The IndVal index was evaluated with the R version 4.0.2 program using the indicspecies and stats packages, with a significance level of 0.05 and 999 permutations (R Core Team 2020). The IndVal index is normally used to evaluate the preference of a species for a characteristic (Dufrêne and Legendre 1997; Cáceres and Legendre 2009), so that species can serve as an indicator of that condition if they have a high value of IndVal (e.g. > 0.70). For this purpose, one characteristic often considered is habitat disturbance (Larrivée et al. 2008). Thus, if a disturbance indicator species is found at a place, that place is probably disturbed. The combination of groups is an extension of the original index but its use is optional.

In this study, we have not looked for the species to be indicators, but only to know the specificity they have for habitats. Therefore, here species adapted to a certain habitat (or combination of habitats) were those that showed a significant IndVal value (p < 0.05) (hereafter termed IndVal-significant species) and a specificity value (component A of the IndVal index) greater than 0.75. Only the specificity component was considered for two reasons: (1) The use of the fidelity component in the IndVal Index only makes sense when species will be used as indicators in the field. This is because species have to be easily found in the group in which they are indicators. In this sense, if the only aim is to know the specificity of the habitat of a particular species, the usefulness of this fidelity component does not make sense. (2) It is difficult for all species related to a particular habitat to have high fidelity values (be homogenously distributed and occur in most samples), which is a requirement to achieve high fidelity (see formula above). Therefore, excluding the fidelity component allows the proposed methodology to be used with species without a homogeneous distribution (e.g. grouped distribution). The specificity component of the IndVal index is similar to the Species Specialization Index (SSI) (Julliard et al. 2006) as they account for the degree of habitat specialization. However, the SSI differs from the specificity component of IndVal in that it computes the variation in average densities across habitat classes via standard deviation. Although SSI is a powerful index, the disadvantage of using SSI instead of the specificity component of IndVal is that it does not vary from zero to one. Therefore, a standard umbral value of high specialization, as herein is established with a value of 0.75, cannot be established with SSI.

(ii) Those species that were adapted to another habitat (e.g. species considered residents of another habitat) were removed from the target habitat data, on the assumption that the species that occurred in the habitat to which they are not adapted were HTS.

(iii) Once these data were depurated, richness estimation analyses were carried out with the Estimates software version 9.1.0 (Colwell 2013) in order to show how the presence of HTS affected the richness estimates. The estimation of species richness was made for each of the four sites, discriminating by type of habitat: foliage, leaf litter, and grassland. To assess the effect of HTS, this analysis was performed both by including and excluding HTS. The following non-parametric estimators were used: ACE (Chao and Yang 1993; Lee and Chao 1994; Chazdon et al. 1988; Magurran 2004; Gotelli and Colwell 2011), ICE (Chao and Yang 1993; Lee and Chao 1994; Chazdon et al. 1988; Gotelli and Colwell 2011), Jackknife1 (Burnham and Overton 1978, 1979; Heltshe and Forrester 1983), Jackknife2 (Burnham and Overton 1978; Smith and Belle 1984; Gotelli and Colwell 2011), Chao1 (Chao 1984, 1987; Magurran 2004; Gotelli and Colwell 2011), and Chao2 (Colwell and Coddington 1994; Gotelli and Colwell 2011). The last two estimators were used in their bias-corrected versions. ACE and Chao1 are based on abundance, while ICE, Jackknife1, Jackknife2, and Chao2 are based on presence-absence. The difference between observed and estimated richness was also calculated and the values obtained were compared with a parametric (t-test) or a nonparametric (Kruskal–Wallis) test depending on whether the distribution of the data subset had a normal or nonparametric distribution, respectively. The distribution of the data was evaluated with the Shapiro–Wilk test. In addition to the estimated and observed richness values, estimation curves were plotted. These curves were expressed as a function of the number of samples. The values of the indices and the observed richness used to plot the curves were obtained from the mean of 100 randomizations. To compare the curves more rigorously, the value of the final slope (between the last 10 samples of the abscissa) of each curve was calculated. This calculation was used in other works (e.g. Hortal and Lobo 2005). The values of the slopes were compared using the same tests mentioned above.

Results

Association of the species with the assessed habitats

A total of 1,697 spiders from the Salticidae family were collected, 27% of them in the adult stage. Of the 69 species identified, 59 were adults (Additional file 1: Table S1). Considering adults + juveniles, IndVal index showed the three groups tested and a combination: foliage, leaf litter, grassland, and leaf litter + grassland. Within these groups, the analysis showed 41 IndVal-significant species: 17 for foliage, two for leaf litter, 21 for grassland, and one for leaf litter + grassland. Considering adults, IndVal index showed the three groups tested: foliage, leaf litter, and grassland groups. Within these groups, the analysis showed 27 IndVal-significant species: 11 for foliage, two for leaf litter, and 14 for grassland (Table 1).

Table 1 IndVal-significant species in three habitats of the studied area, Eastern Chaco District, Argentina

Detection of HTS

When adults + juveniles were considered, eight HTS were detected in foliage, 11 in leaf litter, and two in grassland. In contrast, when only adults were considered, three HTS were detected in foliage, four in leaf litter, and one in grassland (Table 2).

Table 2 HTS in the studied area, Eastern Chaco District, Argentina

Estimation of species richness with and without HTS

In all cases where HTS were present, there was an overestimation of species richness. Analysis of the adult + juvenile dataset revealed an overestimation of species richness, mainly in leaf litter, which was maximum in SN1 (67%), followed by SN2 (52%), CNP (50%), and finally PIPP (45%). In foliage, the overestimation of richness was maximum in SN2 (28%), followed by PIPP (24%) and SN1 (22%), while in grassland the maximum overestimation occurred in PIPP (13%), followed by CNP (6%). Analysis of the adult dataset resulted in an overestimation of species richness, especially in leaf litter with an overestimation of up to 52% in SN2, followed by 41% in SN1. In foliage, richness overestimation was the highest in PIPP and SN1 (both with 36% overestimation), followed by CNP (19%) and finally SN2 (11%). In grassland, the overestimation was up to 10% in the PIPP (Table 3 and 4, Appendix Figs. 2, 3, 4, 5, 6, 7, and  9). Richness overestimation due to the inclusion of HTS was significant in foliage and leaf litter when the adult + juvenile dataset was considered (Additional file 1: Table S2). The final slopes of the curves that included HTS were steeper than those that excluded HTS for all cases (Appendix Figs. 2, 3, 4, 5, 6, 7, and  9, Additional file 2: Table S4 and S5). These differences were significant in leaf litter when the adult + juvenile dataset was considered (Additional file 1: Table S3).

Table 3 Estimation of richness with HTS in the studied area, Eastern Chaco District, Argentina
Table 4 Estimation of richness without HTS in the studied area, Eastern Chaco District, Argentina

Estimation of species richness considering adults + juveniles or adults

With few exceptions, richness estimates were higher when adults + juveniles were considered than when only adults were considered (Tables 3 and 4). However, the difference between richness overestimation between both datasets was significant only in leaf litter when HTS were included (Additional file 1: Table S4). The richness estimation curves did not reach an asymptote in most cases. However, there were several exceptions, such as the leaf litter of CNP with adult and adult + juvenile datasets without HTS, and the grassland of SN2, where asymptotes were obtained from the adult dataset (Appendix Figs. 2D, 3C − D and 7E − F).

In grassland and in foliage with HTS, and in leaf litter without HTS, the curves made with the adult dataset were further from reaching the asymptote than the curves made with the adult + juvenile dataset (Appendix Figs. 2, 3, 4, 5, 6, 7, and  9, Additional file 1: Table S5). However, these differences were not significant (Table S6). On the contrary, in leaf litter with HTS, and in foliage without HTS, the final slopes of the curves were steeper when adults + juveniles were considered (Figs. 2, 3, 4, 5, 6, 7, and  9, Additional file 1: Table S5). In leaf litter with HTS, these differences were significant (Additional file 1: Table S6).

Discussion

This work proposes a simple methodology to detect and eliminate HTS from the database before performing any kind of analysis of assemblages of a given taxon. The proposed methodology included the use of a known index, the Indicator Value Index or IndVal. As an example, it was shown how HTS affect the richness estimates in spider assemblages of the Salticidae family in one study area of Eastern Chaco, Argentina. Knowledge of the primary habitat of these species made it possible to detect them and remove them from the habitats in which they appeared as HTS. Analyses of richness estimates with and without HTS showed that HTS overestimated richness, although not always significantly.

According to Eggleton et al. (2021) HTS are expected to find among highly dispersive taxa. Spiders of the Salticidae family have the ability to disperse by various mechanisms. One of the most important ones includes ballooning (Horner 1975), which consists of dispersal by air (Richman and Jackson 1992). The presence of HTS in the studied habitats could be related to dispersal using this mechanism. The higher number of HTS in leaf litter could be due to the fact that the probability of spiders falling from the foliage was added to this type of dispersal. Although there seem to be no reports on the process of falling of the foliage-dwelling spiders, it can be deduced that this process does occur since one of the widely used collection techniques for foliage-dwelling spiders is foliage beating, which makes the spiders fall to the ground (Nadal et al. 2018). Therefore, this could occur naturally due to the effect of gravity or the movement of foliage by the wind.

The richness overestimates obtained by including HTS herein are in agreement with what was found by Archer (2003) studying wasps and bees, and with Eggleton et al. (2021) with beetles. These results have important implications for conservation policies because it is very common to use species richness and other diversity indices as an indicator of habitat quality (Avalos et al. 2018; Nadal et al. 2018; Nether et al. 2019; Pinto et al. 2021). In this sense, HTS could increase species richness in habitats that are not necessarily of good quality and lead to wrong conclusions. The steepest asymptotes obtained by Jiménez‐Valverde and Lobo (2006) when considering adults of spiders instead of adults + juveniles do not match with what was found herein as the only significant difference indicated the opposite of what was obtained by those authors. However, they worked with the family of crab spiders (Thomisidae), which could partly explain the differences.

One of the limitations of the proposed methodology is that the detection of IndVal-significant species by the Indval index requires the species to be abundant enough (Dufrêne and Legendre 1997). Julliard et al. (2006) detected the same problem with the SSI index. However, this drawback is offset by the low probability that a rare species in one habitat will appear as HTS in another, due to its lower abundance. Despite the aforementioned limitation, this is the first study to propose an unbiased way of detecting tourist-habitat species, by discriminating them from the rest of the data. To our knowledge, Barlow et al. (2010) were the only authors who proposed a numerical way to treat TS. However, their methodology continues to have the drawbacks inherent to other methods, which they themselves criticize and which are also criticized herein. Thus, these authors propose to exclude species with less than 10 records in the sample but they do not distinguish genuine rare species from TS properly. Archer (2003) and Eggleton et al. (2021) used more appropriate procedures than that of previous authors, based on the knowledge of the habitat preference of the species, but not in a very objective way.

Conclusions

The methodology proposed in this work contributes to solving the problem of HTS that was recognized a long time ago, but for which an effective solution has not been found. This method corrects the bias of methods of other authors where species less abundant are eliminated with subjective rules. However, it can be used in a complementary way with the methods that use the empirical knowledge of the natural history or habitat preference of species. In this sense, empirical knowledge of the habitat preference of species could support this method to have a double confirmation on whether a species is HTS or TS.

This study demonstrates that estimators overestimate species richness when HTS are included which can lead to erroneous conclusions in ecological studies. The overestimation of richness by including HTS reinforces the importance of removing these species before performing ecological analyses. Contrary to expectations, the inclusion of juveniles did not contribute to richness estimation curves being more asymptotic than excluding them. However, this study showed that the with adult + juvenile dataset can be detected more HTS than with adult dataset. So, the inclusion of juveniles could improve the analyses, at least in spiders. Although the proposed methodology focused on HTS, and how they affect richness estimation, it can be applied to TS at broader scales (e.g. at the level of biogeographic regions), other community ecology analyses (e.g. alpha and beta diversity indices), or using other multiple or single taxa. In this sense, it is recommended to analyze species preference for regions, with the same procedure with which the preference for habitat was evaluated herein. In the same way, it is recommended to remove HTS or TS before analyzing diversity indices in a particular habitat or region, respectively. Regarding future prospects, knowing the identity of HTS can be very useful in upcoming studies to assess the effect of habitat fragmentation, since more HTS is expected to be obtained the more fragmented the habitats are.

Availability of data and materials

There is relevant information available in the supporting information. The datasets used in this paper will become public at Figshare repository (https://figshare.com) on May 9 2024. The DOIs are as follow: Data to run in Estimates: 10.6084/m9.figshare.19733683. Data to run in R with Indicspecies package: 10.6084/m9.figshare.19733716.

Abbreviations

A:

Specificity component of the Indicator Value Index

ACE:

Abundance-based Coverage Estimator

B:

Fidelity component of the Indicator Value Index

CARTROUNNE:

Colección de Artrópodos de la Universidad Nacional del Nordeste

CNP:

Chaco National Park

FACENA:

Facultad de Ciencias Exactas y Naturales y Agrimensura

HTS:

Habitat-tourist species

ICE:

Incidence-based Coverage Estimator

IBSI-Ara collection:

The arachnological collection of the Instituto de Biología Subtropical, Misiones, Argentina

IndVal:

Valor Indicador

p :

Statistical significance value

PPIP:

Pampa del Indio Provincial Park

S:

Species richness

SN1:

Semi-natural site 1

SN2:

Semi-natural site 2

SSI:

Species Specialization Index

TS:

Tourist-species

UNNE:

Universidad Nacional del Nordeste

References

Download references

Acknowledgements

First, we thank the reviewers and the editor for their valuable comments that improved this article. Second, we thank colleagues I. Zanone, P. Gonzalez, C. Achitte, G. Rubio, P. Cuaranta, E. Toledo, R. Aguirre, M. De los Santos, D. Larrea and A. Raimundo, who collaborated with the fieldwork. Third, we thank G. D. Rubio for collaborating with the identification of several species in the early stages of the study (e.g. Mburuvicha sp., Neonella spp., S. cathaphracta, T. yungae, W. punctata). Fourth, we thank Alejandra Scotti for the language editing. Finally, we thank the administrators and park rangers of Pampa del Indio Provincial Park and Chaco National Park, as well as the owners and foremen of the fields for their permission to carry out the sampling in those places.

Funding

This research was financed by project PI F 003/2015 of the SGCyT—UNNE (Argentina) and by a CONICET doctoral fellowship awarded to the first author.

Author information

Authors and Affiliations

Authors

Contributions

MFN and GA performed the field sampling, MFN determined the species, performed the data analysis, and wrote the manuscript. GA and AG edited the manuscript. All authors gave the final approval for publication. All authors read and approved the final manuscript.

Authors' information

MFN is concluding her doctoral thesis financed by a doctoral fellowship from CONICET and supervised by AG and GA. Her research topic is focused on the ecology and taxonomy of spiders of the Salticidae family, AG is an Ad-Honorem Independent Researcher of CONICET with experience in spider ecology, and GA is a professor and researcher from FACENA-UNNE with experience in spider ecology.

Corresponding author

Correspondence to María Florencia Nadal.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1.

Species of the family Salticidae and their abundance in Pampa del Indio Provincial Park (PIPP), Chaco National Park (CNP) and semi-natural sites (SN1 and SN2) in Chaco, Argentina. Table S2. Significance assessment of the differences between the richness overestimation data, calculated as the difference between observed and estimated richness, when Habitat-tourist species were considered or not. Table S3. Significance assessment of the differences between the data of final slopes of richness estimation curves, when Habitat-tourist species were considered or not. Table S4. Significance assessment of the differences between the richness overestimation data, calculated as the difference between observed and estimated richness, when adults were considered or not. Table S5. Average values of the final slopes of the richness estimation curves, when adults were considered or not. Table S6. Significance assessment of the differences between the data of final slopes of richness estimation, when adults were considered or not.

Additional file 2: Table S1.

Difference between observed and estimated richness when habitat-tourist species were considered. Table S2. Difference between observed and estimated richness when habitat-tourist species were not considered. Table S3. Verification of the normality assumption of the richness overestimation data, calculated as the difference between observed and estimated richness. Table S4. Values of the final slopes of the richness estimation curves when habitat-tourist species were considered. Table S5. Values of the final slopes of the richness estimation curves when habitat-tourist species were not considered. Table S6. Verification of the normality assumption of the final slope data of the richness estimation curves.

Appendix

Appendix

See the Figs. 2, 34, 5, 6, 7, 8 and 9 below.

Fig. 2
figure 2

Species richness estimation curves for Chaco National Park, Chaco, Argentina, with adult + juvenile data. A, B: foliage, C, D: leaf litter, E, F: grassland; ACE: analyses with HTS; BDF: analyses without HTS

Fig. 3
figure 3

Species richness estimation curves for Chaco National Park, Chaco, Argentina, with adult data. A, B: foliage, C, D: leaf litter, E, F: grassland; ACE: analyses with HTS; BDF: analyses without HTS

Fig. 4
figure 4

Species richness estimation curves for semi-natural site 1, Chaco, Argentina, with adult + juvenile data. A, B: foliage, C, D: leaf litter, E, F: grassland; ACE: analyses with HTS; BDF: analyses without HTS

Fig. 5
figure 5

Species richness estimation curves for semi-natural site 1, Chaco, Argentina, with adult data. A, B: foliage, C, D: leaf litter, E, F: grassland; ACE: analyses with HTS; BDF: analyses without HTS

Fig. 6
figure 6

Species richness estimation curves for semi-natural site 2, Chaco, Argentina, with adult + juvenile data. A, B: foliage, C, D: leaf litter, E, F: grassland; ACE: analyses with HTS; BDF: analyses without HTS

Fig. 7
figure 7

Species richness estimation curves for semi-natural site 2, Chaco, Argentina, with adult data. A, B: foliage, C, D: leaf litter, E, F: grassland; ACE: analyses with HTS; BDF: analyses without HTS

Fig. 8
figure 8

Species richness estimation curves for Pampa del Indio Provincial Park, Chaco, Argentina, with adult + juvenile data. A, B: foliage, C, D: leaf litter, E, F: grassland; ACE: analyses with HTS; BDF: analyses without HTS

Fig. 9
figure 9

Species richness estimation curves for Pampa del Indio Provincial Park, Chaco, Argentina, with adult data. A, B: foliage, C, D: leaf litter, E, F: grassland; ACE: analyses with HTS; BDF: analyses without HTS

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nadal, M.F., González, A. & Avalos, G. Exclusion of tourist species from assemblages in ecological studies: a methodological approach using spiders. Ecol Process 11, 59 (2022). https://doi.org/10.1186/s13717-022-00398-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13717-022-00398-6

Keywords