Skip to main content

Non-native plant species richness and influence of greenhouses and human populations in the conterminous United States



One issue in invasive plant ecology is identification of the factors related to the invasion process that increase number of non-native species. When invasion by non-native species increases, so does the probability that some non-native species will become harmful, or classified as invasive species, which disrupt natural ecosystems with attendant economic and social costs. I quantified patterns of how non-native species richness varied with vegetation types and human populations. To evaluate the relative importance of different predictor variables for invasion pathways in the conterminous United States, I modeled non-native plant species richness by county compared to current and historical human populations; greenhouses and nurseries; railroads, pipelines, transmission lines, and oil and gas wells; and land covers of impervious surface, development intensity categories, agriculture, and vegetation types. I also modeled these variables within vegetation types, excluding vegetation variables.


To summarize patterns, non-native plant species richness increased from 72 to 200 with increasing human population density classes. Forests and forest land use mosaics had the greatest mean number of non-native plant species, ranging from 121 to 166, whereas grasslands and grassland mosaics had the least number of non-native plant species, about 70. For modeling variable importance, all combined variables had R2 values of 56% (random forests regressor) and 54% (cubist regressor) for predictions of withheld observations of non-native plant species richness, with greenhouse density and percent forestlands as most influential variables. Single variables of greenhouses (R2 = 29%), historical and current human populations (R2 = 27% and 23%), impervious surface (25%), and medium intensity development (23%) were most associated with non-native plant species richness. For vegetation types, greenhouse and historical human population densities were influential variables particularly in forestlands, shrublands, and wetlands.


Based on these models, human population measures and horticultural locations of greenhouses and plant nurseries may have stronger relationships than measures of land use disturbance and transport with non-native plant species richness.


Humans have deliberately or accidentally moved many species beyond their native ranges, particularly by importing and cultivating plants for ornamental or agricultural purposes. Most plant invaders have been introduced deliberately through horticulture (Reichard and White 2001; Liebhold et al. 2012; van Kleunen et al. 2018). Imported nursery stock also is a common pathway for non-native pathogen and insect introductions (Brasier 2008; Liebhold et al. 2012). Selected plants are easy to propagate, with rapid establishment, growth, and reproduction under a variety of conditions (van Kleunen et al. 2018). After human assistance to reach new locations, these traits position introduced species for invasion of ecosystems outside of controlled garden settings (Theoharides and Dukes 2007), albeit without a close relationship between the rate of spread and traits of invading species (Pyšek and Hulme 2005).

Some of the non-native species become classified as invasive, as measured by impacts on native ecosystems and reduced production of ecosystem services (Mayfield et al. 2021). Invasive non-native plants directly out-compete native species for growing space, reducing abundance of native species and changing composition and structure of ecosystems (Pearson et al. 2016). Other impacts include alterations of primary productivity, floral resources, and ecosystem processes by invasive plant species. Invasive plants may shift wildfire regimes by increasing or decreasing fire frequency and severity; for example, some invasive grasses increase horizontal fuel continuity or become dry earlier in the season than native plants, increasing the chance of wildfire (D’Antonio and Vitousek 1992). The transformation from non-native species to invasive species is one active area of research in invasion ecology, including identification of the non-native species that will become harmful ecologically or socioeconomically, measurement of damage, and how climate change may activate the transformation from non-native species to invasive species (Clements and Ditommaso 2011; Courchamp et al. 2017).

Another research issue encompasses isolation of the factors that contribute to successful invasion by non-native species into new areas, because eventually some successful non-native species will become harmful (Moles et al. 2012). The invasion sequence of introduction, establishment (i.e., survival and growth), and spread (i.e., reproduction) is complex and unpredictable for each ecological context, and along the invasion sequence, non-native plants face constraints and filters (Lockwood et al. 2005; Von Holle and Simberloff 2005; Colautti et al. 2006; Pauchard and Shea 2006). Propagule pressure describes the number of individuals of an introduced non-native species (Lockwood et al. 2005), including introductions directly through horticultural sources (Reichard and White 2001; Liebhold et al. 2012) and accidental imports through trade and transportation. As propagule pressure increases, the probability of successful establishment into suitable locations in time and space increases (Lockwood et al. 2005). For establishment and spread, disturbances open growing space and provide associated resources necessary for survival, growth, and reproduction. Land uses of urbanization, agriculture, transportation, mining and energy development, harvesting for forest products, and drainage of wetlands are types of anthropogenic disturbance that remove and fragment native vegetation, allowing opportunities for establishment of non-native species. Native species may provide resistance to non-native species establishment and spread through competition, herbivory, and pathogens (Mitchell et al. 2006). Classical invasion concepts focused on invasion facilitated by disturbance and inhibited by communities richer in species, whereas more recent evidence, including a meta-analysis of 56 studies, indicates the importance of greater propagule pressure as a consistent factor for successful invasion of new locations by non-native species (Von Holle and Simberloff 2005; Colautti et al. 2006; Cassey et al. 2018; Stringham and Lockwood 2021).

Non-native plant species richness may increase with many factors related to the invasion process, representing categories of propagule sources, transport, land use disturbance, and ecosystem resistance, with overlap in categories, particularly in land use disturbance. For example, non-native plant species richness may increase along a continuum with human population density, housing, economic activity, and road density (McKinney 2001, 2002; Gavier-Pizarro et al. 2010; Pyšek et al. 2010). Human population and housing measures may best indicate non-native plant source introductions, such as to and from residential gardens (van Kleunen et al. 2018), but also may indicate urbanization disturbance and associated transport networks. However, greenhouses and nurseries directly represent sources of non-native plant material (Reichard and White 2001; Liebhold et al. 2012). Railways, pipelines, transmission lines, and oil and gas wells, which result in road networks, are transport corridors with associated disturbance, primarily related to the energy sector (Ott et al. 2021),

Non-native plant species richness may vary by vegetation types and rural–urban gradients, similar to regional variation in non-native plant species richness (Allen and Bradley 2016; Fig. 1A). Variation perhaps reflects differences in ecosystem resistance to invasion by non-native species and loss of ecosystem resistance due to intermixing of residential land use with forested land cover in more heavily populated regions (Gavier-Pizarro et al. 2010). Evidence indicates that non-native species abundance and richness generally increase with human population and housing densities in forests, but limited research exists about widespread differences in non-native plant species abundance and richness in other vegetation types and rural lands, particularly at regional scales (Bock and Bock 2009). Larger areas of relatively intact native ecosystems may contain limited number of non-native plant species (McKinney 2002; Hansen et al. 2014). Conversely, land use disturbances of the agricultural sector and mining and energy extraction often are concentrated in rural locations. Agriculture is the dominant form of land use disturbance, occurring over more than a third of the terrestrial land surface (IPBES 2019). In addition to removal of native plant cover for crops, livestock production has introduced novel herbivory regimes and non-native plants (Shaffer and DeLong 2019). Fast-growing non-native forage grasses (e.g., smooth brome, Bromus inermis) have been planted for forage improvement and soil stabilization after overgrazing (Kennedy 1899; Shaffer and DeLong 2019). Non-native species that are considered valuable for livestock production (e.g., smooth brome; Kentucky bluegrass, Poa pratensis; crested wheatgrass, Agropyron cristatum; Johnsongrass, Sorghum halepense) have increased non-native grass species, with presence on 55% of non-federal rangeland and ≥ 50% cover on 9% of area (NRCS 2018; Shaffer and DeLong 2019). While areas of relatively intact native ecosystems have greatest richness of native plant and animal species, degradation of native ecosystems results in declines in native plants and wildlife (Shaffer and DeLong 2019; Hanberry et al. 2021).

Fig. 1
figure 1

Number of recorded non-native plant species by county (A Center for Invasive Species and Ecosystem Health 2020) and location of vegetation types used in modeling (B blank counties are complex mosaics of vegetation type; Homer et al. 2020)

Nevertheless, some land use disturbances may not be relevant. All land use disturbances are not applicable throughout the U.S., such as forest harvest in the shrublands and grasslands of the western U.S. Moreover, in the eastern U.S., disturbances from tree removals, wildfire, and deer herbivory had overall negative or minimal relationships with non-native species richness (Hanberry 2022a). Indeed, disturbances from fire, herbivores, and mechanical treatments such as mowing are treatments for non-native plant species removal and for native plant species restoration (Hanberry et al. 2021). Effects from these types of disturbance regimes will vary with frequency and timing, location, ecosystem, and the taxa in question, but changes in disturbance regimes generally do not explain non-native species richness (Moles et al. 2012).

Non-native plant species richness has not been assessed with comprehensive coverage at a national scale to summarize how non-native plant species richness varies by vegetation types and human population density classes and whether non-native plant species richness is associated with potential pathways of invasion. Few studies have estimated non-native plant species richness across urban gradients (Cadotte et al. 2017). Therefore, my objective was to to characterize patterns and associations of non-native plant species richness in the conterminous U.S. Based on current evidence, I hypothesized that non-native species richness will increase with greater human densities and cover of forested vegetation, and non-native species richness will be associated human population measures and number of greenhouses and nurseries, which represent propagule pressure. At the scale of U.S. counties, my research questions were: (1) how does non-native plant species richness vary with vegetation types and human population density classes and (2) which proxies related to propagule sources, transport, and land use disturbances best predict non-native plant species richness. I modeled non-native plant species richness by county for the conterminous U.S. and vegetation types compared to several metrics that may represent invasion categories of propagule sources, transport, and land use disturbance, with some overlap in categories: current and historical human populations; greenhouses and nurseries; railroads, pipelines, transmission lines, and oil and gas wells; and land covers of impervious surface, development intensity categories, agriculture, and vegetation types. I also modeled these variables within vegetation types, excluding vegetation variables. These results will expand limited research about differences in non-native plant species richness in rural–urban gradients at regional scales and contribute to evidence about potential pathways of invasion by non-native plant species.


Study area

The conterminous United States consists of contiguous land area, excluding the states of Hawaii and Alaska, covering about 8 million km2 (Fig. 1). Vegetation types consist of primarily forests in the eastern U.S., grasslands in the central U.S., and shrublands in the western U.S. with forests in wetter coastal or higher elevation locations (Fig. 1; Homer et al. 2020). Precipitation generally follows a gradient that decreases from the east to the west, with more wetlands in the eastern half of the U.S. than the western half. Historical grasslands, wetlands, and forests (i.e., open forests of savannas and woodlands) have been converted to crops and pastures, and grasslands and savannas have become forests over time, due to changed disturbance regimes. Populations generally are greater in coastal regions than in interior grasslands, shrublands, and croplands (Fig. 2).

Fig. 2
figure 2

Human population count (A) and number of greenhouses and nurseries (B) by county

Data sets

Regarding data sets, non-native plant species occurrences were from the EDDMapS database (Early Detection and Distribution Mapping System; Center for Invasive Species and Ecosystem Health 2020). For EDDMapS, non-native species occurrences are aggregated from databases, organizations, as well as citizen observers, but frequently reported by U.S. county, with > 6.6 million county records and > 5.3 million point records. Therefore, the EDDMapS database is suited for county-scale analysis, and I summarized the number of non-native plant species records by county for the conterminous U.S. (Fig. 1A, all maps produced with ArcGIS, ESRI, Redlands, CA).

Caveats for this data set include that survey effort likely is imbalanced, similar to other landscape studies. Formal or informal vegetation surveys may occur in areas that are accessible to human population centers. However, offsetting the survey effort related to human population density, public lands, which typically are rural, may receive a disproportional number of surveys and observations and also remote counties may be larger in area; larger areas tend to contain larger numbers of species. Following Gavier-Pizarro et al. (2010:1914–1915), non-native plant species richness and county area for the U.S. and vegetation types did not have a relationship, with the exception of shrublands. For modeling in shrublands, I adjusted non-native plant species counts to densities.

Predictor variables for modeling represented categories of propagule sources, transport, land use disturbance, and ecosystem resistance. Propagule sources were human population variables of year 2015 human population count (resolution of 30 arc second or < 1 km grid cells; 2015 LandScan, Bright et al. ) that I converted to mean values by U.S. county and 1790–2010 human population count (1 km grid cells; Fang and Jawitz 2018), converted to density by county. Propagule sources also were greenhouse and plant nurseries (30,400 locations, restricted access to these data; Homeland Infrastructure Foundation-Level Data 2021), converted to densities by county. Transport variables, with associated land use disturbance primarily from energy extraction, were railroads (km), pipelines (km), transmission lines (km), and oil and gas wells (number; Homeland Infrastructure Foundation-Level Data 2021), converted to densities by county. Disturbance by agriculture included agriculture mean percent area by county during 1850–1987 (Maizel et al. 1998) and cover of croplands and cover of pasture during 2016 (30 m grid cells from the National Land Cover Database; Homer et al. 2020), converted to percentage area by county. I described the built environment with 2016 development intensity classes, which represent both propagule sources and land use disturbance (open, low density, medium density, high density; Homer et al. 2020) and 2016 percent impervious surfaces, which represent invasion categories of propagule sources, transport, and land use disturbance (i.e., commingling of roads, core urban areas, and energy production sites; Homer et al. 2020); I converted these variables to percentage area by county. For a coarse measure of ecosystem resistance (that is, simply presence of the ecosystem type), I used 2016 cover of forest, grassland, shrubland, wetland, and combined wildlands (i.e., forest, grassland, shrubland, wetland; Homer et al. 2020), converted to percentage area of each vegetation type by county (Fig. 1B). The primary vegetation type was assigned based on vegetation types > 50% of wildland vegetation.

Data analysis

I performed summary statistics of non-native plant species richness for counties by human population density classes and vegetation types (Fig. 1B). For human population counts (from 2015 LandScan, Bright et al. 2016), I determined density by dividing counts by cell area (in the GCS WGS 1984 geographic, or longitude and latitude, coordinate system), and then quantified mean density by county, which I grouped into five human population density classes ranging from rural at < 15 humans per km2 to combined suburban and urban at ≥ 550 humans per km2, with intermediate exurban classes (Fig. 4; Hanberry 2022b). Vegetation types were 2016 cover of forest, grassland, shrubland, wetland, and combined land cover and land use mosaics (30 m grid cells from the National Land Cover Database; Homer et al. 2020), converted to percentage area of each vegetation type or land use mosaic by county. The primary vegetation type was assigned based on vegetation types > 50% of wildland vegetation (i.e., forest, grassland, shrubland, wetland land cover rather than land use) by county, but cropland and urban could be the predominant land cover. Combined crop and pasture were assigned as crop if these land covers were > 50% of area by county. Combined medium density and high density development were assigned as urban if these land covers were > 50% of area by county. I matched these land uses with the primary vegetation type, resulting in a mosaic of agricultural or urban land use and wildland land cover.

For modeling relationship between variables and non-native plant species richness, because this is not a controlled experiment particularly suited for inferential statistics and relationships are not always linear for complex data sets, I applied a machine learning or algorithm modeling approach with two ensemble models based on decision trees or rule sets, the random forests and cubist regressors (Breiman 2001; Bzdok et al. 2018). Machine learning encompasses a variety of different algorithms to solve data problems (Sarker 2021). The main characteristics that differentiate these two algorithms from other machine learning options are the ability to perform regression analysis for a numerical response rather than a class response and use of ensemble models of decision trees with resampling methods. Ensemble models use results of many decision trees or rule sets to output the most optimal result, helping to minimize the influence of error. Each ensemble model applies unique algorithms. Random forests develop regression trees in parallel from bootstrap samples, or random samples with replacement, by selecting the best split among a randomly selected subset of different predictors tested at each node, and averages results of all individual trees (Zhou et al. 2019; Sarker 2021). The cubist regressor is a rule-based model that develops a series of trees comprised of linear regression models and corrects (boosts) trees using information from prior trees, but averages results of all individual trees rather than weighing results based on performance (Zhou et al. 2019). Random forests is widely used, at least as a classifier, and both regressors, cubist particularly, have been documented to have better performance overall than generalized linear models for 83 data sets (Fernández-Delgado et al. 2019). I applied both algorithms to provide dual lines of support for results. If both algorithms select the same important variable, then the variable is more likely to be important than if only one algorithm selects the variable as important.

Correlated variables can make it challenging to measure the influence of variables. Typically, ensemble models are relatively robust in isolating influence of redundant, highly correlated predictors, compared to linear models (Kuhn and Johnson 2019). That is, ensemble models distinguish important predictor variables that improve model accuracy, as opposed to splitting the explanatory value of correlated strong predictors into intermediate importance, whereas omission of relevant correlated variables reduces accuracy (Hanberry 2023). I examined model influence of paired variables with the Pearson’s correlation coefficient.

For the twenty predictor variables and number of non-native plant species by county in the conterminous United States, I applied random forests and cubist nonlinear regressors in the caret package (Kuhn 2008; R Core Team 2021) to train the model and then predict to withheld testing data (i.e., 30% of 3110 counties) to determine R2 values for predicted richness of non-native plants (Fig. 3). The caret package automatically chooses values for the necessary parameters associated with the best model fit (i.e., model tuning) based on resampling, in this case tenfold cross-validation repeated three times, with default values for parameters that do not require tuning (e.g., Probst and Boulesteix 2017). Importance of variables is based on the contribution of each variable to model accuracy, with importance values scaled up to 100. To isolate the influence of single variables, I modeled each variable alone.

Fig. 3
figure 3

Modeling steps for the conterminous U.S. and six vegetation types or land use-vegetation type mosaics

In addition to modeling for the conterminous U.S., I modeled non-native plant species richness by four vegetation types and two land use mosaics, but excluding predictor variables of the six vegetation types (i.e., percent area of forest, grassland, shrubland, wetland, and percent area of croplands and percent area of pasture). Vegetation types were forestlands, grasslands, shrublands (adjusting number of non-native plant species to number of non-native plant species per km2 for this vegetation type), and wetlands. Land use mosaics were crop and pasture with primarily grassland vegetation and crop and pasture with primarily forest or wetland vegetation (Fig. 1B). Modeling within smaller extents of vegetation types reduces variation compared to differences across the conterminous U.S.


The non-native plant data set had 2253 unique species or subspecies. Non-native plant species richness in the United States was greatest along the western coastal region and in the northeastern region (Fig. 1A). Non-native plant species richness steadily increased with increasing human population densities, from 87 non-native plant species at the lowest density (in 1450 counties) to about 225 non-native plant species in the two greatest human population density classes (98 and 102 counties; Fig. 4). For vegetation types or land use mosaics, the forest vegetation type and land use mosaics with primarily forest vegetation had the greatest mean number of non-native plant species (Fig. 5). The forest and urban mosaic (65 counties) had a mean value of 166 non-native plant species, forest (1299 counties) had a mean value of 137 non-native plant species, and forest with crop and pasture (455 counties) had a mean value of 121 non-native plant species. In contrast, grasslands and grassland land use mosaics had the least number of non-native plant species. Grasslands (258 counties) and grasslands with crop and pasture (178 counties) had a mean value of 73 and 70 non-native plant species, respectively. Wetlands (159 counties) had a mean value of 121 non-native plant species and wetlands with crop and pasture (168 counties) had a mean value of 96 non-native plant species. Shrublands (244 counties) had a mean value of 95 non-native plant species. Crop and pasture (68 counties) had a mean value of 83 non-native plant species.

Fig. 4
figure 4

Mean number of recorded non-native plant species by county and number of counties for increasing human population density classes

Fig. 5
figure 5

Mean number of recorded non-native plant species by county and number of counties by vegetation type

For the conterminous U.S., the full model with all 20 variables had R2 values of 56% (random forests regressor) and 54% (cubist regressor) for predictions of withheld observations of non-native plant species richness (Table 1). The most influential variables included greenhouse and plant nursery density and percentage forestlands for both regressors. The single variable of greenhouse and nursery density had the greatest R2 value of 29%, modeled by the cubist regressor, with slightly lower R2 values for historical human population densities (27%), impervious surface (25%), and current human population and medium intensity development (23% for both variables), and low and high intensity development (21% and 20%, respectively). Railways had the greatest R2 value (11%) of the transport category and forests had the greatest value (9%) of the vegetation types. The random forests regressor identified similar importance of variables, but with weaker R2 values.

Table 1 Modeling results of most important variables, importance value (value) and R2 (predictions of withheld samples) for cubist and random forests regressors of non-native plant species richness in the conterminous United States and by different vegetation types (pop = population, dev = development intensity), and with results of single variable models in the United States

For the six vegetation types or land use mosaics, these same variables of greenhouse and nursery density, current and historical human population measures, development, and impervious surfaces generally were most related to non-native species richness. In particular, non-native plant species richness was most related to greenhouse and historical human population densities in vegetation types of forestlands, shrublands, and wetlands. Non-native plant species richness was most associated with low intensity development and impervious surfaces in the crop landscapes. The exception was in grassland landscapes, where pipeline density was most related to number of recorded non-native species. The R2 values were lower for vegetation types than for the conterminous U.S. (31–50% by vegetation type with the cubist regressor and 28–51% by vegetation type with the random forests regressor). The two regressors generally identified similar importance of variables.

With correlated predictor variables, strong predictors can end up with lower importance values than if all but one correlated variable were excluded from modeling. Equally, correlated variables can confound distinguishing which of the individual factors is important. All the variables were relevant and Pearson’s correlation coefficients showed that out of 190 unique pairwise correlations for the 20 variables, only 12 had r values greater than 0.7. The two human population measures are relatively interchangeable variables despite different sources and time intervals, with an r = 0.98, and likewise, impervious surfaces and medium intensity development had an r = 0.97. Otherwise, the maximum r values were 0.70 and 0.76 between either historical human population densities or current human population, respectively, and greenhouse densities. These variables were identified as relatively influential for the conterminous U.S., and I isolated their influence as single variable models (Table 1). Moreover, when excluding the current or historical human population measures, impervious surfaces, or medium intensity development, R2 values remained nearly identical for both regressors and the most influential variables retained the same importance value rank. That is, both regressors were able to correctly distinguish importance of strong predictor variables.


Propagule pressure is an important predictor for successful invasion (Von Holle and Simberloff 2005; Colautti et al. 2006) and the major source of non-native plant propagules is ornamental horticulture, including domestic gardens (van Kleunen et al. 2018). The models in this study highlight the spatial relationship of increased number of non-native plant species with propagule sources from greenhouses and plant nurseries, human populations, and the associated built environment of development and impervious surfaces, rather than transport or land use disturbance from agriculture and energy extraction. Medium intensity development of primarily single-family housing, and the highly correlated impervious surfaces, represent the building infrastructure for the human populations, although these predictors also contribute to land use disturbance and transport. Models were consistent for both regressors and a range of vegetation types and human population settings, from western shrublands in rural counties to eastern forestlands in urbanized counties. The findings from these models, as well as elevated non-native plant species richness with greater population densities, align with existing research. Disturbance and change in disturbance regime were weak predictors of non-native species richness, in a global meta-analysis of 200 sites (Moles et al. 2012). In contrast, the connection between non-native plants and horticulture and human activity has been well-developed (Reichard and White 2001; Liebhold et al. 2012; van Kleunen et al. 2018). Similarly, urban areas, and related measures of human populations, housing, and impervious surfaces, are becoming recognized as key locations for non-native species invasion and richness (Bock and Bock 2009; Gaertner et al. 2017).

Greenhouse and nursery density logically is related to human populations. However, the correlation between greenhouse density and current and historical human population measures (r = 0.70 and 0.76; Fig. 2) may be different enough to indicate two different yet critical pathways of propagule pressure. That is, the horticultural industry is providing a direct stream of non-native plant products to consumers, who are vectors of non-native plant spread (Gaertner et al. 2017; van Kleunen et al. 2018). Humans in addition are accidently introducing new non-native plants through urban trade and transport hubs and roads connecting buildings in towns and cities (Gaertner et al. 2017).

Forests, moreover, were distributed, where human population densities and number of greenhouses and nurseries were greater. This meant that forests contained more non-native plants than other vegetation types (Fig. 5). The northeastern region in particular is an intermix of forest cover and higher density human populations, with numerous horticultural locations. However, forestland alone was not a good predictor of non-native species, due to the southeastern U.S., which also is forested but with lesser non-native plant species richness.

Some locations, concentrated in the southeastern U.S., had moderate to high human population densities (years 1790–2010) and greenhouses and nursery densities yet lower non-native plant species richness (< 100 non-native plant species; Fig. 6). Time lags do not seem a likely explanation for why the southeastern region did not conform to the trend because of the long history of global trade in the Southeast. These mismatches are important locations for additional study to determine if these primarily are measurement errors of omission due to sampling bias and if not, which factors may be affecting ecosystem invasibility and resistance to invasion. To speculate on factors, the southeastern and central regions of the U.S., which both have lower non-native species richness, have at least two characteristics in common. They have an abundance of herbaceous species and both are agricultural centers, either for crops and pasture or agroforestry of intensively managed pine plantations. Perhaps biotic resistance combined with herbicide applications provide additional protection against spread of non-native plants. Conversely, some regional studies have found that native biodiversity is correlated with non-native biodiversity (Stohlgren et al. 2003; Fridley et al. 2007; Moles et al. 2012). Other options worth considering for regional differences encompass success of programs for invasive species eradication, biocontrol, and port detection programs.

Fig. 6
figure 6

Counties with < 100 non-native plant species and moderate to high historical human population densities (A) and ≥ 5 greenhouses and nurseries (B) by county

Elevated land use disturbances of agriculture and energy extraction do not seem to be influential for number of non-native plant species. Models did not identify the influence of the dominant form of disturbance in agriculture and aspects of energy extraction and associated transport networks (pipelines, transmission lines, and oil and gas wells) that occur throughout the U.S. Other lines of evidence support limited importance of disturbance on non-native species richness (Moles et al. 2012). Tree harvest, either due to current tree removals or resulting from the intensive and extensive clearcut era of Euro-American settlement (circa 1880 to 1920), has not increased non-native plant species in the forested southeastern U.S. relative to other regions (Fig. 1; Hanberry 2022a). Nonetheless, the southeast region has the most frequently disturbed forests in the U.S., according to tree age and harvest of plantations every 20–30 years (Pan et al. 2011). Core areas of vegetation without roads and other transport corridors are not abundant, even in heavily forested, rural landscapes (Hanberry et al. 2013). Likewise, fewer non-native plant species occurred in the agricultural central region of the U.S., where intense land use disturbance occurs even at low human population densities. It may be that most regions have a saturated amount of land use disturbance and additional stressors. Consequently, more disturbance beyond a certain threshold does not result in more invasion, even if non-native species are a symptom of disturbance. This would require an appropriate study design to be tested.

Gaertner et al. (2017) expressed that one obstacle for studying invasions in urban settings was the lack of globally applicable, urban definitions. However, I applied globally applicable, human population density class definitions (Hanberry 2022b). These classifications demonstrated that non-native plant species richness steadily increased with increasing human population density classes, from 87 non-native plant species at the lowest human population density class to about 225 non-native plant species in the two greatest human population density classes (Fig. 4). Use of global human population models or percent urban land cover may help standardize urban definitions to fill the research gap of limited studies that estimate non-native plant species richness or abundance along human population gradients (Cadotte et al. 2017).

The implication of a strong link between propagule sources and invasion, relative to rural disturbance and transport, is that reducing propagule pressure through prevention of non-native species introduction may be a more beneficial and cost-effective strategy than reducing disturbance for lessening number of non-native plant species (Reaser et al. 2008; Liebhold et al. 2012). Reducing land use disturbance may be desirable for numerous other reasons. Nevertheless, it is straightforward to focus on sources of the introduction, although choice of the most efficient management strategy depends on the context, stage of invasion, scale, and management objectives (Simberloff et al. 2013). Invasions are governed by a complex hierarchy of processes occurring simultaneously at various spatiotemporal scales, which requires evaluation of the cost-effectiveness of control methods. Non-native species cost the U.S. economy an estimated US$120 billion annually, which increases with every additional invasive species (Pimentel et al. 2005). Despite economic and ecological costs due to non-native ornamental plants and associated species, little regulation of plant introduction occurs (Reichard and White 2001; Brasier 2008; Liebhold et al. 2012).

General approaches to limit invasion are prevention of non-native species introduction as the most immediate possible intervention, monitoring for early detection, prioritization of harmful species for management, rapid response to treat species using integrated pest management techniques, and restoration of native ecosystems to prevent invasion (Hulme 2006). In addition, outreach and coordination among agencies, the horticultural industry, and landowners; availability of native horticultural options; and federal or state regulations including restrictions, quarantines, weed-free certifications, and prohibition of selling invasive species will help reduce introduction and spread (Peterson and Diss-Torrance 2012). Public transparency of biosecurity breaches and cost-sharing with the horticultural industry are additional strategies (Brasier 2008). Municipalities, communities, associations, or other local networks can encourage greater invasive species control by sharing information and providing support to landowners (Graham 2013).

Sampling unit size and shape selection is a problem for all studies, resulting in the modifiable areal unit problem, whether at stand (Greig-Smith 1952) or landscape scales (Jelinski and Wu 1996). Continuous grids (Greig-Smith 1952) or basic ecological entities (Jelinski and Wu 1996) are solutions to the problem. The spatial unit for this study was dictated by the non-native species data set, because over half of the reporting for non-native species was by U.S. county. However, counties are similar to continuous grids, albeit with some randomness in sizes and shapes. Typically, randomness in incorporated into studies to avoid inadvertently matching patterns. County area and non-native plant species richness for the U.S. and different vegetation types did not have a relationship (R2 values ranging from 0 to 8%), excepting shrublands (R2 values of 33% and 62%). Nevertheless, western shrublands, with larger counties (mean = 8185 km2), had the same important variables as for eastern forests, and their smaller counties (mean = 1813 km2). Furthermore, I did not perform conventional significance testing, which is very sensitive to the modifiable areal unit problem (Jelinski and Wu 1996). In any event, the county scale is a relevant scale for non-native species management, as evidenced by the reporting scale, and modeling produces valid associations for the size and shape of the modeled spatial units (Hanberry 2013).

Spurious correlations likewise can occur for any study and a total weight of evidence is necessary for assessment. One correlative study does not establish causation or cover all complexities. However, results from this study aligned with research that establishes human activities and horticulture as primary sources of non-native species, and consequently of non-native species invasion (Reichard and White 2001; Liebhold et al. 2012; van Kleunen et al. 2018), with limited influence by disturbance (Moles et al. 2012). Other factors, resolutions, and scales are important for spread of non-native species (Pyšek and Hulme 2005; Pauchard and Shea 2006; Theoharides and Dukes 2007; Cadotte et al. 2017), as indicated by the importance of pipelines for the grassland vegetation type.

Furthermore, I examined only number of non-native species, which was most relevant for invasion pathways, but impact resulting from invasive species is another issue that probably is related to different variables (Theoharides and Dukes 2007). For example, western shrublands may be the ecosystem in the U.S. most affected by non-native plant species, due to abundance of cover rather than number of non-native plant species (Poessel et al. 2022; Chambers et al. 2023). Invasive annual grasses, including cheatgrass (Bromus tectorum), red brome (Bromus rubens), and medusahead rye (Taeniatherum caput-medusae), that burn repeatedly are changing shrublands to non-native grasslands, which is challenging for management of native plants and wildlife, such as the greater sage-grouse (Centrocercus urophasianus; Poessel et al. 2022). While this transition is complex, factors may include the non-native ecological context relative to the native context, such as disturbance regimes (i.e., levels of livestock grazing relative to historical levels of grazing by native large herbivores; fire regimes), disturbance responses of native and non-native plants, and alteration of disturbance regimes by non-native plants (Porensky 2021). That is, classical invasion concepts focused on invasion facilitated by disturbance and traits of non-native species and inhibited by native species may inform transformation from non-native species to invasive species.

Specific caveats exist for these models. Non-native species survey effort may increase closer to urban areas, but plant surveys occur in rural locations, away from urban and developed environments with limited vegetation. Particularly, public lands that are tourist locations appear to receive greater survey effort and may help balance survey effort between urban and rural lands. The tested variables did not account for all variation. However, the invasion sequence of non-native species is complex and idiosyncratic, with random components, and non-native plant species richness is not at equilibria. Overlap existed among propagule source, transport, and disturbance pathways of the predictor variables. I did not examine all possible predictors, including roads due to the nuances involved. Roads on the surface appear to be primarily a transport variable, yet they also will increase with human population densities and also energy extraction, confounding what they represent and may be misinterpreted. Roads also have a hierarchy from dirt roads to controlled access highways. The influence of ecosystem resistance may have been limited in the models, because I only approximated ecosystem resistance with the proportion of wildland types per county; nonetheless, the majority of native biota rely on and occur in natural habitats (Cadotte et al. 2017).


The major pathway of non-native plant propagules flows through the horticultural trade to domestic gardens. Supporting the larger context of already existing studies, greater human population densities and greenhouse and nursery densities as propagule sources of non-native plants appeared to have a stronger spatial relationship with the number of non-native species than proxies of disturbance or transport. Caveats for this modeling include that survey effort likely is imbalanced in terms of population densities; formal or informal vegetation surveys may occur in areas that are accessible to population centers. Equally, one correlative study cannot establish causation or incorporate all spatiotemporal complexities in predictive factors and scales for analysis. Characterizing variation in non-native species richness is challenging. However, results from this study were consistent with research that connects human activities and horticulture with number of non-native species. Both transport corridors and disturbance from land use already may be saturated in most regions, so these may not be limiting factors of dispersal and establishment. Therefore, prevention of introduction sources may be the most effective approach to invasive species management, including working with the horticultural industry to prevent initial introduction.

Availability of data and materials

All data are publicly available unless noted.



Database early detection and distribution mapping system


  • Allen JM, Bradley BA (2016) Out of the weeds? Reduced plant invasion risk with climate change in the continental United States. Biol Cons 203:306–312

    Article  Google Scholar 

  • Bock CE, Bock JH (2009) Biodiversity and residential development beyond the urban fringe. In: Esparza AX, McPherson G (eds) The planner’s guide to natural resource conservation: the science of land development beyond the metropolitan fringe. Springer Science & Business Media, New York

    Google Scholar 

  • Brasier CM (2008) The biosecurity threat to the UK and global environment from international trade in plants. Plant Pathol 57:792–808

    Article  Google Scholar 

  • Breiman L (2001) Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat Sci 16:199–231

    Article  Google Scholar 

  • Bright EA, Rose AN, Urban ML (2016) LandScan 2015. Oak Ridge National Lab, Oak Ridge, TN. Available at Accessed 28 February 2021.

  • Bzdok D, Altman N, Krzywinski M (2018) Statistics versus machine learning. Nat Methods 15:233–234

    Article  CAS  Google Scholar 

  • Cadotte MW, Yasui SL, Livingstone S, MacIvor JS (2017) Are urban systems beneficial, detrimental, or indifferent for biological invasion? Biol Invasions 19:3489–3503

    Article  Google Scholar 

  • Cassey P, Delean S, Lockwood JL, Sadowski JS, Blackburn TM (2018) Dissecting the null model for biological invasions: a meta-analysis of the propagule pressure effect. PLoS Biol 16:e2005987

    Article  Google Scholar 

  • Center for Invasive Species and Ecosystem Health (2020) EDDMapS Invasive Species Database. Warnell School of Forestry and Natural Resources, College of Agricultural and Environmental Sciences, University of Georgia, Athens, GA. Available at Accessed 28 February 2021

  • Chambers JC, Brown JL, Bradford JB, Board DI, Campbell SB, Clause KJ, Hanberry B, Schlaepfer DR, Urza AK (2023) New indicators of ecological resilience and invasion resistance to support prioritization and management in the sagebrush biome. United States. Front Ecol Evol 10:1009268

    Article  Google Scholar 

  • Clements DR, Ditommaso A (2011) Climate change and weed adaptation: can evolution of invasive plants lead to greater range expansion than forecasted? Weed Res 51:227–240

    Article  Google Scholar 

  • Colautti RI, Grigorovich IA, MacIsaac HJ (2006) Propagule pressure: a null model for biological invasions. Biol Invasions 8:1023–1037

    Article  Google Scholar 

  • Courchamp F, Fournier A, Bellard C, Bertelsmeier C, Bonnaud E, Jeschke JM, Russell JC (2017) Invasion biology: specific problems and possible solutions. Trends Ecol Evol 32:13–22

    Article  Google Scholar 

  • D’Antonio CM, Vitousek PM (1992) Biological invasions by exotic grasses, the grass/fire cycle, and global change. Annu Rev Ecol Syst 23:63–87

    Article  Google Scholar 

  • Fang Y, Jawitz J (2018) High-resolution reconstruction of the United States human population distribution, 1790–2010. Scientific Data 5:180067

    Article  Google Scholar 

  • Fernández-Delgado M, Sirsat MS, Cernadas E, Alawadi S, Barro S, Febrero-Bande M (2019) An extensive experimental survey of regression methods. Neural Netw 111:11–34

    Article  Google Scholar 

  • Fridley JD, Stachowicz JJ, Naeem S, Sax DF, Seabloom EW, Smith MD, Stohlgren TJ, Tilman D, Holle BV (2007) The invasion paradox: reconciling pattern and process in species invasions. Ecology 88:3–17

    Article  CAS  Google Scholar 

  • Gaertner M, Wilson JR, Cadotte MW, MacIvor JS, Zenni RD, Richardson DM (2017) Non-native species in urban environments: patterns, processes, impacts and challenges. Biol Invasions 19:3461–3469

    Article  Google Scholar 

  • Gavier-Pizarro GI, Radeloff VC, Stewart SI, Huebner CD, Keuler NS (2010) Housing is positively associated with invasive exotic plant species richness in New England, USA. Ecol Appl 20:1913–1925

    Article  Google Scholar 

  • Graham S (2013) Three cooperative pathways to solving a collective weed management problem. Aust J Environ Manag 20:116–129

    Article  Google Scholar 

  • Greig-Smith P (1952) The use of random and contiguous quadrats in the study of the structure of plant communities. Ann Bot 1:293–316

    Article  Google Scholar 

  • Hanberry BB (2013) Finer grain size increases effects of error and changes influence of environmental predictors on species distribution models. Eco Inform 15:8–13

    Article  Google Scholar 

  • Hanberry BB (2022a) Non-native plant associations with wildfire, tree removals, and deer in the eastern United States. Landscape Online 97:1104

    Article  Google Scholar 

  • Hanberry BB (2022b) Imposing consistent global definitions of urban populations with gridded population density models: Irreconcilable differences at the national scale. Landsc Urban Plan 226:104493

    Article  Google Scholar 

  • Hanberry BB (2023) Shifting potential tree species distributions from the Last Glacial Maximum to the Mid-Holocene in North America, with a correlation assessment. J Quat Sci.

    Article  Google Scholar 

  • Hanberry BB, Hanberry P, Demarais S (2013) Birds and land classes in young forested landscapes. Open Ornithol J 6:1–8

    Article  Google Scholar 

  • Hanberry BB, DeBano SJ, Kaye TN, Rowland MM, Hartway CR, Shorrock D (2021) Pollinators of the Great Plains: disturbances, stressors, management, and research needs. Rangel Ecol Manage 78:220–234

    Article  Google Scholar 

  • Hansen AJ, Piekielek N, Davis C, Haas J, Theobald DM, Gross JE, Monahan WB, Olliff T, Running SW (2014) Exposure of US National Parks to land use and climate change 1900–2100. Ecol Appl 24(3):484–502

    Article  Google Scholar 

  • Homeland Infrastructure Foundation-Level Data (2021) Available at: Accessed 28 February 2021

  • Homer C, Dewitz J, Jin S, Xian G, Costello C, Danielson P, Gass L, Funk M, Wickham J, Stehman S, Auch R (2020) Conterminous United States land cover change patterns 2001–2016 from the 2016 national land cover database. ISPRS J Photogramm Remote Sens 162:184–199

    Article  Google Scholar 

  • Hulme PE (2006) Beyond control: Wider implications for the management of biological invasions. J Appl Ecol 43:835–847

    Article  Google Scholar 

  • Intergovernmental Science‐Policy Platform on Biodiversity and Ecosystem Services [IPBES] (2019) Global assessment report on biodiversity and ecosystem services of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services. IPBES secretariat.

  • Jelinski DE, Wu J (1996) The modifiable areal unit problem and implications for landscape ecology. Landscape Ecol 11:129–140

    Article  Google Scholar 

  • Kennedy PB (1899) Smooth bromegrass (Bromus inermis). U.S. Department of Agriculture Division of Agrostology. Circular 18. Available at Accessed 25 May 2023

  • Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28:1–26

    Article  Google Scholar 

  • Kuhn M, Johnson K (2019) Feature engineering and selection: A practical approach for predictive models. CRC Press. Available at Accessed 6 July 2022

  • Liebhold AM, Brockerhoff EG, Garrett LJ, Parke JL, Britton KO (2012) Live plant imports: the major pathway for forest insect and pathogen invasions of the United States. Front Ecol Environ 10:135–143

    Article  Google Scholar 

  • Lockwood JL, Cassey P, Blackburn T (2005) The role of propagule pressure in explaining species invasion. Trends Ecol Evol 20:223–228

    Article  Google Scholar 

  • Maizel M, White RD, Root R, Gage S, Stitt S, Osborne L, Muehlbach G (1998) Historical interrelationships between population settlement and farmland in the conterminous United States, 1790 to 1992. Available at: Accessed 21 May 2020

  • Mayfield AE, Seybold SJ, Haag WR, Johnson MT, Kerns BK, Kilgo JC, Larkin DJ, Lucardi RD, Moltzan BD, Pearson DE, Rothlisberger JD (2021) Impacts of invasive species in terrestrial and aquatic systems in the United States. In: Poland TM, Patel-Weynand T, Finch DM, Miniat CF, Hayes DC, Lopez VM (eds) Invasive species in forests and rangelands of the United States: a comprehensive science synthesis for the United States forest sector. Springer International Publishing, Heidelberg, pp 5–39

    Chapter  Google Scholar 

  • McKinney ML (2001) Effects of human population, area, and time on non-native plant and fish diversity in the United States. Biol Cons 100:243–252

    Article  Google Scholar 

  • McKinney ML (2002) Influence of settlement time, human population, park shape and age, visitation and roads on the number of alien plant species in protected areas in the USA. Divers Distrib 8:311–318

    Article  Google Scholar 

  • Mitchell CE, Agrawal AA, Bever JD, Gilbert GS, Hufbauer RA, Klironomos JN, Maron JL, Morris WF, Parker IM, Power AG, Seabloom EW (2006) Biotic interactions and plant invasions. Ecol Lett 9:726–740

    Article  Google Scholar 

  • Moles AT, Flores-Moreno H, Bonser SP, Warton DI, Helm A, Warman L, Eldridge DJ, Jurado E, Hemmings FA, Reich PB, Cavender-Bares J et al (2012) Invasions: the trail behind, the path ahead, and a test of a disturbing idea. J Ecol 100:116–127

    Article  Google Scholar 

  • Natural Resources Conservation Service (NRCS) (2018) National Resources Inventory rangeland resource assessment. Available at Accessed 29 August 2019

  • Ott JP, Hanberry BB, Khalil M, Paschke MW, Van Der Burg MP, Prenni AJ (2021) Energy development and production in the Great Plains: implications and mitigation opportunities. Rangel Ecol Manage 78:257–272

    Article  Google Scholar 

  • Pan Y, Chen JM, Birdsey R, McCullough K, He L, Deng F (2011) Age structure and disturbance legacy of North American forests. Biogeosciences 8:715–732

    Article  Google Scholar 

  • Pauchard A, Shea K (2006) Integrating the study of non-native plant invasions across spatial scales. Biol Invasions 8:399–413

    Article  Google Scholar 

  • Pearson DE, Ortega YK, Eren Ö, Hierro JL (2016) Quantifying “apparent” impact and distinguishing impact from invasiveness in multispecies plant invasions. Ecol Appl 26:162–173

    Article  Google Scholar 

  • Peterson K, Diss-Torrance A (2012) Motivation for compliance with environmental regulations related to forest health. J Environ Manage 112:104–119

    Article  Google Scholar 

  • Pimentel D, Zuniga R, Morrison D (2005) Update on the environmental and economic costs associated with alien-invasive species in the United States. Ecol Econ 52:273–288

    Article  Google Scholar 

  • Poessel SA, Barnard DM, Applestein C, Germino MJ, Ellsworth EA, Major D, Moser A, Katzner TE (2022) Greater sage-grouse respond positively to intensive post-fire restoration treatments. Ecol Evol 12(3):e8671

    Article  Google Scholar 

  • Porensky LM (2021) Embracing complexity and humility in rangeland science. Rangelands 43(4):142–150

    Article  Google Scholar 

  • Probst P, Boulesteix AL (2017) To tune or not to tune the number of trees in random forest. J Mach Learn Res 18:6673–6690

    Google Scholar 

  • Pyšek P, Hulme PE (2005) Spatio-temporal dynamics of plant invasions: linking pattern to process. Ecoscience 12:302–315

    Article  Google Scholar 

  • Pyšek P, Jarošík V, Hulme PE, Kühn I, Wild J, Arianoutsou M, Bacher S, Chiron F, Didžiulis V, Essl F, Genovesi P et al (2010) Disentangling the role of environmental and human pressures on biological invasions across Europe. Proc Natl Acad Sci 107:12157–12162

    Article  Google Scholar 

  • R Core Team (2021) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria

  • Reaser JK, Meyerson LA, Von Holle B (2008) Saving camels from straws: how propagule pressure-based prevention policies can reduce the risk of biological invasion. Biol Invasions 10:1085–1098

    Article  Google Scholar 

  • Reichard SH, White P (2001) Horticulture as a pathway of invasive plant introductions in the United States. Bioscience 51:103–113

    Article  Google Scholar 

  • Sarker IH (2021) Machine learning: algorithms, real-world applications and research directions. SN Comput Sci 2(3):160

    Article  Google Scholar 

  • Shaffer JA, DeLong JP (2019) The effects of management practices on grassland birds: an introduction to North American grasslands and the practices used to manage grasslands and grassland birds. Papers in Ornithology. 97.

  • Simberloff D, Martin JL, Genovesi P, Maris V, Wardle DA, Aronson J, Courchamp F, Galil B, García-Berthou E, Pascal M, Pyšek P (2013) Impacts of biological invasions: what’s what and the way forward. Trends Ecol Evol 28:58–66

    Article  Google Scholar 

  • Stohlgren TJ, Barnett DT, Kartesz JT (2003) The rich get richer: patterns of plant invasions in the United States. Front Ecol Environ 1:11–14

    Article  Google Scholar 

  • Stringham OC, Lockwood JL (2021) Managing propagule pressure to prevent invasive species establishments: propagule size, number, and risk–release curve. Ecol Appl 31(4):e02314

    Article  Google Scholar 

  • Theoharides KA, Dukes JS (2007) Plant invasion across space and time: factors affecting nonindigenous species success during four stages of invasion. New Phytol 176(2):256–273

    Article  Google Scholar 

  • van Kleunen M, Essl F, Pergl J, Brundu G, Carboni M, Dullinger S, Early R, González-Moreno P, Groom QJ, Hulme PE, Kueffer C et al (2018) The changing role of ornamental horticulture in alien plant invasions. Biol Rev 93:1421–1437

    Article  Google Scholar 

  • Von Holle B, Simberloff D (2005) Ecological resistance to biological invasion overwhelmed by propagule pressure. Ecology 86(12):3212–3218

    Article  Google Scholar 

  • Zhou J, Li E, Wei H, Li C, Qiao Q, Armaghani DJ (2019) Random forests and cubist algorithms for predicting shear strengths of rockfill materials. Appl Sci 9:1621

    Article  Google Scholar 

Download references


I thank the reviewers and C. Miniat for their helpful comments to develop the manuscript. This research was supported by the USDA Forest Service, Rocky Mountain Research Station. The findings and conclusions in this publication are those of the author and should not be construed to represent any official USDA or U.S. Government determination or policy.


No funding supported preparation of this manuscript.

Author information

Authors and Affiliations



BBH completed all authorship tasks.

Corresponding author

Correspondence to Brice B. Hanberry.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The author declares no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hanberry, B.B. Non-native plant species richness and influence of greenhouses and human populations in the conterminous United States. Ecol Process 12, 27 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Agriculture
  • Energy
  • Forest
  • Grassland
  • Invasive
  • Propagule
  • Rural
  • Transport
  • Urban