Predictive distribution modeling of Swertia bimaculata in Darjeeling-Sikkim Eastern Himalaya using MaxEnt: current and future scenarios

As global temperatures continue to rise, species distribution modeling is a suitable tool for identifying rare and endangered species most at risk of extinction, along with tracking shifting geographical range. The present study investigates the potential distribution of Swertia bimaculata in the Darjeeling-Sikkim region of Eastern Himalaya in current and future climate scenarios of GFDL-CM3 (Geophysical Fluid Dynamics Laboratory-Climate Model 3) for the year 2050 and year 2070 through MaxEnt presence data modeling. Two sets of variables were used for modeling current scenario. The models were evaluated using AUC (area under the curve) values and TSS (true skill statistic). Habitat assessment of the species shows low and sporadic distribution within the study area. A significant decrease is observed in the possible range of the species in the future climate scenario with the habitat decreasing from 869.48 to 0 km2. Resultant maps from the modeling process show significant upward shifting of the species range along the altitudinal gradient. Still, results should be taken with caution given the low number of occurrences used in the modeling. The results thus highlight the vulnerability of the species towards extinction in the near future.


Introduction
The Intergovernmental Panel on Climate Change (IPCC) fifth assessment report (2014) reiterates that recent climate change caused by heightened concentrations of greenhouse gasses generated by anthropogenic activity consequently impacted both natural and human systems. The occurrences of climate change are easily observable from the increasing average global temperatures (Fischer and Knutti 2015), disappearances of mountainous glaciers (Roe et al. 2017), melting polar ice caps (Chen et al. 2006), increasing sea levels (Nerem et al. 2018), and desertification  to name a few. One of the impacts of climate change is human-driven biodiversity loss predicting the extinction of an overall 7.9% of total species on the planet (Urban 2015). Four representative concentration pathways (RCPs) have been described based on the mitigation scenarios fulfilled (IPCC 2014). The current climatic trend affects mountainous regions across the world including the Himalayas (Hamid et al. 2019), where several plant and animal species have had a notable upward altitudinal shift (Dullinger et al. 2012). Changes in climatic patterns have further complicated downstream effect on species conservation. This highlights the importance of predictive tools for quicker assessment of priority species.
Species distribution modeling (SDM) is a tool used for designing effective conservation strategies by generating predictive maps of the range of a species in a given area (Kumar and Stohlgren 2009). SDMs are often used to prioritize areas for conservation (Chunco et al. 2013) and assess the immediate and future impact of climate change under different scenarios (Ranjitkar et al. 2016). The main principle based on which most SDMs function is the correlation of presence and sometimes absence data, with different environmental variables to create a predictive map of the species range (Pearson et al. 2006). SDMs essentially estimate the geographic distribution of a species, both actual and potential which is done primarily by characterizing environmental conditions most suitable for the species and then, identifying where these environments are distributed across the geographic space on a spatial scale (Pearson 2007).
MaxEnt is one such SDM program widely used for distribution modeling both internationally, such as China (Yi et al. 2016), New-Caledonia (Kumar and Stohlgren 2009), and the USA (Mingyang et al. 2008), and also in the Indian context , Rawat et al. 2017). MaxEnt has been used previously in the Himalayas for modeling several species such as Justicia adhatoda (Yang et al. 2013), Rhododendrons (Kumar 2012), and Berberis aristata (Ray et al. 2011). MaxEnt is based on the principle of maximum entropy, a machine-learning technique that uses presence (occurrence of a species) and background environmental data. The information provided by the environment and the presence records of a species limit the probable distribution of a species (Pearson 2007). It can also predict the hypothetical habitat of a species in a future scenario. Such predictions have assessed possible habitat for invasive species like Arundina graminifolia (Kolanowska and Konowalik 2014) and endangered species like Thuja sutchuenensis (Qin et al. 2017). As the expected impact on biodiversity could be high, it is thus necessary to understand the exact effect of climate change over a set period of years at the species level.
The Eastern Himalaya is one of the extensions of the Himalaya biodiversity hotspot (CEPF 2020) in the Indian subcontinent with an estimated 5800 species of plants (Pande and Arora 2014). The Darjeeling and Sikkim Himalayas, a contiguous part of the Indian Eastern Himalaya (Starkel and Sarkar 2014), harbors 14 genera under the family Gentianaceae with about 70 species (Grierson and Long 1999). Swertia bimaculata is one of the few species under genus Swertia (Gentianaceae) found in the Darjeeling-Sikkim Himalayas (Grierson and Long 1999). Worldwide, this species has been recorded primarily in East Asian and Southeast Asian countries like Japan, China, Republic of Korea, Bhutan, and India. One of the factors negatively impacting S. bimaculata population is habitat degradation (Kanade andJohn 2018, Pal et al. 2016). Only seventeen occurrences have been recorded previously in India (GBIF 2020). While far less in demand than the medicinal plant Swertia chirayita (Pandey et al. 2012), the value of S. bimaculata as an adulterant has resulted in the decline of the population of the species making it less common in Darjeeling-Sikkim Himalaya. Das et al. (2013) report that S. bimaculata is a species with valuable bioactive compounds, with a rapidly declining population. Hence, it is crucial to understand the distribution pattern and assess the current and future spatial range of the species for conservation measures.

Location
The present study area focuses on the Darjeeling-Sikkim part of Eastern Himalaya (Fig. 1). The contiguous part of the Eastern Himalaya, the exclusively mountainous region of Darjeeling-Sikkim Himalaya, extends from 26°31′ to 28°10′ N Latitude and 87°5 9′ to 88°58′ E Longitude covering a total area of about 9020 km 2 . The region is composed of two transverse ranges Dongkya in the east and Singalila in the west with the countries Nepal, Bangladesh, Bhutan, and China bordering the region that occupies a 100-km-long segment of the highest mountain barrier on the globe (Starkel and Sarkar 2014). The altitudinal range varies from ca. 132 m at Sukna in Darjeeling to 8598 masl at Sikkim, the summit of the third highest peak of the world, Mount Kanchenjungha. A combination of factors like geographic location and climate makes the vegetation of Darjeeling-Sikkim Himalaya a meeting ground for the Indo-Malaysian and Indo-Chinese tropical lowland flora, the Sino-Himalayan east Asiatic, and the Western Himalayan flora comprising about 9000 species with a high percentage of endemic plant species (Das 1995). Due to the altitudinal variation, the area exhibits a characteristic monsoon climate, with wet summer and dry winter caused by the direct exposure to the moistureladen southwest monsoon flowing upwards from the Bay of Bengal that lies at proximity. This region is also a part of the designated Himalayan biodiversity hotspot as selected by Conservation International and the home to many endemic species (Conservation International 2005).

Swertia bimaculata
The genus Swertia of the family Gentianaceae is represented by nearly 150 species worldwide (WFO 2019) of which 40 species are distributed in India and 15 within the Eastern Himalaya (Samaddar et al. 2014). Taxonomically, S. bimaculata is identified by its ovate leaves with three distinct nerves in its vegetative state, flowers with black spots at the apex of each corolla lobe, and two usually distinct green orbicular glands in the middle of each corolla lobe (Grierson andLong 1999, Pandey et al. 2012). In India, 17 occurrences of the species have been recorded previously, with only two recorded from the eastern region (GBIF 2020).

Population assessment
The phytosociological assessment of the habitat for S. bimaculata was carried out with the help of quadrat sampling method. The quadrat size of 2 × 2 m was placed in the habitat where the S. bimaculata showed its occurrence at eleven occurrence points. The data collected for the associated species was pooled to compute and estimate frequency, density, abundance, and dominance using the following equations (Curtis andMcIntosh 1950, Phillips 1959 The relative values were summed up using IVI = ∑ [RF + RD + RA] to extract the importance value index (IVI) for each taxa associated in the habitat. Several different diversity indices for the associated species in the habitat of S. bimaculata were estimated using PAST version 3.24, namely, Shannon index H ′ = − ∑ [(n i /N) ln(n i /N)] (Shannon and Weaver 1963); Richness index D ¼ S= ffiffiffiffi N p (Menhinick 1964); Evenness index J = H ′ / ln S (Pielou 1966); and Index of dominance CD = ∑ (n i /N) 2  (Simpson 1949). The spatial distribution pattern of the species was also studied (Whitford 1949).

Species occurrence data
The occurrence data was collected for S. bimaculata through field investigation, and the coordinate data for the species was recorded using Garmin eTrexH. The population count for S. bimaculata was done with habitat assessment through studying the associated species in the ecological niche. Occurrence points for Sikkim Himalayas were also consulted (Das et al. 2013). Simultaneously, the coordinates were taken in the field for climate change modeling, which was used to determine the IUCN threat status of the species in the study area using GeoCAT (Bachman et al. 2011). During modeling, however, the two coordinates obtained from GBIF (GBIF 2020) were excluded as while they were within India, they were outside the study area. The coordinates obtained from field visit and literature review were combined and used and are presented in Table 1.

Environmental variables
Environmental variables (Table 2) for the current and future climate condition were sourced from the World-Clim database (Hijmans et al. 2005). The current climate data were obtained from the WorldClim database version 2.0 at~1 km 2 (30 arc second) resolution. In case of future climate change predictions, the nineteen bioclimatic variables were obtained for RCP 2.6, RCP 4.5, and RCP 8.5 for the year 2050 and the year 2070 based on the global climatic model GFDL-CM3 (Griffies et al. 2011, Chaturvedi et al. 2012. The data obtained were first clipped and then converted to the ASCII file format by using QGIS 3.4 Madeira. The elevation data were obtained from the Global Multi Resolution Terrain Elevation Data in 1-km 2 resolution (Danielson and Gesch 2011), and the data was used to generate slope and aspect layers in ASCII format in QGIS 3.4 Madeira.

Modeling procedure
The modeling was done using MaxEnt algorithm version 3.4.1 (Phillips et al. 2006). First, a complete model was run with all 22 bioclimatic variables with default settings applied except for using linear, quadratic, and hinge features as the number of data points was 16, along with 10 percentile training presence threshold rule and replicated 10 times. ENM Tools 1.3 (Warren et al. 2010) was used to perform multi-collinearity test to eliminate highly correlated variables, i.e., any correlation where the value of r was greater than 0.9 (Jueterbock et al. 2016), and the second model was run using nine bioclimatic variables. Both models were generated by setting the random test percentage to 30% (Thapa et al. 2018). For future climate predictions, the model was run using the nine uncorrelated bioclimatic variables and the occurrence points first calibrated with the current environment layers and then projected on future climate layers for RCP 2.6, RCP 4.5, and RCP 8.5 for both years 2050 and 2070. The variables used for each model are given in Table 3. The quality of all models was assessed using the average AUC (area under the curve) values and are recorded in Table 5. Models were also evaluated by calculating true skill statistic (TSS) (Allouche et al. 2006). By default, MaxEnt performs multivariate environmental similarity surface (MESS) analysis, which was also used (Elith et al. 2010).

Mapping
Mapping was prepared in the QGIS 3.4 Madeira software, and for each of the generated maps for each model, the corresponding 10th percentile training presence was set as the threshold in order to remove areas with low probability of occurrence (Young et al. 2011). Area was calculated from the generated maps using the QGIS 3.4 Madeira software.

Population assessment
The surveyed habitat revealed that the population of S. bimaculata (Fig. 2) is quite low with only 36 individuals (n=36) counted across eleven suitable habitats in the Darjeeling region. The lowest count of individual S. bimaculata was one whereas the maximum count within a quadrat was observed to be nine individuals. The estimated density was 3.27 with abundance score of 3.3. The abundance to frequency ratio was found to be 0.03, indicating random distribution of the taxa across the habitat. The phytosociological assessment of the habitat of S. bimaculata showed the occurrence of tree species like Acer campbellii, Acer sikkimense, Acer thomsonii, Cryptomeria japonica, Eriobotrya dubia, Eriobotrya petiolata, Eurya acuminata, Exbucklandia populnea, Macaranga denticulata, Neolitsea impressa, and species of Symplocos. The estimation of different phytosociological parameters in recognized habitat of S. bimaculata revealed 45 species of shrubs and herbs that belonged to 38 genera and 27 families with a total of 183 individuals in eleven recognized sites from Darjeeling Himalaya. The family Rosaceae showed highest number with 8 species under 4 genera followed by Compositae and Urticaceae with 3 species each. Acanthaceae, Araliaceae, Balsaminaceae, Hydrangeaceae, Primulaceae, and a fern family Pteridaceae showed 2 species each within the habitat.
The total density calculated for the shrubs and herbs was 16.64 with highest density of 1.45 expressed by Melissa axillaris and Oplismenus compositus. The total abundance of taxa associated with S. bimaculata in its niche was estimated to be 134.6 with highest abundance expressed by Melissa axillaris. The score for importance value index for the shrubs and herb taxa within the habitat ranged between 2.85 and 22.19 with highest score for Melissa axillaris and lowest for Helwingia himalaica, Lysimachia japonica, Maesa chisia, Osbeckia melastomacea, Rubus splendidissimus, Smilax aspericaulis, and Viburnum mullaha as recorded in Table 4.
The estimation for the diversity indices showed that the Shannon index was 3.52, the richness index was estimated as 3.32, the evenness index as 0.92, and the Simpson's index of dominance was recorded to be 0.96 for the associated species within the habitat.

Species distribution modeling
The predicted distribution of S. bimaculata for the complete model (all 22 variables) and the uncorrelated model (9 uncorrelated variables) is shown (Fig. 3).
Both of these models performed well in MaxEnt, since for each, the AUC values were higher than 0.9, with the mean AUC values being 0.982 and 0.980 for the complete and the uncorrelated model respectively ( Table 5). The TSS value was approximately 0.78 and 0.64 for the complete and uncorrelated model respectively (Table 5). For each of the complete and uncorrelated model, the spatial distribution of the species in the study area is 3.92% (451.89 km 2 ) and 7.55% (869.48 km 2 ) respectively (Table 5). In the complete model, the percentage of contribution was highest for BIO13 (precipitation of wettest month) at 39.1%. In the uncorrelated model, BIO18 (precipitation of warmest quarter) showed the highest contribution at 40.2% (Table 5). Figure 4 depicts the ROC curves along with the AUC value. Overall, both the complete and uncorrelated models show very similar curves and AUC   values. Figure 5 illustrates the Jackknife of the complete and the uncorrelated model. The potential spatial distribution of each generated model for future distribution in RCP 2.6, RCP 4.5, and RCP 8.5 in the year 2050 and year 2070 are shown in Fig. 6. The percentage of the study area where S. bimaculata occurs after applying the threshold is recorded in Table 5. For each of the six future predictions, the AUC value ranged from 0.982 to 0.985 making all six models well-performing. The TSS value ranged from approximately 0.63 to 0.79. Among the future models, the probable area of occurrence for S. bimaculata is highest for the year 2050 with 2.07% (238.38 km 2 ) in RCP 4.5 and the lowest estimated area is 0 (0 km 2 ) for the year 2070 in RCP 8.5 (Fig. 6). Figure 7 illustrates the ROC curves of the six future models with the graphs being very similar. The corresponding AUC values are given in Table 5. Figure 8 shows the jackknife of all future models computed. Here, elevation (elev) is the most influential variable while the coldest quarter precipitation (bio_19) is the most influential bioclimatic variable. Figure 9 shows the result of MESS analysis of the species for all future models. Figure 10 shows the response of the species to the different bioclimatic variables.

Discussion
The present study explores both the habitat assessment of S. bimaculata and its spatial distribution in the present and future climate change scenarios. Only 36 individual plants were recorded with abundance to frequency ratio of 0.05 indicating random distribution in the study area. Since the individual count of S. bimaculata was observed to be low in number, the species was modeled under the present and projected under the predicted future climate conditions in Darjeeling-Sikkim Himalaya. The species most dominant in the habitat of S. bimaculata were Melissa axillaris and a grass species Oplismenus compositus.
As S. bimaculata is considered endangered by area of occupancy in India (Bachman et al. 2011), the MaxEnt algorithm was used to assess habitat suitability of S. bimaculata using two different suites of environmental variables. Such species distribution modeling is guided by both spatial coordinates and ecological variables in MaxEnt (Phillips and Dudík 2008). Preceding studies involving species distribution modeling in India has explored medicinally significant species such as Berberis aristata (Ray et al. 2011), Justicia adhatoda (Yang et al. 2013), and Brucea mollis (Borthakur et al. 2018); endangered species such as Gymnocladus assamicus (Menon et al. 2010), Ilex khasiana (Adhikari et al. 2012), and Neottia cordata (Tsiftsis et al. 2019); invasive species such as Hyptis suaveolens (Padalia et al. 2014); and the bamboo Yushania maling (Srivastava et al. 2018) along with tree species like Fagus sylvatica (Castaño- Santamaría et al. 2019). In the present study, all the models had AUC values higher than 0.9. Similar AUC values are reported for Vincetoxicum arnottianum (Khanum et al. 2013), Hyptis suaveolens (Padalia et al. 2014), Scutellaria baicalensis , and Oxytenanthera abyssinica (Gebrewahid et al. 2020). TSS value is an alternative measure of model performance (Allouche et al. 2006). The TSS values ranged from 0.6 to 0.8 for all models, indicating that the models are not very robust.
The nine variables used in the uncorrelated model were utilized for future climate change projections of RCP 2.6, RCP 4.5, and RCP 8.5 for both the year 2050 and the year 2070. It should be noted that the threshold used to calculate area was 10th percentile training presence, which removes areas with very low probability of species occurrence. In RCP 8.5, no mitigation strategies are applied, and the temperature is expected to rise to 2.0°C above the pre-industrial level in the year 2050 and 3.7°C in the year 2070 (IPCC 2014). In such a scenario, suitable habitat disappears completely in the year 2070 (Table 5). In RCP 4.5, moderate mitigation strategies are adopted, and the temperature increases to 1.4°C in the year 2050 and 1.8°C in the year 2070 (IPCC 2014). The percentage of area for the year 2050 is 2.07%, and in the year 2070, the area is 1.28%. It is possible that due to moderate mitigation, warming results in opening up of previously colder regions. In RCP 2.6, all expected climate change alleviation goals have been fulfilled, and temperature increases by 1.0°C above pre-industrial levels in the year 2050 and does not further increase in  the year 2070 (IPCC 2014). The percentage of area in 2050 for RCP 2.6 is 0.73%, and it decreases to 0.03% in the year 2070. The smaller increase of global temperature may mean that areas of higher elevations remain unsuitable while other bioclimatic variables are influenced enough to decrease suitable habitat area. Overall, a similar trend is observed across all three RCPs wherein suitable habitat decreases with time. It was observed that the largest area for the year 2050 and the year 2070 was for RCP 4.5. This could be due to the warming climate, which lets the species spread towards higher altitudes, but more elevated temperatures of RCP 8.5 result in a restricted range for the species. Studies conducted by Parolo and Rossi (2008) and Matteodo et al. (2013) demonstrate that changing climate pushes plants towards higher elevations and latitudes. Similarly, S. bimaculata migrates northwards.
Overall, elevation had the most influence on species distribution with it having the highest training gain among all the variables both by itself and by its absence regardless of the combination of variables used. In MESS analysis of future models, the similarity of the future and the current bioclimatic variables are compared. Darker areas show that future data is outside the range of the current data. The species response curves demonstrate the response of S. bimaculata to each variable, particularly altitude (peaking at 2000 m AMSL), temperature (between 18 and 34°C) and precipitation. Climate change therefore perceptibly affects the spatial range of S. bimaculata in the future with the range first decreasing in the year 2050 and then further declining in the year 2070 (Remya et al. 2015). Samaddar et al. (2013) reports the presence of high concentrations of bioactive compounds such as swertiamarin and amarogentin, which makes this species further vulnerable to exploitation. The species of Swertia are traditionally precious in this part of the Himalaya, and S. chirayita, a critically endangered plant of the Himalaya, has numerous medicinal properties including anti-cancerous potential Dhawan 2005, Saha et al. 2004). Yonzone (2017) reports the use of S. bimaculata mainly as a substitute for S. chirayita. A decoction of either fresh or dried plant mass S. chirayita is orally administered to treat common ailments. The collection of S. chirayita has extensively reduced the population of the species, and therefore, as its substituent, S. bimaculata, locally known as "Bhaley chirauto," is used widely in this part of the Himalaya. Thus, overharvesting coupled with rapid urban development in the Darjeeling-Sikkim Himalaya makes this species a vulnerable one. Due to such activity, the population of S. bimaculata has reduced extensively with sporadic distribution and decline in population in natural vegetation. A decrease in the spatial range of a species increases the potential risk of local extinction (Thomas et al. 2004). Climate change is another hazard to this species as it can lead to further shrinkage of the potential habitat. Human-induced climate change is directly affecting the biological species from genes to ecosystems (Hoffmann et al. 2019). Active conservation efforts and management strategies are utmost necessary at this point to preserve the species in situ as the population of the species is seemingly declining at an alarming rate.
While MaxEnt is an efficient tool for modeling endangered species (Gebrewahid et al. 2020), the accuracy is limited by the methodology and data applied during modeling. For instance, the collection procedure of the occurrence data alongside the low number of occurrence points can make the modeling approach biased. Thus, it is necessary to use correct and reliable habitat models to reduce the biasness (Sobek-Swant et al. 2012). However, the models generated in this study provide an insight to the possible distribution of Swertia bimaculata in present and future.