Modeling fire ignition probability and frequency using Hurdle models: a cross-regional study in Southern Europe

Wildfires play a key role in shaping Mediterranean landscapes and ecosystems and in impacting species dynamics. Numerous studies have investigated the wildfire occurrences and the influence of their drivers in many countries of the Mediterranean Basin. However, in this regard, no studies have attempted to compare different Mediterranean regions, which may appear similar under many aspects. In response to this gap, climatic, topographic, anthropic, and landscape drivers were analyzed and compared to assess the patterns of fire ignition points in terms of fire occurrence and frequency in Catalonia (Spain), Sardinia, and Apulia (Italy). Therefore, the objectives of the study were to (1) assess fire ignition occurrence in terms of probability and frequency, (2) compare the main drivers affecting fire occurrence, and (3) produce fire probability and frequency maps for each region. In pursuit of the above, the probability of fire ignition occurrence and frequency was mapped using Negative Binomial Hurdle models, while the models’ performances were evaluated using several metrics (AUC, prediction accuracy, RMSE, and the Pearson correlation coefficient). The results showed an inverse correlation between distance from infrastructures (i.e., urban roads and areas) and the occurrence of fires in all three study regions. This relationship became more significant when the frequency of fire ignition points was assessed. Moreover, a positive correlation was found between fire occurrence and landscape drivers according to region. The land cover classes more significantly affected were forest, agriculture, and grassland for Catalonia, Sardinia, and Apulia, respectively. Compared to the climatic, topographic, and landscape drivers, anthropic activity significantly influences fire ignition and frequency in all three regions. When the distance from urban roads and areas decreases, the probability of fire ignition occurrence and frequency increases. Consequently, it is essential to implement long- to medium-term intervention plans to reduce the proximity between potential ignition points and fuels. In this perspective, the present study provides an applicable decision-making tool to improve wildfire prevention strategies at the European level in an area like the Mediterranean Basin where a profuse number of wildfires take place.


Background
The Mediterranean Basin represents a hotspot in terms of fire occurrence (Pausas et al. 2008). In the last four decades, nearly 2 million fire events occurred in the Mediterranean regions of Southern Europe, i.e., Portugal, Spain, France, Italy, and Greece, altogether affecting more than 15 million hectares (ha). In 2018, more than 10,000 fires covering an area of 44,643 ha have been recorded in Italy and Spain (San-Miguel-Ayanz et al. 2018). The mounting spread of megafires across Southern Europe (San-Miguel-Ayanz et al. 2013a) has spurred the European Commission to implement large-scale strategies aimed at preventing fire ignition probability and frequency (Elia et al. 2016;Lafortezza et al. 2013;Alcasena et al. 2019). To this end, the role of the scientific community is crucial for the transmission of knowledge to the operative world (e.g., Civil Protection, Forest Service). More accurate predictive models can help land management agencies understand where fires are more likely to ignite in a given landscape (Oliveira et al. 2012;Lafortezza et al. 2015).
A large body of literature exists on the estimation of fire ignition probability and frequency at different global, continental, and regional scales (Miranda et al. 2012;Ganteaume et al. 2013;Guo et al. 2016;Costafreda-Aumedes et al. 2017;Oliveira et al. 2017;Viedma et al. 2018). In particular, a variety of studies have estimated the probability of fire ignition in the Mediterranean Basin using logistic regression (González-Olabarria et al. 2007;Catry et al. 2009;Martínez et al. 2009;Vilar del Hoyo et al. 2011), geographically weighted logistic regression (Koutsias et al. 2010;Oliveira et al. 2014;Rodrigues et al. 2014), and machine learning techniques (Oliveira et al. 2012;Martín et al. 2019). Fire ignition frequency has mainly been investigated by applying counting models, such as Poisson regression (Faivre et al. 2014;Boubeta et al. 2015;Rodrigues et al. 2016) and negative binomial regressions (Quintanilha and Ho 2006). Despite human manipulation, the estimation of fire ignition probability and frequency is characterized by a high degree of stochasticity (Elia et al. 2019). For this reason, a deeper understanding of wildfire ignition occurrence requires increasingly innovative approaches. We believe that in this field, there is much room for improvement which can be enhanced by exploring models used in other areas of research.
In this regard, the use of Hurdle models could substantially improve the understanding of fire ignition probability and frequency. In comparison with simpler models, Hurdle models have the ability to account for distributions deviating from normality, such as ignition points, where the number of zero values (i.e., absence of ignition) is considerably larger than the number of nonzero values (i.e., presence of ignition). Therefore, the use of these models distinctly results in improved accuracy (Xiao et al. 2015).
Currently, the literature does not provide a large number of studies where Hurdle models are employed in estimating fire ignition probability and frequency. Of those found, for example, Serra et al. (2014) used the Poisson Hurdle model to analyze the occurrence of megafires (> 50 ha) and their potential causes in Spain. Elia et al. (2020) investigated the likelihood and frequency of fire recurrence using a Negative Binomial Hurdle models in South Italy, while Xiao et al. (2015) employed several count models, including Hurdle models, to analyze fire occurrences in China.
Our study was intended to fill this gap; therefore, we developed a cross-regional study to estimate fire ignition probability and frequency employing a Hurdle model in three regions of the Mediterranean Basin: Catalonia (Spain), Sardinia, and Apulia (Italy). Within these geographical regions, we specifically aimed to (1) assess the soundness of Hurdle models in estimating fire ignition occurrence in terms of probability and frequency, (2) compare the most influential drivers of wildfires among the three study regions, and (3) produce fire probability and frequency maps.
Innovative approaches are crucial for identifying the similarities and differences between the drivers of fires and their environmental impacts across European regions (San-Miguel-Ayanz et al. 2013b). This study also intends to corroborate prior researches and provide new insight for implementing cross-regional fire management strategies in Mediterranean ecosystems, particularly in the broader context of European wildfire management policy.

Study areas
This study focuses on three regions of the Mediterranean Basin: Catalonia (Spain), Sardinia, and Apulia (Italy) (Fig. 1). Catalonia is located between the geographic coordinates 40°31′ 22.5″ N, 0°09′ 58.9″ E and 42°51′ 40.5″ N, 3°19′ 14.3″ E in northeastern Spain. Along its coasts, the climate is typically Mediterranean with mild winters and hot summers and continental in hinterland areas characterized by warm and dry summers and cold winters. The average annual rainfall varies from 500 mm in the coastal zone to 1000 mm in the Pyrenees mountain range bordering with France.
The mean annual temperatures range from 17 to 0°C while the altitude ranges from sea level to greater than 3000 m. More than half of Catalonia's territory is covered by shrubland (38%) and forest (23%) (González-Olabarria et al. 2007). In the Pyrenees and Pre-Pyrenees, where the climate is more of the continental type, forests comprise Pinus uncinate Mill., Fagus sylvatica L., and P. sylvestris L., whereas in the Mediterranean areas, the main tree species are P. halepensis Mill., Quercus ilex L., P. nigra, Q. suber L., Q. humilis Mill., and P. pinea L. (Saura and Piqué 2006).
The regions of Sardinia and Apulia are located in Southern Italy between 38°51′ 55.3″ N, 8°08′ 11″ E and 41°18′ 28″ N, 9°49′ 40″ E, and 39°47′ 29.8″ N, 14°56′ 02.8″ E and 42°08′ 25.5″ N, 18°31′ 09.5″ E, respectively. Sardinia is a hilly region with high topographical variability while Apulia is a predominantly flat region with small hills in the northwest. The average elevation is 338 m and 565 m asl, respectively. Both Mediterranean regions are characterized by hot and dry summers and mild winters. Most of the annual rainfall occurs during the fall and winter, with an average varying between 400 and 1000 mm in Sardinia and 450 and 650 mm in Apulia. In Apulia, the mean annual temperature ranges from 12.0°C inland to 19.0°C along the coasts while in Sardinia, it varies from 11.6 to 18.0°C (Canu et al. 2015). Furthermore, Sardinia is a wooded region with 24% of forest cover (583,472 ha) compared to Apulia, which is one of the least wooded regions in Italy with only 7% of forest cover (145,889 ha) (Gasparini et al. 2013). In both regions, vegetation is typically Mediterranean, and most forests are represented by deciduous species belonging to the Quercus genus. The main species found in Apulia are Q. ilex L., Q. pubescens Willd., Q. cerris L., and Q. coccifera L. Another important component of the woodland in these regions is the Mediterranean maquis, which consists of Phyllirea spp., Ruscus aculeatus L., Pistacia lentiscus L., Asparagus acutifolius L., Paliurus spina-christi Mill., Cistus monspeliensis L., C. incanus L., and C. salviifolius L. Maquis and woodlands (mainly Q. ilex L., Q. suber L., and Q. pubescens Willd. stands) combined with pastures dominate Sardinia's hinterland, while agricultural areas cover about 45% of the island along the coasts and in the plains (Bachetta et al. 2009;Farris et al. 2013;Bajocco et al. 2019). Demographically speaking, Catalonia and Apulia are characterized by high population density compared to Sardinia. In 2017, Catalonia counted more than 7 million inhabitants, becoming the second most populated region in Spain with a share of 16.0% over the total population. Instead, Sardinia is the third largest region in Italy (24,106 km 2 ), but one of the least populated with nearly 70 inhabitants per square kilometer. Although Apulia's surface area is smaller (19,354 km 2 ) compared to Sardinia's, it has a denser population with approximately 4 million inhabitants per square kilometer (Eurostat 2018).

Data acquisition Independent variables
The drivers that affect the occurrence of fire ignition points (FIPs) depend on climatic, topographic, and landscape conditions as well as anthropic effects (Costafreda-Aumedes et al. 2018;Bajocco et al. 2019). In this study, 15 predictors were identified and selected as independent variables (Table 1). For each of the three regions, climatic and topographical data were drawn from European databases.
Specifically, climatic data consisting of maximum temperatures and dry days for the 2000-2012 time period were selected from the E-OBS dataset of the EU-FP6 UERRA project (Cornes et al. 2018), while topographical data were obtained from the 30-m Digital Elevation Model (DEM) of the European Environmental Agency. Land cover data were extracted from different sources made available by the regional agencies of Catalonia, Sardinia, and Apulia. For each region, land cover (percentage) was divided into eight classes: agriculture, forest, grassland, other lands, shrubland, urban, waterland, and wetland. Population density and distance from urban roads and areas were selected as anthropic drivers. For each region, the population density was acquired from the GEOSTAT 2011 grid dataset provided by the European Commission. The land cover dataset was used as the starting point for calculating the distances from urban roads and areas. Lastly, a 1-km 2 grid was created to extract the mean values of all predictors with geographic information systems (GIS) tools (QGIS and ESRI ArcMap).

Dependent variables
For each of the three study regions, FIPs were acquired for the 2000-2012 time period. These data represent the most similar and harmonized datasets of fire ignition among the regions within the considered period of study. FIPs were derived from different sources according to region. Fire ignition datasets were obtained from the Ministerio de Agricultura Alimentación y Medio Ambiente (Catalonia), Regional Forestry Corps (Sardinia), and Civil Protection Department (Apulia).
The bar chart in Fig. 2 illustrates the FIP trends for the period of reference for each region. Compared to Catalonia and Apulia, Sardinia shows the highest number of FIPs, which is never below 2000; the region peaks in 2010 exceeding 3500 ignitions. In the 2000 to 2006 time period, Catalonia registers a higher number of FIPs than does Apulia, but in 2007 and 2008, the number decreases below that of Apulia. In 2010 and 2011, the number of FIPs in Apulia is almost identical to that in Catalonia, but decreases in 2012. Table 2 provides some descriptive FIP statistics (sum, mean, max, min, standard deviation) for each study region during the 2000-2012 time period. The occurrence (presence/absence of ignitions) and frequency of FIPs (i.e., number of ignitions per cell) were extracted with a 1-km 2 grid as dependent variables (Zhang et al. 2016). In each region, FIP frequency is unevenly distributed (Fig. 3) because it is affected by the high presence of zero values (number of ignitions = 0) and low presence of positive counts (number of ignitions ≥ 1). These features led us to choose the Hurdle models to investigate fire ignition occurrence and frequency in the study regions, given its capacity to analyze the presence or absence of FIPs and their frequency.

Negative Binomial Hurdle model
According to the Poisson assumption, the dependent variable is characterized by overdispersion when the variance exceeds the mean. For this reason, the Negative Binomial Hurdle model built by Mullahy (1986) was chosen for this study. Hurdle models are a class of statistical models that shape count data with a preponderance of zeros. The distribution of dependent variables was simultaneously analyzed by the model in two different parts: hurdle (Eq. 1) and count (Eq. 2).
P is a probability mass function, and Y i represents FIPs. In the hurdle part, logistic regression was chosen to determine the presence (Y = 1) or absence (Y = 0) of FIPs (F zero , Eq. 3), where e is Euler's number, α 0 is the intercept, and β 1 …β n are the correlation coefficients for each independent variable (x 1 …x n ). Meanwhile, the negative binomial regression is a function of probability for the count part (F count , Eq. 4) used to determine FIP frequency (Y > 0).
The Negative Binomial Hurdle model is recommended when the observed outcome has an average lower than its variance (Spano et al. 2019), where Γ is the gamma function, and Θ and λ represent function parameters (Li et al. 2020). The hurdle() function from the R "pscl" package (Jackman 2017) was used for the analysis. The same explanatory variables were used for the hurdle and count parts, and a separate model was created for each of the three regions. For each part (hurdle and count), the predictions were extracted and used to calculate fire ignition maps.

Model validation
To validate the Negative Binomial Hurdle models for Catalonia, Sardinia, and Apulia, the databases were split into two parts: training and test sets. In the first step, the training set, formed by 70% of the data, was used to develop the model. The test set, 30% of the remaining data, was applied to validate the models' results. To evaluate the performance of the models, the receiver operating characteristic (ROC) curve and area under the curve (AUC) were calculated for the hurdle part, while the Pearson correlation coefficient and root mean square error (RMSE) were calculated for the count part. This process was repeated five times with different samples of training and test sets to assess the stability of the models. In addition, metrics over the entire dataset were calculated.

Results
Fire ignition probability and frequency Table 3 shows the coefficients of independent variables estimated by the Negative Binomial Hurdle model for Catalonia, Sardinia, and Apulia. The effect of independent variables on FIP occurrence was analyzed in the Hurdle part, whereas the influence of these variables on FIP frequency was evaluated in the count part.

Fire ignition probability
A negative relationship between anthropic variables and FIP occurrence was shown in the hurdle part of the models. However, while the relationship between the occurrence of FIPs and the distance from urban areas is stronger than in the count part for Apulia (− 0.48) and Sardinia (− 0.54), the correlation with distance from roads is less strong. Compared to the count part, the landscape variables in the hurdle part were more significant in determining fire occurrence. In all three regions, a positive correlation was found between FIP occurrence and different land cover classes according to region. The most significant classes of land cover recorded were forest (13.7), agriculture (5.81), and grassland (0.50) for Catalonia, Sardinia, and Apulia, respectively. In general, landscape variables are good predictors, but the coefficient values of Catalonia were higher and more significant than those of Apulia and Sardinia. In addition, a negative correlation was found between FIP occurrence and DEM for Apulia and Catalonia, whereas in Sardinia, this correlation was positive. In the case of climatic data, there was a positive correlation between FIP occurrence and maximum temperature only for Sardinia; in particular, when the dry days (0.25) and temperature (0.11) increased, the occurrence of FIPs also increased.

Fire ignition frequency
FIP frequency was significantly linked to anthropic data. In all three regions, an inverse correlation was found between the number of FIPs and distance from roads. Therefore, when the distance from roads decreased, the    (Table 3). In addition, FIP frequency was less inversely correlated to distance from urban areas; however, this variable was significant only for Sardinia (− 0.41) and Catalonia (− 0.36). The population density was not significant for Apulia and Sardinia, whereas for Catalonia, it was positively correlated with FIPs (0.15). Consequently, the frequency of FIPs increased where population density was high (only for the region of Catalonia). The climatic and topographic data demonstrated contrasting effects among the regions. A negative correlation was shown between FIP frequency and topographical data (DEM and slope) for Apulia and Catalonia, whereas in Sardinia, this correlation was positive. The landscape variables were not significant for Sardinia and Catalonia, whereas a positive correlation was found between FIP frequency and the presence of forest, grassland, and shrubland in the Apulia region.

Validation
After calibrating the Hurdle models, we used the test set (30% of data) to evaluate their performances. For each region, cross-validations were carried out on five subsamples of training and test sets to assess potential overfitting. Table 4 shows the performance metrics calculated for the overall model and five subsamples. The results show that no overfitting was observed, as demonstrated by the similar metrics values among the subsamples.

Cross-regional comparison of fire drivers
We estimated the probability density function of fire ignition occurrence in relation to the six main explanatory variables (in the climatic, anthropic, and topographic categories) to compare the magnitude of each driver across the three study regions (Fig. 4). We found similar trends among the regions for the following four explanatory variables: mean maximum temperature, distance from roads, population density, and DEM. For example, Fig. 4a illustrates that the trend of fire ignition probability was similar among the regions and reached high values when the mean maximum temperature was in the 32 to 38°C range. When the distance from roads decreased, the fire ignition probability increased (Fig.  4c). We observed a similar trend among the regions for the mean distance from roads, population density, and elevation (DEM) (Fig. 4d-f). In particular, the curves suggest that fire ignition probability was high when elevation reached the values between 0 and 500 m. In the case of Sardinia and Catalonia, the highest fire ignition probability occurred around 250 m, while in Apulia, two peaks were found at 0 and 500 m. Furthermore, fire ignition probability peaked when the mean distance from roads was approximately 300 m (Fig. 4c). Lastly, the population density trend was similar among the three regions, as high values of fire ignition probability occurred in the range between 0 and 100 inhabitants per square kilometer (Fig. 4e).
Different trends were observed for the two drivers related to climatic and anthropic systems: the amount of dry days and mean distance from urban areas (Fig. 4b, d, respectively). For example, in Apulia and Sardinia, fire ignition probability reached high values when the average number of dry days ranged between 65 and 80, while in Catalonia, this range decreased (40-50 dry days, Fig. 4b). Moreover, Fig. 4d suggests a difference between fire ignition probability trends in relation to distance from urban areas. In Apulia and Sardinia, the highest values of fire ignition probability corresponded to a distance between 0 and 500 m from urban areas compared to Catalonia where the highest values were recorded at a distance of 1000 m.

Fire ignition maps
One of the objectives of this study was to create a fire ignition probability map and fire frequency map for each of the three regions. Maps a-c are the output of the hurdle part, while maps d-f represent the output of the count part of the Negative Binomial Hurdle models (Fig. 5). According to the fire ignition probability maps produced, the region of Catalonia did not exhibit a high probability of fire ignition, although the east coast and central hallow are characterized by a medium-high probability (Fig. 5a). The lowest fire ignition probability was found in north Catalonia on the border with France. In contrast, the regions of Sardinia and Apulia presented high ignition probabilities. Sardinia showed the highest probability of fire ignition almost over the entire territory, especially on the west coast from north to south (Fig. 5b). In this region, the areas with low probability were scattered and mostly found in the southwest and along the east coast in a fragmented pattern. In Apulia, however, areas such as the Gargano Promontory, the Daunian Subappenines, and the central region showed reduced high fire ignition probabilities (Fig. 5c). In addition, this region has two very large areas-the "Piana Salentina" and "Tavoliere delle Puglie"-with very low fire ignition probabilities, which are bound by the Daunian Pre-Appennines in the west and the Adriatic Sea in the east.
Furthermore, maps containing the average fire frequency predicted by the models were also produced based on three frequency ranges: low (0-1), medium (1-2), and high (> 2). A correspondence was observed between the frequency of fires and probability of fire occurrence; specifically, areas with high frequencies showed higher fire probability occurrences ( Fig. 5d-f). This correspondence was more evident for Sardinia, where more areas with high frequencies were observed (Fig. 5e). On the contrary, the maps showed that Catalonia and Apulia had fewer areas with a fire frequency > 2 (Fig. 5d, f). In Catalonia, these areas are mainly located around the center of Barcelona, whereas in Apulia, they are located in the northeast and in the center of the region.

Discussion
In this study, the Negative Binomial Hurdle model, an analytical tool rarely adopted in wildfire-related studies to assess fire ignition occurrence (presence/absence and frequency), was applied to three Mediterranean regions of Southern Europe: Catalonia, Sardinia, and Apulia. Our findings suggest that the model can be considered a sound approach for the prediction of probability and frequency of fire ignition points in all the three study areas. Among the three regions, the differences in AUC values and accuracies are quite small showing a robust goodness of fit. According to the literature, AUC values of 0.5 to 0.7 are considered low accuracy, values of 0.7 to 0.9 suggest useful applications, and values around 0.9 indicate high accuracy (Swets 1988). Our results indicated AUC values of 0.71, 0.80, and 0.83 for Catalonia, Fig. 4 The probability density function of fire ignition occurrence for mean max. temperature (a), mean dry days (b), mean distance from roads (c) and urban areas (d), mean population density (e), and mean elevation (DEM, f) Sardinia, and Apulia, respectively, between predicted probabilities and observed outcomes. These results are consistent with those of Xiao et al. (2015) who explored the count data mixed model to predict fire occurrence in China. The authors found that when the data are dispersed, Hurdle models might give a more satisfactory fit to the data.
Fires are triggered by various anthropic, climatic, topographical, and landscape drivers. The topographical drivers employed in the study reveal a contrasting pattern among the regions. Elevation (i.e., DEM) in all three regions was highly significant and negatively correlated with fire ignition occurrence and frequency in Apulia and Catalonia, but positively correlated in Sardinia. According to the literature (González-Olabarria et al. 2015;Mancini et al. 2018b), evidence has been found for a relationship between altitude and fire occurrence. González-Olabarria et al. (2015) in Catalonia observed that as elevation increases, ignition density decreases. Mancini et al. (2018a) in Italy demonstrated that fire frequency decreases as elevation increases. In Sardinia, elevation is positively correlated with our response variables because the region is mostly characterized by mountainous areas with flammable vegetation rather than plains. Furthermore, the pastoral economy is widespread in Sardinia's hinterland, and pasture renewal takes place thorough stubble burning. Slope was significant with a negative correlation for both our variable responses only in Catalonia where the geomorphology of the landscape strictly affects flammable species, the microclimate, and ignition sources (Syphard et al. 2008).
Although Catalonia, Sardinia, and Apulia are characterized by different morphologies and landscapes, we found drivers with the same predictive power in explaining our dependent variables. For example, the anthropic drivers (e.g., distance to roads, distance from urban areas) suggest a negative relationship with fire ignition probability (presence/absence). This relationship was Fig. 5 Fire ignition maps: fire ignition probability (a-c) and frequency (d-f) for Catalonia, Sardinia, and Apulia, respectively. Low fire ignition probability and frequency are shown in green, while high fire probability and frequency are shown in red; gray indicates no data also found to be significant for fire ignition frequency. In the Mediterranean region, the strong influence of manmade infrastructures renders the landscape prone to fire ignition and spread (Martínez et al. 2009;Miranda et al. 2012;Martín et al. 2019). In landscapes with a predominance of anthropic activity, such as the Mediterranean countries of Southern Europe, such sources generate the majority of fires (Costa et al. 2011;Oliveira et al. 2014;Pavlek et al. 2016). In this geographical context, the main causes of fires are arson and negligence, which account for 98% of the total number of recorded events. The closer the distance to roads, the greater the fire probability (Zambon et al. 2019). For instance, in all three regions, road distance showed the highest predictive power representing one of the main drivers of fire occurrence. These results are in line with previous studies (Guo et al. 2016;Elia et al. 2020). Elia et al. (2019) showed that in Southern Italy, the number of fire ignitions increases as the distance to main roads decreases. This trend is particularly evident in Catalonia where the landscape, covered by a dense road network, is affected by elevated fire occurrence and frequency. In Sardinia as well, the highest fire ignition probability in terms of occurrence and frequency is recorded in correspondence with major roads. In Apulia, the situation is similar to that described above, where fire probability and frequency increase as urban interfaces increase (Badia et al. 2019;Ager et al. 2019). In our model, population density unexpectedly showed limited predictive power for all three regions; in Apulia and Sardinia, it was not significant. This is probably due to the fact that most fires are found in rural areas characterized by low population density, and fire ignition is triggered by pastoral activity and stubble burning (Lovreglio et al. 2010). Population density is higher in Catalonia than in the other two regions (Eurostat 2018). Fire ignition probability is concentrated along the coast where population density is higher and human pressure generates a greater amount of ignition sources (Badia et al. 2011;Serra et al. 2014). On the contrary, in the north of the region bordering France, the ignition probability and frequency of these events are considerably reduced. This is perhaps the result of low population density and the climatic factors in the Pyrenees mountains, which do not easily trigger fire ignition and spread (Faerber 2009).
Our study shows that among the land cover/uses, forest and shrubland represent the main drivers of fire probability in Apulia and Catalonia, while in Sardinia, agriculture detains the highest predictive power. According to the Third Spanish National Forest Inventory (Alberdi et al. 2017), forests and other natural areas (e.g., shrublands, grasslands) cover almost 60% of Catalonia's territory. We found a positive relationship between the presence of forest and fire occurrence in this region; i.e., fire ignition probability increases with a high percentage of forests. The Joint Research Centre (JRC) Technical report (San-Miguel-Ayanz et al. 2018) supports our finding, according to which 68% of fires in 2018 occurred in the woodlands of Spain. Although Apulia has the fewest number of forests in Italy, we found the maximum values of fire ignition probability and frequency in areas covered by woodlands. A previous study by Elia et al. (2019) suggests that the probability of ignition occurrence is relevant along the coast in the northern and southern parts of Apulia, especially in urban interfaces with a strong presence of shrubland and Mediterranean maquis. In addition, Sebastián-López et al. (2008) suggest that shrubs represent the main drivers of fire occurrence in their model for Southern Europe. Other studies (Nunes et al. 2005;Moreira et al. 2011) have demonstrated that the presence of shrubs is usually correlated with fire-prone Mediterranean ecosystems.
On the contrary, croplands, vineyards, and olive groves cover most of the northern Apulian region where fires are less likely to ignite and spread. In comparison with Apulia and Catalonia, agriculture is the main driver of fire occurrence and frequency in Sardinia. In this region's hinterland where forest stands dominate, fire ignition probability decreases, whereas along the coast where agricultural activity is predominant, fire ignition probability increases. The effect of agriculture on expected fire ignition probability has previously been reported by Bajocco and Ricotta (2008), who observed a positive relationship between pastoral practices, grazing pressure, and fire frequency and occurrence.
The prediction maps developed by the Hurdle models highlight the spatial effect of anthropic, climatic, and landscape drivers in detecting the most fire-prone areas in Apulia, Sardinia, and Catalonia. Forest managers and decision-makers can use these maps as tools for planning management interventions in a broader fire risk analysis. Although the objectives of this research were to estimate fire ignition probability and frequency by comparing the drivers that generate these patterns in three different regions of the Mediterranean Basin, potential limitations can be observed. To properly manage these events, which are becoming increasingly frequent and severe, building databases with similar acquisition methods at the European level is indispensable to study fire probability and frequency. Further, we intentionally avoided the inclusion of socioeconomic drivers of the three regions, since in Italy, these data are difficult to obtain, especially at the local scale. Nevertheless, this study represents the first approach to comparing fire occurrence and frequency and related drivers in different regions of the Mediterranean Basin. Future efforts are warranted to improve our understanding of fire occurrence and its drivers by comparing homogeneous regions (Giannico et al. 2018). Such efforts could support the European Commission in promoting good practice policies for monitoring and management as well as post-fire resilience activities.

Conclusions
In recent decades, the fire regime has changed becoming more frequent and severe due to climate change. In this regard, awareness of the similarities and differences that trigger fires in different regions of Europe is fundamental for the implementation of a standard intervention protocol. To meet this requirement, it is urgent to understand which main drivers influence fire occurrence and its frequency. Our study shows that all drivers (i.e., climatic, topographic, landscape, and anthropic) contribute to contrasting results in terms of fire probability and frequency. However, of these drivers, anthropic activity strongly influences fire ignition and its frequency across landscapes. Consequently, it is essential to monitor areas which are close to urban settlements by implementing long-to medium-term intervention plans. In this perspective, our study represents a potential decision-making tool for land management agencies in the three regions to identify the most vulnerable areas where major interventions are needed. Additionally, this new approach can introduce different applications in the field of wildfire research. Innovative future studies could focus on comparing multiple count data models (Hurdle models, zero-inflated models, Poisson model, negative binomial model) to understand which of these is the most representative in describing wildfire occurrences.