Spatial point-pattern analysis as a powerful tool in identifying pattern-process relationships in plant ecology: an updated review

Ecological processes such as seedling establishment, biotic interactions, and mortality can leave footprints on species spatial structure that can be detectable through spatial point-pattern analysis (SPPA). Being widely used in plant ecology, SPPA is increasingly carried out to describe biotic interactions and interpret pattern-process relationships. However, some aspects are still subjected to a non-negligible debate such as required sample size (in terms of the number of points and plot area), the link between the low number of points and frequently observed random (or independent) patterns, and relating patterns to processes. In this paper, an overview of SPPA is given based on rich and updated literature providing guidance for ecologists (especially beginners) on summary statistics, uni-/bi-/multivariate analysis, unmarked/marked analysis, types of marks, etc. Some ambiguities in SPPA are also discussed. SPPA has a long history in plant ecology and is based on a large set of summary statistics aiming to describe species spatial patterns. Several mechanisms known to be responsible for species spatial patterns are actually investigated in different biomes and for different species. Natural processes, plant environmental conditions, and human intervention are interrelated and are key drivers of plant spatial distribution. In spite of being not recommended, small sample sizes are more common in SPPA. In some areas, periodic forest inventories and permanent plots are scarce although they are key tools for spatial data availability and plant dynamic monitoring. The spatial position of plants is an interesting source of information that helps to make hypotheses about processes responsible for plant spatial structures. Despite the continuous progress of SPPA, some ambiguities require further clarifications.


Background
In its large sense, structure is a central concept for describing relationships within a system and its patterns (Gadow et al. 2012). Forest structure commonly denotes the mode of the spatial distribution of tree attributes within a forest ecosystem and the association of their characteristics (Gadow et al. 2012;Hui et al. 2019). In these ecosystems, every single plant represents a structural component, with attributes such as species identity, abundance, size, and spatial arrangement (Hui et al. 2019). In spatial analysis, a forest stand is represented by a set of points or events (trees in our case), that is, a set of mapped point locations (x, y coordinates) within a study area (Wiegand and Moloney 2004;Pommerening et al. 2011). A spatial pattern means the organization of these points in space which shows some degree of predictability (Dale 1999). Thus, spatial structure of the stand can be described by that of the pattern (Goreaud 2000).
Ecological processes such as seedling establishment, biotic interactions, and mortality can leave footprints on species distributions that can be detectable based on spatial pattern analyses (Petritan et al. 2014;Law et al. 2009). In plant communities, studying spatial pattern is motivated by the fact that understanding these communities is firstly based on the description and quantification of their spatio-temporal characteristics (Dale 1999). For the reason that one species can exhibit either a positive or negative effect on the existence and spatial distribution of another species, one main result of spatial pattern is its influence on other species (Dale 1999). The main aim and challenge of spatial point-pattern analysis (SPPA hereafter) is relating pattern and process, but generally, assessing plant spatial pattern is relatively easy than identifying underlying processes (Perry et al. 2006;Velázquez et al. 2016). The observed structure may reflect related processes: for example a regular distribution could reflect a competitive interaction (Wiegand and Moloney 2004). Consequently, structure and process are dependent; particular structures create specific processes of regeneration, growth, and mortality; conversely, these processes create specific structural arrangements (Gadow et al. 2012).
Spatial structure and tree sizes are related and generally affected by negative interaction which can occur at two levels: below-growth for water and nutrients, and/or above-growth for light (Getzin et al. 2006;Yılmaz et al. 2019). Thus, in forest communities, competition, species growth, and their mortality are powerful organizing processes which shape species spatial patterns (Dale 1999;Getzin et al. 2008b;Gadow et al. 2012;Petritan et al. 2014). However, one should be careful when interpreting the observed pattern since the same pattern may be induced by several processes (Wiegand and Moloney 2004) such as regeneration, growth, competitive interaction, reproduction, and plant death mechanisms (Dale 1999). Similarly, conditional on environmental factors, the same ecological process can produce different spatial patterns (Perry et al. 2006).
Environmental factors are commonly assessed by first-order summary statistics (summary or characteristics) and are known to act at a large scale, while biotic interactions are detected by second-order summary statistics (SOSS hereafter) and are recognized to operate at a small scale. These SOSS are based on the overall point-to-point distances in a mapped region; they are key tools for identifying spatial pattern types and scales (Wiegand and Moloney 2004). Indeed, species interactions were linked to short-scale patterns while environmental variables affected large-scale ones (Ziegler et al. 2017). However, taking into account one component could not allow the detection of real mechanisms behind a given spatial structure (Zhao et al. 2015;Jia et al. 2016).
In spite of the continuous progress, the solid theoretical background, and the wide applications of SPPA in plant ecology, some difficulties may arise when conducting this analysis type due to many sorts of ambiguities. For example, there is no clear consensus about the minimum sample size required (in terms of points and plot area). Independent patterns may occur when using a low number of trees (e.g. Wehenkel et al. 2015;Cordero et al. 2016) which can obscure the real spatial pattern. Thus, it was the motivation for me to write this paper which is particularly addressed to beginners who did not have sufficient background on SPPA. Indeed, the main elements of SPPA applied to plant ecology are briefly reviewed, and then some related difficulties are discussed. Furthermore, this paper contains a wide range of literature on different aspects of SPPA such as analysis types, test functions, null models, and marks commonly used in ecological studies whose reader could directly consult according to its research question. Here, I did not focus on the rigorous mathematical formulas of test functions-for this, the reader should refer to the detailed textbooks (e.g. Stoyan and Stoyan 1994;Diggle 2003Diggle , 2014Illian et al. 2008;Wiegand andMoloney, 2004, 2014).

A simple overview on summary statistics
A spatial pattern can be represented by a point pattern (e.g. trees) which consists of a set of mapped point locations in a study area (Wiegand and Moloney 2004, Fig. 1). In a point process, each single tree can be considered a point. Therefore, SPPA studies the spatial arrangement of points. The sample plot is commonly called the observation window (symbolized by W) which is usually a rectangular or circular area and is selected to offer representative data of the investigated community (Pommerening et al. 2011).
Started in the 1970s, point process statistics for spatial structures became a mature discipline and increasingly used by researchers (Stoyan et al. 2017). Spatial point-pattern analysis has a long history in plant ecology and is based on a large set of test statistics known as summary statistics (Perry et al. 2006). They aim to evaluate and describe statistical properties and spatial structure of point patterns (Wiegand and Moloney 2014). Point patterns are often influenced by two types of effect (Goreaud 2000;Wiegand andMoloney 2004, 2014): 1. First-order effects: produce a variation in the intensity of point-pattern (i.e. the density, often symbolized by λ) in response to some causal variable (e.g. influence of soil properties on the presence of a plant). Thus, they are evaluated by firstorder statistics introduced by the point pattern intensity which represents the average number of points per unit area. Since it varies with position, the point's local intensity function depends on position x (i.e. λ (x)). 2. Second-order effects: result from interactions between points (e.g. facilitation assumed by adult plants towards recruits). Thus, they are assessed by SOSS which are based on the spatial relationships between pairs of points. In contrast to first-order statistics, many SOSS are available. Fig. 1 Conversion of field data (a) to a list of points coordinates (b) then to a point pattern (c) (Modified from Goreaud 2000). In the sampled area, each plant is represented by its Cartesian coordinates (x, y) which serve as a basis to represent stand spatial structure First-order statistics vary depending on environmental factors (e.g. topographic variables) and are known to act over a large scale, while SOSS are used to assess biotic interactions that are detected at a small scale (Ziegler et al. 2017). Second-order statistics are based on the overall small-scale point-to-point distances within a mapped area; they are key tools for identifying types and scales of spatial patterns (Wiegand and Moloney 2004). Accordingly, the correspondent spatial analysis methods are divided into two groups: 1. Nearest neighbour analysis (first-order and refined): analyse the intensity λ of a point pattern and its variation at a large scale (Wiegand and Moloney 2004). They consist of indices such as Clark and Evans index and refined nearest neighbour analysis such as nearest neighbour distribution function D k (r) and spherical contact distribution H s (r) (Wiegand and Moloney 2004;Illian et al. 2008; called also G(y) and F(y), respectively, by Diggle 2003). They calculate, for each point, a spatial structure index with respect to n nearest neighbours (generally n = 4). They can be applied for small sample plots and over limited distances (Goreaud 2000;Perry et al. 2006). 2. Second-order statistics: are functions which rely on a distance variable r and measure correlations between all pairs of points distant by r (Gadow et al. 2012). They are key tools for identifying spatial pattern types and critical scales below which significant interactions arise and at which distances they are neutral, positive, or negative (Perry et al. 2006;Wiegand and Moloney 2004;Wiegand et al. 2007a). These methods are very expensive (in terms of effort and time) requiring complete mapped points within a large area (Goreaud 2000). Among these statistics, Ripley's K-function K(r) (Ripley 1977) or its modified version L-function (Besag 1977) and the pair-correlation function g(r) (Stoyan and Stoyan 1994) are by far the most used by ecologists (Velázquez et al. 2016).
Both method groups have particular advantages and weaknesses. The first group is conceptually simple but short-sighted, that is, they only quantify the relationship between a tree and its n nearest neighbours and ignore what is beyond these neighbours (Stoyan and Penttinen 2000;Pommerening et al. 2011). However, keeping these limitations in mind, first-order nearest neighbour statistics are useful and may be considered a first step in spatial analysis (Perry et al. 2006). Because of the shortcomings of this method, SOSS are preferred when mapped data are available from a large observation window (Pommerening and Stoyan 2006). In the present paper, I focus on SOSS which provide detailed information about forest structure instead of first-order statistics. The readers who are interested in the first-order methods can found rich information in Pommerening (2008), Gadow et al. (2012), and Pommerening and Sánchez Meador (2018), as well as some interesting applications such as Zhang et al. (2018), Li et al. (2017), and Nguyen et al. (2018a). Recently, there are many studies that introduced interesting methods in SPPA (e.g. Wälder and Wälder 2008;Ledo et al. 2011;Stoyan et al. 2017;Ballani et al. 2019).

Second-order summary statistics
Unmarked spatial pattern analysis When only point locations (x, y) are considered, the corresponding analysis is commonly designated as "unmarked analysis". There are different levels of analysis: univariate analysis takes into account only one type of pattern which can be one species, one size (or age) class, one life stage, etc. In the bivariate mode, two patterns are investigated (two different species, two size classes such as adult vs. seedlings, two life stages such as understory vs. overstory, etc), while multivariate analysis studies more than two patterns. Univariate analysis is by far the most performed by ecologists (Velázquez et al. 2016). Currently, several SOSS are available (see Wiegand et al. 2013, Wiegand andMoloney 2014). Table 1 focuses on the most used SOSS found in the literature and related applications. In this table, the choice of the test functions was based on the range provided by the Programita software (Wiegand and Moloney 2014) since it is the most used by ecologists (Velázquez et al. 2016, Table 2), it is quite easy and has a detailed descriptive documentation (e.g. Wiegand 2014) which provides several application examples with their comprehensive interpretation. However, given that Programita is a specialized software designed specifically for SPPA, ecologists that are familiar with R (R Development Core Team 2019 using the "spatstat" package (Baddeley et al. 2015) can conduct different SPPA types with the wide range of packages available in R. Other interesting software is also available (see Szmyt 2014, p.23, Table 2).
Until the end of the twentieth century, the most used summary statistic was by far Ripley's K-function K(r) (Wiegand and Moloney 2004) and its modified version L-function L(r) (Besag 1977). However, due to the cumulative characteristic of K(r) (Wiegand and Moloney 2004), the pair-correlation function g(r) (Wiegand and Moloney 2004, Table 1) is becoming increasingly used. However, K-and g-functions are designated for homogeneous patterns (in term of intensity) (Pélissier and Goreaud, 2001) and environmental heterogeneity produces a spatial variation in intensity resulting in a systematic bias in the calculated functions (Schiffers et al. 2008) and Table 1 Second-order summary statistics commonly used in spatial point-pattern analysis. In the univariate analysis, only one pattern is involved (e.g. one species, one size or age class, one life stage, etc.), while in the bivariate version two patterns (1 and 2) are investigated (e.g. two different species, two size classes, two life stages, etc.). For the mark correlation analysis, the most studied mark is by far tree diameter. In the case of random labelling analysis, the marks usually consist of tree status (e.g. dead vs living). For all analysis types, positive, negative, or absence of departure from the null model simulation envelopes occurs at a given scale r Ripley's Lfunction L(r) (Ripley 1977;Besag 1977) L(r) = 0 Points of the pattern are randomly distributed L(r) > 0 Points of the pattern are aggregated L(r) < 0 Points of the pattern are segregated Kenkel 1988;Ward et al., 1996;Haase et al. 1996;Haase et al. 1997;Pélissier 1998;Eccles et al. 1999;Chen and Bradshaw 1999;Mast and Veblen 1999;Grau 2000 Table 1 Second-order summary statistics commonly used in spatial point-pattern analysis. In the univariate analysis, only one pattern is involved (e.g. one species, one size or age class, one life stage, etc.), while in the bivariate version two patterns (1 and 2) are investigated (e.g. two different species, two size classes, two life stages, etc.). For the mark correlation analysis, the most studied mark is by far tree diameter. In the case of random labelling analysis, the marks usually consist of tree status (e.g. dead vs living). For all analysis types, positive, negative, or absence of departure from the null model simulation envelopes occurs at a given scale r (Continued) The marks of points are similar to the mean marks (of the study plot) The marks of point that had another point nearby tend to be larger than the mean marks, i.e. positive correlation or mutual stimulation The marks of point that had another point nearby tend to be smaller than the mean marks, i.e. negative correlation or mutual inhibition The marks of neighbouring points did not show any spatial correlation The marks of a focal point that has another neighbour are larger than the mean mark, i.e. positive effect of nearby points on the marks The marks of a focal point that has another neighbour are smaller than the mean mark, i.e. negative effect of nearby points on the marks Raventós et al. 2011;Fedriani et al. 2015 r-mark correlation function k .m1 (r) (Illian The marks of points did not show any spatial correlation The marks of points are larger than the mean if they are nearby to a focal point The marks of points are smaller than the mean if they are nearby to a focal point Raventós et al. 2011;Ziegler et al. 2017 Table 1 Second-order summary statistics commonly used in spatial point-pattern analysis. In the univariate analysis, only one pattern is involved (e.g. one species, one size or age class, one life stage, etc.), while in the bivariate version two patterns (1 and 2) are investigated (e.g. two different species, two size classes, two life stages, etc.). For the mark correlation analysis, the most studied mark is by far tree diameter. In the case of random labelling analysis, the marks usually consist of tree status (e.g. dead vs living). For all analysis types, positive, negative, or absence of departure from the null model simulation envelopes occurs at a given scale r (Continued) The marks of two pattern points are not spatially correlated The marks of the two pattern points tend to have larger marks than the mean mark (positive correlation) The marks of the two pattern points tend to have smaller marks than the mean mark There is no effect of a pattern 2 point on the mark of the pattern 1 point.
The mean mark of focal points of pattern 1 that have a pattern 2 neighbour is larger than the plot mean mark (positive correlation) The mean mark of focal points of pattern 1 that have a pattern 2 neighbour is smaller than the plot mean mark (negative correlation) Ribeiro et al. 2021 Mark variogram γ m1m2 (r) (Illian et al. 2008) The distribution of point patterns 1 and 2 is independent from their marks The points of patterns 1 and 2 tend to have similar marks (positive correlation) The points of patterns 1 and 2 tend to have dissimilar marks (negative correlation)

Erfanifard and Stereńczak 2017
Qualitatively univariate marked analysis g 11 (r) (Stoyan and Stoyan 1994) g 11 (r) = 1 Points of pattern 1 are randomly distributed g 11 (r) > 1 Points of pattern 1 are aggregated g 11 (r) < 1  Table 1 Second-order summary statistics commonly used in spatial point-pattern analysis. In the univariate analysis, only one pattern is involved (e.g. one species, one size or age class, one life stage, etc.), while in the bivariate version two patterns (1 and 2) are investigated (e.g. two different species, two size classes, two life stages, etc.). For the mark correlation analysis, the most studied mark is by far tree diameter. In the case of random labelling analysis, the marks usually consist of tree status (e.g. dead vs living). For all analysis types, positive, negative, or absence of departure from the null model simulation envelopes occurs at a given scale r (Continued) Density of patterns 1 and 2 around pattern 1 is similar to that around pattern 2, i.e. absence of density-dependent effect Pattern 1 occurs preferably in areas with high density of patterns 1 and 2, i.e. negative densitydependence (densitydependent mortality) Pattern 1 occurs preferably in areas with low density of patterns 1 and 2, i.e. positive density dependence (density-dependent survival) Raventós et al. 2010Raventós et al. , 2011Velázquez et al. 2014;Jácome-Flores et al. 2016;Szmyt and Tarasiuk 2018;Miao et al. 2018 g 12 (r)g 11 (r) (Getzin et al. 2006) Pattern 1 is surrounded by pattern 2 in the same way as pattern 1 surrounds pattern 1, i.e. patterns 1 and 2 have similar spatial distributions Pattern 2 is more frequent around pattern 1 than pattern 1 around pattern 1, i.e. pattern 1 is negatively correlated. Pattern 2 show additional aggregation that is independent from pattern 1 g 12 (r) − g 11 (r) < 0 Pattern 1   causing a stronger positive autocorrelation than occurs in reality, called ''virtual aggregation'' (Wiegand and Moloney 2004). To deal with this problem, Schiffers et al. (2008) derived a new test statistic, termed K2-function, as an extension of existing summary statistics but it is very little used (Table 1). Besides, Kand g-functions remain key tools in analysing point patterns and the most used either in uni-or bivariate analysis (Velázquez et al. 2016; Table 1). Furthermore, the use of numerous test functions simultaneously allows reducing a lack of detection of an interaction between points and hence understanding underlying processes (Raventós et al. 2010). Many other summary statistics exist ; Wiegand and Moloney 2014) but I could not find their application in ecology (e.g. the proportion E(r) of points with no neighbour at distance r, the mean distance nn(k) to the k th neighbours). Refined nearest neighbour analysis (i.e. D k (r) and its related function H s (r), Diggle 2003;Illian et al. 2008) are less used by ecologists (Velázquez et al. 2016, Table 1). Besides, the reader is referred to Barot et al. (1999), . This analysis is noticeably advantageous in clarifying the interactions among tree species and the association between tree size variation and spatial scale (Hui et al. 2019). It is divided into two types: (1) quantitative marked analysis (QNA hereafter) which involves quantitative properties of plant (e.g. height, DBH, etc.) and (2) qualitative marked analysis (QLA hereafter) which implicates categorical characteristics such as species identity or their status (living vs. dead). Table S1 (See Additional file 1) shows the most marks used in the literature. Marked analysis is an important tool that aids in investigating distance-and density-dependent effects on trees (see below). Besides, in their evaluation of the state of SPPA in ecology, Velázquez et al. (2016) found that MA is rarely used by ecologists compared to unmarked analysis (see also Table 1). Furthermore, the authors found that QLA is more used than QNA. In QNA, the mark correlation function commonly symbolized by k mm (r) is by far the most used by ecologists (Table 1) and the DBH is the most used mark (Table S1), while partial pair-correlation functions g ij (r) (Stoyan and Stoyan 1994) are commonly used in QLA. For each analysis type, there is a set of statistical functions which allow testing different hypotheses (see Wiegand and Moloney 2014; Table 1). In literature, QNA is commonly designated as mark correlation analysis since it is usually carried out using k mm (r). Similarly, QLA is commonly referred to as random labelling analysis given that this is the most frequently used null model in this analysis type. Like the unmarked analysis, MA can be applied to uni-, bi-, or multivariate data.

Edge effect
The edge effect represents a frequent problem in SPPA (Wiegand and Moloney 2004). The issue is that neighbourhood interactions are not correctly considered at the sample plot boundary when potential neighbours lie outside the plot (Pommerening 2002

Null models
The selection of the suitable null model depending on the scientific question in hand is a crucial step for an appropriate interpretation of the results 2004;Carrer et al. 2018). There is a set of null models which differ from one analysis to another (uni-or bivariate, marked or not) and allow testing different hypotheses (Table S2, Additional file 1). In the unmarked univariate analysis, the complete spatial randomness (CSR) null model is by far the most used by ecologists (Velázquez et al. 2016) and it can be shown as a homogeneous Poisson process (a reference distribution) which assumes that the pattern intensity λ is constant in the study area (Wiegand andMoloney 2004, 2014). If the pattern is not homogeneous, the dependence revealed by K(r) may be more caused by firstorder heterogeneity than by interaction between points (Wiegand and Moloney 2004). For this purpose, the heterogeneous Poisson process (HP) null model is more appropriate since it takes into account environmental heterogeneity effects and allows disentangling second-order effects and thus capturing small-scale spatial structures (Wiegand and Moloney 2004;Carrer et al. 2018). For the bivariate case, the most commonly used null models are (1) the independence null model (Goreaud and Pélissier 2003, also called random superposition by Illian et al. 2008) assumes that the two point patterns were created by two independent processes (Wiegand and Moloney 2004). It is largely used to investigate the relationship between two different species, size classes, or life stages. (2) The antecedent condition (Wiegand and Moloney 2004) is usually used in the case of adultyoung relationships where the locations of pattern 1 (i.e. adults) is kept fixed while randomizing only pattern 2 (i.e. young) following a specific univariate null model (Wiegand 2004). The toroidal shift null model (Dale 1999) can also be used to test the independence of the two point patterns, it is applied by keeping pattern 1 unchanged and shifting the whole of pattern 2 by dealing with the study region (which had to be rectangular) as a torus (Wiegand 2004). If other pattern properties such as qualitative marks (also called labels) are integrated into the analysis, random labelling (Goreaud and Pélissier 2003; Tables 1 and S2) is the suitable null model (Wiegand and Moloney 2014). It assumes that the pattern points are generated by one process; then, a succeeding process created the marks which are randomly assigned to the points; thus, the null model focuses on the process responsible for assigning labels to points (Wiegand andMoloney 2004, 2014). It is mostly used for assessing density-dependent mortality. A wide variety of qualitative marks are used particularly tree status (living vs dead trees, see Table S2). Several test functions allow testing for random labelling hypothesis and can be used either for uni-or bivariate analysis (Wiegand and Moloney 2014; Table 1). When considering quantitative marks, the independent marking remains the suitable null model. Like random labelling, it keeps the location of points fixed and randomly shuffles the marks between all points; this allows eliminating potential spatial structure in the studied marks (Wiegand and Moloney 2014). Similarly, this null model can be used in both uni-and bivariate cases for different marks (Wiegand and Moloney 2014; Table S2) whose plant size remains the most used. In the literature, QNA is usually designated as mark correlation analysis due to the frequent use of the mark correlation function k mm (r) although several test functions are available (Wiegand and Moloney 2014; Table 1). Velázquez et al. (2016) pointed out that many studies used random labelling correctly for QLA, but there were some studies that confused the null models for independence and random labelling or even random labelling and independent marking. More details can be found in Wiegand and Moloney (2004, p. 226-227) for the choice of suitable null models. It is important to note that there are other null models which can be used in advanced analyses to test for more complex hypotheses (see Wiegand andMoloney 2004, 2014) such as the Poisson cluster process (Diggle 1983) and the hard-core process (Illian et al. 2008).
For all analysis types, the empirical curves of statistical functions are compared to the Monte Carlo envelopes generated by multiple simulations of the null model (Wiegand and Moloney 2004). There is a departure from the null model when the empirical curves fall outside the simulation envelopes. In order to test for the significance of this departure, a goodness-of-fit (GoF) is usually used (Loosmore and Ford 2006). The use of numerous test statistics simultaneously allows reducing a possible lack of detection of a departure from the null model and thus improving the understanding of underlying processes (Raventós et al. 2010;Wiegand et al. 2013).

Description of SPPA results and their interpretation
Preliminary considerations about nomenclature Different spatial patterns can be distinguished (Fig. 2). In literature, there are multiple synonyms for each spatial pattern with a certain distinction between those used in uni-and bivariate analyses. In the univariate case, aggregation, clumping, or clustering is usually devoted to positive interaction between points. Aggregation denotes that the pattern points are on average closer together than expected under the null model (Wiegand and Moloney 2014). Contrarily, segregation, uniformity, regularity, or hyperdispersion commonly describes negative interaction. Segregation designates that the points of the pattern are on average further apart than expected (Wiegand and Moloney 2014). Finally, the random pattern reflects the absence of interaction between points. In the case of bivariate analysis which involves two different point patterns, attraction (positive interaction), repulsion (negative interaction), or independence (absence of interaction) is commonly used. Similarly, attraction refers to a tendency for points of two different patterns to be closer than expected under a null hypothesis, whereas repulsion denotes a tendency for points to be farther apart than expected (Peterson and Squiers 1995). In fact, there is no interruption between terminologies used for uni-and bivariate analyses but some terms are more appropriate in some cases and not in others. For example, segregation is usually used for designating spatial separation between points of the same pattern instead of repulsion which is more suitable for two different patterns. However, other criteria are sometimes considered; for example, Szmyt and Tarasiuk (2018) distinguished between the terms of segregation and repulsion based on scale; while the first term is used to describe interaction at a small scale, the latter is defined for large scale. Additionally, some authors used unclear terminologies such as Barot et al. (1999) who used "spatial association" to describe positive interaction between two pattern types. Zhang et al. (2013) used the terms aggregation, regularity, and randomness for marked analysis which do not reflect the spatial correlation of tree sizes. Suzuki et al. (2005) used the expression "complete spatial randomness" as a synonym of "complete spatial independence" in the univariate analysis. Li et al. (2020b) used also similar terms (i.e. aggregation, regularity, and randomness) in either uni-or bivariate analysis. Besides, in literature, spatial pattern of points is commonly designated as spatial structure, spatial distribution, spatial association, or interaction. However, Suzuki et al. (2005) used the expression "spatial pattern" and "spatial association" for the uniand bivariate analysis, respectively.

Description of SPPA results
Generally, the description of resultant curves is relatively easy. For instance, if the g(r) function values are lower, higher than, or equal to the confidence envelopes, the pattern is designed as regular, aggregated, or random, respectively (Table 1). The description is similar for L(r) and O-ring statistics O(r). For k mm (r), if the curve is lower, higher than, or equal to the confidence envelopes, there is inhibition, stimulation, or absence of correlation between point marks, respectively. However, many differences exist from a function to another (see Table 1).

Unmarked analysis
Although the interpretation of obtained results may be a delicate step (Wiegand and Moloney 2004), the observed pattern may reflect related processes: a negative association would indicate competition, while a positive association would be related to facilitation. Numerous studies explained positive spatial patterns by various mechanisms particularly seed dispersal characteristics (De Luis et al. 2008;Martínez et al. 2010;Lan et al. 2012;Liu et al. 2014;Nguyen et al. 2018a) and/or shade tolerance (Hein et al. 2009;Wang et al. 2010b;Petritan et al. 2015;Erfanifard and Stereńczak 2017; Table S3, Additional file 1). Small-scale facilitation between species occurring at the same microsites can be due to their similar growth requirement (Martínez et al. 2010;Ledo et al. 2011), while Jia et al. (2016 found an opposite trend as well as Li et al. 2020a) who found that differences in species morphology and life characteristics reduce their direct competition. Many studies found that aggregation decreases and even disappears with increasing distances, which indicates a dispersal limitation effect (Nguyen et al. 2016). Seed dispersal mode directly influences the spatial distribution of tree recruitment and the spatial relationships with conspecifics (Seidler and Plotkin 2006). On the other hand, negative association between trees can be due to competitive effects. Intra-or interspecific competition starts instantly after the stand initiation stage (Yılmaz et al. 2019). Different growth rhythms of species may lead to interspecific repulsion (Comas et al. 2009). Ledo et al. (2011) reported that repulsion should be encountered between the young plants of two co-occurring species if they have dissimilar modes of seed dispersal, while Yang et al. (2018) pointed out that repulsion may occur between species which have similar strategies for resource use. Moreover, . Randomness, aggregation, and regularity are commonly reserved to univariate analysis involving one pattern type, while independence, attraction, and repulsion are confined to bivariate mode segregation may be due to the occurrence of seed production in different years and recruits may establish where seeds fall on favourable seedbeds (Fajardo et al. 2006). As a result, suitable sites for recruitment in a given year are mostly taken by only one species (Comas et al. 2009).
In fact, both facilitative and competitive effects can succeed during plant life, that is, there is no perpetual facilitation or competition either at the intra-or interspecific level. Species spatial patterns are related to their biological attributes and intraspecific associations at small scales change over life stages (Kang et al. 2014). As the forest grows, competition increases and leads to the death of weak members and a slight decrease in aggregation intensity with increasing life stages or tree sizes, resulting in a random or even uniform spatial distribution under the self-thinning process (Kenkel 1988;Kang et al. 2014;Nguyen et al. 2016). Accordingly, several studies showed a shift in species spatial pattern over life stages: from aggregation to regularity (Pseudotsuga menziesii var.  Table S3). Nevertheless, failure to detect a change towards regularity may merely reflect a weak competitive effect which does not lead to important mortality rate, but rather lead to a decline in growth (Gray and He 2009). In the case of plantation, the initial regular distribution pattern which reflects the initial spacing between planted trees instead of the interaction between trees (Szmyt 2014;Li et al. 2020b) could persist after decades due to competition or shift towards aggregation as a result of natural regeneration ). Ledo et al. (2014) reported that facilitation is the key process that dominates in the early life stage whereas competition becomes more important in the later stages. Lan et al. (2012) found similar results and concluded that the intensity of interaction is a function of species, life stage, and inter-tree distance. In addition, environmental conditions control species distribution, According to the stress-gradient hypothesis (Maestre et al. 2009), facilitation dominates under highstress conditions (abiotic or biotic) while competition is expected to be more intense in low-stress conditions (see Velázquez et al. 2014, Zheng et al. 2017, Bowman and Swatling-Holcomb 2017). Thus, interspecific competition and facilitation are incontestably powerful factors in shaping species spatial patterns (Dale 1999). Many studies which investigate simultaneously a set of species found almost positive interactions. For example, among the 18 species studied by Nguyen et al. (2016), 16 showed aggregation at different scales and irrespective of their abundance. Du et al. (2017) found also that, among 146 species they studied, 145 showed aggregation. The dominance of positive associations was also reported by Lan et al. (2012) among the 30 species they analysed. Motta and Lingua (2005) highlighted that aggregation is by far the natural situation, while a random or regular structure is related to earlier forest use such as livestock grazing and plantation. Tree spatial patterns were found to be strongly affected by thinning modes and harvesting intensity in planted forests . Wang et al. (2020a) found that Stipa grandis individuals were overdispersed in the ungrazed community while they were clustered in the grazed community. Li et al. (2020b) found that planted tree species had a regular spatial pattern while non-planted trees (i.e. natural regeneration species) experienced significant intraspecific aggregation. Baran et al. (2020) found that trees tend to be aggregated in unmanaged forests while they showed random patterns in managed forests. However, independent patterns were found to be intensely dominant in interspecific interactions in many subtropical and tropical natural forests (Nguyen et al. 2018a;Li et al. 2020b) supporting the unified neutral theory (Hubbell 2006).
Ecosystems are characterized by spatio-temporal heterogeneity (Saunders et al. 2005). Environmental heterogeneity was found to play an important role in species spatial aggregation (Getzin et al. 2006;Nguyen et al. 2016;Du et al. 2017). Indeed, the same species can show different spatial patterns in different stands. Moreover, many studies revealed that spatial patterns are also driven by species-specific traits (Du et al. 2017). It is important to note that there are sometimes complex structures which comprise a mixed pattern, i.e. presence of regularity at a given distance r and aggregation at other distances (Goreaud 2000).
Although most of the studies examined by Velázquez et al. (2016) carried out univariate analysis, bivariate (or multivariate) analysis is more important, since in natural ecosystems several species coexist and share common resources. Hence, explaining the coexistence of cooccurring species is one of the most challenges of plant ecology and information offered by SPPA can help to understand species coexistence mechanisms (Jia et al. 2016;Wiegand et al. 2021). Though this question remains far from being studied enough, several cooccurring species have been studied and different mechanisms were proposed to explain their coexistence (Mori and Takeda 2004;De Luis et al. 2008;Hein et al. 2009;Raventós et al. 2010;Wang et al. 2010aWang et al. , 2010bRaventós et al. 2011;Nanami et al. 2011;Iszkuło et al. 2012;Lan et al. 2012;Liu et al. 2014;Ledo et al. 2014;Petritan et al. 2015;Erfanifard and Stereńczak 2017;Szmyt and Tarasiuk 2018;Li et al. 2020a; Table S3) as well as for congeneric species (Acer sp.: Zhang et al. 2010, Symplocos sp.: Yang et al. 2018Quercus sp.: Collet et al. 2017;Yuan et al. 2018, Myrcia sp.: Ribeiro et al. 2021. Many theories were proposed to explain species coexistence; interesting information can be also found in Wilson (2011).

Marked analysis
Quantitatively marked analysis: distance-dependent effect Facilitative or competitive effects do not only result in species spatial aggregation or segregation but also influence their growth (e.g. height, diameter, etc.). Thus, MA is an important tool that aids in investigating distancedependent effects on tree growth (Fedriani et al. 2015). In contrast to the unmarked analysis, the terms of stimulation, inhibition, or independence are used in the case of positive, negative, or no correlation between tree marks, respectively. Indeed, positive correlation occurs at a distance r if the marks tend to have similar magnitudes; for example, larger (or smaller) trees are associated with each other. Negative correlation arises when neighbouring trees show some form of inhibition, that is, when a tree found close to a large tree tends to be small and vice versa (Wiegand and Moloney 2014;Pommerening and Särkkä 2013). In other words, if the closer trees are smaller than the mark average in the plot there is an inhibition due to competition, while there is stimulation due to facilitation effect if the neighbouring trees are larger than the mark average (Wiegand and Moloney 2014). Lastly, absence of correlation between tree marks can be observed in the case of independence.
Two types of inter-tree competition effect can be distinguished. When small and large trees are clumped (i.e. dominant and suppressed trees) with or without a high mortality of small trees compared to large ones, there is asymmetric or one-sided contest competition (Kenkel 1988;Raventós et al. 2010;Nanami et al. 2011;Fig. 3). Conversely, if small trees tend to be associated with small trees (i.e. clumps of suppressed trees), there is symmetric or two-sided scramble competition (Kenkel 1988;Raventós et al. 2010 ; Fig. 3). According to Goreaud (2000), symmetric competition occurs between two trees i and j when the influence of i on j is similar to that of j on i, while there is asymmetric competition when the influence of i on j differs largely from that of j on i.
In addition to natural processes, negative autocorrelation can result from anthropogenic disturbances (e.g. thinning, Pommerening and Särkkä 2013). In this case, the mark variogram is considered the suitable test function in QNA (Pommerening and Särkkä 2013, Table 1).

Qualitatively marked analysis: density-dependent effect
When plant size and/or their density increase, their requirements (e.g. nutrients, light) increase as well, and competition becomes intense. As a result, the risk of dying is expected to increase and the resulting death leads to a temporary reduction in tree density and competition intensity (Pommerening and Sánchez Meador 2018) which stimulates growth and allows again an increase in size (Fig. 4). This process is commonly known as density-dependent mortality or self-thinning mechanism (Kenkel 1988). In vegetation communities, the selfthinning process is widely observed in large size classes but can start in the seedling establishment stage as well (Moeur 1997). Indeed, conspecific competition may be responsible for plant death in the recruitment stage Fig. 3 Illustrations of the correlation between sizes (here DBH) of two neighbouring trees distant by r. In the case of two-sided scramble competition (a), the inter-tree competition is equal and the two neighbouring tree sizes are reduced, there is mutual inhibition. In the case of one-sided contest competition (b), one of the two trees is dominant and the other is suppressed due to asymmetric competitive abilities. In the absence of negative association, the neighbouring trees benefit from being close to each other and exercise facilitation (c) promoting mutual stimulation of their growth (Getzin et al. 2006). Several studies reported the importance of self-thinning in regulating species communities. It is usually assessed by the comparison between spatial patterns of pre-mortality (i.e. living and dead individuals) and post-mortality (i.e. living individuals) (He and Duncan 2000;Getzin et al. 2006;Omelko et al. 2018;Miao et al. 2018) using random labelling analysis. Negative intra-or interspecific density dependence arises when higher mortality (or lower survival) of trees is observed in denser patches of surviving conspecific or heterospecific trees, respectively (He and Duncan 2000). The selfthinning process can be assessed by analysing spatial pattern changes over a relatively long time interval (e.g. Lutz et al. 2014). Moreover, it can be evaluated by the comparison between size (or age) classes over life stages, that is, between young and adult spatial patterns. Indeed, recruits are frequently found to be aggregated at a small scale and a decrease of clustering degree can be observed with increasing plant size (from young to adult stage) indicating self-thinning process (Wiegand et al. 2007b) and resulting usually in regular distribution (Yao et al. 2016;Wang et al. 2017;Lv et al. 2019). However, the absence of detection of self-thinning may reflect very weak competition (Nguyen et al. 2016).

Main interactive drivers of plant dynamic
Species spatial arrangement reflects the dynamic of plant communities (Fig. 5). According to Goreaud (2000), this dynamic is driven by the interaction between three main components: 1) Natural processes, such as seed dispersal, recruitment, growth, and mortality, are closely linked to the two other factors which follow (local environment of plants and human intervention). Considering natural processes alone, positive or negative interactions may be observed (see above).
2) Local environment of plants: here we distinguish between biotic and abiotic components. The biotic environment includes all types of intra-and interspecific relationships between plants. Facilitative and competitive effects are relevant examples. For instance, larger "nurse plants" may exhibit an important facilitative effect on seedlings leading to a positive spatial association (Fajardo et al. 2006). Indeed, recruitment success and early survival seem to be improved by tree canopy shade at drier microsites by providing protection against site stresses (Fajardo et al. 2006;LeMay et al. 2009). Besides, the interaction between plant-herbivore (García-Cervigón et al. 2017) and plant-pest (Bassil et al. 2018) have an important effect on species spatial patterns. The abiotic component such as soil conditions (Fajardo et al. 2006;Zheng et al. 2017;Das Gupta and Pinno 2018) and topography (Zhao et al. 2015) are also important in determining species spatial structure. Nevertheless, both biotic and abiotic effects are synergic and the effect of one could be increased or attenuated by the other.
3) Human intervention, such as thinning, likely leads to negative autocorrelation resulting in the association of small and large trees (Pommerening and Särkkä, 2013). Gradel et al. (2017) studied the effect of thinning on tree growth and stand structure and found that tree Fig. 4 The relationships between growth, requirements, competition, and death in plant communities. When neighbouring plants grow, their requirements (e.g. light, water, nutrients) increase which leads to the increase of their competitive interaction resulting in a rise in mortality rate. The space freed up by individual mortality contributes to the decrease of competition intensity and stimulation of plant growth and/or density Fig. 5 The spatial structure of plant communities reflects its dynamic which is driven by the synergic effect of local environment, natural processes, and human operations. Spatial plant distribution reflects the community dynamic which is continuously modified by the interaction between natural processes and abiotic and biotic factors as well as anthropogenic land uses spatial pattern was mainly aggregated while became regular after thinning. They pointed out that before thinning negative interaction had a strong effect on tree growth while a significant reduction in competition was detected after thinning which promoted an important increase in species growth. Several studies also found a significant effect of anthropogenic land-use changes (Motta and Edouard 2005) and management practices (e.g. Motta and Lingua 2005;Bilek et al. 2011;Navarro-Cerrillo et al. 2013;Ghalandarayeshi et al. 2017) since the distribution of gaps formed after silviculture management will influence species recruitment, diversity, and survival (Ghalandarayeshi et al. 2017). Here, the gap dynamic should be taken into account when interpreting management effects particularly for shadeintolerant species. For example, Getzin et al. (2006) observed that Douglas fir regeneration depended on the presence of gaps within conspecifics which allow reducing the self-thinning mechanism. Briefly, a suitable interpretation of observed patterns should (1) take into account species biology and ecology, (2) consider environmental conditions, and (3) integrate the maximum information available on the disturbance or land-use history of the study site.

Sample plot size
In forest studies, the problem related to spatial scale is linked to sampling strategy, especially the size and number of sample plots (Carrer et al. 2018). In the literature, it is usually recommended to use a large plot size when performing SOSS but there is a lack of consensus about the minimum sample size required. Large plots are often advised in order to minimize edge effects (Pommerening and Stoyan 2006;Wiegand et al. 2013) which are relatively greater in small plots (Wiegand and Moloney 2014). In a study which aimed at the evaluation of the effect of plot size and sampling design in SPPA, Carrer et al. (2018) pointed out that negative effects of small plots cannot be entirely avoided despite the efficiency of edge corrections included in recent software (Velázquez et al. 2016). However, many studies were carried out in relatively small plots; for instance, Szmyt ( while the use of one small plot cannot allow detecting spatial patterns. The authors found that the accuracy of small plots (0.25 ha) was low and showed less consistent with the reference plot (4 ha). Thus, to analyse large-scale spatial patterns they suggested the use of plots larger than 1 ha in high-diversity forests. Nevertheless, in the absence of large plots, it is recommended to combine several small plots and their accuracy is improved using the HP null model instead of CSR in order to account for environmental heterogeneity and capture small-scale spatial structure (Carrer et al. 2018). Hence, small size plots are largely used in replicated analysis (Riginos et al. 2005;De Luis et al. 2008;Comas et al. 2009;Raventós et al., 2011;Petritan et al. 2014Petritan et al. , 2015Erfanifard and Stereńczak 2017;Ziegler et al. 2017;Erfanifard et al. 2019;Ben-Said et al. 2020;Wang et al. 2020a) where many small plots are sampled over a large study area then combined in one average function.

Permanent plot scarcity
Most studies conducted SPPA in permanent plots (Wiegand et al. 2007a;Bilek et al. 2011;Zhao et al. 2015) or constitute an opportunity for the creation of such plots (Velázquez et al. 2014;Li et al. 2017;Carrer et al. 2018;Yang et al. 2018;Zhang et al. 2018   for large time periods allowing various analysis types and responding to different research, management, and conservation purposes. However, many regions of the world do not benefit from continuous forest inventories, necessary equipment, human and/or financial resources to establish and monitor permanent plots. The absence of such plots constitutes a considerable gap which limits spatial data availability and thus application of SPPA which is too time consuming and requires hard fieldwork. Actually, the use of digital photography and geographical information system (GIS) allows better data collection and analysis of vegetation spatial point patterns ).

Number of pattern points
Despite the availability of rich literature on SPPA and its applications in the ecology field, the minimum number of points is not always respected. Pommerening and Stoyan (2006) recommended the use of at least 100 points, while Wiegand and Moloney (2014) recommended 50-70 points. However, several studies used a small number of points (Table 2). This was also mentioned by Velázquez et al. (2016) where approximately half of the studies they analysed performed SPPA with relatively few points (< 100). Cordero et al. (2016) used a very low number of trees in size-class analyses (e.g. 4 seedlings and 6 saplings). Wehenkel et al. (2015) used between 11 and 35 trees. It is important to note that some studies do not mention the number of trees used in their analyses (e.g. Navarro-Cerrillo et al. 2013;Zheng et al. 2017;Bassil et al. 2018). When long-term datasets are not available, most studies divide the total number of the investigated plant into many successive life stages based on tree size (e.g. diameter, height), age, or growth form (e.g. overstory, understory… being themselves based on tree size or age), then compare spatial distribution between these classes (Li et al. 2020a), resulting in a further reduction of the total number of points. In this case, the problem is usually related to the high occurrence of random (or independent) patterns which makes it difficult to conclude whether this pattern reflects a reality or it results from the low number of points. For instance, Wehenkel et al. (2015) found almost exclusively independence between smaller and larger trees which was suggested to be related to the lower number of these classes. Cordero et al. (2016) found similar results. Nevertheless, in other cases, non-random patterns may occur in spite of using few points. Indeed, Cordero et al. (2016) found non-random patterns within some tree classes (i.e. 13 adults) while they found random distribution when using a relatively higher number (e.g. 75 adults). Thus, the correlation between random patterns and low number of points is not always obvious. Therefore, Rajala et al. (2019) found that the power to detect biotic relationships is positively correlated to species abundance and interaction scale and its intensity, but it has a negative correlation with inequity in species abundances. Wehenkel et al. (2015) recommended using larger plot sizes (> 0.25 ha) in unevenaged and species-rich forests to distinguish less apparent, but important, interactions between spatial pattern, diversity, and functioning in these ecosystems. However, independent associations can result from the cumulative effect of several and complex processes (Getzin et al. 2014).
The sample size did not necessarily condition the number of individuals, but this number depends on the forest under investigation. For example, Ben-Said et al. (2020) established a plot with 20 m of radius which contained 79 trees, while another plot with only 15 m of radius contained 98 trees. Moreover, Comas et al. (2009) used circular plots with 20 m of radius that contained approximately 100 trees of pine resulting in plots ranging from 0.04 to 0.16 ha. Thus, to deal with this problem, a combination between plot size and number of points seems to be a suitable alternative, that is, the sample size can be conditioned by the number of individuals involved. Despite being considered a small size sample, plots containing less than 50 points can be considered the first step especially in exploratory studies to make preliminary hypotheses for forests that have not been the subject of previous spatial pattern studies (Ben-Said et al. 2020), as well as when sufficient equipment and financial or human resources are lacking. Thus, the choice of a sample size as a function of the number of individuals involved seems to be a trade-off between the minimum number required in SPPA and the difficulty of establishing large plots.

Conclusion
The spatial position of plants represents an interesting source of information and permits to infer the processes responsible for plant spatial arrangements. Many researchers developed a range of statistics, known as summary statistics, which allow characterizing and interpreting spatial stand structure. The availability of spatial pattern studies in plant ecology offers a rich basis to test several hypotheses and theories. Despite the solid theoretical background of SPPA and the wide related applications, some aspects of SPPA remain unobvious such as the minimum number of points and the plot size required. The correlation between low tree abundance and random patterns remains controversial. Even in similar forest communities, inconsistent results have been reported. Indeed, the large variability observed in the sample size among studies offers more flexibility. In exploratory studies, small sample sizes can provide a basis to make preliminary hypotheses on the observed patterns. On the other hand, in many regions of the world, the lack (and even the absence) of permanent plots and/or periodic forest inventories constitute a gap for conducting consistent monitoring of spatial and temporal evolutions of forest structure. Therefore, more attention should be paid to spatial pattern data especially for the main forest essences. This data type is also recognized to be important in management purposes.
By its rich literature, this paper offers an important range of information and can largely aid beginner ecologists in (1) taking into account some basic requirements of SPPA, (2) choosing directly scientific studies based on different SPPA types and related characteristics (uni-/ bi-/multivariate analysis, unmarked/marked analysis, test functions, types of marks, null models, software, etc.), and (3) raising more attention to key ambiguities which are a source of SPPA difficulties which need further clarifications.