Checking the Consistency of Volunteered Phenological Observations While Analysing Their Synchrony

The increasing availability of volunteered geographic information (VGI) enables novel studies in many scientific domains. However, inconsistent VGI can negatively affect these studies. This paper describes a workflow that checks the consistency of Volunteered Phenological Observations (VPOs) while considering the synchrony of observations (i.e., the temporal dispersion of a phenological event). The geographic coordinates, day of the year (DOY) of the observed event, and the accumulation of daily temperature until that DOY were used to: (1) spatially group VPOs by connecting observations that are near to each other, (2) define consistency constraints, (3) check the consistency of VPOs by evaluating the defined constraints, and (4) optimize the constraints by analysing the effect of inconsistent VPOs on the synchrony models derived from the observations. This workflow was tested using VPOs collected in the Netherlands during the period 2003–2015. We found that the average percentage of inconsistent observations was low to moderate (ranging from 1% for wood anemone and pedunculate oak to 15% for cow parsley species). This indicates that volunteers provide reliable phenological information. We also found a significant correlation between the standard deviation of DOY of the observed events and the accumulation of daily temperature (with correlation coefficients ranging from 0.78 for lesser celandine, and 0.60 for pedunculate oak). This confirmed that colder days in late winter and early spring lead to synchronous flowering and leafing onsets. Our results highlighted the potential of synchrony information and geographical context for checking the consistency of phenological VGI. Other domains using VGI can adapt this geocomputational workflow to check the consistency of their data, and hence the robustness of their analyses.


VGI Attribute Consistency
Because of the progress in information, communication and mobile location-aware technologies, the use of volunteered geographic information (VGI) has dramatically increased in recent times [1,2]. VGI is useful since it permits citizens to act as human sensors and to, for instance, contribute to environmental research. However, the quality of the information provided by volunteers remains a concern [3,4]. In VGI, contributors produce geographic information without necessarily being experts in the study domain. Furthermore, volunteers have different levels of expertise in recognizing geographical phenomena [5]. As a result, the quality of VGI has always been a concern [6,7]. An essential VGI concern is information consistency [8].
VGI consistency refers to the degree of adherence to logical rules of data structure, attribution and relationships as described in the information specifications [4,9,10]. Attribute consistency refers to the consistency of attributes associated with specific locations, objects or events placed in geographic space, such as time (e.g., date). Traditionally, VGI attributed consistency is assessed using a direct comparison with reference data collected with defined quality standards [4,6,11]. For instance, Jokar Arsanjani et al. [7] investigated the attribute consistency of OpenStreetMap (www.openstreetmap.org) project data using the Corine Land Cover database and the pan-European Global Monitoring for Environment and Security Urban Atlas (GMESUA) dataset as the authoritative reference.
Yet, for time-critical domains such as disaster response [7] and nature observations, reference datasets are not always available [7]. An alternative approach consists of evaluating the available data along three intrinsic dimensions [12]: (1) crowdsourcing evaluation, (2) social measures and (3) geographic context. In crowdsourcing evaluation, VGI is manually checked and edited by multiple contributors to ensure the consistency. Social measures focus on the assessment of the contributors themselves as a proxy measure to the quality of their contributions. For example, the number of contributors [13] and positive and negative edits [14] are used as a measure of VGI consistency. Although both dimensions contribute to checking VGI consistency, they only rely on information about VGI contributors and disregard the geographic context in which the data was collected. The third dimension relates to Tobler's first law of geography, which states that "all things are related, but nearby things are more related than distant things" [15].
The geographic context approach allows considering spatial aspects in the consistency checks. For example, Schlieder and Yanenko [10] built a framework to determine the probable spatial inconsistencies in social reporting and Bonter and Cooper [16] tested the use of a smart filter system that uses the geographic location of species observations to identify counts of species that are too high or species that do not frequently appear on standard lists. Existing approaches to check consistency using the geographic context often use the geographic distance between VGI as the main and only parameter to include in the checks. Yet, contextual geoinformation are available now more than ever before, and they can be used to improve the consistency checks. For instance, checking VGI that is collected for ecological studies, such as phenology, can benefit from existing environment contextual geoinformation such as weather datasets.

Synchrony of Phenological VGI
Phenology is the study of periodic plant and animal life cycle events (phenophase) and how seasonal and inter-annual variations in weather conditions affect them [17][18][19]. Phenological studies benefit from VGI developments to acquire observations that support climate change studies at various scales [20][21][22]. National and regional phenological networks promote phenological research and curate ever-increasing collections of Volunteered Phenological Observations (VPOs) collected by large crowds of volunteers [1,2]. VPOs are available at fine spatial scales and provide timely and low-cost information [23][24][25]. Hence, VPOs support novel phenological studies that study the impact of climate change on plants and animals [26].
The analysis of the temporal dispersion of a phenophase across individuals of the same species, or phenological synchrony, could provide information about how context derives the timing of phenophase. Changes in phenological synchrony have ecological consequences for individual survival and ecosystem stability [27][28][29]. For example, low synchrony in plant flowering can hamper the expected random mating pattern because early bloomers will likely get pollinated by early plants, and late plants by late plants [30]. In regions with a marked seasonality, synchrony is strongly controlled by temperature variability [21,31]. Phenological synchrony is typically quantified by the standard deviation of Day of Year (DOY) of all the observations collected in a given area and year [21,[32][33][34]. This quantification method is sensitive to inconsistent VPOs [34,35]. Although VPOs support climate change studies such as synchrony analysis, the consistency of the observations provided by volunteers remains a concern.
Synchrony analysis is sensitive to anomalously early or late VPOs regarding their associated environmental conditions [35,36]. These so-called inconsistent VPOs can bias the results of trend analysis and modelling because they are not representative and decrease the ratio of signal to noise [25,37]. Inconsistent VPOs are expected because volunteers may perform observations at locations with environmental conditions that are not representative of the phenophases being monitored (e.g., they might report data for an individual plant growing under a unique micro-climate). Most VPO consistency checks rely on applying a threshold to identify outliers. These checks use only VPOs and they assume that most of the observations are consistent and can be used to estimate the (frequency) distribution of the reported dates and to recognize outliers [38]. However, this assumption is not always correct when working with volunteered observations.
The commonly used Tukey boxplot [39] uses 1.5 times the absolute value of the difference between the first and third quartiles of the annual DOYs to highlight outliers. This is a crucial step of abovementioned checks. The integration of historical and independent sources of geoinformation with VGI has been found to be an effective approach to improve the quality of VGI [40,41]. In phenology, there is contextual environmental information (e.g., temperature) that plays a key role in many phenological studies. Yet, this information is not used in consistency checking [37,42]. This contextual information has the potential to check the consistency of VPOs by integrating assumptions about the effect of context on VPOs.
This study checks the consistency of VPOs while taking the effect of inconsistent observation on phenological synchrony into account. In the next section, we describe a geocomputational workflow that uses the geographic location and DOY of VPOs, and the associated daily temperature at the locations where the observations were made. Here, we tested the workflow using VPOs from the Dutch phenological network and daily temperature time-series from the Dutch national weather service. The added value of our workflow is evaluated by comparing it with the results of a more classical outliers identification method, namely the Tukey boxplot.

VPOs and Temperature Datasets
This study is illustrated with VPOs from the Dutch phenological network Natuurkalender (www. natuurkalender.nl) (Nature's Calendar). This volunteer-based network was established in 2001 to monitor phenophases for a wide range of species in the Netherlands [43]. Every volunteer that can recognize plants and phenophases can contribute his/her observations via the website of the project. The number of observers and the locations of the VPOs vary from year to year. For more details regarding the VPOs and the available datasets, see Vliet et al., 2009. Considering the spatio-temporal coverage of the available observations, spring phenophases of four plants species were selected for this study: flowering of lesser celandine (Ficaria verna Huds), wood anemone (Anemone nemorosa L.), and cow parsley (Anthriscus sylvestris (L.) Hoffm), and leafing of pedunculate oak (Quercus robur L.). For each species, the annual number of VPOs is presented in Table 1. Lesser celandine and wood anemone are herbs that grow in forests, grasslands, and next to roads and waterways. Both herbs are intensively studied in Europe as an indicator of global environmental changes on the distribution and abundance of forest understorey [44]. Lesser celandine starts to flower early in March and wood anemone flowers around mid-March. Cow parsley is also an herb, which grows in rich, moist soils along roadsides and forest paths. It flowers from late March onwards. Pedunculate oak is a tree, often planted on sandy soils, which starts leafing around late April. Cow parsley and pedunculate oak have been found to be indicators of climate change by studies performed in Europe. For the period 2003-2015, the Natuurkalender database has a total of 3042, 1298 and 1811 flowering observations for lesser celandine, wood anemone and cow parsley, and 586 leafing observations for pedunculate oak. These observations only contain the geographic location provided by a volunteer (in the Dutch National Coordinate System; EPSG: 28992) and the date that a phenophase was first observed in a given year.
In the Netherlands and other temperate regions, temperature is the main driver of plant phenophases [33,45,46]. Hence, the annual temperature regimes of each observation were characterized using 1 km gridded estimates of daily average temperature, as downloaded from the Royal Netherlands Meteorological Institute (KNMI; https://data.knmi.nl/datasets). These grids, created by interpolating daily temperature records collected by 150 meteorological stations distributed across the country, were used to calculate Growing Degree Days (GDDs) by adding up the daily average temperature (T) above a base temperature (T b ) from the first of January of the year of the observation until the reported date of observation of each VPO [47]. We selected 0 • C as the base temperature (Equation (1)). This base temperature has been used for modelling the timing of leafing and flowering and their regional variation for several species in Western Europe [48][49][50].

Analysing Consistency and Synchrony of VPOs
The proposed workflow consists of four steps ( Figure 1) and uses the geographic location of the VPOs, their observed DOY and the GDDs accumulated until that DOY. The first three steps are used to check annual VPOs consistency. These steps connect, model and compare VPOs that are near to each other to identify sets of inconsistent observation. The fourth step focuses on modelling synchrony and optimizing the identification of inconsistent VPOs. The next paragraphs describe each of the steps in more detail. In the first step, the geographic locations of the annual VPOs were used to spatially group observations by connecting observations that are near to each other. This creates a "network graph". This approach was chosen because near observations locations are more likely to have similar weather and observations that are further apart. The spatial variability of meteorological parameters, other than temperature, is small across the Netherlands because of the relatively homogeneous topography (maximum difference in elevation of about 300 m), and a gradual and smooth steepness [52]. These annual graphs were made using a Delaunay triangulation and edges longer than a given threshold distance were pruned [53]. This triangulation method was chosen because it is computationally efficient and avoids long edges [9,54]. The pruning distance was set to 100 km because this value ensured good connectivity (each observation was connected to at least two other observations).
In the second step, the difference in the observed DOY of connected observations was modelled using their corresponding difference in GDDs. For each year, the Pearson product-moment correlation coefficient was calculated and a linear regression model was fitted [22,47]. The slopes of the regression lines (which represent the rate of spatial change in DOY per unit of GDDs) are used to define consistency constraints. More precisely, constraints were defined by fixing a maximum difference in DOY (ΔMax) for observations performed under similar environmental conditions. In In the first step, the geographic locations of the annual VPOs were used to spatially group observations by connecting observations that are near to each other. This creates a "network graph". This approach was chosen because near observations locations are more likely to have similar weather and observations that are further apart. The spatial variability of meteorological parameters, other than temperature, is small across the Netherlands because of the relatively homogeneous topography (maximum difference in elevation of about 300 m), and a gradual and smooth steepness [51]. These annual graphs were made using a Delaunay triangulation and edges longer than a given threshold distance were pruned [52]. This triangulation method was chosen because it is computationally efficient and avoids long edges [9,53]. The pruning distance was set to 100 km because this value ensured good connectivity (each observation was connected to at least two other observations).
In the second step, the difference in the observed DOY of connected observations was modelled using their corresponding difference in GDDs. For each year, the Pearson product-moment correlation coefficient was calculated and a linear regression model was fitted [21,46]. The slopes of the regression lines (which represent the rate of spatial change in DOY per unit of GDDs) are used to define consistency constraints. More precisely, constraints were defined by fixing a maximum difference in DOY (∆Max) for observations performed under similar environmental conditions. In practical terms, this indicates that for a given difference in GDDs, the difference in DOY should not exceed ∆Max, otherwise, the two reported DOYs are inconsistent to each other.
In the third step, the consistency constraint was checked by using a range of ∆Max values. Since ∆Max cannot be known a priori, the chosen range varies from one week to one month in steps of one day. The minimum ∆Max was set to one week to take genetic variation into account because two individuals of the same species that grow at the same location (i.e., under the same temperature regimes) may not start flowering (or leafing) at the same DOY. For each species and year, the inconsistent observations corresponding to each ∆Max value (i.e., VPOs refuted by more than one connected VPO) were highlighted and excluded. This resulted in 24 sets of consistent VPOs for each species. The optimal ∆Max for each species, and consequently the final set of inconsistent VPOs, was selected after performing the phenological synchrony analysis that takes place in the last step (step 4- Figure 1) of the workflow.
In the fourth step, the synchrony of consistent VPOs was analysed and modelled using the inter-annual variability in temperature as an explanatory variable. Wang et al. (2016) and Thackeray et al. (2016) proved that warmer days in late winter and early spring lead to less synchronous flowering and leafing onsets than colder transitions [34,54]. They showed that the annual standard deviations of flowering and leafing onset are significantly higher in years with a lower rate of change of temperature from winter to spring. In such years, the number of days prior to flowering and leafing onset that have high temperature increases. This results in a larger average GDD than in years with a higher rate of temperature change. For each set of consistent VPOs, the Pearson product-moment correlation coefficient between the annual standard deviation of the DOYs and the annual average of their GDD values was calculated. For each species, the optimal ∆Max corresponds to the value that maximizes the correlation. A linear regression to the annual standard deviation of the DOYs and average GDD of the final set of consistent VPOs was also fitted to model synchrony.
The fitness of this model was evaluated by its coefficient of determination (R-squared). The R-squared values of the synchrony model driven with consistent VPOs were compared with the R-squared values produced by the models fitted using: (1) the original and (2) the outlier-free (boxplot outliers excluded) VPOs. This comparison helps to understand the added value of the proposed workflow. The percentage of annual inconsistent and outlier VPOs was also calculated to compare the amount of data lost through the cleaning of observation by the proposed workflow and by the outlier-filtering method (based on the boxplot).

Results
The pruned Delaunay triangulation of VPO locations results in annual graphs of VPOs locations. For lesser celandine (Figure 2), wood anemone ( Figure A1) and cow parsley ( Figure A2), these graphs cover the whole of the Netherlands. However, the VPOs of pedunculate oak ( Figure A3) are mostly located in the centre of the country. For all species, a large dispersion of VPO locations from year to year was observed, which is intrinsic to volunteer monitoring network data. For example, the spatial variability found for lesser celandine shows that volunteers observed this species almost everywhere in the Netherlands however, in some years (e.g., 2007 and 2008) observations are more clustered in the western part of the country (Figures 3 and A4-A6). The four largest cities of the country (Amsterdam, Rotterdam, The Hague and Utrecht) are located in the western part. The level of spatial connectivity of VPOs is higher surrounding these cities. In these areas, the results of consistency check are more reliable because a larger number of VPOs are available to confirm or refute the consistency of the connected observations. Annual graphs for lesser celandine flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned.
The correlation between the difference in DOYs and the difference in GDDs was significant for all species. The average correlation coefficient was 0.91 for lesser celandine (Figure 3), 0.90 for wood anemone ( Figure A4), 0.93 for cow parsley ( Figure A5) and 0.90 for pedunculate oak ( Figure A6). This indicates the considerable influence of the accumulated daily temperature on the timing of flowering. The slopes of the fitted regression lines show the rate of change of the difference in reported DOYs per unit of difference in GDD (i.e., the steeper the slope, the larger the difference in the timing of the phenophase). The comparison of the annually modelled and reported difference in DOYs is used to identify 24 sets of inconsistent observations.  The correlation coefficient between the annual standard deviation of consistent DOY corresponding to each ΔMax value and the annual average GDD were calculated for lesser celandine (Figure 4), wood anemone, cow parsley and pedunculate oak ( Figure A7). The ΔMax that lead to the phenological synchrony model with the largest coefficient of determination was 13 days for lesser celandine, 20 days for wood anemone, seven days for cow parsley and 15 days for pedunculate oak. For example, the optimal ΔMax value indicated that the maximum difference in the timing of flowering of two lesser celandine plants (located less than 100-km apart and growing under similar temperature regimes) was 13 days. The small value of ΔMax for cow parsley flowering, a plant with a shallow root system, might be caused by the impact of other environmental parameters such as soil moisture and light intensity [56]. The correlation between the difference in DOYs and the difference in GDDs was significant for all species. The average correlation coefficient was 0.91 for lesser celandine (Figure 3), 0.90 for wood anemone ( Figure A4), 0.93 for cow parsley ( Figure A5) and 0.90 for pedunculate oak ( Figure A6). This indicates the considerable influence of the accumulated daily temperature on the timing of flowering. The slopes of the fitted regression lines show the rate of change of the difference in reported DOYs per unit of difference in GDD (i.e., the steeper the slope, the larger the difference in the timing of the phenophase). The comparison of the annually modelled and reported difference in DOYs is used to identify 24 sets of inconsistent observations. The correlation coefficient between the annual standard deviation of consistent DOY corresponding to each ∆Max value and the annual average GDD were calculated for lesser celandine (Figure 4), wood anemone, cow parsley and pedunculate oak ( Figure A7). The ∆Max that lead to the phenological synchrony model with the largest coefficient of determination was 13 days for lesser celandine, 20 days for wood anemone, seven days for cow parsley and 15 days for pedunculate oak. For example, the optimal ∆Max value indicated that the maximum difference in the timing of flowering of two lesser celandine plants (located less than 100-km apart and growing under similar temperature regimes) was 13 days. The small value of ∆Max for cow parsley flowering, a plant with a shallow root system, might be caused by the impact of other environmental parameters such as soil moisture and light intensity [55]. For all species, optimal inconsistent VPOs show a large difference in the reported DOY compared to their surrounding VPOs ( Figure 5). For all species, except cow parsley, the annual percentages of VPOs, highlighted as possible inconsistent observations, is smaller than the annual percentages of boxplot outliers (Figure 6 and A8). Inconsistent VPOs refer to unusually early or late DOYs with respect to the regional temperature regime of the observation site whereas boxplot outliers only identify very early or late observations for a given set of annual observations and, in consequence, do not consider the effect of regional contextual information. The annual boxplots for lesser celandine (Figure 7d), wood anemone, cow parsley and pedunculate oak (Appendix F) show that outliers are mostly located below the lower whisker of the boxplot. This indicates that the distribution of observed DOYs is not normally (Gaussian) distributed as the boxplot assumes. As a result, boxplot filters out several early VPOs that are consistent observations. Such consistent early VPOs can provide reliable information about advancement in the timing of phenophases. For all species, optimal inconsistent VPOs show a large difference in the reported DOY compared to their surrounding VPOs ( Figure 5). For all species, except cow parsley, the annual percentages of VPOs, highlighted as possible inconsistent observations, is smaller than the annual percentages of boxplot outliers (Figures 6 and A8). Inconsistent VPOs refer to unusually early or late DOYs with respect to the regional temperature regime of the observation site whereas boxplot outliers only identify very early or late observations for a given set of annual observations and, in consequence, do not consider the effect of regional contextual information. The annual boxplots for lesser celandine (Figure 7d), wood anemone, cow parsley and pedunculate oak ( Figure A8) show that outliers are mostly located below the lower whisker of the boxplot. This indicates that the distribution of observed DOYs is not normally (Gaussian) distributed as the boxplot assumes. As a result, boxplot filters out several early VPOs that are consistent observations. Such consistent early VPOs can provide reliable information about advancement in the timing of phenophases.  The synchrony analysis resulted in models that predict the standard deviation of the DOY of the selected phenophases using the annual average GDD. For consistent VPOs, the correlation coefficient between the standard deviation of the DOY and the annual average GDD was 0.78 for lesser celandine, 0.63 for wood anemone, 0.61 for cow parsley and 0.60 for pedunculate oak. These results suggest that the timing of flowering and leafing onsets for the species under study is more   The synchrony analysis resulted in models that predict the standard deviation of the DOY of the selected phenophases using the annual average GDD. For consistent VPOs, the correlation coefficient between the standard deviation of the DOY and the annual average GDD was 0.78 for lesser celandine, 0.63 for wood anemone, 0.61 for cow parsley and 0.60 for pedunculate oak. These results suggest that the timing of flowering and leafing onsets for the species under study is more

Discussion
This study presents a workflow to check the consistency of VPOs. Unlike purely statistical methods, our workflow uses the geographic location and the corresponding accumulation of daily temperature as independent sources of information. The workflow defines and evaluates consistency constraints based on the correlation between VPOs synchrony and the rate of change of temperature from winter to spring. We used the workflow to filter out phenological observations that do not provide regionally representative species-specific flowering and leafing DOYs and labels them as "inconsistent". Unlike existing threshold-based outlier detection methods, our workflow is based on The synchrony analysis resulted in models that predict the standard deviation of the DOY of the selected phenophases using the annual average GDD. For consistent VPOs, the correlation coefficient between the standard deviation of the DOY and the annual average GDD was 0.78 for lesser celandine, 0.63 for wood anemone, 0.61 for cow parsley and 0.60 for pedunculate oak. These results suggest that the timing of flowering and leafing onsets for the species under study is more synchronous in cold late winters and early springs than in warm ones. The comparison between the synchrony models, made from original, outlier-free and consistent VPOs, shows that using boxplots negatively impacts the quality of the model (Figures 7 and A9-A11). For all species, the R-squared value of the models based on consistent VPOs is larger than that of the models based on outlier-free data (Table 2). Moreover, the models based on consistent VPOs are more in line with the models made using original VPOs. Removing a large number of outliers ( Figure 6) by using the Tukey boxplot method leads to strongly distorted models at large geographical scales (c.f. Figure 7b). The application of our workflow improves the quality of the model (notice the improved R-squared values in Figure 7c).

Discussion
This study presents a workflow to check the consistency of VPOs. Unlike purely statistical methods, our workflow uses the geographic location and the corresponding accumulation of daily temperature as independent sources of information. The workflow defines and evaluates consistency constraints based on the correlation between VPOs synchrony and the rate of change of temperature from winter to spring. We used the workflow to filter out phenological observations that do not provide regionally representative species-specific flowering and leafing DOYs and labels them as "inconsistent". Unlike existing threshold-based outlier detection methods, our workflow is based on a geographic context approach to identify neighbours of VPOs. Moreover, the workflow not only considers the geographic context of VPOs, as used by Schlieder and Yanenko [11] and Bonter and Cooper [18], but it integrates this information with environmental contextual information to check the VPOs consistency. As a result, the workflow avoids filtering unusually early or late observations that are consistent with their environment.
Inconsistent VPOs can be caused by either species and/or phenophase misidentification or they can be true observations influenced by a microscale temperature regime that is hard to model using 1 by 1 km temperature data. In either case, inconsistent observations are not representative of the phenology of this species in the Netherlands. The high correlation found between the difference in DOYs and the difference in GDD of near observations on flowering and leafing onsets indicates that daily temperature is indeed relevant for the analysis of the selected species and phenophases. This is in line with the fact that daily temperature is a dominant factor for plant phenology in temperate and boreal regions [56]. This highlights the importance of storing metadata about volunteered observations to improve the temporal consistency in phenological databases.
Considering ∆Max as a proxy for the spatial variability of the timing of a phenophase under similar temperature conditions helps to constrain the temporal window in which the occurrence of a VPO is consistent. As ∆Max takes into account the geographic context, a quality control mechanism of VPOs based on this metric outperforms alternative methods solemnly based on data distributions. Given the increasing popularity of citizen science networks, we expect to get better estimates of ∆Max in the near future. The ∆Max metric can also help to understand the phenology of species. For instance, the relatively small value of ∆Max for cow parsley (seven days) indicates that the flowering onset of this species is controlled more strongly by temperature as opposed to the other selected species or phenophases. In consequence, this species shows the highest correlation between the difference in DOYs and the difference in GDD of VPOs. The initial ∆Max value (i.e., one week) that sets a maximum difference in DOY for observations performed under similar environmental conditions is not known. This value might be different in other study areas or for other plant species.
In this study, we used a pruned Delaunay triangulation to connect VPOs to their closest neighbour [57][58][59]. This approach has advantages over using a distance-based method, which connects all VPOs within a specific distance to the under-check observation. In the distance-based approach, a distance threshold has to be chosen. However, such a distance threshold is not always known and selection of an arbitrary distance threshold could have a negative impact on the analysis. Moreover, a distance-based approach would potentially lead to VPOs with no neighbours within their vicinity. However, the applied Delaunay triangulation method is more flexible and data-driven. It uses closer neighbours when available (leading to smaller triangles) and more distant VPOs when needed (leading to larger triangles).
Here, we assume that GDD accumulations drive the synchrony of the selected phenophases within a 100-km distance. The annual number of observations does not substantially affect the synchrony model of species. For instance, the annual number of observations of cow parsley is higher than the number of observations of wood anemone and pedunculate oak. Yet, the R-squared of the synchrony model found for this species is smaller than that of wood anemone and pedunculate oak. In the Netherlands, the level of spatial connectivity of VPOs using this distance threshold is high, however, this might not be the case when analysing larger areas. In such areas, the analysis might be hampered by annual graphs with a low level of spatial connectivity. Thus, advanced spatial point pattern tests could help to evaluate the spatial distribution of VPOs as well as to identify the similarity of VPOs distributions over the study period. For example, Andresen and Malleson provide various tests to measure the degree of density and of similarity of spatial point patterns [60].
Other temperature-driven metrics than GDD such as the average temperature, could be tested with our workflow. For example, Calinger et al. [61] showed the suitability of average monthly temperatures during the month of the phenophase and some number of months prior to the event to model phenological responses to temperature across many species. Moreover, GDD accumulations may not be the only or main driver of flowering and leafing onset in other study areas. The length of the chilling period [62,63], photoperiod [24,63], and precipitation and elevation [64] might also drive flowering onset in spring. For example, Studer et al. [65] used a multivariate regression to model the timing of wood anemone flowering as a function of temperature and precipitation. A similar model could be used during the consistency check phase of our workflow because it is generic enough to accommodate other phenological drivers.
Our workflow works for events that are synchronized, yet, this is the case for several ecological phenomena. In citizen science, several networks monitor environmental events that are weather-driven. Examples of such types of monitoring are the reporting of tick and mosquitoes bites and observation of migrating birds. For these types of phenomena, different types of weather data can be used to find inconsistent observations. The developed workflow can also be useful in these domains.

Conclusions
VGI has greatly contributed to phenological studies, leading to an improved understanding of plant and animal seasonality across the globe. In this respect, checking the consistency of volunteered phenological observations or VPO is a pre-requisite to ensure the validity and representativeness of VPO-based results. In this paper, we present a workflow designed to use geographical and contextual information associated with phenological observations to check the consistency of observations while analysing their synchrony. This workflow was used to improve our knowledge of the local impact of inter-annual temperature variations on the consistency and synchrony of VPOs from various plant species in the Netherlands.
Our results reveal that the most common method (boxplot) to filter outliers in VPOs substantially biased synchrony analysis of the timing of the spring flowering and leafing. Our results indicate that climate change and inter-annual weather variability determine changes in the synchrony of spring plant phenology. Given that several national and international initiatives facilitate and actively support the collection of VGI for ecological studies and that the open data movement is resulting in more contextual environmental information becoming available, the proposed workflow provides a unique opportunity to check the consistency of volunteered observations. Considering its general character, we think that this geocomputational workflow could be adapted to other kinds of VGI, hence contributing to the curation of this interesting source of geospatial data. Appendix A Figure A1. Annual graphs for wood anemone flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned. Figure A1. Annual graphs for wood anemone flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned.
ISPRS Int. J. Geo-Inf. 2018, 7, x FOR PEER REVIEW 15 of 24 Figure A2. Annual graphs for cow parsley flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned. Figure A2. Annual graphs for cow parsley flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned.  Figure A3. Annual graphs for pedunculate oak flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned. Figure A3. Annual graphs for pedunculate oak flowering onset observations. The graphs were made using a Delaunay triangulation and edges longer than 100 km were pruned.   observations of cow parsley flowering onset (see Figure A2). Correlation coefficient and rate of spatial change in DOY per unit of GDDs (i.e., slope of regression line) are given for each observation year. Figure A6. Linear regression fit between the difference in observed flowering DOY (number of days) and the difference in modelled GDD (number of degree days) for pairs of spatially connected observations of pedunculate oak leafing onset (see Figure A3). Correlation coefficient and rate of spatial change in DOY per unit of GDDs (i.e., slope of regression line) are given for each observation year. Figure A6. Linear regression fit between the difference in observed flowering DOY (number of days) and the difference in modelled GDD (number of degree days) for pairs of spatially connected observations of pedunculate oak leafing onset (see Figure A3). Correlation coefficient and rate of spatial change in DOY per unit of GDDs (i.e., slope of regression line) are given for each observation year. observations of cow parsley flowering onset (see Figure A2). Correlation coefficient and rate of spatial change in DOY per unit of GDDs (i.e., slope of regression line) are given for each observation year.