Land Cover Mapping in Cloud-Prone Tropical Areas Using Sentinel-2 Data: Integrating Spectral Features with Ndvi Temporal Dynamics

: Accurate remote sensing and mapping of land cover in the tropics remain di ﬃ cult tasks since data gaps and a heterogenic landscape make it challenging to perform land cover classiﬁcation. In this paper, we proposed a multi-feature classiﬁcation method to integrate temporal statistical features with spectral and textural features. This method is designed to improve the accuracy of land cover classiﬁcation in cloud-prone tropical regions. Sentinel-2 images were used to construct an NDVI stack for a time-series statistical analysis to characterize the temporal variance of land cover. Two statistical indices were calculated and used to represent the variation in annual vegetation. These indices included the mean (NDVI_mean) and coe ﬃ cient of variation (NDVI_cv) for the NDVI time series. The temporal statistical features were then integrated with spectral and textural features extracted from high-quality Sentinel-2 imagery for Random Forest classiﬁcation. The performance and contribution of di ﬀ erent combinations were assessed based on their classiﬁcation accuracies. Our results show that the time-series statistical analysis is an e ﬀ ective way to represent land cover category information contained in annual NDVI variance. The method uses clear pixels from dense low-quality images to obtain the NDVI statistical characteristics, thus, to reduce the inﬂuence of random factors such as weather conditions on single-date image. The addition of NDVI_mean and NDVI_cv can improve the separability among most types of land cover. The overall accuracy and the kappa coe ﬃ cient reached values of 0.8913 and 0.8514 when NDVI_mean and NDVI_cv were integrated. Furthermore, the time-series statistical analysis has less stringent requirements regarding image quality and features a high computational e ﬃ ciency, which shows its great potential to improve the overall accuracy of land cover classiﬁcation at regional scales in cloud-prone tropical regions.


Introduction
Land cover patterns denote fundamental land surface characteristics and underlying biophysical processes. Therefore, land cover has been regarded as an important climatic variable and environmental factor for modelling Earth surface processes related to anthropogenic activities and the natural This study was aimed at assessing the potential for discriminating land cover categories in cloud-prone tropical areas using NDVI statistical indicators derived from time-series Sentinel-2 images, combined with spectral and textural features derived from single-date Sentinel-2 imagery. We hypothesized that temporal statistical features can assist in discriminating between land cover categories with similar spectral reflectance in single-date imagery. Multi-feature classification was implemented using ensemble learning algorithms. In this paper, we detail the method and discuss its strengths and weaknesses.

Study Area
The study area was located in the Southeastern Mun River Basin, Thailand ( Figure 1). The area features a humid tropical monsoon climate with an annual temperature of 18 °C or greater and an average annual precipitation of 1,300~1,500 mm. The rainy season lasts from mid-May to the beginning of October, while the dry season lasts from November to April [29]. Natural and plantation forests are distributed in the southern mountainous area. Agricultural crops are cultivated on the northern plain. Crops can be cultivated at any time in the wet season due to the suitable hydrothermal conditions. They can also be cultivated in the dry season with the support of irrigation systems. The crops are harvested two or three times in a year in many areas. The spectral characteristics for crops can vary at different times and locations since the planting period is not limited by a fixed growing season [30]. The types of land cover were divided according to the natural conditions and land use in the study area. The first class was divided into forests (including natural and plantation forests as the second class), farmlands (including paddy fields and dry fields as the second class), water bodies, wetlands, and impervious surfaces.

Sentinel-2 Single-Date Image Data
The Sentinel-2 program is a land monitoring mission that is a part of the European Copernicus Program and provides unprecedented perspectives on terrestrial observations [31]. The Sentinel-2 platform is a twin-orbit constellation (Sentinel-2A/B) with a revisit time of approximately five days at the equator. The main MSI sensor provides 13 spectral bands at various spatial resolutions. These spectral bands include: four bands in the blue (b2), green (b3), red (b4), and infrared (b8) spectrum with a 10 m spatial resolution; six bands in the red-edge (b5, b6, b7), near-infrared (NIR) (b8a), and short-wave infrared (SWIR) (b11, b12) spectrum with a 20 m spatial resolution; and three atmospheric bands (b1, b9, and b10) with a 60 m spatial resolution. In the study area, a few

Sentinel-2 Single-Date Image Data
The Sentinel-2 program is a land monitoring mission that is a part of the European Copernicus Program and provides unprecedented perspectives on terrestrial observations [31]. The Sentinel-2 platform is a twin-orbit constellation (Sentinel-2A/B) with a revisit time of approximately five days at the equator. The main MSI sensor provides 13 spectral bands at various spatial resolutions. These spectral bands include: four bands in the blue (b2), green (b3), red (b4), and infrared (b8) spectrum with a 10 m spatial resolution; six bands in the red-edge (b5, b6, b7), near-infrared (NIR) (b8a), and short-wave infrared (SWIR) (b11, b12) spectrum with a 20 m spatial resolution; and three atmospheric bands (b1, b9, and b10) with a 60 m spatial resolution. In the study area, a few cloud-free images can be obtained in the dry season, while the rainy season lasts for a long period and features heavy cloud contamination. A cloud-free Sentinel-2A image (Figure 1) from Feb. 13, 2017, was downloaded Remote Sens. 2020, 12, 1163 4 of 18 from the European Space Agency's (ESA) Sentinel Scientific Data Hub (https://scihub.copernicus.eu/). The image is available as a Level-1C product with radiometric and geometric corrections. In data pre-processing, atmospheric correction was performed using the Sen2cor plug-in [32] to convert the Top-of-Atmosphere (TOA) reflectance to corrected Bottom-of-Atmosphere (BOA) reflectance values. Six bands were then resampled from a 20 m to 10 m resolution using the nearest neighbor sampling method within SNAP software. These bands were then stacked using four bands with a 10 m resolution for classification. The three atmospheric bands were excluded from analysis.

Stacking Time-Series Sentinel-2 NDVI
A total of 40 scenes composed of Sentinel-2A and Sentinel-2B images from 2017 were downloaded from the European Space Agency's (ESA) Sentinel Scientific Data Hub to calculate the NDVI temporal statistical characteristics. The imagery contained a cloud quality flag band, and the values were stored between 0-100. This value represented the probability of the pixel being covered by clouds from 0% to 100%. The pixels that were partially or fully covered by clouds were masked based on the cloud quality flag band of the image. The pixels with a cloud quality flag band value that was not 0 were then set as an invalid value to obtain all clear pixels in the 2017 period. The band math tool in ENVI software [33] was then used to calculate the NDVI value for the clear pixels in each image. Lastly, the final 40 NDVI image scenes were used to build the Sentinel-2 NDVI stack data for statistical analysis.

Ground Survey Data and Sample Datasets
The samples for training and validation were established using ground survey data and Google Earth [34] high-resolution remote sensing images. A total of 718 samples of typical land-use types were collected from ground surveys in February and August 2017. Moreover, 2598 samples were selected using visual interpretation within Google Earth. Stratified random sampling was carried out in ArcGIS 10.2 [35] based on the total number of samples. Additionally, 70% of samples (2,321) was used for model training. The remaining 995 samples, together with additional 3135 samples which were obtained by another researchers' independent visual interpretation of Google Earth imagery, were used for validation and accuracy assessment.

Spectral and Textural Feature Analysis
Spectral reflectance provides the basis for land cover classification. In the single-date imagery, four bands with a 10 m spatial resolution and six bands with a 20 m spatial resolution (after resampling) were used for land cover classification. The gray-level co-occurrence matrix (GLCM) method [36] was used to extract textural features. Using each spectral band to calculate texture will increase the feature space, and may hamper the classifier's performance. Before the texture calculation, a classical statistical method, the Principal Component Analysis (PCA) was applied to the cloud-free single-date imagery to decrease the high correlation between the spectral bands providing independent information [31]. After PCA, the original dataset was transformed into a new set of uncorrelated attributes called principal components (PCs). The cumulative contribution rate of the first PC reached 72.36% after transformation of ten spectral bands, and the first and second PCs contained over 95.58% of the original information. Therefore, the first and second PCs were used to extract textural features. Five texture indices were calculated for each PC in reference to Haralick [36] and Jensen et al. [37]. These indices included the angular second moment (ASM), contrast (CON), entropy (ENT), mean (MEAN), and correlation (COR). The window size was set as 5×5 and the step length was set as 1 in this study. A total of 10 textural features were obtained via this method.

Time-Series NDVI Statistical Indicator Analysis
In this study, the temporal features corresponded to annual NDVI statistical values and these were extracted from the time-series Sentinel-2 NDVI stack data. Two statistical indicators were employed, including the mean value (NDVI_mean) and coefficient of variation based on the NDVI (NDVI_cv).
(1) NDVI_mean The mean value reflects the concentrated trend in time-series NDVI data. The time-series dataset is represented as X = {x 1 , x 2 , · · · , x N }. N is the length of X, and the mean value (u X ) of X is calculated by (2) NDVI_cv The coefficient of variation reflects the dispersion degree and concentrated tendency of time-series NDVI data. This value is influenced by both the dispersion degree and the average population level. It is calculated by where u X is the mean value of the time-series data, and σ X represents the standard deviation and reflects the dispersion degree of time-series data. σ X is calculated as Both NDVI_mean and NDVI_cv were calculated for each pixel based on NDVI time-series stack data within a Python development environment.

Random Forest Classification
Random forest (RF) is an ensemble learning algorithm proposed by Breiman [38] for supervised classification with decision trees as the basic classifier. It is a non-parametric classification and regression method. Previous studies have demonstrated that the RF algorithm can process massive high-dimensional data while maintaining high accuracy [39,40]. The algorithm also has a more stable and robust response to noise and outliers in training data. The RF algorithm is more advantageous for estimating the importance of predictor variables compared with other machine learning algorithms [39,41]. The feature importance is calculated using the Mean Decrease in Gini (MDG), which measures how much a feature reduces the Gini Impurity metric in a class. The Out of Bag (OOB) error is then used for model evaluation to determine the importance of model features. In this study, a Python script was used to import the GDAL library [42] to read training sample data and the random forest classifier was imported from the Scikit-learn library. A total of 22 features (e.g., 10 spectral bands, 10 textural features, and 2 temporal features) were incorporated to construct the RF model for land cover classification. The parameter n tree (the number of trees created by randomly selecting samples from training samples) was set to 1000. The parameter m try (the number of variables used for tree node splitting) was set to 4 as the square root of the number of available layers. This method was developed in reference to previous studies [39,43].

Accuracy Assessment
In previous studies, an error or confusion matrix has mostly been used to assess the accuracy of remote sensing classification [44]. This approach can assist in identifying misclassification between different classes and potential sources of error. Furthermore, several metrics can be computed based on the confusion matrix to quantitatively assess the land cover classification accuracy. An accuracy Remote Sens. 2020, 12, 1163 6 of 18 assessment was conducted using three indicators, including the overall accuracy (OA), user's accuracy (UA), producer's accuracy (PA) [45], and the kappa coefficient [46].

Separability of NDVI_mean and NDVI_cv for Different Land Cover Types
Land cover samples were selected to calculate NDVI temporal statistics, and to assess the separability of NDVI_mean ( Figure 2) and NDVI_cv ( Figure 3) among all land cover categories ( Figure 3). To ensure inclusion of as much land cover variation as possible, we selected the points with a simple random sampling method for each land cover type for statistical analysis. The land cover types in these samples included natural forests (84), plantation forests (68), paddy fields (84), dry fields (84), water bodies (60), wetlands (48), and impervious surfaces (84).
Land cover samples were selected to calculate NDVI temporal statistics, and to assess the separability of NDVI_mean ( Figure 2) and NDVI_cv ( Figure 3) among all land cover categories ( Figure 3). To ensure inclusion of as much land cover variation as possible, we selected the points with a simple random sampling method for each land cover type for statistical analysis. The land cover types in these samples included natural forests (84), plantation forests (68), paddy fields (84), dry fields (84), water bodies (60), wetlands (48), and impervious surfaces (84).

NDVI_mean
In Figure 2, natural forests and plantation forests were observed to have similar NDVI_mean values, which fluctuated between 0.78 and 0.76, respectively. Both land cover types were significantly higher than other classes, indicating a good separability. The NDVI_mean values for paddy fields and dry fields were around 0.34 and 0.50, respectively. Overall, the NDVI_mean values for paddy fields were lower than the values for dry fields, with some overlap within the 0.35-0.45 range. The NDVI_mean values for both water bodies and wetlands varied significantly, with average values of -0.22 and 0.11, respectively. The values for these two land cover types could be partly used to differentiate them from other classes. However, wetlands were probably confused with water bodies in the -0.2-0.0 range and with paddy fields in the 0.2-0.4 range. The NDVI_mean value for impervious surfaces was around 0.19 and featured a better separability than most other land cover types except for wetlands and paddy fields.

NDVI_cv
The NDVI_cv statistics for all samples from each class are shown in Figure 3. Both natural and plantation forests featured low NDVI_cv values at around 0.25. This suggests that the annual NDVI value for these classes was more stable, and there was smaller spatiotemporal variation. Thus, it was beneficial to separate these classes from the other categories. Dry fields and paddy fields were characterized by the seasonal variation in NDVI values, with a larger NDVI_cv value of around 0.5. The NDVI_cv for wetlands was affected by the seasonal change in moisture and vegetation and varied significantly during the year. This led to higher NDVI_cv values compared to most other land cover types. The coefficient of variation was represented by the ratio of the standard deviation (σ) to the mean (μ) value. Some NDVI_cv outliers were found in wetlands and water bodies due to the small μ value. The NDVI_cv values for impervious surfaces were generally lower than 0.3, indicating a good separability from paddy fields and dry fields.

The Importance of Different Features
The results for the feature importance measure are listed in Table 1. The most important features included the NIR-2, Red edge-1, and Red bands, with values of 10.05%, 9.82%, and 7.82%, respectively. Green, SWIR-1 (B11), and Red edge-3 (B7) were also of high importance, ranking 5th, 7th, and 8th, respectively. MEAN_1 and MEAN_2 ranked 6th and ninth among all features, denoting their relatively high importance. However, other textural features were less important and

NDVI_mean
In Figure 2, natural forests and plantation forests were observed to have similar NDVI_mean values, which fluctuated between 0.78 and 0.76, respectively. Both land cover types were significantly higher than other classes, indicating a good separability. The NDVI_mean values for paddy fields and dry fields were around 0.34 and 0.50, respectively. Overall, the NDVI_mean values for paddy fields were lower than the values for dry fields, with some overlap within the 0.35-0.45 range. The NDVI_mean values for both water bodies and wetlands varied significantly, with average values of −0.22 and 0.11, respectively. The values for these two land cover types could be partly used to differentiate them from other classes. However, wetlands were probably confused with water bodies in the −0.2-0.0 range and with paddy fields in the 0.2-0.4 range. The NDVI_mean value for impervious surfaces was around 0.19 and featured a better separability than most other land cover types except for wetlands and paddy fields.

NDVI_cv
The NDVI_cv statistics for all samples from each class are shown in Figure 3. Both natural and plantation forests featured low NDVI_cv values at around 0.25. This suggests that the annual NDVI value for these classes was more stable, and there was smaller spatiotemporal variation. Thus, it was beneficial to separate these classes from the other categories. Dry fields and paddy fields were characterized by the seasonal variation in NDVI values, with a larger NDVI_cv value of around 0.5. The NDVI_cv for wetlands was affected by the seasonal change in moisture and vegetation and varied significantly during the year. This led to higher NDVI_cv values compared to most other land cover types. The coefficient of variation was represented by the ratio of the standard deviation (σ) to the mean (µ) value. Some NDVI_cv outliers were found in wetlands and water bodies due to the small µ value. The NDVI_cv values for impervious surfaces were generally lower than 0.3, indicating a good separability from paddy fields and dry fields.

The Importance of Different Features
The results for the feature importance measure are listed in Table 1. The most important features included the NIR-2, Red edge-1, and Red bands, with values of 10.05%, 9.82%, and 7.82%, respectively. Green, SWIR-1 (B11), and Red edge-3 (B7) were also of high importance, ranking 5th, 7th, and 8th, respectively. MEAN_1 and MEAN_2 ranked 6th and ninth among all features, denoting their relatively high importance. However, other textural features were less important and were placed at the end of the list. For the statistical indicators, the NDVI_mean ranked 4th in the list with a high importance value of 7.0%, but NDVI_cv only ranked 13th with a value of 4.48%. In Table 1, OOB accuracy was observed to increase as the number of features involved in the classification increased from 1 to 6. The OOB accuracy was observed to increase and then stabilize.

Comparison of Classification Results from Different Feature Combinations
All 22 features were selected as the input features for constructing an RF model to determine the contribution of spectral bands, textures features, and temporal indicators to land cover classification. Five combination schemes were designed for land cover classification, including:
Spectral bands, textural features, and NDVI_mean + NDVI_cv (SBTMC) The classification maps are shown in Figure 4. The overall land cover types featured similar patterns when classified using the five combination models. Paddy fields accounted for most of the study area. Natural forests were distributed in the south with well demarcated boundaries. Plantation forests were distributed on the periphery of natural forests and mainly in the transition zone between mountains and plains. There were a few patches of water bodies in the form of ponds and reservoirs. Impervious surfaces included clustered towns, scattered villages, and line-shaped impervious roads.
The confusion matrix for each combination was calculated to further compare the performance of different schemes (Tables 2-6).

Spectral Bands Only (SB)
The OA and kappa coefficient produced by the SB method reached 0.8051 and 0.7347, respectively. The PA value for the natural forests was 87.63%, and the UA value was 86.29%. Other land cover types such as plantation forests, paddy fields, and dry fields may have been wrongly classified as natural forests. The PA value for plantation forests was low (78.40%) compared to natural forests. The classification accuracy for dry fields was relatively low. The PA and UA for dry fields were 73.25% and 55.02%, respectively. Paddy fields were the most likely to be confused with impervious surfaces. The most confusion involved in dry field classification was caused by paddy fields, natural forests, and impervious surfaces. The UA value for water bodies was high (93.83%), but the PA was low (84.44%) due to the misclassification into paddy fields. Wetlands were prone to be misclassified as paddy fields and other types, resulting in a lower PA value (45.16%). Both the PA and UA for impervious surfaces were relatively low, with values of 59.12% and 56.97%, respectively. This was caused by the misclassification of impervious surfaces and farmlands.

Spectral Bands and Textural Features (SBT)
The addition of textural features improved the accuracy of most categories, and the OA and the kappa coefficient amounted to 0.8545 and 0.8010, respectively. The most obvious improvement occurred in impervious surfaces. PA increased from 59.12% to 81.13%, and UA increased from 56.97% to 79.38%. According to Table 3, textural features greatly reduced the confusion between impervious surfaces and farmlands. The proportion of impervious surfaces misclassified into paddy fields and dry fields decreased by 14.5% and 7.2%, respectively. Meanwhile, the proportion of paddy fields divided into impervious surfaces decreased by 3.71%. As a result, the PA and UA for paddy fields increased by 5.17% and 3.38%, respectively. Similarly, the UA for dry fields also increased from 55.02% to 60.24%. The addition of textural features also slightly reduced the confusion between natural and plantation forests. Moreover, the probability of misclassification between plantation forests and paddy fields also declined. It should be noted that not all categories showed improvement. The addition of textural features also caused some new confusion among specific land cover classes. For example, the PA for wetlands decreased from 45.16% to 29.03% since they were increasingly misclassified as paddy fields. Additionally, there was an increase in the number of paddy fields that were misclassified as natural forests.

SBT+NDVI_mean (SBTM) and SBT+NDVI_cv (SBTC)
The OA and kappa coefficient were further improved when temporal features (NDVI_mean or NDVI_cv) were added. This resulted in values of 0.8685 and 0.8204 for the SBTM combination, and 0.8714 and 0.8249 for the SBTC combination. According to Tables 4 and 5, each statistical indicator was observed to have a different contribution towards improving the classification accuracy.
The high NDVI_mean and low NDVI_cv values for natural and plantation forests further improved their distinction from other categories. NDVI_mean contributed more to the improvement of the UA for natural forests, while NDVI_cv was slightly effective in increasing the PA for natural forests. NDVI_mean and NDVI_cv also reduced the classification error for paddy fields. NDVI_mean was more effective for differentiating paddy fields from natural forests and dry fields, resulting in an increase in the PA value (90.02%). Moreover, NDVI_cv was better for correcting the misclassification of impervious surfaces as paddy fields, resulting in a higher UA value (92.65%). In dry fields, there was a greater improvement in the UA value after the addition of NDVI_cv. Impervious surfaces were much confused with paddy fields in both the SB and SBT models. There was less confusion following the addition of NDVI_cv. As a result, both the PA and UA values for impervious surfaces increased to 92.45% and 81.89%, respectively. Compared with the SB model, the PA and UA values for wetlands were observed to decrease after the addition of NDVI_mean or NDVI_cv ( Table 2). The UA values for water bodies also decreased in SBTC since wetland pixels were wrongly classified as water bodies.

SBT+NDVI_mean+NDVI_cv (SBTMC)
Both NDVI_mean and NDVI_cv were added to SBTMC. The OA and kappa coefficient was 0.8913 and 0.8514, or the highest among the five combinations (Table 6).
In most cases, SBTMC retained the advantages of both the SBTM and SBTC combinations. Additionally, NDVI_mean and NDVI_cv complemented each other, and improved the overall classification accuracy. For example, there was a reduced misclassification of paddy fields as natural forests or dry fields in SBTM. However, most paddy fields were misclassified as impervious surfaces (Table 4). In the SBTC model, most paddy fields were misclassified as natural forests or dry fields, but there were few misclassifications between paddy fields and impervious surfaces ( Table 5). The advantages of NDVI_mean and NDVI_cv were preserved when both variables were added into the SBTMC model. As a result, there was a reduction in the misclassification of paddy fields into other categories, and the PA increased to 91.01%.
Similarly, in SBTMC, NDVI_mean reduced the misclassification of dry fields as paddy fields. Moreover, NDVI_cv reduced the misclassification of impervious surfaces as paddy fields. As a result, the UA value for paddy fields reached 93.56% in the SBTMC model. Similar improvements could also be found in impervious surfaces and other categories.

The Advantages of Statistical Indicators for Land Cover Classification in Cloud-Prone Regions
Many studies have noted that the addition of time-series remote sensing data can improve the accuracy of land cover classification [17,40,47]. In this study, statistical indicators derived from time-series Sentinel-2 NDVI were proposed to represent the temporal variation of land cover, and were used to improve the capacity for discernment among different categories. These statistical indicators were observed to have several advantages.
First, the statistical indices have less stringent requirements regarding cloud contamination in time-series remote sensing images. Many studies have shown that high cloud cover in the tropics is the biggest obstacle to time-series analysis [8,28] In cloud-prone regions, few high-quality images are available throughout the year ( Figure 5) even for satellite platforms with a high revisit period [8] Furthermore, these few high-quality images are mostly acquired during the dry season. Additionally, it is difficult to use only a few images to extract key phenological information, and an accurate and complete phenology profile cannot be constructed due to data limitations [48,49]. However, temporal statistical analysis has lower image quality requirements. Some clear pixels can still be used for statistical analysis in the rainy season with severe cloud cover, which greatly improves the utilization rate of pixels.
Second, the statistical indicators can effectively represent the NDVI's annual variation characteristics. These indicators can reduce the influence of random factors on single-date imagery such as specific seasons or weather conditions. Natural vegetation grows vigorously throughout the year in the tropics due to the high moisture and heat conditions. Moreover, crop planting methods are flexible [16], and there is no fixed growing season for crops. This often leads to high variation in crop phenology across the region (Figure 6a), which results in different spectral characteristics for the same crop at different growing stages (Figure 6b). Moreover, the similar spectral characteristics among different land cover types makes it challenging for discriminating between classes using cost-effective and single-date remote sensing imagery data. For example, farmland can be planted or fallowed at any time of the year in tropical Southeast Asia. The spectral characteristics of fallowed farmland can easily be confused with impervious surfaces in clustered towns (Figure 6c). As a result, there is a large degree of misclassification between farmlands and impervious surfaces due to the spectral differences in imagery. However, the relatively low NDVI_cv values for impervious surfaces are different from the values for farmland, which is beneficial for reducing misclassification. Overall, temporal statistical analysis is more robust when participating in classification, especially for land cover types with a strong seasonal variation.
First, the statistical indices have less stringent requirements regarding cloud contamination in time-series remote sensing images. Many studies have shown that high cloud cover in the tropics is the biggest obstacle to time-series analysis [8,28] In cloud-prone regions, few high-quality images are available throughout the year ( Figure 5) even for satellite platforms with a high revisit period [8] Furthermore, these few high-quality images are mostly acquired during the dry season. Additionally, it is difficult to use only a few images to extract key phenological information, and an accurate and complete phenology profile cannot be constructed due to data limitations [48,49]. However, temporal statistical analysis has lower image quality requirements. Some clear pixels can still be used for statistical analysis in the rainy season with severe cloud cover, which greatly improves the utilization rate of pixels.  year in the tropics due to the high moisture and heat conditions. Moreover, crop planting methods are flexible [16], and there is no fixed growing season for crops. This often leads to high variation in crop phenology across the region (Figure 6a), which results in different spectral characteristics for the same crop at different growing stages (Figure 6b). Moreover, the similar spectral characteristics among different land cover types makes it challenging for discriminating between classes using cost-effective and single-date remote sensing imagery data. For example, farmland can be planted or fallowed at any time of the year in tropical Southeast Asia. The spectral characteristics of fallowed farmland can easily be confused with impervious surfaces in clustered towns (Figure 6c). As a result, there is a large degree of misclassification between farmlands and impervious surfaces due to the spectral differences in imagery. However, the relatively low NDVI_cv values for impervious surfaces are different from the values for farmland, which is beneficial for reducing misclassification. Overall, temporal statistical analysis is more robust when participating in classification, especially for land cover types with a strong seasonal variation. Moreover, temporal statistical computation is simpler and more efficient compared with other time-series analysis methods. For example, the NDVI values for forests in tropical areas are relatively stable throughout the year. As a result, the statistical indicators for a small number of available clear forest pixels can characterize high NDVI_mean values and distinguish forests from other land cover types.

Difficulty in Land Cover Classification at Fine Scales
In this paper, the integration effect for temporal information contained in time-series imagery was tested using spectral and textural features from single-date imagery. As a result, the discrimination was maximized for different land cover categories. However, misclassification still occurred for land cover types in the second classification category.

Natural Forest vs. Plantation Forest
It is difficult to differentiate natural forests from plantation forests due to their similar spectral and phenological characteristics [28,50]. In Figures 3 and 4, both NDVI_mean and NDVI_cv values were similar between natural and plantation forests, and the separability between them was insignificant. The main difference between natural and plantation forests was due to their textural features at the landscape scale. In the tropics, natural forests are characterized by rich species Moreover, temporal statistical computation is simpler and more efficient compared with other time-series analysis methods. For example, the NDVI values for forests in tropical areas are relatively stable throughout the year. As a result, the statistical indicators for a small number of available clear forest pixels can characterize high NDVI_mean values and distinguish forests from other land cover types.

Difficulty in Land Cover Classification at Fine Scales
In this paper, the integration effect for temporal information contained in time-series imagery was tested using spectral and textural features from single-date imagery. As a result, the discrimination was maximized for different land cover categories. However, misclassification still occurred for land cover types in the second classification category.

Natural Forest vs. Plantation Forest
It is difficult to differentiate natural forests from plantation forests due to their similar spectral and phenological characteristics [28,50]. In Figures 3 and 4, both NDVI_mean and NDVI_cv values were similar between natural and plantation forests, and the separability between them was insignificant. The main difference between natural and plantation forests was due to their textural features at the landscape scale. In the tropics, natural forests are characterized by rich species diversity and a stratified canopy. Conversely, plantation forests have fewer tree species with more apparent textural characteristics due to their regular row and plant spacing. Therefore, textural features have great potential for differentiating between the two types. However, the improvement in the misclassification of natural and plantation forests was observed to be limited when textural features were considered. According to the field surveys, there was a low canopy coverage in plantation forests during the early growth stages, which was greatly affected by understory vegetation. Moreover, the matured plantation forest had a dense canopy coverage that was quite similar to natural forests. The textural features extracted based on Sentinel-2 imagery with a 10 m resolution were not significant compared to natural forests. However, it should be noted that only a few textural features were tested in this study. As noted by other studies, textural characteristics are related to the size of the canopy, the window size for calculation, and the spatial resolution of remote sensing data [51,52]. The significance of textures and other features for classifying natural and plantation forests still requires further investigation.

Paddy Field vs. Dry Field
In this study, the addition of temporal statistical indicators greatly reduced the misclassification between farmlands and other land categories. However, it is still difficult to accurately classify paddy fields and dry fields within farmlands due to their similar spectral reflectance, textural features, and phenology patterns. Xiao et al. [53] used a time series of MODIS vegetation indices to map rice paddy fields in southern China based on the sensitivity of LSWI to increased surface moisture during periods of flooding. However, it is difficult to obtain images for these periods in cloud-prone areas. Moreover, there is a certain degree of saturated moisture both in paddy fields and dry fields during the rainy season. Some studies have confirmed the applicability of other data sources such as SAR for identifying paddy fields [54]. Radar signals are less affected by cloud cover and are sensitive to changes in moisture. The incorporation of radar backscatter information is expected to further improve the misclassification of paddy fields and dry fields.

Wetlands vs. Others
Wetlands in this study were defined as a shallow swamp with hydrophytic vegetation or forests that experienced seasonal flooding (e.g., mainly located in the flood plain). The highly dynamic variation in soil moisture and the seasonal variation in plant components can cause the NDVI_mean and NDVI_cv of wetlands to overlap with other land categories. In some cases, the addition of NDVI_mean and NDVI_cv may increase the difficulty of wetland identification. The difficulty in wetland discrimination is a result of both its highly dynamic nature and broad definition.

Uncertainty in Feature Importance Measures of the Random Forest Model
Past studies have demonstrated that the integration of multiple features is helpful for improving the detail and accuracy of land cover maps [55,56]. The contribution of each input feature is uncertain when working with remote sensing data at high spatiotemporal resolutions and with rich diverse features. In this study, the red-edge and shortwave infrared (SWIR) bands were found to be important for land cover mapping, which agrees with similar studies [57][58][59]. The near-infrared band (B8a) was also ranked as an important feature. Surprisingly, Immitzer et al. [20] found that Sentinel-2 s near infrared bands were among the least important channels. However, they also noted that only one set of images was analyzed. Other bands may be of high importance if the images are acquired in different seasons, or if the study area is characterized by different landscapes.
One of the common advantages of the random forest model is its ability to evaluate the importance of variables, which can be used as a basis for feature optimization or dimensionality reduction [56]. In this study, Mean Decrease in Gini (MDG) was used to calculate feature importance. However, the measure of variable importance should be considered with caution since it only represents the feature's prediction ability for the target variable in the current model. Further study is required to assess the role of variable importance on feature selection and optimization. Useful information can be omitted if features regarded with low importance are excluded in the process of optimum feature selection. Suppose that multiple features are associated with each other and any one of them can be used as an indicator (an excellent feature). Once a feature is selected, the importance of other associated features can drop sharply because the impurity is reduced by the selected feature. It is then difficult to select other features due to the lower impurity. In this approach, only the first feature has a high importance, and the other features tend to have a low importance. This leads to the misunderstanding that the features that were selected first have greater importance, while the other features are not as important. In this study, the NDVI_cv features were ranked as having a low importance. However, the classification results suggest that NDVI_cv can effectively improve the overall classification accuracy, especially for paddy fields and impervious surfaces. Therefore, when applying the random forest model, features should be selected according to both the order of importance and the relationship between different features.

Conclusions
This study demonstrates the advantages of using time-series Sentinel-2 NDVI statistical indicators combined with spectral and textural features from Sentinel-2 single-date imagery. This method can be used to accurately map dynamic land cover in cloud-prone tropical areas. The result also revealed that time-series statistical analysis is an effective way to represent temporal information contained in annual land cover variance. A time-series analysis can make use of clear pixels from dense low-quality images and reduce the influence of random factors such as weather conditions on single-date images. The discriminant analysis revealed that the NDVI_mean can effectively differentiate forests from other land categories and reduce the misclassification between farmlands and other land cover types to a large extent. NDVI_cv can also discriminate between impervious surfaces and fallowed farmlands. The overall accuracy and kappa coefficient were respectively 0.8545 and 0.8010 for spectral and textural features. When NDVI_mean and NDVI_cv were integrated, the overall accuracy and kappa coefficient reached 0.8913 and 0.8514, respectively. The time-series statistical analysis has less stringent requirements for analysis. It is also more robust in accounting for seasonal variation and is simple to calculate. The method has great potential when integrated with other features for mapping land cover in tropical cloud-prone areas at large scales with high accuracy.