A Head / Tail Breaks-Based Method for E ﬃ ciently Estimating the Absolute Boltzmann Entropy of Numerical Raster Data

: Shannon entropy is the most popular method for quantifying information in a system. However, Shannon entropy is considered incapable of quantifying spatial data, such as raster data, hence it has not been applied to such datasets. Recently, a method for calculating the Boltzmann entropy of numerical raster data was proposed, but it is not e ﬃ cient as it involves a series of numerical processes. We aimed to improve the computational e ﬃ ciency of this method by borrowing the idea of head and tail breaks. This paper relaxed the condition of head and tail breaks and classiﬁed data with a heavy-tailed distribution. The average of the data values in a given class was regarded as its representative value, and this was substituted into a linear function to obtain the full expression of the relationship between classiﬁcation level and Boltzmann entropy. The function was used to estimate the absolute Boltzmann entropy of the data. Our experimental results show that the proposed method is both practical and e ﬃ cient; computation time was reduced to about 1% of the original method when dealing with eight 600 × 600 pixel digital elevation models.


Introduction
Raster data representation offers a number of important advantages over other options, hence the technology predominates in satellite imagery, digital elevation modeling, landscape gradient mapping, and other applications. The advantages of raster data modeling include its simple data structure, the ease with which location-specific data may be collected, and its suitability for representing continuous surfaces [1]. The quantification of information of raster datasets has long been a challenging work and have proven useful in many applications. For example, information held in raster form has been used to evaluate the performance of image fusion [2][3][4], and it is regarded as an essential reference for band selection in hyperspectral imaging [5,6].
Information production may be quantified through Shannon entropy [7], which is commonly used in many domains, such as computer graphics [3] and landscape ecology [8,9]. However, the application of Shannon entropy to spatial data especially for raster data has been questioned because it is ill-suited to quantifying spatial information that describes the spatial distribution of raster data [10].
Two solutions have been proposed to address this problem: the Shannon entropy model can be improved [11][12][13] or Boltzmann entropy can be used in its place [14][15][16][17][18]. Boltzmann entropy is the measure of the degree of system disorder, and, in theory, it is capable of quantifying the spatial information of raster data [19]. The first method to calculate the Boltzmann entropy of numerical raster data was proposed recently [18]. This approach defines a proper macrostate and then determines

The Absolute Boltzmann Entropy of Numerical Raster Data: Computation and Problem
A method for computing the absolute Boltzmann entropy of a landscape gradient is described in detail in Gao et al. [18]. First, a proper set of microstates must be defined, which can be done by resampling the original landscape gradient with a sliding window of 2 × 2 pixels, as shown in Figure 1. The number of microstates for each sliding window in the original landscape was uniquely determined based on the related macrostate. The relative Boltzmann entropy (S R ) of the original landscape can be computed by the Boltzmann entropy equation as where K B denotes the Boltzmann constant 1.38 × 10 −23 , W i is the number of possible microstates for i th sliding window, and m is the number of sliding windows. The base of the logarithm is usually set to 2, 10, or the natural log base, e, depending on the context. raster data was proposed recently [18]. This approach defines a proper macrostate and then determines the number of microstates. By using absolute Boltzmann entropy, which has zero entropy as its point of reference, the proposed method is able to compare different raster data in an absolute sense. The experimental results of Gao et al. [18] showed that their method can capture the spatial information held within a raster data structure. However, this method is not efficient because it includes a series of processes that are computationally intensive and time-consuming. This study aims to develop a method that can estimate the absolute Boltzmann entropy of numerical raster data in a way that is efficient. To do so, we adopted a classification method for a heavy-tailed distribution with a scaling pattern, estimated the relative Boltzmann entropy of each class, and computed the absolute Boltzmann entropy based on these estimates. We tested the performance of this approach with several experiments using numerical raster data.

The Absolute Boltzmann Entropy of Numerical Raster Data: Computation and Problem
A method for computing the absolute Boltzmann entropy of a landscape gradient is described in detail in Gao et al. [18]. First, a proper set of microstates must be defined, which can be done by resampling the original landscape gradient with a sliding window of 2 2 pixels, as shown in Figure 1. The number of microstates for each sliding window in the original landscape was uniquely determined based on the related macrostate. The relative Boltzmann entropy of the original landscape can be computed by the Boltzmann entropy equation as where denotes the Boltzmann constant 1.38 10 , is the number of possible microstates for th sliding window, and is the number of sliding windows. The base of the logarithm is usually set to 2, 10, or the natural log base,ｅ, depending on the context. A given landscape gradient can be represented with a hierarchy that spans from the most detailed level (i.e., the original landscape) to the most abstract level (i.e., a single pixel), as shown in Figure 2. The sum of all values is the absolute Boltzmann entropy ( ) of the landscape gradients, as represented in the equation A given landscape gradient can be represented with a hierarchy that spans from the most detailed level (i.e., the original landscape) to the most abstract level (i.e., a single pixel), as shown in Figure 2. The sum of all S R values is the absolute Boltzmann entropy (S A ) of the landscape gradients, as represented in the equation where L j denotes the j th level in the hierarchy, S R L j is the relative Boltzmann entropy of L j , and m is the total number of levels. The base of the logarithm is usually set to 2, 10, or e.
ISPRS Int. J. Geo-Inf. 2020, 9, 103 3 of 13 (2) where denotes the th level in the hierarchy, is the relative Boltzmann entropy of , and is the total number of levels. The base of the logarithm is usually set to 2, 10, orｅ. The method used by Gao et al. quantifies the spatial information of raster data, such as a landscape gradient. However, the iterative process for determining the number of microstates for each sliding window is slow and computationally demanding. Moreover, as the dimensions of the original landscape gradient increases, the number of sliding windows increases exponentially. For example, as the size of the original landscape expands from 10 10 pixels to 100 100 pixels, the number of sliding windows increases from 81 (9 2 ) to 9801 (99 2 ).
Computing the absolute Boltzmann entropy is also a process of iteration because each level is derived from the level below it through resampling (e.g., is derived from . Each pixel in can be generated from four related pixels in by this method. For example, the average of the four pixels in is the value of the corresponding pixel in . Further, the value of all pixels in can be obtained. By using "resample" tools with bilinear interpolation option in ArcGIS software, we changed the image shown in Figure 3a from 512 512 pixels to 256 256 pixels (Figure 3b).

Figure 2.
A visual representation of summing the relative Boltzmann entropies (S R ) of all levels to get the absolute Boltzmann entropy (S A ) of a landscape gradient (modified from Gao et al. [18]).
The method used by Gao et al. quantifies the spatial information of raster data, such as a landscape gradient. However, the iterative process for determining the number of microstates for each sliding window is slow and computationally demanding. Moreover, as the dimensions of the original landscape gradient increases, the number of sliding windows increases exponentially. For example, as the size of the original landscape expands from 10 × 10 pixels to 100 × 100 pixels, the number of sliding windows increases from 81 (9 2 ) to 9801 (99 2 ).
Computing the absolute Boltzmann entropy is also a process of iteration because each level is derived from the level below it through resampling (e.g., L 1 is derived from L 0 ). Each pixel in L 1 can be generated from four related pixels in L 0 by this method. For example, the average of the four pixels in L 0 is the value of the corresponding pixel in L 1 . Further, the value of all pixels in L 1 can be obtained. By using "resample" tools with bilinear interpolation option in ArcGIS software, we changed the image shown in Figure 3a from 512 × 512 pixels to 256 × 256 pixels (Figure 3b).
We established a baseline comparison by computing the absolute Boltzmann entropy for two images of different sizes (Figure 3) in the same operating environment-MATLAB running in 64-bit Windows 10 with an Intel Core i7-8750H CPU (2.20 GHz, 12.00 GB RAM). We recorded the time taken to compute the number of microstates, as well as the total computation time. The results of the baseline computation are shown in Table 1. We found that the time required to compute the number of microstates accounted for a large proportion of the total computation time. In addition, the larger image (512 × 512 pixels, 1218.5s) required about seven times as much computation time as the smaller image (256 × 256 pixels, 173.5s), even though its area was only four times larger. each sliding window is slow and computationally demanding. Moreover, as the dimensions of the original landscape gradient increases, the number of sliding windows increases exponentially. For example, as the size of the original landscape expands from 10 10 pixels to 100 100 pixels, the number of sliding windows increases from 81 (9 2 ) to 9801 (99 2 ).
Computing the absolute Boltzmann entropy is also a process of iteration because each level is derived from the level below it through resampling (e.g., is derived from . Each pixel in can be generated from four related pixels in by this method. For example, the average of the four pixels in is the value of the corresponding pixel in . Further, the value of all pixels in can be obtained. By using "resample" tools with bilinear interpolation option in ArcGIS software, we changed the image shown in Figure 3a from 512 512 pixels to 256 256 pixels ( Figure 3b).

A Strategy for Efficiently Estimating the Absolute Boltzmann Entropy
Since the sum of all relative Boltzmann entropies is the absolute Boltzmann entropy, we simplified our approach by assuming that the relative Boltzmann entropy of each level could be plotted against its hierarchy level to generate a curve. The value of the variable Level stands for the level in the hierarchy of a landscape gradient. An example of this process is shown in Figure 4. By extrapolating this curve, the complex process of computing the absolute Boltzmann entropy was changed into the relatively easy process of adding the values of all points on the curve. We established a baseline comparison by computing the absolute Boltzmann entropy for two images of different sizes (Figure 3) in the same operating environment-MATLAB running in 64-bit Windows 10 with an Intel Core i7-8750H CPU (2.20 GHz, 12.00 GB RAM). We recorded the time taken to compute the number of microstates, as well as the total computation time. The results of the baseline computation are shown in Table 1. We found that the time required to compute the number of microstates accounted for a large proportion of the total computation time. In addition, the larger image (512 512 pixels, 1218.5s) required about seven times as much computation time as the smaller image (256 256 pixels, 173.5s), even though its area was only four times larger.

A Strategy for Efficiently Estimating the Absolute Boltzmann Entropy
Since the sum of all relative Boltzmann entropies is the absolute Boltzmann entropy, we simplified our approach by assuming that the relative Boltzmann entropy of each level could be plotted against its hierarchy level to generate a curve. The value of the variable stands for the level in the hierarchy of a landscape gradient. An example of this process is shown in Figure 4. By extrapolating this curve, the complex process of computing the absolute Boltzmann entropy was changed into the relatively easy process of adding the values of all points on the curve.  The tail of the curve always asymptotically approaches the x-axis in the positive direction, and most points have small relative Boltzmann entropy values. This curve is considered to have a heavytailed distribution, which means that it is significantly right-skewed and can be described as having a "scaling" or "hierarchical" pattern, with "far more smaller things than larger ones" [20].
By assuming that the shape of the curve, and therefore its function, can be estimated from several points, we were able to predict the values of the other points and use these to estimate the absolute Boltzmann entropy of the numerical raster data. The points that we used to estimate the shape of the curve were close to the y-axis. This was the result of a fundamental limitation of the iterative process, which always starts with the original gradient and proceeds in sequence. These low-level points have The tail of the curve always asymptotically approaches the x-axis in the positive direction, and most points have small relative Boltzmann entropy values. This curve is considered to have a heavy-tailed distribution, which means that it is significantly right-skewed and can be described as having a "scaling" or "hierarchical" pattern, with "far more smaller things than larger ones" [20].
By assuming that the shape of the curve, and therefore its function, can be estimated from several points, we were able to predict the values of the other points and use these to estimate the absolute Boltzmann entropy of the numerical raster data. The points that we used to estimate the shape of the curve were close to the y-axis. This was the result of a fundamental limitation of the iterative process, which always starts with the original gradient and proceeds in sequence. These low-level points have a high impact on the trajectory of the curve and on the sum of the entropies, which makes them more critical than those points with smaller values. It was reasonable to estimate the shape of the curve from these critical points. This solution was developed to avoid having to compute the relative Boltzmann entropies at all levels, which should improve the computational efficiency of the entire process.
However, estimating the function of a curve based on several points is challenging. As the shape of the curve becomes more complex, more data is required to represent it accurately [21]. When the available data is insufficient, breaking the curve into stepwise classifications is the best way to reduce its complexity. For example, MacDonald et al. [22] used three different classes to simplify the curve, a lower tail, a plateau, and an upper tail. We identified an appropriate classification method to break up the curve, with the additional objective of extracting the essential points with larger relative Boltzmann entropy values. A representative value was chosen to represent all points in a given class, and several representative values were substituted into a function to obtain the full expression of that function, which was then used to estimate the relative Boltzmann entropies of other classes. Finally, the absolute Boltzmann entropy of the numerical raster data was computed based on these estimated relative Boltzmann entropies.
Following this line of thought, this paper proposes a strategy for efficiently estimating the absolute Boltzmann entropy as follows: • Analyze the data distribution and adopt a suitable classification method.

•
Estimate a function through a representative value of the most essential class and obtain the estimated relative Boltzmann entropies of other classes through this function.

•
Compute the absolute Boltzmann entropy of the numerical raster data based on the estimated relative Boltzmann entropies.

The Calculation of Absolute Boltzmann Entropy Based on Head/Tail Breaks
In this section, the approach of head/tail breaking has been used for estimating the relative Boltzmann entropies of each class for a given numerical raster dataset. The total absolute Boltzmann entropy can be obtained by summing up the estimation values of relative Boltzmann entropy of each class.

Adopting a Classification Method for a Heavy-Tailed Distribution with a Scaling Pattern
Good classification results can reflect the patterns of data. The results of classification will be entirely dependent on the classification method that is used to divide the curve, so it is crucial to adopt an appropriate classification method.
A classification method for data with a heavy-tailed distribution has been proposed recently [20]. The method sorts the data into two imbalanced parts based on the arithmetic mean-a head (higher than the mean) and a tail (lower than the mean). Given that this imbalanced structure may repeat in the head, these points are further divided into two parts until the points remaining in the head do not follow a heavy-tailed distribution. The final division of data points between the head and tail is generally about 20% and 80%, respectively [23]. However, "this condition can be relaxed for many geographic features, such as 50 percent or even more" [24].
To estimate the number of points contained in the head and tail of each class, this paper used this idea of head and tail breaks and set a target of 40% of points in the head and 60% in the tail. For example, for a numerical raster that was 100 × 80 pixels, we obtained 80 levels through resampling, but the 80 th level consisted of only one unit, which did not meet the calculation conditions established by Gao et al. Therefore, the final number of points on the curve was 79, of which 31 were sorted into the head (decimals were rounded down). This division was repeated until the number of points in the head equaled the number of points in the tail, or the number of points in the head reached 1. Since the heavy-tailed distribution considered in this paper had a scaling pattern, it contained a small number of high relative Boltzmann entropies. The tail from each step was considered a class, as was the head in the last step, as shown in Figure 5. The classification method used in this study not only classifies all points into one of several classes, but it reflects the patterns of data with a heavy-tailed distribution. More importantly, it extracts the most important points-those with the largest relative Boltzmann entropy values.

Estimating the Relative Boltzmann Entropies of Each Class
Since the range of values in a given same class is small, this paper chose to use the average of the points in each class as its relative Boltzmann entropy value.
The points in each classification were derived from the head of the upper classification, and the points contained in the last classification should have the highest relative Boltzmann entropies. We computed the relative Boltzmann entropies of these points, which was expedited through the use of the method borrowed from Gao et al. [18]. The average of these accurate relative Boltzmann entropies is the relative Boltzmann entropy of the last class. In the same way, the accurate relative Boltzmann entropies of the 1 class and the 2 class can be computed, in theory. However, if we were to compute the accurate relative Boltzmann entropy of every class, then we would not have solved the inefficiency problem.
Instead, we used the accurate relative Boltzmann entropies of several classes to estimate the relative Boltzmann entropies of other classes. This is valid because these relative Boltzmann entropies can be described approximately through a linear function. This paper chose the linear function . Recall that the first class through the 1 class all belong to the tail, while the last class belongs to the head. We substituted the accurate relative Boltzmann entropy of the 1 class into the linear function (i.e., the independent variable is 1 , and the dependent variable is the accurate relative Boltzmann entropy of the 1 class). We inversely solved the parameter and derived the expression of the function. Then, we computed the relative Boltzmann entropies of the remaining classes, which belonged to the tails.

Computation of the Absolute Boltzmann Entropy Based on Estimates
Thus far, we have obtained the number of classes, the number of points contained in each class, and the relative Boltzmann entropies of each class. The absolute Boltzmann entropy of the numerical raster dataset of interest can be computed as The classification method used in this study not only classifies all points into one of several classes, but it reflects the patterns of data with a heavy-tailed distribution. More importantly, it extracts the most important points-those with the largest relative Boltzmann entropy values.

Estimating the Relative Boltzmann Entropies of Each Class
Since the range of values in a given same class is small, this paper chose to use the average of the points in each class as its relative Boltzmann entropy value.
The points in each classification were derived from the head of the upper classification, and the points contained in the last classification should have the highest relative Boltzmann entropies. We computed the relative Boltzmann entropies of these points, which was expedited through the use of the method borrowed from Gao et al. [18]. The average of these accurate relative Boltzmann entropies is the relative Boltzmann entropy of the last class. In the same way, the accurate relative Boltzmann entropies of the (N − 1) th class and the (N − 2) th class can be computed, in theory. However, if we were to compute the accurate relative Boltzmann entropy of every class, then we would not have solved the inefficiency problem.
Instead, we used the accurate relative Boltzmann entropies of several classes to estimate the relative Boltzmann entropies of other classes. This is valid because these relative Boltzmann entropies can be described approximately through a linear function. This paper chose the linear function y = a × x. Recall that the first class through the (N − 1) th class all belong to the tail, while the last class belongs to the head. We substituted the accurate relative Boltzmann entropy of the (N − 1) th class into the linear function (i.e., the independent variable is [N − 1], and the dependent variable is the accurate relative Boltzmann entropy of the [N − 1] th class). We inversely solved the parameter α and derived the expression of the function. Then, we computed the relative Boltzmann entropies of the remaining classes, which belonged to the tails.

Computation of the Absolute Boltzmann Entropy Based on Estimates
Thus far, we have obtained the number of classes, the number of points contained in each class, and the relative Boltzmann entropies of each class. The absolute Boltzmann entropy of the numerical raster dataset of interest can be computed as where S A denotes the estimated absolute Boltzmann entropy of the numerical raster data, S R (c) is the estimated relative Boltzmann entropy of class c, n c is the number of points in class c, and k is the number of classes. For example, S R (1) and S R (2) are the averages of all points in the N th class and the (N − 1) th class, respectively. They can be computed precisely through the Gao et al. method [18].

Experimental Testing
We conducted several tests of the proposed method. As shown in Figure 6, the experimental data, i.e., four pairs of digital elevation models (DEMs), was the same as that used by Gao et al. [18]. The size of each DEM was 600 × 600 pixels, with each pixel representing the elevation in an area. The DEMs in Figure 6b2,c2 are smoother than the other DEMs, indicating a relatively low degree of heterogeneity. They contain fewer impossible microstates, which lowers their absolute Boltzmann entropy. To better analyze our results, we compared the results of these experiments with those of Gao et al. [18], as shown in Table 2.
where denotes the estimated absolute Boltzmann entropy of the numerical raster data, is the estimated relative Boltzmann entropy of class , is the number of points in class , and is the number of classes.
For example, 1 and 2 are the averages of all points in the class and the 1 class, respectively. They can be computed precisely through the Gao et al. method [18].

Experimental Testing
We conducted several tests of the proposed method. As shown in Figure 6, the experimental data, i.e., four pairs of digital elevation models (DEMs), was the same as that used by Gao et al. [18]. The size of each DEM was 600 600 pixels, with each pixel representing the elevation in an area. The DEMs in Figure b2 and c2 are smoother than the other DEMs, indicating a relatively low degree of heterogeneity. They contain fewer impossible microstates, which lowers their absolute Boltzmann entropy. To better analyze our results, we compared the results of these experiments with those of Gao et al. [18], as shown in Table 2. Figure 6. Eight digital elevation models, paired by letter, taken from the experimental data of Gao et al. [18].  Figure 6. Eight digital elevation models, paired by letter, taken from the experimental data of Gao et al. [18].  Table 2 reveals that the use of our proposed method dramatically reduced the computation time by between 99.5% (from 1482.9 to 7.8 s) and 99.8% (from 2842.2 to 6.9 s). However, their relative errors were unstable with the smallest at 0% and the largest reaching 58.1%.

Discussion
In the process of estimating the relative Boltzmann entropy of each class, we chose a linear function (y = a × x) and substituted the accurate relative Boltzmann entropy of the (N − 1) th class into the linear function to inversely solve the parameter α (inverse solution). This allowed us to obtain the expression of the function. We tried other function models, such as quadratic functions, power law functions, and exponential functions, but linear functions yielded results closest to the actual values.
We used the accurate relative Boltzmann entropies of several classes to generate a fitted linear function (y = a × x), which has been used to compute the relative Boltzmann entropies of the remaining classes. To identify the most efficient number of points (classes) to use, we modeled the entropies based on two, three, four, and five points respectively. Supposing x is the number of points, the classes involved will range from the N th class to the (N − (x − 1)) th class. When applied to the experimental DEM, the results of absolute Boltzmann entropies obtained by models with varying number of points are shown in Figure 7, with the relative errors tabulated in Table 3 and the computation times displayed in Figure 8 and Table 4.  Table 2 reveals that the use of our proposed method dramatically reduced the computation time by between 99.5% (from 1482.9 to 7.8 s) and 99.8% (from 2842.2 to 6.9 s). However, their relative errors were unstable with the smallest at 0% and the largest reaching 58.1%.

Discussion
In the process of estimating the relative Boltzmann entropy of each class, we chose a linear function and substituted the accurate relative Boltzmann entropy of the 1 class into the linear function to inversely solve the parameter α (inverse solution). This allowed us to obtain the expression of the function. We tried other function models, such as quadratic functions, power law functions, and exponential functions, but linear functions yielded results closest to the actual values.
We used the accurate relative Boltzmann entropies of several classes to generate a fitted linear function , which has been used to compute the relative Boltzmann entropies of the remaining classes. To identify the most efficient number of points (classes) to use, we modeled the entropies based on two, three, four, and five points respectively. Supposing is the number of points, the classes involved will range from the class to the 1 class. When applied to the experimental DEM, the results of absolute Boltzmann entropies obtained by models with varying number of points are shown in Figure 7, with the relative errors tabulated in Table 3 and the computation times displayed in Figure 8 and Table 4.     We found that all models yielded estimates similar to the accurate values, and we were unable to identify a consistent relationship between the estimates of different methods for a given DEM. As more points were used to fit the linear function, the computation time increased. The relative errors using a fitting with four points were minimal and the computation time was acceptable. Hence, we argue that four-point fitting was the most suitable solution. As the size of a numerical raster becomes larger, it may not prove adequate to substitute only one point into the linear function to obtain the  We found that all models yielded estimates similar to the accurate values, and we were unable to identify a consistent relationship between the estimates of different methods for a given DEM. As more points were used to fit the linear function, the computation time increased. The relative errors using a fitting with four points were minimal and the computation time was acceptable. Hence, we argue that four-point fitting was the most suitable solution. As the size of a numerical raster becomes larger, it may not prove adequate to substitute only one point into the linear function to obtain the final curve. For any raster, there is an optimal number of points used to generate a fitted linear function that maximizes accuracy and also of high calculation efficiency. Table 3 reveals the systematic failure of the models to estimate the absolute Boltzmann entropies of DEM b2 and DEM c2. In order to identify the possible reasons for this consistently high error, we plotted level against relative Boltzmann entropy for all eight DEMs, as shown in Figure 9. The negative slopes of DEM b2 and DEM c2 are steeper than those of the other DEMs. From Figure 6, it is clear that DEM b2 and DEM c2 began with smoother topography than the other DEMs, indicating low heterogeneity. This may be exacerbated by the process of reducing the resolution through the resampling technique, which tends to decrease heterogeneity even further for those images with lower heterogeneity. In other words, with the decrease of spatial heterogeneity, the The negative slopes of DEM b2 and DEM c2 are steeper than those of the other DEMs. From Figure 6, it is clear that DEM b2 and DEM c2 began with smoother topography than the other DEMs, indicating low heterogeneity. This may be exacerbated by the process of reducing the resolution through the resampling technique, which tends to decrease heterogeneity even further for those images with lower heterogeneity. In other words, with the decrease of spatial heterogeneity, the abundance of impossible microstates diminishes, which led the relative Boltzmann entropy of each level to shrink.
Since the heterogeneity of DEM b2 and DEM c2 started off small, the relative Boltzmann entropy of each level decreased faster than for the other DEMs, and the number of points with small values in the tail was more than for other DEMs. For DEM b2 and DEM c2, the average of all points in the first class were meager, but the estimations of the first class through a linear function will be much higher than the accurate average. Gao and Li [17] found that their resampling-based method paid more attention to the central locations than to the edge ones when computing the absolute Boltzmann entropy of a landscape gradient, which is not consistent with thermodynamics. To address this problem, they proposed an aggregation-based method for computing the absolute Boltzmann entropy. This method is similar to the resampling-based method, and a proper macrostate can be produced by aggregating the original landscape gradient with a sliding window whose size is 2 × 2 pixels. However, this approach removes the overlap while sliding the window, as described in full by Gao and Li [17]. The method only works if both the length and the width of the landscape gradient are the 2 n , where n is a positive integer. Two DEMs obtained from the Geospatial Data Cloud (http://www.gscloud.cn), with dimensions of 512 × 512 pixels ( Figure 10) were analyzed to evaluate the calculation accuracy of the absolute Boltzmann entropy in Table 5. abundance of impossible microstates diminishes, which led the relative Boltzmann entropy of each level to shrink. Since the heterogeneity of DEM b2 and DEM c2 started off small, the relative Boltzmann entropy of each level decreased faster than for the other DEMs, and the number of points with small values in the tail was more than for other DEMs. For DEM b2 and DEM c2, the average of all points in the first class were meager, but the estimations of the first class through a linear function will be much higher than the accurate average. Gao and Li [17] found that their resampling-based method paid more attention to the central locations than to the edge ones when computing the absolute Boltzmann entropy of a landscape gradient, which is not consistent with thermodynamics. To address this problem, they proposed an aggregation-based method for computing the absolute Boltzmann entropy. This method is similar to the resampling-based method, and a proper macrostate can be produced by aggregating the original landscape gradient with a sliding window whose size is 2 2 pixels. However, this approach removes the overlap while sliding the window, as described in full by Gao and Li [17]. The method only works if both the length and the width of the landscape gradient are the 2 , where is a positive integer. Two DEMs obtained from the Geospatial Data Cloud (http://www.gscloud.cn), with dimensions of 512 512 pixels ( Figure 10) were analyzed to evaluate the calculation accuracy of the absolute Boltzmann entropy in Table 5.   Table 5 reveals that the relative errors generated by the aggregation-based method were lower than those by the resampling-based method. It is possible that the resampling-based method is poorly suited for DEMs with low heterogeneity because it does not conform to its thermodynamic analog.

Conclusions and Future Work
In this study, we have proposed and tested a method for efficiently estimating the absolute Boltzmann entropy of numerical raster data. The proposed method borrows the idea of head and tail breaks to classify all points into a few classes. This classification scheme not only reduces the complexity of the curve but it also reflects the scaling or hierarchy pattern of a dataset with a heavytailed distribution. Meanwhile, the classification scheme naturally determines the optimal number of   Table 5 reveals that the relative errors generated by the aggregation-based method were lower than those by the resampling-based method. It is possible that the resampling-based method is poorly suited for DEMs with low heterogeneity because it does not conform to its thermodynamic analog.

Conclusions and Future Work
In this study, we have proposed and tested a method for efficiently estimating the absolute Boltzmann entropy of numerical raster data. The proposed method borrows the idea of head and tail breaks to classify all points into a few classes. This classification scheme not only reduces the complexity of the curve but it also reflects the scaling or hierarchy pattern of a dataset with a heavy-tailed distribution. Meanwhile, the classification scheme naturally determines the optimal number of classes and class intervals. The average of the data values in a given class is able to serve as the representative value of that class because the differences among them are comparatively small. This proposed method substitutes the representative value into a linear function, and the expression of that function can be obtained through an inverse solution. The remaining representative values can be estimated from this function. After getting these estimates, the method computes the absolute Boltzmann entropy of the numerical raster data.
Our experimental results showed that the proposed method achieved higher computational efficiency than the previous method when applied to the same data as Gao et al. [18]. The new method saved about 99% of the computation time when applied to eight 600 × 600 pixel DEMs. However, two DEMs with low heterogeneity (DEM b2 and DEM c2) encountered high error rates, and the relative errors of the remaining six DEMs were not sufficiently stable. To solve this problem, we tested a range of numbers of points to fit the linear function. These results showed that four-point fitting was the most suitable solution. Computation time was reduced by about 98% when dealing with eight 600 × 600 pixel DEMs. Relative errors of the six heterogeneous DEMs were relatively stable. In addition, to improve calculation accuracy, we used an aggregation-based method [17] to replace the resampling-based method to improve the number of points involved in linear fitting. Results demonstrated that this method has the ability to improve the calculation accuracy for DEMs with low heterogeneity.
The proposed method is not perfect, and more extensive research is required. This paper assumed that the relative Boltzmann entropies of all points decrease at each step, and it ignored the situation in which the relative Boltzmann entropy of a given individual point is larger than the previous point. Moreover, this paper sets the percent of all heads and tails at 40% and 60%, respectively. It is possible that the number of points in the head was less than that in the tail but we rounded them off. Hence, the process of classification can only be terminated when the number of points in the head equals 1. An alternative termination condition would be when the number of points in the head is greater than or equal to that in the tail [20], which would make it possible to terminate the process of classification earlier.

Conflicts of Interest:
The authors declare no conflict of interest.