Blind Image Quality Assessment of Natural Scenes Based on Entropy Differences in the DCT Domain

Blind/no-reference image quality assessment is performed to accurately evaluate the perceptual quality of a distorted image without prior information from a reference image. In this paper, an effective blind image quality assessment approach based on entropy differences in the discrete cosine transform domain for natural images is proposed. Information entropy is an effective measure of the amount of information in an image. We find the discrete cosine transform coefficient distribution of distorted natural images shows a pulse-shape phenomenon, which directly affects the differences of entropy. Then, a Weibull model is used to fit the distributions of natural and distorted images. This is because the Weibull model sufficiently approximates the pulse-shape phenomenon as well as the sharp-peak and heavy-tail phenomena of natural scene statistics rules. Four features that are related to entropy differences and human visual system are extracted from the Weibull model for three scaling images. Image quality is assessed by the support vector regression method based on the extracted features. This blind Weibull statistics algorithm is thoroughly evaluated using three widely used databases: LIVE, TID2008, and CSIQ. The experimental results show that the performance of the proposed blind Weibull statistics method is highly consistent with that of human visual perception and greater than that of the state-of-the-art blind and full-reference image quality assessment methods in most cases.


Introduction
The human visual system (HVS) is important for perceiving the world. As an important medium of information transmission and communication, images play an increasingly vital role in human life. Since distortions can be introduced during image acquisition, compression, transmission and storage, the image quality assessment (IQA) method is widely studied for evaluating the influence of various distortions on perceived image quality [1,2].
In principal, subjective assessment is the most reliable way to evaluate the visual quality of images. However, this method is time-consuming, expensive, and impossible to implement in real-world systems. Therefore, objective assessment of image quality has gained growing attention in recent years. Depending on to what extent a reference image is used for quality assessment, existing objective IQA methods can be classified into three categories: full-reference (FR), reduced-reference (RR) and no-reference/blind (NR/B) methods. Accessing all or part of the reference image information is unrealistic in many circumstances [3][4][5][6][7][8], hence it has become increasingly important to develop effective blind IQA (BIQA) methods.
Weibull model to evaluated image quality. Furthermore, the proposed method has the advantage of high prediction accuracy and high generalization ability.
The rest of the paper is organized as follows. Section 2 presents the NSS model based on entropy differences. Section 3 presents the proposed BWS algorithm in details. Section 4 evaluates the performance of the BWS algorithm from various aspects. Section 5 concludes the paper.

NSS Model Based on Entropy Differences
Information entropy indicates the amount of information contained within an image and the changes of entropy are highly sensitive to the degrees and types of image distortions. Our method utilizes distribution difference of DCT coefficients that directly affects entropy changes to assess image quality. As reported in the literature [35,36], natural images exhibit specific statistical regularities in spatial and frequency domains and are highly structured, and the statistical distribution remain approximately the same under scale and content changes. The characteristics of natural images are essential for IQA.

Under-Fitting Effect of GGD
The GGD model, as a typical NSS model, is widely used to fit distribution of AC coefficients in the DCT domain for natural images [29], and it can well simulate non-Gaussian behaviors, including sharp-peak and heavy-tail phenomena of the distribution of AC coefficients of natural images [35][36][37]. Figure 1 shows a natural image (i.e., "stream") included in the LIVE database [38], and two of its JPEG-compressed versions. The subjective qualities are scored by the Difference Mean Opinion Score (DMOS), which returned values of 0, 63.649 and 29.739 for the three images. A larger DMOS indicates lower visual quality. Figure 2 shows the distribution of AC coefficients for the corresponding images in Figure 1 and the GGD fitting curve. As shown in Figure 2a, the GGD model is capable of fitting the natural image, especially the sharp-peak and heavy-tail phenomena. For the distorted images, however, distinct deviations occur around the peak, as shown in Figure 2b,c. The underlying reason for the misfits is that the structure information of the distorted images has been changed including smoothness, texture, edge information. The number of the AC coefficients in the value of zero is increased rapidly, which triggers the pulse-shape. The pulse-shape phenomenon enhances along the increase of the distortion level. Thus, the GGD model fails at simulating the pulse-shape phenomenon, and it under-fits the distorted image distribution.

Weibull Distribution Model
To well fit the distribution of AC coefficients for a distorted image, the Weibull distribution model is employed, which is given by [39]: where a and m are the shape parameter and the scale parameter. The family of Weibull distributions includes the Exponential distribution (a = 1) and Rayleigh distribution (a = 2). When a < 1, f X (x) approaches infinity as x approaches zero. The characteristic can be used to describe the pulse-shape phenomenon for the distribution of AC coefficients. Figure 3 shows the change of Weibull distribution in different parameter settings. When m is fixed and a is less than 1, a larger shape parameter a corresponds to a slower change of f X (x). When a is fixed, a larger scale parameter m corresponds to a faster change of f X (x).  Figure 4 shows the distribution of the absolute values of AC coefficients for the corresponding images in Figure 1 and the fitting curves of the Weibull model. The Weibull model well fits the pulse-shape in addition to the sharp-peak and heavy-tail phenomena. The faster the Weibull distribution changes, the larger the distortion is. We use the Mean Square Error (MSE) to express the fitting error: where P i is the value of the histogram for the absolute values of AC coefficients, and Q i is the statistical distribution density of the fitting functions. The MSE values for Weibull model are 6.07 × 10 −6 and 3.38 × 10 −7 in Figure 4b,c, respectively, and these values are much lower than the MSE values of the GGD model, which are 1.1 × 10 −4 and 3.36 × 10 −5 in Figure 2b,c, respectively.  We also develop a Weibull model that fits five representative distortion types in the LIVE database: JP2K compression (JP2K), JPEG compression (JPEG), white noise (WN), Gaussian blur (GB), and fast-fading (FF) channel distortions [38]. Figure 5 presents the MSE comparison of two models for each distortion type images. Table 1 lists the average MSE of each distortion type. The fitting error of the Weibull model is obviously smaller than that of the GGD model. Therefore, the Weibull model is employed to evaluating the image quality in this paper.

Proposed BWS Method
In this section, we describe the proposed BWS method in detail. The framework is illustrated in Figure 6. First, block DCT processing is applied to images of different scales. The goal is not only to conform to the decomposition process of local information in the HVS [40] but also to reflect the image structure correlation [4]. In the DCT domain, the magnitudes of the block AC coefficients and those in the orientation and frequency sub-regions are extracted to describe HVS characteristics. Then, a Weibull model is employed to fit these values. From the Weibull model, the following four perceptive features are extracted: the shape-scale parameter feature, the coefficient of variation feature, the frequency sub-band feature, and the directional sub-band feature. Finally, by using the SVR learning method, the relationship between the image features and subjective quality scores in high-dimensional space is obtained.  One advantage of this approach is that we consider only the change of luminance information in the BWS algorithm because neuroscience research shows that the HVS is highly sensitive to changes in image luminance information [37]. In addition, DCT processing is useful in IQA. The presentation of features can be enhanced by treating different frequency components with different distortion levels. The computational convenience is another advantage [41].

Relationship Between HVS and Perceptual Features
IQA modeling must be able to satisfy human perceptual requirements, which are closely related to HVS. Designing an HVS-based model for directly predicting image quality is infeasible because of the complexity. Therefore, in this paper, the features related to the HVS are extracted from the Weibull model and used to predict perceptual image quality scores by a learning method. The visual perception system has been shown to be highly hierarchical [42]. Visual properties are processed in areas V1 and V2 of the primate neocortex, which occupies a large region of the visual cortex. V1 is the first visual perception cortical area, and the neurons in V1 can achieve succinct descriptions of images in terms of the local structural information pertaining to the spatial position, orientation, and frequency [43]. Area V2 is also a major visual processing area in the visual cortex [44], and the neurons in V2 have the property of the scale invariance [45]. Therefore, our proposed model is related to the properties of the HVS.

Shape-Scale Parameter Feature
Before extracting features, we divide the image into 5 × 5 blocks with a two-pixel overlap between adjacent blocks to remove the redundancy of image blocks and better reflect the correlation information among blocks. For each block, DCT is performed to extract the absolute AC coefficients. Then, the Weibull model is used to fit the magnitudes of the AC coefficients of each block. Theories in fragmentation posit that the scale parameter m and shape parameter a in the Weibull distribution are strongly correlated with brain responses. The experiments on brain responses showed that a and m explain up to 71 % of variance of the early electroencephalogram signal [39]. These parameters can also be estimated from the outputs of X-cells and Y-cells [46]. In addition, the two parameters can accurately describe the image structure correlation because a difference in the image distribution of quality degradation, which depends on the image structural information, results in a different shape of the Weibull distribution, thereby resulting in different values of a and m. In other words, the response of the brain to external image signals is highly correlated with the parameters a and m of the Weibull distribution. Thus, we defined the shape-scale parameter feature ζ = ( 1/m ) a . The parameters that directly determine the form of the Weibull distribution are a and ( 1/m ) a , as shown in Equation (3). The Equation (3) is the deformation of Weibull distribution. Because the human subjects are viewing natural images that are correlated with a and m, considering only the influence of a while ignoring the effect of m on the Weibull distribution does not produce accurate results. Therefore, we chose ( 1/m ) a as feature ζ for assessing image quality. The advantage of this feature is that it provides an intuitive expression of the Weibull equation as well as a monotonic function of distortion, which can be used to represent the levels of distortion in images.
The efficiency of the features is verified in the LIVE database. We extracted the average value of the highest 10% and 100% (all block coefficients) of the shape-scale parameter features (ζ) in all blocks of the image. These two percentages correspond to image distortion of the local worst regions and the global regions, respectively. It may be inappropriate to only focus on distortion of local worst regions or the overall regions [47][48][49]. Thus, it is necessary to combine the local distortion with the global distortion. Table 2 shows the Spearman Rank Order Correlation Coefficient (SROCC) values between the DMOS scores and the average values of the highest 10% of ζ as well as the DMOS values and the average values of the 100% of ζ in LIVE database. The SROCC value is larger than 0.7, which indicates a significant correlation with the subjective scores [50]. Therefore, these features can be effectively used as perceptual features for IQA.

Coefficient of Variation Feature
Natural images are known to be highly correlated [36,37]. The correlation can be affected by distortions in different forms. In the DCT domain, distortions change the distribution of the AC coefficients. For JP2K, JPEG, GB, and FF [38], the distortion increases the differences among low-, middle-and high-frequency information. Then, the standard deviation becomes larger under the unit mean than that in the natural image. Thus, the large variation of the standard deviation represents a large distortion. In contrast, for WN distortions, the increased random noise causes high-frequency information to increase rapidly, thereby reducing the differences among different frequency information. Thus, a small variation of the standard deviation corresponds to a large distortion.
Therefore, we define the coefficient of variation feature ξ, which describes the variation of the standard deviation under the unit mean as follows: where the mean µ X and variance σ 2 X of the Weibull model can be obtained as follows: where Γ denotes the gamma function. This parameter is defined as follows: We calculated the average value of the highest 10% of ξ and the average value of 100% of ξ in all blocks across the image. Table 3 shows the SROCC values, which are verified in the LIVE database. The correlation is also significant in most distortion types. Thus, the features can be used to assess image quality.

Frequency Sub-Band Feature
Natural images are highly structured in the frequency domain. Image distortions often modify the local spectral properties of an image so that these properties are dissimilar to those of in natural images [29]. For JP2K, JPEG, GB, and FF distortions, distortions trigger a rapid increase of differences among the coefficients of variation of the frequency sub-bands coefficients. A large difference represents a large distortion. However, with the WN distortion type, the opposite change trend is observed.
To measure this difference, we defined the frequency sub-band feature f . According to the method in [29], we divided each 5 × 5 image block into three different frequency sub-bands, as shown in Table 4. Then, the Weibull fit was obtained for each of the sub-regions, and the coefficient of variation ξ f was calculated using Equation (4) in the three sub-bands. Finally, the variance of ξ f was calculated as the frequency sub-band feature f . Table 4. DCT coefficients of three bands.
The feature was pooled by calculating the average value of the highest 10% of f , and the average value of 100% of f in all blocks across the image in the LIVE database. In Table 5, we report how well the features are correlated with the subjective scores. The SROCC is clearly related to the subjective scores, which means that these features can be used to describe subjective perception.

Directional Sub-Band Feature
The HVS also has different sensitivities to sub-bands in different directions [40]. Image distortion often changes the correlation information of sub-bands in different directions, which makes the HVS highly sensitive to this change. For the JP2K, JPEG, GB, and FF distortion types, distortions modify the inconsistencies among coefficients of variation in sub-bands in different directions. A large inconsistency reflects a large distortion. Note that this effect has a reverse relationship to the WN distortion.
Therefore, this inconsistency can be described by the orientation sub-band feature S o . We divided sub-bands into three different orientations for each block, as shown in Table 6. This decomposition approach is similar to the approach in [29]. Then, the Weibull model was fitted to the absolute AC coefficients within each shaded region in the block. The coefficient of variation ξ o was also calculated, as shown in Equation (4), in three directional sub-bands. Finally, the directional sub-band feature S o can be obtained from the variance of ξ o . Table 6. DCT coefficient collected along three orientations.
The average values of the highest 10% and 100% of S o for all blocks across images were collected. We report the SROCC values between DMOS scores and features in Table 7 and demonstrate an obvious correlation with human perception.

Multi-Scale Feature Extraction
Previous research has demonstrated that the incorporation of multi-scale information can enhance the prediction accuracy [5]. The statistical properties of a natural image are the same at different scales, whereas distortions affect the image structure across different scales. The perception of image details depends on the image resolution, the distance from the image plane to the observer and the acuity of the observer's system [40]. A multi-scale evaluation accounts for these variable factors. Therefore, we extracted 24 perceptual features across three scales. In addition to the original-scale image, the second-scale image was constructed by low-pass filtering and down-sampling the original image by a factor of two. Then, the third-scale image was obtained in the same way from the second-scale image. As listed in Table 8, each scale includes eight features. The extraction process is as described in Sections 3.2-3.5.

Prediction Model
After extracting features in three scales, we learned the relationship between image features and subjective scores. In the literature, SVR is widely adopted as the mapping function for learning this relationship [51,52]. Considering a set of training data {(x 1 , y 1 ), . . . , (x l , y l )}, where x i ∈ R n is the extracted image feature and y i is the corresponding DMOS, a regression function can be learned to map the feature to the quality score, i.e., y i = SVR(x i ). We used the LIBSVM package [53] to implement the SVR with a Radial Basis Function (RBF) kernel in our metric. Once the regression model was learned, we could use it to estimate the perceptual quality of any input image.

Experimental Setup
The performance of the blind IQA algorithms was validated using subjective image quality databases, where each image is associated with a human score (e.g., a (Difference) Mean Opinion Scores (DMOS/MOS)). The performance describes how well the objective metric is correlated with human ratings. Several subjective image quality evaluation databases have been established. We employed three widely used databases, namely, the LIVE database [38], the TID2008 database [54] and the CSIQ database [55], in our research. These three databases are summarized as follows: (1) The LIVE database includes 29 reference images and 779 distorted images corrupted by five types of distortions: JP2K, JPEG, WN, GB and FF. Subjective quality scores are provided in the form of DMOS ranging from 0 to 100. Each of the distorted images is associated with a DMOS, representing the subjective quality of this image. (2) The TID2008 database covers 17 distortion types, each of which consists of 100 distorted versions from 25 reference images. Subjective quality scores are provided for each image in the form of MOS, ranging from 0 to 9. Note that there is one artificial image and its distortion images in the TID2008 database. We discarded these images when evaluating the performance of the BWS method because the proposed method was designed to evaluate the quality of natural scenes. In addition, we mainly considered the subsets with JP2K, JPEG, WN and GB distortion types that appear in LIVE database. These four distortion types are also the most commonly encountered distortions in practical applications. To evaluate the performance of the BWS algorithm, two correlation coefficients, the Pearson Linear Correlation Coefficient (PLCC) and the SROCC, are used as the criteria. The PLCC measures the prediction accuracy, whereas the SROCC represents the prediction monotonicity. Before calculating the PLCC, the algorithm scores are mapped using a logistic non-linearity as described in [56]. Both the SROCC and PLCC lie in the range [−1,1]. SROCC and PLCC values that are closer to "1" or "−1"correspond to better predictions of this algorithm.

Performance on Individual Databases
First, we evaluated the overall performance of the BWS method and other competing IQA methods, namely, BLIINDS-II [29], DIIVINE [31], BRISQUE [32], BIQI [57], PSNR and SSIM [4], on each database. The first four methods are NR IQA algorithms, and the latter two are FR IQA algorithms.
Because the BWS approach is based on SVR, we randomly divided each image database into two sets: a training set and a testing set. The training set is used to train the prediction model and the testing set is used to test the prediction results. In our experimental setup, 80% of the distorted images for each database are used as the training set and the remaining 20% of the images are used as the testing set. Content does not overlap between these two sets. We repeat the training-testing procedure 1000 times, and the median value of the obtained SROCCs and PLCCs is reported as the final performance of the proposed metric. Meanwhile, we adapted the same experimental setup for comparison algorithms. Although the FR IQA approaches PSNR and SSIM do not require training on a database, for a fair comparison, we also conducted the experiments on the randomly partitioned testing set and recorded the median value of the PLCC and SROCC. Table 9 shows the performances of different methods on the LIVE, TID2008 and CSIQ databases. For each database, the top two IQA methods are highlighted in bold. For the LIVE database, the overall performance of the proposed BWS method is better than those of the other IQA methods. For the TID2008 database and the CSIQ database, the BWS algorithm outperforms the other NR and FR IQA methods. It is concluded that the BWS algorithm outperforms the other competitors overall on different databases. Although other methods may work well on some databases, they fail to deliver good results on other databases. For example, DIIVINE obtains a good result on the LIVE database but performs poorly on the CSIQ and TID2008 databases. Moreover, we present weighted-average SROCC and PLCC results of competing IQA methods on all three databases in Table 10. The weight that is assigned to each database depends on the number of distorted images that the database contains [58,59]. BWS still performs best among the IQA methods. Hence, we conclude that the objective scores that are predicted by BWS correlate much more consistently with subjective evaluation than those that are predicted by other IQA metrics. To determine whether the superiority of the BWS method over its counterparts is statistical significance, we conducted statistical analysis to validate their differences in performance. The hypothesis testing, which was based on the t-test [60], was presented, which measures the equivalence of the mean values of two independent samples. Experiments are conducted by randomly splitting the database into a training set and a testing set and the SROCC values are reported for 1000 training-testing trials. Thus, we applied the t-test between the SROCCs that were generated by each of the two algorithms and tabulated the results in Tables 11-13. Each table shows the results of the t-test on each database. A value of "1" in the tables indicates that the row algorithm is statistically superior to the column algorithm, whereas a value of "−1" indicates that the row algorithm is statistically inferior to the column algorithm. A value of "0" indicates that the row and column algorithms are statistically equivalent. From the experimental results in Tables 10-12, the BWS method was found to be statistically superior to FR approaches PSNR and SSIM and NR IQA approach BLIINDS-II. Table 11. Statistical significance test on LIVE.

Performance on Individual Distortion Type
In this section, we tested the performances of the proposed BWS method and other competing IQA methods on individual distortion type over the LIVE, TID2008 and CSIQ databases. For NR IQA, we trained on 80% of the distorted images with various distortion types randomly and tested on the remaining 20% of the distorted images with a specific distortion type. The SROCC and PLCC comparisons on each database are illustrated in Tables 14 and 15. We observed the advantages of the individual distortion type of the BWS algorithm on each database. For the LIVE database, we clearly found that the proposed metric outperforms other NR metrics in FF. In particular, compared with the BLIINDS-II method, implemented in the same DCT domain, BWS performs better on the JP2K, WN, GB and FF distortion types. For FR metrics, although these methods require the complete information of the reference images, our algorithm still outperforms PSNR on all distortion types and outperforms SSIM on WN and GB distortions. For the TID2008 database, the performance of our method is superior on JP2K distortions relative to other NR metrics. Similarly, the performances on JP2K, WN, and GB distortion are better than those of the BLIINDS-II method. Compared with the FR metric, the performance of the BWS algorithm on the JP2K distortion type is better than those of PSNR and SSIM. For the CSIQ database, our method outperforms the remaining NR metrics on WN and GB distortions. Compared with the BLIINDS-II algorithm, the BWS method consistently performs better. Compared with the FR metric, the performances of BWS on the JPEG and GB types are better than those of PSNR, and on GB and WN the distortion is better than with SSIM. Therefore, we found that the performance of the BWS algorithm is superior on some specific distortion types in each database.
To facilitate a comparison of the effects between BWS method and the other IQA methods, 13 groups of distorted images were considered in the three databases. The best two results of the SROCC and PLCC are highlighted in boldface for the NR IQA methods. We calculated the number of times that each method was ranked in the top two in terms of the SROCC values and PLCC values for each distortion type. For the 13 groups of distorted images in the three databases, the BWS algorithm was ranked in the top two the most times, with 10 times for the SROCC and 11 times for the PLCC. We also report the weighted means and standard deviations (STDs) of competing IQA methods across all distortion groups in Table 16. The BWS has higher average and lower STD across all distortion groups. Hence, the BWS method achieves a consistently better performance on most commonly encountered distortion types. To visually show the correlation of the BWS method between the predicted quality scores and the subjective scores, we present scatter plots (for each distortion type and for the entire LIVE, TID2008 and CSIQ databases) of the predicted scores and the subjective scores in Figures 7-9. These figures show that a strong linear relationship occurs between the predicted scores of BWS and the subjective human ratings, which indicates a high prediction accuracy of the BWS algorithm.

Cross-Database Validation
In our previous experiments, the training samples and test samples were selected from the same database. It is expected that an IQA model that has learned on one image quality database should be able to accurately assess the quality of images in other databases. Therefore, to demonstrate the generality and robustness of the proposed BWS method, the following experiments were conducted. We trained all distorted images from one database to obtain a prediction model and used this model to test the scores of distorted images from other databases. With the three databases, six combinations of training and testing database pairs were created. We compared our proposed BWS algorithm and BLIINDS-II method on the same DCT domain. The SROCC results of the cross-database validation are tabulated in Table 17. The results indicate that the BWS algorithm performs better than the BLIINDS-II method in most cases. Therefore, the cross-database validation has demonstrated the generality and robustness of the proposed BWS method in DCT domain.

Model Selection
In Section 2, we analyze Weibull model as NSS model instead of a typical GGD model can well simulate the statistical regularities of distorted natural images in DCT domain. However, the changes of Weibull distribution model are similar to the simpler exponential distribution model. It is necessary to judge whether the exponential model is superior to Weibull model for IQA. Figure 10 shows the distribution of AC coefficients for the corresponding images in Figure 1 and the exponential fitting curve. We found that, if we used the exponential distribution model to fit statistics of natural and distorted images in DCT domain, it unfortunately failed at fitting these phenomena. Moreover, we used MSE to present fitting error comparison of three models for each distortion type images, as shown in Table 18. The fitting errors of Weibull model is minimum. Therefore, it is not appropriate to use exponential model instead of Weibull model.

The Block Size Selection
In the BWS method, we selected 5 × 5 block for DCT. On the one hand, if a smaller block size is selected, the correlation between blocks is very large, thus it is difficult to distinguish the difference of extracted features. It affects the prediction accuracy. Similarly, if a bigger block size is selected, the extracted features lack the correlation information between blocks, which leads to inaccurate evaluation performance. Meanwhile, we used experiments to prove 5 × 5 block size is better in our method, as shown in Table 19. On the other hand, the 5 × 5 block is very common in image quality assessment [29,61,62]. Therefore, selecting 5 × 5 block is reasonable and can improve predicted performance. Pooling strategy has been studied recently as an important factor to the accuracy of objective quality metrics. In our method, we calculated the average value of the highest 10% of features and that of 100% of features. The highest 10% and the 100% features of all blocks in image describe image distortion of the worst 10% regions and the overall image regions. The averaging of features in all the local regions is one of the widely used methods in image quality metrics. It describes the global distortion of the image. However, when only a small region in an image is corrupted with extremely annoying artifacts, human subjects tend to pay more attention to the low-quality region. Thus, we also considered the image distortion of the worst 10% regions because the human eye is more annoyed by the local distortion of the image and the subjects are likely to focus their quality opinions on the worst 10% of the whole image [63].
We extracted three feature sets: the average values of the highest 10% features, the average values of the 100% features and the combination of the first two feature sets in three scales from the LIVE database. The feature extraction method and the experimental setup were the same as those reported in Sections 3 and 4. The median SROCC value of 1000 training-testing trials was utilized to evaluate the performance. The results of this experiment are shown in Table 20, and they indicate that the performance of the combination feature set is better than that when only one factor is considered. This finding demonstrates that the joint pooling strategy can improve the prediction performance of image quality measures and human intuition of images is the synthesis of local and global perception. A multi-scale segmentation method has been developed and implemented for IQA. We proposed multi-scale features for predicting image quality because the perceptibility of image details depends on the viewing conditions. The subjective evaluation of a given image varies with these factors. Therefore, a multi-scale method is an effective method of incorporating image details at different resolutions.
We conducted an experiment to determine the impact of scale on IQA. In our method, we selected three scale images for extracting features because no significant gain in performance is obtained beyond the third scale of feature extraction. The methods of scale image segmentation, feature extraction and experiment setup were the same as the previous operation. We report the correlations of different scales between the predicted scores and the subjective scores on the LIVE database in Table 21.
The experimental results show that the performance of BWS method in three scales outperform the other cases. It proves that the approach of multi-scale can improve the performance of our metric.

Computational Complexity
In many practical applications, the prediction accuracy and the algorithm complexity need to be considered comprehensively. Therefore, we evaluated the computational complexity of all competing methods in Table 22. One can see that, although the proposed BWS algorithm does not have the lowest complexity, the assessment accuracy is better among all the competing models, as shown in Tables 10 and 16. For example, the BIQI is the lowest with complexity (N), where N is the total number of image pixels. However, its performance is the worst. The DIIVINE is the most complex, but the performance is inferior to our algorithm. Therefore, the proposed BWS algorithm is a reasonable trade-off between assessment accuracy and computational complexity.

Conclusion
Since there exists a strong relationship between the changes of entropy and distribution of images, existing NR IQA models typically use the GGD model to fit the distribution of a natural image in the DCT domain and extract features from this model to learn a quality prediction model. However, the difference between the distribution of the distorted image and that of its natural image, which includes the pulse-shape phenomenon, is neglected. In this paper, we propose the BWS method, which is a new NR IQA method based on the Weibull model in the DCT domain. The most important pulse-shape phenomenon is first considered in the distorted image distribution. The proposed Weibull model not only overcomes the disadvantages of the GGD model but also reflects HVS perception. Our research findings suggest that the Weibull model plays an important role in quality assessment tasks. Extensive experimental results on three public databases have demonstrated that the proposed BWS method is highly correlated with human visual perception and competitive with the state-of-the-art NR and FR IQA methods in terms of prediction accuracy and database independence.

Conflicts of Interest:
The authors declare no conflicts of interest.

Abbreviations
The following abbreviations are used in this manuscript: FR