Real-Time Haze Removal Using Normalised Pixel-Wise Dark-Channel Prior and Robust Atmospheric-Light Estimation

: This study proposes real-time haze removal from a single image using normalised pixel-wise dark-channel prior (DCP). DCP assumes that at least one RGB colour channel within most local patches in a haze-free image has a low-intensity value. Since the spatial resolution of the transmission map depends on the patch size and it loses the detailed structure with large patch sizes, original work reﬁnes the transmission map using an image-matting technique. However, it requires high computational cost and is not adequate for real-time application. To solve these problems, we use normalised pixel-wise haze estimation without losing the detailed structure of the transmission map. This study also proposes robust atmospheric-light estimation using a coarse-to-ﬁne search strategy and down-sampled haze estimation for acceleration. Experiments with actual and simulated haze images showed that the proposed method achieves real-time results of visually and quantitatively acceptable quality compared with other conventional methods of haze removal.


Introduction
In recent years, self-driving vehicles, underwater robots, and remote sensing have attracted attention; such applications employ fast and robust image-recognition techniques. However, images of outdoor or underwater scenes have poor image quality because of haze (Figure 1a), thus affecting image recognition. To solve this problem, many haze removal techniques were proposed, and these techniques can be classified into non-learning-based and learning-based approaches.
Non-learning-based approaches use multiple haze images [1], depth information [2] and prior knowledge from a single haze image [3][4][5]. Methods employing prior knowledge maximise contrast within the local patch [3], assuming that surface shading and transmission are locally uncorrelated [4], and statistically observe that at least one RGB colour channel within most local patches in a haze-free image has a low-intensity value [5]. Median and guided-image filters [6,7] are used for accelerating haze removal; however, these methods could not achieve real-time processing (defined as 20 fps for our calculations herein). Learning-based approaches employ random forest [8], colour-attenuation and prior-based brightness-saturation relation [9] and deep learning [10,11]. These methods can achieve accurate and fast haze removal compared with conventional non-learning-based approaches. In deep-learning-based methods, large-scale pairs of haze images and corresponding haze-free images must be prepared and their relation must be trained. Image pairs of haze and haze-free images cannot be existed simultaneously in actual situation; therefore, haze images are generated from haze-free images by employing haze-observation model [10] and depth information from the corresponding haze-free images [11]. The haze-removal accuracy of deep-learning-based methods depends on the

Traditional Dark Channel Prior
This section describes DCP [5], which is the basis of the proposed method. The haze-observation model [1,5] is represented by where I(x) is the observed RGB colour vector of haze image at coordinate x, J(x) is the ideal haze-free image at coordinate x, A is the atmospheric light, t(x) is the value of transmission map at coordinate x. To solve the haze-removal problem, some prior knowledge such as DCP must be applied. The transmission map derivation (Section 2.1), atmospheric-light estimation (Section 2.2) and haze-removal image creation (Section 2.3) are explained as follows.

Estimation of Transmission Map
Medium transmission t(x) [1,5] is expressed by where β is the scattering coefficient of the atmosphere and d(x) is the depth at coordinate x. He et al. [5] used DCP, indicating that at least one RGB colour channel within most local patch has a low-intensity value where J c (y) is a colour channel of haze-free image J(y) at coordinate y and DC is the dark-channel operator which extracts a mimimum RGB colour channel in a local patch Ω(x) centered at coordinate x. From Equations (1) and (3) can be rewritten as wheret(x) is the coarse transmission map based on patch and the argument I(x)/A and J(x)/A are to be element-wise division.
where ω is the haze removal rate which is considered to the human perception for depth scene (0.95 in the He et al. [5]). Sincet(x) is calculated by each large patch to satisfy the DCP,t(x) is not smooth in edge region and the spatial resolution is lost. To solve the problem, He et al. [5] refined the transmission mapt(x) using image-matting processing [14] as post-processing. However, such processing requires high computational cost and several tens of seconds to execute the haze-removal method.

Estimation of Atmospheric Light
Atmospheric light A comprises pixels of the observed image for which t(x) = 0 in Equation (1); there is no direct light and the distance is infinity in Equation (2). In the outdoor image, this generally represents the intensity of the sky region. To estimate atmospheric light A, the highest luminance value is considered in the haze image I [3]. If an image contains a white object, the atmospheric light A is misestimated, and optimum atmospheric light A is estimated using dark-channel value [5]. Initially He et al. [5] determined the top 0.1% brightest pixels in the dark-channel image, and chose the highest intensity pixels from those same pixels in haze image I. Although this approach is useful because it can estimate atmospheric light A by ignoring small white object, the size is limited below the patch size.

Estimation of Haze-Removal Image
Haze removal can be calculated by modifying Equation (1) as follows: where t(x) is the refined transmission map from patch-based transmission mapt(x), A is atmospheric light and t 0 is a parameter that is set to 0.1 to avoid division by a small value.

Proposed Method
Computer vision tasks, such as self-driving vehicles, under-water robots and remote sensing, employ real-time haze removal to realise fast and robust image recognition. In this section, a real-time and highly accurate haze-removal algorithm is proposed.

Normalized Pixel-Wise Dark Channel Prior
In the DCP, the spatial resolution of the transmission map t(x) worsens along object edges because of calculating spatial minimisation in a dark-channel image. Therefore, He et al. [5] refined the transmission map via image-matting processing [14]. However, image-matting processing requires a high computational cost and is not acceptable for real-time application. Therefore, they proposed a guided-image filter [7] as a fast image-matting technique. Other researchers also proposed a pixel-wise DCP [15][16][17] and a method combining original patch-wise DCP in a flat region and pixel-wise DCP around the edge region [18]. Although pixel-wise DCP can estimate the transmission map t(x) without selecting a minimum value spatially, the result tends to be darker than the haze image ( Figure 2b). The histogram of medium transmission t(x) in Figure 3 shows that the pixel-wise DCP without normalisation shifts to the left side compared with the histogram of the original patch-wise DCP (He et al. [5]). This is why the DC(J(x)/A) of Equation (4) cannot be zero by setting the patch size to 1 × 1 instead of 15 × 15. Therefore, in the proposed method, the DC(J(x)/A) of Equation (4) has a small value; the value of (DC J (x)) is defined by multiplying normalised dark channel of haze image I, which ranges from 0 to 1 and the ratio γ in Equation (8).
where DC p is a pixel-wise dark channel operator, Ω is the entire image. The transmission map t(x) of normalized pixel-wise DCP can be calculated by The histogram ( Figure 3) of the transmission map t(x) derived by the proposed method shifts towards the right side compared with the histogram of transmission map without normalisation. As a result, the histogram of the proposed method gets close to the original patch-wise DCP. Here, if γ is set to be 0, Equation (9) corresponds to the pixel-wise DCP without normalisation ( Figure 2b). Furthermore, setting γ to be a small value (e.g., 0.25) results in a dark image within the yellow dotted rectangle (Figure 2c), but if γ is set to be a large value (e.g., 0.75), the haze-removal effect diminishes within the dashed red rectangle (Figure 2e).

Acceleration by Down-Sampling
In the haze-removal method, it is necessary to calculate the transmission map t(x) for each pixel. Therefore, the calculation time of haze removal depends on the image size, and thus larger image sizes have higher associated calculation costs. Fortunately, observation of the transmission map t(x) indicates that it is characterised by a relatively low frequency except edges between objects, particularly at different depths. Therefore, we reduced the computation time greatly by down-sampling the input image. It estimated the transmission map t(x) and atmospheric light A using down-sampled image, and then the haze-removal image J was estimated by the up-sampled transmission map t(x) using Equation (6). Figure 4 shows the haze-removal results with different down-sampling ratios. The down-sampling ratio set to 1/4 achieved visually acceptable results, but when it was set to 1/8 or 1/16, halo effects were generated along edges such as along the sides of trees and leaves within the dashed red rectangles (second row of Figure 4e,f). Also, significant aliasing occurred along edges of the bench within the yellow dotted rectangle (third row of Figure 4e,f), and uneven colour occurred in the enhancement results in second row of Figure 4e,f. We therefore set the down-sampling ratio to 1/4 in all further experiments. Also, we used box filtering in down-sampling and bicubic interpolation in up-sampling. Too, acceleration by the down-sampling approach helps with noise suppression by spatial smoothing.

Robust Atmospheric Light Estimation
He et al. [5] estimated atmospheric light A using the original patch-wise DCP, which is a robust method because it ignores small white objects by using a large patch size (e.g., 15 × 15). In addition, Liu et al.'s method [19] segments the sky region and uses the average value of that region as atmospheric light A. In our proposed method, because the dark-channel image is calculated for each pixel, white regions (represented by the blue '+' mark in Figure 5) are misinterpreted as atmospheric light A. On the basis of Figure 5, we found that our proposed method cannot use the He et al. [5] approach directly. In addition, their method requires extra computation time for sorting the top 0.1% brightest pixels in the dark channel of haze image I. To solve this problem, we propose a method that robustly estimates atmospheric light A by using a coarse-to-fine search strategy. Figure 6 shows the flow of the coarse-to-fine search strategy. In this strategy, initially the resolution of the dark-channel image is reduced step by step and the position of the largest dark-channel value is obtained at the lowest resolution; next it recalculates the position of the largest dark-channel value in the second-lowest resolution and continues to recalculate the position of the largest dark-channel value until the original image size is attained. In Figure 5, the red '×' mark (coarse-to-fine search strategy) is the correctly estimated atmospheric light A. Figure 5. Effectiveness of coarse-to-fine search strategy. Blue '+' mark is result of atmospheric-light estimation by pixel-wise dark-channel image without using coarse-to-fine strategy. Red '×' is result of pixel-wise dark-channel image using coarse-to-fine strategy.

Results and Discussion
In this section, we compare our method with Tarel [10] for qualitative visual evaluation and quantitative evaluation. Ref. [10] is used trained network provided by [20]. We used haze and haze-free images downloaded from the Flickr website [21] (all collected images are public domain or creative commons zero license) and MATLAB source codes [20,22,23].
Initially, we generated five uniform and nonuniform haze images (Figures 7b and 8c) from the haze-free image (Figure 7a) by applying Equation (1). In order to do these simulations, we had to set the transmission map, for which we experimented with setting to uniform and nonuniform medium transmissions. In the uniform medium transmission t, it is set to 0.5 directly. On the other hand, in the nonuniform medium transmission t(x), we set depth by manually segmenting four to five classes for the each image ( Figure 8b) and then fixing the depth for each class; we determined the medium transmission t(x) by applying Equation (2). In the quantitative evaluation, peak-signal-to-noise-ratio (PSNR) and structural similarity (SSIM) [24] are calculated where H, W and K are image size as height and width and number of colour channel respectively; G and J are the ground-truth image and haze-removal result, respectively; MAX is maximum possible value of ground-truth image; µ G and µ J are gaussian weighted averages of G and J, respectively, within local patch; within local patch, σ G and σ J are standard deviations of G and J, respectively, within local patch; σ GJ is a covariance of G and J within local patch; C 1 (set to 0.01 2 ) and C 2 (set to 0.03 2 ) are small constants. Secondly, we compared with proposed method and conventional method to actual haze image as qualitative visual evaluation. Finally, we show the comparison of computation time by each method and image size.
In common of qualitative evaluation in results with setting the uniform or the nonuniform medium transmission, the results of Figures 7 and 8 show that both our proposed method (Figures 7g  and 8h) and He et al.'s method [5] (Figures 7d and 8e) can obtain highly accurate haze-removal images that are indistinguishable from the original haze-free image. Cai et al.'s method [10] can also obtain highly accurate haze-removal images in outdoor scene such as cityscape and landscape images (Figures 7e and 8f). However, Cai et al.'s method [10] cannot remove the haze in underwater scene. The reason is that underwater images are not included in the training data. The results of pixel-wise DCP without normalisation (Figures 7f and 8g) are darker than their original haze-free images (Figure 7a).
In the quantitative evaluation with uniform setting in Table 1, it is apparent that our proposed method can obtain the highest PSNR and SSIM values compared with conventional methods if the appropriate value for γ is selected. Here, in the case of uniform medium transmission t, min y∈Ω (min c∈{r,g,b} (I c (y)/A c )) is close to 1 − t because min y∈Ω (min c∈{r,g,b} (J c (y)/A c )) is close to 0, and max y∈Ω (min c∈{r,g,b} (I c (y)/A c )) is close to 1 because max y∈Ω (min c∈{r,g,b} (J c (y)/A c )) is close to 1 in Equation (8). As the result, the appropriate value of γ is close to 1 when the ω equals to 1. From Table 1, the proposed method can obtain the best results when γ is set to a large value. On the other hand, in the quantitative evaluation with nonuniform setting shown in Table 2, some results from He et al.'s method [5] achieved better performance than the proposed method. The main reason is that it is not easy to estimate an appropriate γ in the case of nonuniform medium transmission t(x) because it depends on the haze scene. How to automatically determine an appropriate value from the distribution of haze in the scene is our future work. Table 1. Quantitative evaluation with PSNR and SSIM [24] for simulated haze images generated using uniform transmission map (t = 0.5). First row is PSNR value, and second row is SSIM value in each cell.  We used the paired t-test to verify whether any performance differences between the proposed method and state-of-the-art methods are statistically significant. The test results are summarized in Table 3. The statistically significant methods (p < 0.05) are indicated by "Yes" and others are indicated by "No". As shown in Table 3, the proposed method outperformed Tarel et al.'s method [6] and pixel-wise DCP (γ = 0) method in both uniform and nonuniform medium transmission cases. On the other hand, the proposed method outperformed He et al.'s method [5] and Cai et al.'s method [10] only in the uniform setting and there are no significant difference in the nonuniform setting. Table 2. Quantitative evaluation with PSNR and SSIM [24] for simulated haze images generated using nonuniform transmission map. First row is PSNR value, and second row is SSIM value in each cell.   Figure 9 shows that our haze-removal method produced good results for processing actual haze images. Closer qualitative evaluation confirms that the images processed by our proposed method (Figure 9f) are visually similar to those obtained by He et al.'s method [5] (Figure 9c). We can see that the results of pixel-wise DCP without normalisation (Figure 9e) are also unnaturally darker than those obtained by He et al.'s method [5] (Figure 9c) and our proposed method (Figure 9f). Furthermore, although Tarel et al.'s method [6] obtained clearer haze-removal results in the pumpkin, bridge and townscape images compared with our proposed method results, our evaluation confirmed that the colours of the park, bridge and townscape images changed from those of the original haze images. We also noted the occurrence of halo effects in the train image. In addition, Tarel et al.'s method cannot work well in the underwater image. Cai et al.'s method [10] (Figure 9d) can remove haze more naturally than other methods. In particular, it can remove haze uniformly in the sky region and the colour is more natural. On the other hand, it cannot work well in the underwater image. Figure 10 shows computation time for each image size for each method, assuming a i7-5557U (3.1 GHz, 2 cores, 4 threads) without GPU acceleration and main memory size is 16 GB. All methods are implemented in MATLAB. Using conventional methods, it takes several tens of seconds, and they cannot achieve real-time calculation. However, our proposed method can achieve real-time calculation until image size exceeds 1024 × 680 pixels.

Conclusions
In this paper, we propose a haze-removal method using a normalised pixel-wise DCP method. We also propose a fast transmission map estimation by down-sampling and robust atmospheric-light estimation using a coarse-to-fine search strategy. Experimental results show that the proposed method can achieve haze removal with acceptable accuracy and greater efficiency than can conventional methods. The advantage of the proposed method is its fast computation with acceptable visual quality compared with state-of-the-art-methods. On the other hand, its disadvantage is that the user must set an appropriate γ manually for each different haze scene. How to systematically determine the appropriate γ value from the distribution of haze in the scene is our future work. In addition, we are going to apply the method to real applications, such as automatic-driving, underwater-robot and remotely sensed imaging [25].