Noise Removal in the Developing Process of Digital Negatives

Most modern color digital cameras are equipped with a single image sensor with a color filter array (CFA). One of the most important stages of preprocessing is noise reduction. Most research related to this topic ignores the problem associated with the actual color image acquisition process and assumes that we are processing the image in the sRGB space. In the presented paper, the real process of developing raw images obtained from the CFA sensor was analyzed. As part of the work, a diverse database of test images in the form of a digital negative and its reference version was prepared. The main problem posed in the work was the location of the denoising and demosaicing algorithms in the entire raw image processing pipeline. For this purpose, all stages of processing the digital negative are reproduced. The process of noise generation in the image sensors was also simulated, parameterizing it with ISO sensitivity for a specific CMOS sensor. In this work, we tested commonly used algorithms based on the idea of non-local means, such as NLM or BM3D, in combination with various techniques of interpolation of CFA sensor data. Our experiments have shown that the use of noise reduction methods directly on the raw sensor data, improves the final result only in the case of highly disturbed images, which corresponds to the process of image acquisition in difficult lighting conditions.


Introduction
The vast majority of today's digital cameras use a single sensor with a color filter array (CFA). It is therefore necessary to interpolate the missing pixels using so-called demosaicing algorithms to obtain a complete RGB image. For the color filter array, the Bayer pattern remains the most popular one [1], and most of the demosaicing techniques are designed to interpolate such data. Figure 1 shows the arrangement of color filters in a Bayer mosaic, the image captured by the CFA sensor, and its RGB visualization.
Design of efficient demosaicing algorithms is important because almost all modern digital color cameras use an image demosaicing algorithm to produce color images. Although, in the case of modern image sensors with very high resolution, even the simplest methods can give satisfactory results, the researchers all over the world continually create new improved demosaicing algorithms starting from the simplest bilinear interpolation to very sophisticated ones such as iterative color stencils technique presented by Getreuer [2] and techniques derived from machine learning presented by Khashabi et al. [3].
Unfortunately, most interpolation algorithms are not immune to image sensor noise. Interpolation errors are particularly evident in the presence of impulsive noise: it causes the formation of unpleasant and often difficult to remove artifacts in the final image. Fortunately, impulsive noise in modern imaging sensors is quite easy to remove at an early stage of image processing. The interpolation The problem of noise in the imaging sensors over the years were dealt in different ways. The simplest one is to use some standard denoising methods such as VMF (Vector Median Filter) and its modifications, NLM (Non-Local Means), BM3D (Block-Matching and 3D Filtering) etc. [4][5][6][7][8][9]. Another option is to adjust noise reduction algorithms to work with raw data and remove noise before interpolating CFA data [10][11][12][13]. Another approach is to combine noise reduction and interpolation into one process-joint denoising and demosaising [3,14,15].
The standard filtering quality assessment procedure assumes the use of a set of sRGB test images with 8-bit color depth for each channel. These images are distorted by simulated noise, the distribution of which usually does not fully correspond to the processes occurring in the real imaging sensors. When demosaicing algorithms or the entire CFA image processing pipeline are tested, the test sRGB images are converted to a simulated Bayer image.
The real problem is slightly different, the data from the sensor are not in the sRGB space, but in the linear space of the sensor, usually with 12 or 14 bit depth. The noise associated with the image acquisition process should therefore be simulated directly for raw sensor data, i.e., where it occurs. When processing digital negatives, this noise is subject to a number of nonlinear transformations, as a result of which the noise in the final image may have a significantly altered distribution. This could adversely affect the performance of various filtering techniques, especially if some of them such as NLM or BM3D were designed for a specific noise model (most often Gaussian). Moreover, many commonly used test images have been obtained using CFA sensors and may contain interpolation artifacts, and often contain significant noise. These images also usually have a limited bit depth compared to real raw sensor data. It seems, therefore, that such data should not be used either to test the entire process of developing digital negatives, or to test its individual stages. Therefore, a new test image database has been prepared with high-quality reference images, raw, and disturbed raw images. Given the noise model described above, it can be assumed that the filtering process should also be carried out on raw data and not, as in the traditional approach for sRGB images at the end of the digital image development process.
In this article, we try to verify the correctness of this assumption, so we want to answer the question about the proper sequence of operations performed during the development of digital negatives, and more specifically, about the proper placement of the interpolation and filtering processes. As part of the experiment, tests were carried out for various demosaicing and filtering methods, applied in different order. During the research, the effectiveness of the examined solutions was tested using objective quality measures, and the statistical significance of the results obtained was assessed. The results of our tests can help in choosing the most effective approach to developing raw images used directly in digital cameras as well as in postprocessing software. The paper is organized as follows. Section 2 presents general digital negatives processing pipelines and their potential impact on the final results; it also explains problems related to the objective evaluation of the quality of developed raw images. Section 3 describes how to prepare test data, ground truth images, noisy images, and images for different processing steps. Section 4 describes an experiment to assess the impact of placing noise filtering at different stages of raw image processing pipeline. Section 5 concludes our results.

Raw Image Processing Pipelines
The process of developing digital negatives (DNG) starts with the output of the CFA sensor-a mosaic image registered in grayscale with a depth of 12 or 14 bits. The result of this process is usually a color image in sRGB or other space with a wider gamut, such as Adobe RGB. In the process of raw image developing, a number of operations should be performed, such as linearization, correction of sensor defects, white balancing, demosaicing, and denoising ( Figure 2). These operations may be performed in different sequences, but noise reduction is usually one of the last steps of the raw image processing pipeline. It can be supposed that this approach is not optimal and sensor noise filtering should be done before the original distribution of sensor noise is altered by other raw image processing steps ( Figure 3). However, to evaluate the real impact of choosing the filtering scheme it is necessary to assess the quality of the results.
The standard approach is still widely used, but has serious drawbacks: • demosaicing algorithms are changing noise distribution, and introduce additional artifacts, especially in presence of impulsive noise, • usually more data to process-three highly correlated channels (compared to one mosaic grayscale image), and • raw image processing may cause some data loss before denoising process.
However, the classical methods are supported by the fact that most of the available noise reduction algorithms are designed to filter RGB images.
The process of evaluating the effectiveness of classical filtering techniques is usually based on a very simple quality evaluation model ( Figure 4). This approach usually assumes the use of a standard set of test images and disrupting them with a known (usually very simple) noise model. The ideal solution would be to use real distorted images, e.g., obtained at a high ISO value and their equivalents at a minimum value. This approach was used by the authors of [16]; they use advanced techniques of matching ground truth images to their noisy counterparts. In this case, however, the noise level of ground truth images is still quite high.   Pre-processing (Linearization and sensor defect correction)

Before demosaicing
In one process

Generating A Set of Test Images
In our work, we want to analyze the effectiveness of filtering methods used at different stages of image acquisition pipeline. Therefore, we have to create test images with high bit depth, suitable for evaluating the entire process of digital negative development. The proposed test image generation process uses high-quality real raw images obtained from various digital cameras.
First the "perfect" images are created, the so-called ground truth (GT), and on their basis, images simulating the actual acquisition process are prepared ( Figure 5).

Downsampling Real Raw Images
To obtain high quality test images, the method presented in the paper [3] was adapted. The proposed method assumes the use of high-quality raw images, subjecting them to a significant downsampling with maximum entropy averaging. The color pixel values are determined by averaging the corresponding values from the raw image in downsampling window, an example of that window for the Bayer array is shown in Figure 6, window size W = 2 means that it consists of 2 Bayer mosaic patterns in line. Averaging inside the window and assigning calculated values to a specific channel is performed using the p c mask, where c ∈ {r, g, b}. The simplest solution is to use naive averaging (Figure 7), but this approach causes color shifts in the resulting image. To compensate for the effect of shifting inside the color channels, it is necessary to choose such weights inside the masks so that their center of gravity lies in the middle of the window. Therefore, for each color channel c ∈ {r, g, b}, the appropriate maximization problem should be solved: Sensors 2020, 20, 902 Constraint (2) ensures that the weights add up to one. A pair of constraints (3) and (4) ensures that the spatial center of gravity lies in the center of the window, because for the dimensions 2W × 2W the center point lies in (W + 0.5, W + 0.5). All weights in the mask should be non-negative, which provides a restriction (5). A restriction (6) ensures that the p c mask weights correspond to the spatial distribution of the c color fields in the CFA filter, i.e., the pixel weights that do not correspond to the C c channel are zero. These restrictions can lead to many different weight distributions; however, the most desirable distribution is the least concentrated around some specific numerical value. The measure of this concentration is entropy, which is a function of the goal of the maximization problem posed (1). For the analyzed case of the 4 × 4 window (W = 2), the masks that solve problems (1)-(6) take values as shown in Figure 8.

Impulsive Noise Problem-Raw Spatial Median Filter
Images obtained in the downsampling process described above are devoid of demosaic artifacts, in addition, the dominant sensor noise components (shot noise and read noise) are significantly reduced. However, even high-end imaging sensors may have defective pixels causing impulsive noise. This can interfere with the downsampling process by creating color artifacts in the resulting image. We can prevent this by using appropriate raw image data filtering techniques. The most effective method is to use a dark frame to detect faulty pixels and then replace them using spatial interpolation. However, if the dark frame data have not been obtained, we will use a simple modification of the Spatial Median Temporal Mean (SMTM) filter proposed in the work [17] reduced to single raw image: Raw Spatial Median Filter (RAWSM). During downsampling, for each pixel g i inside the 2W × 2W window, the η i indicator is calculated, according to the following formula, Sensors 2020, 20, 902 7 of 21 according to the color of the CFA filter (g i ∈ g c ∈ C c , c ∈ {r, g, b}). This ensures that the evaluation of pixels of different colors is separable-whether a "blue" pixel is damaged or not can be determined from the values of the adjacent "red" pixels. The value of the η i indicator for a given pixel determines whether it is damaged or not. A defective pixel is replaced by a median of pixels belonging to the same color channel.
where m c is the median value used in the Equation (7), T is the tuning parameter of the RAWSM method, and σ (g c ) is the standard deviation of the pixel value g c . In our work, taking into account the relatively low level of impulsive noise in CMOS sensors, the suboptimal value of T = 4 was selected. This filtering method was proposed for filtering during the downsampling process, but it is universal and can be used to filter raw data when processing digital negatives.

The Process of Preparing Ground Truth Images
Two different Canon EOS 500D and 600D digital cameras were used to obtain test high quality raw images in proprietary cr2 file format. The acquisition process was carried out at minimum sensitivity setting (ISO 100) and in good lighting conditions. The processing was carried out using the Adobe DNG converter and with the modified Matlab scripts suggested in the work [18]. The full process of sRGB ground truth image creation is described below.
• reading raw files, • impulsive noise removal (using dark frame or with RAWSM filter), color space correction, • brightness and contrast adjustment.
A collection of 48 virtually noise free sRGB images was obtained-25 downsampled EOS 600D photos of size 432 × 288, 16 bpp and 23 downsampled EOS 500D photos of size 396 × 264, 16 bpp. The downsampling process performed using the maximum entropy method enabled complete elimination of artifacts associated with the use of the Bayer filter matrix, and almost completely reduced the sensor noise. In addition, "perfect" images for different stages of raw image processing pipeline were prepared. Finally, four groups of ground truth test images were prepared (GT test images are available at: https://tiny.pl/t6q48):

Synthetic Noise Model
The next step in the process of evaluating the quality of digital negative development is to simulate noise in a real CMOS sensor. The general model of CMOS sensor noise can be presented by additive formula: where σ is a standard deviation in point x and ξ denotes additive white Gaussian noise (AWGN). Foi et al. [19] assume that the sensor noise consists of two independent components: signal dependent Poissonian η p , and Gaussian which is independent of the signal-η g : Sensors 2020, 20, 902 9 of 21 Variable variances of the Poissonian component η P can be determined by the linear function of the signal: y: var{η p (y(x))} = ay(x), where a is a Poisson distribution parameter. The variance of Poisson noise increases with the value of the signal, while the variance of the Guassian component is constant, equal to b. Thus, the total variance could be written as σ 2 (y(x)) = ay(x) + b. The general noise model can therefore be presented in a form: This model contains two parameters that can be identified from imaging data obtained from real imaging sensors at various ISO values [19]. The use of a model suited to the actual device allows to obtain noise corresponding to the disturbances occurring during the actual image acquisition. This gives you the ability to test filtering algorithms for different real-world scenarios, e.g., performance depending on ISO sensitivity.
The use of such a noise model enables effective development and testing of algorithms in a near-target environment with appropriately selected interference levels (they are neither extremely low nor unnaturally high).
To more accurately reflect the real distortions in digital images, the noise model from (11) can be extended by impulsive noise component i: In this model, impulse noise will be understood as a faulty pixel, often called "hot", which can take different values, but for a given pixel the value is approximately constant. Such a noise model can be added based on dark frames obtained for various real image sensors [20]. In our case, we did not analyze impulsive noise; this is due to the fact that in the case of modern imaging sensors impulsive noise is relatively small and easy to effectively remove from the raw images, however impulsive noise data obtained from real camera were also included in our data set (The noisy images are available at: https://tiny.pl/t6xtj). The model described by Equation (11) corresponds to the physical phenomena occurring in the imaging sensors causing the dominant types of noise: shot and read out noise. However, it is necessary to determine realistic values of model parameters; in our case, we used the procedure proposed by Foi et al. in the works [19]. In our study, the noise model parameters were determined for Canon EOS 500D for four different ISO values (100, 800, 1600, and 3200), the obtained parameter values are shown in Figure 11. This model can be used to model data for both monochrome sensors and those using a Bayer filter, but must be used for raw data, as the noise/signal variance dependence has been determined for such data.

The Assumptions of the Experiment and the Input Data
During many years of work on image filtering, many noise reduction algorithms have been developed, some of them also have versions adapted to raw data processing. For our research we chose well known and effective algorithms using the idea of non-local means (NLM) [21] and BM3D [8].
Both algorithms are often used as a reference in filtering efficiency testing, and BM3D is still one of the most effective algorithms for filtering Gaussian-like noise. There are also many modifications to these algorithms that significantly improve their speed. Although both NLM and BM3D have been adapted to work with CFA data, we also decided to test the solution presented in the work [22]-pseudo four-channel filtering technique (P4Ch). This method is based on decomposition of CFA image into four smaller sub-images (pseudo-channels) corresponding to pixel positions in Bayer pattern. Then the channels are subjected to PCA analysis, so obtained channels can be filtered using any technique designed for grayscale images. In the last step, we perform reverse transformations to obtain a filtered CFA image. It is possible to create four different variants of P4Ch sub images (RGGB, GRBG, BGGR, and GBRG), so the P4Ch filtering operation is repeated four times and the final CFA result is obtained by averaging the values obtained for all variants.
Another problem is the choice of demosaicing algorithms, in our case, we decided to choose two algorithms: the Adams Hamilton algorithm (AH) [23] and Self-similarity driven demosaicing (SSDD) [24]. The first one is quite popular because of its relative simplicity and high efficiency, while the second one was created as a development of the idea contained in the NLM algorithm, giving excellent results, which is, however, paid for by the high computational complexity.
During our research, we tested the operation of various filtering and demosaicing scenarios. We have performed tests for different levels of noise for all images in our image data set. In addition, each image was corrupted independently ten times to reduce the uncertainty of the result caused by the use of a pseudo-random number generator. During the research, we tested different combinations of filtering and demosaicing algorithms; in the final comparison we used the following methods, • classical color image denoising filters: -Non Local Means (NLM) [21], -Color Block-Matching and 3D Filtering (CBM3D) [8], • direct raw denoising [25]: • pseudo 4-channel filtering technique [22] (P4Ch) • demosaicing techniques: -
Noisy raw images are the starting point for our tests, and for their processing, it is necessary to use an appropriate combination of filtering and demosaicing; the appropriate processing sequences are shown in Table 1. Due to the specificity of the tested algorithms, we divided them into three groups: standard filters, pseudo-four channel filters (P4Ch), and direct CFA filters.

Results
Tables 2 and 3 show the mean results of peak signal to noise ratio (PSNR) and structural similarity index (SSIM) as well as standard deviations for the whole image test set. The results in the tables are color-coded and grouped according to the classification used in Table 1, the best results are highlighted in green. The results obtained for the whole data set, i.e., for all images and for all ISO sensitivities (noise levels) were visualized on boxplots ( Figure 12).   From these results, it is difficult to draw general conclusions about the whole set of images and all levels of noise. Average results for the best classical methods are similar to those achieved for filters using raw data from the Bayer matrix. It is therefore also not possible to answer the general question posed at the beginning of this work: which of the processing schemes will give the best results. However, by analyzing the results for different ISO sensitivities, we see that as the noise level increases, the more advanced raw filtering methods gain a noticeable advantage, which is also visible on the boxplots presented in Figure 13.
Another method of analysis of the tested results may be a kind of rank assessment-grouping the compared techniques by the number of files for which a given method wins. Table 4 groups the results obtained for different ISO sensitivities using the PSNR indicator.
The results confirm the advantage of methods based on direct raw data filtering over classical methods for higher noise levels. Table 4. Number of "wins" for individual filtering schemes according to the achieved value of PSNR coefficient.

No. Filter
All ISO 100 800 1600 3200 P4Ch BM3D → AH According to this comparison, for high noise levels (ISO 1600 and 3200), the best results are obtained for the most advanced P4Ch BM3D filtering technique using SSDD interpolation. Minimally worse results are obtained much faster, and thus are more useful, by the Adams-Hamilton demosaic algorithm. What is more, when on both sides of the comparison we place classical methods and methods performing filtering before the interpolation stage, the advantage of the latter will be even more visible.
We also decided to look at the results for individual files, we chose two images dog and bird, the first one being quite a challenge for filtering and demosaicing algorithms-its results often represent outliers, while the second one gives results typical for our data set. The relevant PSNR box plots obtained for these images for all ISO sensitivities and ISO 1600 are shown in Figures 14 and 15 (for better readability, these graphs skip data for the demosaic operation itself without filtering). Examples of visual filtering results for noise corresponding to ISO 1600 are shown in Figures 16 and 17. Analyzing the numerical results and images obtained for the test image "dog" we can assume that in this case, the demosaicing process is more difficult than filtering, which is indicated, among other things, by the great advantage of methods using an advanced SSDD algorithm. Nevertheless, in this case, for higher noise values, better results are obtained by placing the filtering process before interpolation. Note that, taking into account the structural quality coefficient (SSIM), the benefits of using direct raw image filtering methods may be greater than for the PSNR indicator.  Despite a relatively small set of test images, we tried to include data with great diversity in it. The test set included images with a large number of details, textural data, and images containing uniform surfaces. The images are characterized by different colors in terms of both color temperature and saturation. In our simulations, we tried to recreate as faithfully as possible the natural process of noise generation during image acquisition; additionally, each image was disturbed by 10 modeled noise realizations. As can be seen on the boxplots in Figures 14b and 15b, the influence of randomness on the interference on the variability of obtained results is relatively small.
Analyzing the obtained results, especially the boxplots (Figures 12 and 13), it can be seen that the obtained results have a large range of variability for different images and levels of noise and it is difficult to assess whether they are statistically significant. To assess the statistical significance of the results obtained, a two-sample t-test was carried out.
For simplicity, we have divided our algorithms into two classes: classical algorithms (demosaic, then denoise) and algorithms dedicated to CFA data filtering (denoise, then demosaic). We have compared the results for the best classical and CFA filtering methods, grouping the results according to ISO values. In all comparisons, we assumed the null hypothesis with equal average quality coefficients (PSNR or SSIM) with a significance level of 5%. The alternative hypothesis, on the other hand, assumed that the average quality index for the winning filter is higher than the second one. Moreover, we assumed that the compared sets may have different variances.
The results obtained are presented in the form of box charts for the methods compared, in each case the better method is placed in the chart on the left. The value of h = 1 means that the obtained result for our data is statistically significant (i.e., we reject the null hypothesis).
At the beginning, we compared the results for the whole test set-all pictures and all ISO sensitivities, the results of the comparison are shown in Figure 18. As it could be assumed after analyzing the data in Tables 2 and 3, the test confirms that the difference between these results is not statistically significant (h = 0, p-value = 0.21). Subsequent tests were carried out for individual noise levels (ISO values)-the results obtained are shown in Figure 19, as can be seen in most cases, the differences obtained are statistically significant. As previously observed for small ISO values, better results were obtained for classical filtering methods, while for larger noise levels, CFA data filtering seems to be a better solution.

Conclusions
The presented paper analyzes the influence of location of filtering algorithms in the whole pipeline digital negative development on the final image quality. A high-quality image data set has been prepared, which, according to the authors, made it possible to carry out tests in conditions close to real life. Our results show that the best approach to filtering CFA images for all cases can not be clearly identified. In general it seems that the traditional approach gives better results for images with relatively small amount of noise, whereas with the increase of noise, dedicated raw filtering techniques give better and better results. However, it should be noted that we have obtained results for a limited set of images, which may limit confidence in the results. Appropriate diversity of images and a clear advantage of dedicated RAW methods allow us to believe that these results can be generalized. In general, it can be assumed that the predominance of traditional methods for small noise levels is small enough to recommend the use of more advanced methods in all cases. The obtained results can help in choosing the best approach to the process of processing raw images-directly in digital cameras, as well as in post-processing software. It seems that for low light conditions it is crucial to carry out noise reduction before the CFA data interpolation process. Funding: The research was funded by grant from the Silesian University of Technology-subsidy for the maintenance and development of research potential.

Conflicts of Interest:
The authors declare no conflict of interest