Super-Resolution Reconstruction Algorithm for Infrared Image with Double Regular Items Based on Sub-Pixel Convolution

: In this paper, an adaptive dual-regularization super-resolution reconstruction algorithm based on sub-pixel convolution (MPSR) is proposed. There are two novel features of the algorithm: First, the traditional regularization algorithm and sub-pixel convolution algorithm are combined to enrich the details; then, a regularization function with two adaptive parameters and two regularization terms is proposed to enhance the edge. MPSR ﬁrstly enhances the multi-scale detail of low-resolution images; then, regular processing and feature extraction are carried out; ﬁnally, sub-pixel convolution is used to fuse the extracted features to generate high-resolution images. The experimental results show that the subjective and objective evaluation indexes (PSNR / SSIM) of the algorithm have achieved satisfactory results.


Introduction
Infrared imaging equipment has anti-jamming and strong target recognition ability, and can work at night and in harsh environments. It has been widely applied in industry, military, medicine, life and other fields. However, due to the limited number of pixels in the detector array of the infrared imaging system and the limitation of pixel size, the spatial sampling frequency cannot satisfy the sampling theorem, and infrared diffraction also causes the loss of high-frequency details of the image signal. In addition, the temperature range of real objects in nature is relatively small, and there is heat exchange and absorption with the surrounding environment. Therefore, compared with visible light images, an infrared image has lower resolution, unclear gray level, more concentrated gray distribution and less obvious texture, which is far from meeting the needs of human eyes for resolution. To solve these problems, the researchers applied the super-resolution reconstruction algorithm to the infrared image field.
The deep convolution neural network has been successfully applied to single image super-resolution reconstruction (SISR) [1]. The goal of single image super-resolution reconstruction is to reconstruct high-resolution (HR) images from low-resolution (LR) images. It has a wide range of applications in security, monitoring, satellite, medical imaging and other fields [2], and can be used as a a built-in module for other image restoration or recognition tasks [3,4]. With the rapid update of intelligent algorithms, the image super-resolution algorithm has made great progress compared with the original. Because of the addition of depth learning theory, the image super-resolution reconstruction effect is amazing, and the derivative speed of the new algorithm is very fast. From Figure 1, it can be In 2016, Shi proposed a super-resolution reconstruction method based on sub-pixel convolution [7]. This method directly takes low-resolution images as input, which reduces the complexity and has a good reconstruction effect. However, due to the low-resolution, indistinct gray level, centralized gray distribution and unclear texture of infrared images, this algorithm is improved on the basis of a sub-pixel convolution super-resolution reconstruction algorithm. The improved algorithm is more suitable for infrared images and achieves better results.
In this paper, an adaptive dual-regularization super-resolution reconstruction algorithm (MPSR) based on sub-pixel convolution is proposed. This algorithm has obvious advantages in infrared image processing. MPSR firstly enhances the multi-scale detail of low-resolution images, Then the initial reconstruction is carried out with the regular terms of the double regular terms and the double adaptive parameters. Finally, sub-pixel convolution is used to fuse the extracted features to generate high-resolution images. The experimental results show that the subjective and objective evaluation indexes (PSNR/SSIM) of the algorithm have achieved satisfactory results.
The rest of this paper is organized as follows. Section 2 is the detailed introduction of the algorithm proposed in this paper. Section 3 describes the content of the experiment and the analysis of the results. Section 4 is the conclusion. In 2016, Shi proposed a super-resolution reconstruction method based on sub-pixel convolution [7]. This method directly takes low-resolution images as input, which reduces the complexity and has a good reconstruction effect. However, due to the low-resolution, indistinct gray level, centralized gray distribution and unclear texture of infrared images, this algorithm is improved on the basis of a sub-pixel convolution super-resolution reconstruction algorithm. The improved algorithm is more suitable for infrared images and achieves better results.

Super
In this paper, an adaptive dual-regularization super-resolution reconstruction algorithm (MPSR) based on sub-pixel convolution is proposed. This algorithm has obvious advantages in infrared image processing. MPSR firstly enhances the multi-scale detail of low-resolution images, Then the initial reconstruction is carried out with the regular terms of the double regular terms and the double adaptive parameters. Finally, sub-pixel convolution is used to fuse the extracted features to generate high-resolution images. The experimental results show that the subjective and objective evaluation indexes (PSNR/SSIM) of the algorithm have achieved satisfactory results.
The rest of this paper is organized as follows. Section 2 is the detailed introduction of the algorithm proposed in this paper. Section 3 describes the content of the experiment and the analysis of the results. Section 4 is the conclusion.

Methods
Due to the low resolution, indistinct gray level, centralized gray distribution and unclear texture of infrared image, the super-resolution reconstruction algorithm based on sub-pixel convolution directly takes a low-resolution image as input, which results in an unsatisfactory reconstruction effect.
Therefore, the adaptive dual-regularization super-resolution reconstruction algorithm based on sub-pixel convolution (MPSR) first enhances the infrared image with multi-scale details. In the second step, another regular term is added to the regular part of the algorithm to enhance the edge of the image, and an adaptive regularization parameter of the constrained fidelity term and two regular items is added. The third step is to extract and activate the features, finally, the sub-pixel convolution is used to generate high-resolution images. The algorithm block diagram of MPSR is shown in Figure 2 and will be described in detail in the following sections.
Due to the low resolution, indistinct gray level, centralized gray distribution and unclear texture of infrared image, the super-resolution reconstruction algorithm based on sub-pixel convolution directly takes a low-resolution image as input, which results in an unsatisfactory reconstruction effect. Therefore, the adaptive dual-regularization super-resolution reconstruction algorithm based on sub-pixel convolution (MPSR) first enhances the infrared image with multi-scale details. In the second step, another regular term is added to the regular part of the algorithm to enhance the edge of the image, and an adaptive regularization parameter of the constrained fidelity term and two regular items is added. The third step is to extract and activate the features, finally, the sub-pixel convolution is used to generate high-resolution images. The algorithm block diagram of MPSR is shown in Figure 2 and will be described in detail in the following sections.

Multi-Scale Detail Enhancement
MPSR uses the multi-scale method to improve image details without artifacts. Firstly, the image is convoluted into three Gaussian blur kernels 1 2 3 ,, H H H of different scales, and the blurred images ,, B B B are obtained as follows: Then, the convoluted images are subtracted from the original image to obtain different levels of detail information: Finally, the overall detail image is generated by merging the three layers: the fine detail 1 D expands the gray level differences near the edge, but may saturate the gray levels duo to its excessive overshooting. To overcome this problem, the sgn function is introduced.
The data of 1 a , 2 a , 3 a and the PSNR of the reconstructed image are obtained by simulation experiment. Figure 3 is drawn from the experimental data. It can be seen that when 1 a , 2 a , 3 a are set to 0.1, 0.2 and 0.6 respectively, the reconstruction effect is the best.

Multi-Scale Detail Enhancement
MPSR uses the multi-scale method to improve image details without artifacts. Firstly, the image is convoluted into three Gaussian blur kernels H 1 , H 2 , H 3 of different scales, and the blurred images B 1 , B 2 , B 3 are obtained as follows: Then, the convoluted images are subtracted from the original image to obtain different levels of detail information: Finally, the overall detail image is generated by merging the three layers: the fine detail D 1 expands the gray level differences near the edge, but may saturate the gray levels duo to its excessive overshooting. To overcome this problem, the sgn function is introduced. The data of a 1 ,a 2 ,a 3 and the PSNR of the reconstructed image are obtained by simulation experiment. Figure 3 is drawn from the experimental data. It can be seen that when a 1 ,a 2 ,a 3 are set to 0.1, 0.2 and 0.6 respectively, the reconstruction effect is the best.

Related Algorithm
The algorithm in this paper is proposed on the basis of the maximum a posteriori estimation, which is described in detail below.
The implication of the super-resolution reconstruction algorithm based on maximum a posteriori estimation is that if the sequence of low-resolution images is known; the posteriori probability of high-resolution images is maximized as follows: According to the Bayesian formula, the following formula is obtained: Take the logarithm to the right side of Formula (5): where log Pr y|x is the logarithm of the maximum likelihood function; log Pr{x} is the logarithm of the prior probability of x. If the image noise is assumed to be Gauss noise with mean values of 0 and variance of σ 2 k , the overall probability function for estimating y k by x is: In Formula (7),ŷ k simulates low-resolution images and is given byŷ k = A kx , where A k is a fuzzy matrix. Assuming that the images are independent of each other, the probability density function of low-resolution image sequences can be expressed as Formula (8).
A common prior probability estimation method is to smooth the image, that is, to punish the adjacent pixel units with large differences in the image, and its probability density function can be assumed to be in the following form: where λ is a control parameter to control the peak of probability density distribution. Substitute Formula (9) into Formula (6): In Formula (10), the first term in the middle bracket on the right side of the equal sign is a constant term, which can be eliminated directly; then the negative sign of the other two terms is Appl. Sci. 2020, 10, 1109 5 of 15 changed to a positive sign, and assuming that the noise of each image has the same variance, the above maximization problem can be transformed into the following minimization problem: Set α 1 = 2σ 2 λ ; the above formula can be written as: where α 1 is the regularization parameter. The necessary condition for minimizing the above formula is that the partial differential value of y − Ax 2 + α 1 Qx 2 with respect to x is zero, as Formula (13).
According to Formula (12), regularization parameter α 1 controls the relative contribution of y − Ax 2 and Qx 2 in the process of solving, and constrains the degree of distortion and smoothness of reconstructed images. If α 1 is too small, the problem of noise is not solved well, and the reconstructed image will still be distorted. If α 1 is too large, the reconstructed image will be too smooth, resulting in the loss of image details. The adaptive solution method can solve the above problems. This method makes full use of the information of reconstruction results in the iteration process, updates α 1 continuously and calculates the reconstructed image. The new reconstructed image will be applied to solve α 1 in the next iteration, so that the optimal solution of the reconstructed image can be obtained by cyclic iteration.
From the principle of regularization, it is pointless that the regularization parameter is less than zero, so the first condition it should satisfy is to be greater than zero. Larger y − Ax 2 represents that there is a large noise in the model. In order to eliminate the influence of noise on the results, larger α 1 is needed to regularize it, so α 1 should be proportional to y − Ax 2 . Because Q is a high-pass filter operator and larger Qx 2 represents image with rich edge and texture details, it is necessary to select smaller regularization parameters to maintain image details, so α 1 should be inversely proportional to Qx 2 . According to the three basic attributes mentioned above, the following regularization parameter solving formulas are determined: In Formula (14), α 1(k+1) is the regularization parameter of the k + 1 iteration; x k is the reconstruction image of the k iteration; r is a very small number to ensure that the denominator is not zero; λ is the convergence correction factor, and a fixed constant can be selected. In the process of iteration, after regularization parameters are determined, Formula (15) is applied to update the reconstructed image:

Improvement of Regularization
On the basis of the two-dimensional Laplacian operator, a Prewitt operator is added to enhance the edge of the image. This operator is represented by W.
Appl. Sci. 2020, 10, 1109 6 of 15 Then the maximization problem is transformed into the minimization problem. The formula is as Formula (17) where α 1 and α 2 are regularization parameters. The necessary condition for minimizing the above formula is that the partial differential value of y − Ax 2 + α 1 Qx 2 + α 2 Wx 2 with respect to x is zero, as Formula (18): In the process of iteration, after regularization parameters are determined, the following formulas are used to update the reconstructed image: where α 1(k+1) and α 2(k+1) are the regularization parameter of the k+1 iteration; x k is the reconstruction image of the k iteration; r is a very small number to ensure that the denominator is not zero; λ is the convergence correction factor, and a fixed constant can be selected.

Basic Theory of Sub-Pixel Convolution
Many super-resolution reconstruction algorithms usually preprocess the low-resolution image in the first step, that is, they are enlarged to the required size by the bicubic interpolation algorithm [16], and then reconstructed. This greatly increases the complexity of reconstruction. The super-resolution reconstruction algorithm of sub-pixel convolution is to extract the features of low-resolution images first, and finally to fuse the extracted features to generate high-resolution images by using sub-pixel convolution. The process is shown in the Figure 4.  As shown in Figure 4, the input of the network does not need interpolation pretreatment, but directly inputs the original LR image. After the convolution layer, the characteristic images with channel number 2 r are obtained; r is the magnification factor. Then, through the sub-pixel convolution layer, 2 r channels of each pixel correspond to a sub-pixel block with size  rr in the HR image. Finally, the characteristic image of 2  H W r is rearranged to an HR image of 1  rH rW . Therefore, the number of feature outputs of the last convolution layer should be set to the square of the magnification factor r , so that the total number of pixels is the same as the HR image.
Compared with using the interpolation function to enlarge the image in the first layer of the network, when using sub-pixel convolution to enlarge the image, the interpolation function can learn automatically in the previous layer of convolution, thus learning better and more complex mapping from LR to HR. Since convolution operations are performed on LR images, smaller convolution cores can be used to extract the same information, which further reduces the computational complexity, so the efficiency will be significantly improved.
For a network composed of L layers, the first L-1 layer can be described as Formula (21)   As shown in Figure 4, the input of the network does not need interpolation pretreatment, but directly inputs the original LR image. After the convolution layer, the characteristic images with channel number r 2 are obtained; r is the magnification factor. Then, through the sub-pixel convolution layer, r 2 channels of each pixel correspond to a sub-pixel block with size r × r in the HR image. Finally, the characteristic image of H × W × r 2 is rearranged to an HR image of rH × rW × 1. Therefore, the number of feature outputs of the last convolution layer should be set to the square of the magnification factor r, so that the total number of pixels is the same as the HR image.
Compared with using the interpolation function to enlarge the image in the first layer of the network, when using sub-pixel convolution to enlarge the image, the interpolation function can learn automatically in the previous layer of convolution, thus learning better and more complex mapping from LR to HR. Since convolution operations are performed on LR images, smaller convolution cores can be used to extract the same information, which further reduces the computational complexity, so the efficiency will be significantly improved.
For a network composed of L layers, the first L-1 layer can be described as Formula (21) and Formula (25).
where W l , b l , l ∈ (1, L − 1) are the weight and deviation of the network, W l is the 2D convolution layer of size n l−1 × n l × k l × k l , n l is the characteristic number of layer l, k l is the filter size of layer l, deviation b l is the vector of length n l , ϕ is a non-linear function (or activation function) and the last layer f L synthesizes.

Combining Regularization with Sub-Pixel Convolution
Due to the insufficient feature extraction of the regularization algorithm in the super-resolution reconstruction of a single image, and the complexity of using the interpolated image as the reconstruction input, in order to solve the above problems, this paper proposes an adaptive dual-regularization super-resolution reconstruction algorithm based on sub-pixel convolution (MPSR), which combines the regularization and sub-pixel convolution to make the original low-resolution image. The resolution image is directly used as the input. Firstly, it is regularized to enhance the details and edges; secondly, it is extracted and activated; finally, it is rearranged with the sub-pixel convolution layer to get a high-resolution image, which not only improves the image quality compared to the original basis, but also improves the algorithm training speed. See Algorithm 1 for details

Complexity Analysis
When calculating the complexity of the model, the influence of the activation function on the complexity of the model is neglected because the activation function has little influence on the complexity of the model. The formula for calculating the complexity is as follows: where D is the network depth, K l is the convolution core size of layer l, C is the number of channels, S I is the size of the input image. Formula (23) shows that the complexity of the network model is proportional to the size of the HR image, and the impact of the middle layer on the complexity of the network model is relatively greater than that of the first and last layer. In SRCNN, the size of the convolution core in the first layer is set to 9 × 9, but since our algorithm uses the original low-resolution image as input, the smaller convolution core can be applied to extract the same features. The size of the convolution core in the first layer is 5 × 5, and the size of the convolution core in the latter layer is 3 × 3. The following table compares the complexity of the 9-5-5 SRCNN algorithm with that of MPSR algorithm. According to Table 1, the complexity of the algorithm MPSR is 3.9 times lower than that of SRCNN.

Experimental Environment and Parameters Setting
Infrared image dataset is a kind of self-made dataset. These images are taken by infrared imaging equipment. It contains 400 training images and 100 test images. In order to make the experimental results as accurate as possible, the training set should contain as many scenes as possible. In order to synthesize low-resolution images, Gauss filter is used to blur high-resolution images. Due to the small number of samples in the dataset, the original high-resolution image and the blurred low-resolution image are cut to generate more sample pairs. The original image pixel is 320 × 240, the cut image pixel is 32 × 24 and finally 4 × 10 4 pairs of training samples are generated. During the training, the learning rate is set to 10 −4 and the activation function is RELU.
Among them, MAX is the largest number of regular partial iterations, and the maximum number of iterations has an impact on the reconstruction effect and time. As shown in Figure 5.
As can be seen from the Figure 5a, the greater the number of iterations, the greater the PSNR values, the better the reconstruction effect. Figure 5b shows that the larger the number of iterations, the longer the time needed to reconstruct the image. Considering comprehensively, the number of iterations in this experiment is set to 50. In practical work, it can be selected according to the needs.
The experimental environment is as Table 2.
resolution image are cut to generate more sample pairs. The original image pixel is 240 320 , the cut image pixel is 24 32  and finally 4 10 4 pairs of training samples are generated. During the training, the learning rate is set to 4 10  and the activation function is RELU.
Among them, MAX is the largest number of regular partial iterations, and the maximum number of iterations has an impact on the reconstruction effect and time. As shown in Figure 5.

Image Quality Metric Parameters
In the evaluation of simulation image test results, subjective evaluation and objective evaluation are used in this paper. The subjective evaluation is the result of human observation [17]. The objective evaluation indexes include peak signal-to-noise ratio (PSNR) and structural similarity degree (SSIM). The formula for obtaining PSNR value is defined as Formula (24): where MSE represents the mean square error. In the objective evaluation index of image processing, PSNR is widely used, which is recognized by researchers. The larger the PSNR value is, the closer the reconstructed image is to the original high-resolution image, and the better the reconstruction effect is. SSIM (structural similarity index) is another commonly used evaluation index, which judges image quality according to the similarity of image structure before and after reconstruction. The formula is defined as Formula (25).
where µ x is the average of x, µ y is the average of y, σ 2 x and σ 2 y are the variance of x and y respectively, and σ 2 xy is the covariance of x and y. c 1 = (k 1 L) 2 , c 2 = (k 1 L) 2 . L reflects the range of pixel values. k 1 = 0.01, k 2 = 0.03, SSIM value range is (−1, 1). The closer the SSIM value is to 1, the better the image reconstruction effect is. When the two images are identical, the value of SSIM is 1.

Experiment on the Effectiveness of Algorithm Improvements
MPSR has two improvements. Firstly, because the texture of the infrared image is not obvious, the image details are enhanced before the super-resolution reconstruction, and then the image is reconstructed with the improved biregular objective function. Secondly, the regularization objective function is combined with sub-pixel convolution. Instead of taking the enlarged high-resolution image as the input, the low-resolution image is taken as the input. Finally, the sub-pixel convolution is used to enlarge the image to the required multiple, which not only reduces the complexity, but also enhances the reconstruction effect. The effectiveness of these two parts is verified by experiments, as shown in Figure 6.  Figure 6a is a comparison of the objective functions between single regular item and double regular items. It can be seen from the figure that the effect of image reconstruction with the objective function of double regular items is better than that with single regular item, and the PSNR value is increased by about 1.5 dB Figure 6b is the comparison of PSNR values before and after algorithm improvement. The former algorithm refers to the traditional regularization algorithm, the improved algorithm is MPSR. It can be seen from the figure that the improved PSNR value has increased by 2.5 dB. As shown above, both improvements in this experiment are effective.

Comparison of Different Algorithms
In order to compare with MPSR, spline interpolation algorithm (SPLINE), adaptive regularization algorithm (MAP), SRCNN and sub-pixel convolution-based reconstruction algorithm (ESPCN) are also applied in the experiment. Eight images in the test set are selected to display the data. The specific images are shown in Figure 7.  Figure 6a is a comparison of the objective functions between single regular item and double regular items. It can be seen from the figure that the effect of image reconstruction with the objective function of double regular items is better than that with single regular item, and the PSNR value is increased by about 1.5 dB Figure 6b is the comparison of PSNR values before and after algorithm improvement. The former algorithm refers to the traditional regularization algorithm, the improved algorithm is MPSR. It can be seen from the figure that the improved PSNR value has increased by 2.5 dB. As shown above, both improvements in this experiment are effective.

Comparison of Different Algorithms
In order to compare with MPSR, spline interpolation algorithm (SPLINE), adaptive regularization algorithm (MAP), SRCNN and sub-pixel convolution-based reconstruction algorithm (ESPCN) are also applied in the experiment. Eight images in the test set are selected to display the data. The specific images are shown in Figure 7. The PSNR and SSIM values obtained from reconstructed images using different algorithms are listed in Tables 3 and 4. In order to evaluate the effect more intuitively, the data in Tables 3 and 4 are averaged and plotted in Figure 8. In order to evaluate the effect more intuitively, the data in Tables 3 and 4 are averaged and plotted in Figure 8. The abscissa of Figure 8a is the algorithm name, and the ordinate is the PSNR values obtained by reconstructing the image with different algorithms. From Figure 8a, it can be seen that the SPLINE algorithm has the worst effect, the mean value of PSNR after reconstruction is only 30.5 dB, and the traditional regularization algorithm MAP is not ideal. At present, the popular algorithm based on deep learning has a better reconstruction effect than the traditional algorithm, but MPSR proposed The abscissa of Figure 8a is the algorithm name, and the ordinate is the PSNR values obtained by reconstructing the image with different algorithms. From Figure 8a, it can be seen that the SPLINE algorithm has the worst effect, the mean value of PSNR after reconstruction is only 30.5 dB, and the traditional regularization algorithm MAP is not ideal. At present, the popular algorithm based on deep learning has a better reconstruction effect than the traditional algorithm, but MPSR proposed in this paper has the best effect. It can be seen from Figure 8a that the PSNR values calculated by this algorithm is about 4 dB higher than that of SPLINE interpolation algorithm, about 2 dB higher than that of SRCNN algorithm, and about 1 dB higher than that of ESPCN algorithm.
The abscissa of Figure 8b is the algorithm name, and the ordinate is the SSIM value obtained by reconstructing the image with different algorithms. Compared with the trend of PSNR, spline algorithm has the worst effect. The mean value of SSIM after reconstruction is less than 0.88, while the mean value of SSIM of MPSR is nearly 0.94, which is nearly 0.03 more than that of SRCNN algorithm and 0.02 more than that of ESPCN algorithm.
The effect of the reconstructed images is as in Figure 9.   Figure 9 are the images reconstructed with different algorithms. Among these five methods, the super-resolution image reconstructed by SPLINE interpolation is the most fuzzy and the edge preserving effect is the worst. SRCNN and ESPCN algorithm have good reconstruction effect and rich detail information, but the edge reconstruction effect is still not ideal. The MPSR algorithm proposed in this paper has obvious advantages in detail information enhancement and edge preservation, and can effectively reduce noise and improve the overall brightness of the image, and the subjective visual effect is better than other algorithms.

Conclusions
Common super-resolution reconstruction algorithms usually pre-process low-resolution images; that is, interpolation is enlarged to the required size, and then super-resolution reconstruction is carried out. This greatly increases the complexity of reconstruction. The image super-resolution reconstruction algorithm based on sub-pixel convolution directly takes low-resolution images as input, which reduces the complexity and improves the speed. However, due to the low resolution of the infrared image, the unclear gray level and the centralized gray distribution, the texture is not obvious, and the reconstruction effect is not ideal. In this paper, an adaptive dual-regularization super-resolution reconstruction algorithm based on sub-pixel convolution is proposed. The algorithm regularizes the image before convolution. Compared with the traditional regularization algorithm, the algorithm firstly enhances the multi-scale details of the image, and adds another regularization term to the regularization part of the algorithm to enhance the edge of the image, and adds a constraint fidelity. Compared with other algorithms, the PSNR and SSIM values of the proposed algorithm are significantly improved, and the enhancement of image details and edge preservation are significantly better. Moreover, the proposed algorithm can effectively reduce noise and enhance the overall brightness of the image, and the subjective visual effect is better than other algorithms.