Ex) Article Title, Author, Keywords
Current Optics
and Photonics
Ex) Article Title, Author, Keywords
Curr. Opt. Photon. 2023; 7(6): 655-664
Published online December 25, 2023 https://doi.org/10.3807/COPP.2023.7.6.655
Copyright © Optical Society of Korea.
Chenzhe Jiang1, Banglian Xu1, Leihong Zhang2 , Dawei Zhang2
Corresponding author: *lhzhang@usst.edu.cn, ORCID 0000-0003-1071-3621
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Ghost imaging (GI) technology is developing rapidly, but there are inevitably some limitations such as the influence of atmospheric turbulence. In this paper, we study a ghost imaging system in atmospheric turbulence and use a gamma-gamma (GG) model to simulate the medium to strong range of turbulence distribution. With a compressed sensing (CS) algorithm and generative adversarial network (GAN), the image can be restored well. We analyze the performance of correlation imaging, the influence of atmospheric turbulence and the restoration algorithm’s effects. The restored image’s peak signal-to-noise ratio (PSNR) and structural similarity index map (SSIM) increased to 21.9 dB and 0.67 dB, respectively. This proves that deep learning (DL) methods can restore a distorted image well, and it has specific significance for computational imaging in noisy and fuzzy environments.
Keywords: Atmospheric turbulence, Deep learning, Generative adversarial network, Ghost imaging
OCIS codes: (010.0010) Atmospheric and oceanic optics; (100.3020) Image reconstruction-restoration; (110.1758) Computational imaging
Ghost imaging (GI), a new non-traditional imaging method, uses light field’s high-order correlation to obtain the object’s spatial or phase distribution information [1]. Computational ghost imaging (CGI) technology has great application potential and has a particular application value in 3D imaging, remote sensing imaging, the medical field, optical encryption, national defense, the military [2–4] and so on. After the theory of compressed sensing (CS) was proposed [5], which broke through the limitation of the Nyquist sampling theorem and realized compressed sensing ghost imaging (CSGI) [6], it was also used to deal with the issue of random scattering [7]. It can be thought that in remote sensing imaging, astronomical imaging and other processes, the influencing factors of the environment cannot be ignored [8].
At present, the research on GI in atmospheric turbulence is extensive. In 2012, Zhang and Wang [9] studied the influence of atmospheric turbulence on GI. In 2013, Cheng and Lin [10] studied the effects of atmospheric turbulence intensity, light source coherence, light source size and arm detector size on GI. In 2016, Yang et al. [11] studied a situation where the light path of GI system contained a strong scattering medium. In 2018, Tang et al. [12] studied the position of the measured object in GI in turbulence. When GI is performed in fog and other complex weather or long-range imaging, light beams are absorbed or scattered in the randomly changing atmospheric turbulence, and the light signal is seriously affected, resulting in noise, distortion, blurring and other problems in the final imaging [13].
Computational imaging (CI) breaks through the limitations of traditional optical imaging. There are inherent problems in the imaging process, such as the diffraction limit, optical scattering limitations, and the constraints of optical systems [14]. CI combines an imaging system with information processing and can recover complete object information from incomplete and hidden measurement information. Thus, it has great practical significance. This technology has been widely used in a variety of imaging systems, and some typical inverse problems in imaging are solved, such as quantum imaging [15, 16], three-dimensional imaging technology [17], and multimode fiber imaging [18]. In recent years, deep learning (DL) has made unprecedented progress and has been used extensively in face recognition, machine translation, natural language processing and other fields [19, 20]. Some researchers have dealt with some problems by combining them with computational imaging (CI) and have solved many problems in the imaging field.
This paper first introduces the development and current situation of GI and the principle of the imaging process. Then we use the gamma-gamma (GG) model to simulate atmospheric turbulence within the medium to strong turbulent fluctuation rage. We research the performance of GI, and some influencing factors including turbulence intensity, transmission distance, and sampling rate. Finally, we compare various restoration methods and analyze the imaging results after restoration. The results show that the distorted imaging can be well restored by the proposed method. In this paper, we deal with the distortion of GI with advanced DL methods, and the results prove the superiority of DL in processing problems of GI.
In CGI, we load a random speckle image into a digital micromirror device (DMD) instead of the traditional rotating ground glass [21, 22]. The imaging process is shown in Fig. 1. This greatly reduces the complexity of the procedure and operation in the experiment process and makes it more practical.
The phase of the laser source is modulated by the DMD. Optical path information is collected from the target object and recorded by the bucket detector. The total strength is calculated as in Eq. (1), and the correlation function is shown in Eq. (2):
where δi(x, y) means the phase information loaded into the DMD matrix in advance [23, 24], T(x, y) is target object information, Ii is total light intensity measured by the bucket detector, M is the number of samples, and I means the ensemble average of the system. CS is widely used in the signal process to accurately reconstruct high-dimensional signals from low-dimensional signals [25]. If X ∈ RN is sparse on a set of bases, it can be expressed as Eq. (3):
where φ denotes a sparse transformation matrix of size N × N, s is the signal after sparse transformation, and it is also the signal from sampling.
The model in this paper is a GG turbulence model [26]. This model is suitable for moderate and strong turbulence and has a wide range of applications and good practicality [27, 28]. The outer-scale effect Ix and the inner-scale effect Iy in atmospheric turbulence are both respectively affected by this distribution. The probability densities of these two effects are expressed as Eq. (4):
Then we can get the whole probability density expressed as Eq. (5):
where α represents large-scale vortexes in the scattering process, β is small-scale vortexes, Γ(∙) is gamma function, Kn(∙) is the second order modified Bezier function of n order, and I is the normalized received irradiance. In Eq. (6) σ21 = 1.23
It is known that the turbulence behind the target object has little effect on imaging. Therefore, atmospheric turbulence is added to the receiving optical path of the target object. The experimental process is combined according to the principles and formulas mentioned above, and the situation of atmospheric turbulence in the optical path is simulated as shown in Fig. 2.
In this paper, we propose a method based on the generative adversarial network (GAN) and name it TGAN. It has advantages such as being able to save high texture details and create patterns close to real images, so it is the main force in image super-resolution and image restoration [30, 31]. GAN was first proposed by Goodfellow et al. [32]. Most of the DL algorithms before GAN needed many complex operations, and the time cost was very high. The most basic structure of the GAN is composed of one generator and one discriminator, where Z is the factor that causes image distortion such as noise, turbulence and so on. A schematic diagram is shown in Fig. 3.
The main purpose of this network is to simulate the distribution of actual data. Assume the data distribution is PT(x), a series of actual data are distributed as {x1, x2, ..., xm} the generator model distribution is PG(x;θ), and θ is a parameter that controls generator distribution. In order to simulate actual data more accurately, the whole process is transformed into a problem of calculating maximum likelihood estimation (MLE), The likelihood estimation is L =
In this formula, the maximum value of V(G, D) represents the difference between the data from the actual distribution and the data from the generator, where the former item in the formula demonstrates the datum from PT, the latter indicates the datum from PG, and max V(G, D) is used to optimize the discriminator.
The proposed network structure is shown in Fig. 4. Input an image of 256 × 256, perform one 2D convolution with 64 convolution kernels, a kernel size of 7 × 7, and a step size of 1 to extract the features of the input image, which can reduce the number of training parameters. Speed up the computation by batch normalization (BN) and use a rectified linear unit (ReLu) [33] as the activation function, which is more efficient and simplifies the calculation. Next, there are two down-sampling blocks with the same structure. As is shown in Fig. 4, in each down-sampling block there is a 3 × 3 convolutional layers and a 2 × 2 max pooling layers, which reduces the size of the feature map, simplifies the calculation complexity, and extracts the main features of the image. The last two layers are a BN layer and a ReLu function layer. After the down-sampling block, there are nine Res-Net blocks, and the residual network makes the training faster. In each Res-Net, the input data is first subjected to a convolution layer with 256 convolution kernels, a kernel size of 3 × 3, and a step size of 1. Then, do batch standardization and ReLu activation processing. The data is then passed to a dropout layer with a ratio of 0.5%, that is, 50% of neurons are randomly disconnected to prevent possible overfitting during training. After the data fusion processing, the two transposed convolution blocks are used to restore the small size feature map to the original size. Finally, the output data is passed to a 2D convolution layer with three convolution kernels, a kernel size of 7 × 7, and a step size of 1. Finally, the data is passed to the Tanh activation layer.
The generator network Gθ is trained to generate images. Our purpose is to restore the distorted image It to the real image IT, and the training process is a process of optimizing the loss function.
The loss function is used to estimate the degree of inconsistency between the predicted value f (x) and the true value Y of the model. It is a non-negative value function, usually expressed by L(Y, f (x)), where the smaller the loss function, the better the model robustness. The loss function is a combination of adversarial loss and content loss, expressed as Eq. (8):
In Eq. (8) LGAN is the adversarial loss and LC is the content loss.
When training the original GAN (vanilla GAN) it is easy to encounter problems such as gradient disappearance and mode collapse, which is very difficult to train. The Wasserstein GAN (WGAN) proposed later uses the Wassertein-1 distance, making training less difficult. On this basis, the gradient penalty item is added to further improve the stability of training. WGAN-GP realizes stable training on multiple GAN structures, and there is almost no need to adjust hyperparameters. The expression is shown in Eq. (9):
There is content loss, that is, the gap between the assessment of the generated image and ground truth. Two common choices are L1 mean absolute error (MAE) loss, and L2 mean square error (MSE) loss. Recently, perceptual loss was proposed, which is essentially an L2 loss. It calculates the distance between the feature map generated and the ground truth feature map. The expression is shown in Eq. (10):
where φi,j represents the feature map output in the i-th max pooling layer and the j-th convolution layer (after activation) after inputting the image into VGG19 [34] (pre-trained on ImageNet). Wi,j Hi,j is the dimension of the feature map.
The TGAN algorithm in this paper uses the TensorFlow 1.1.5 [35] framework. During training, the batch size is 16. We use the Adam [36] to optimize the algorithm, where the learning rate is 0.001,beta1 is 0.9, beta2 is 0.999, and epsilon is 1e-08. After training, the time to restore one 256 × 256 image is about 0.4 s.
The whole system includes an imaging part and an algorithm part. The imaging system compresses sensing GI in atmospheric turbulence (TCSGI). The proposed method uses the GAN to restore previously distorted images, called TGAN. The general process is shown in Fig. 5.
In order to show the situation of GI and the feasibility of the proposed method, we conducted simulation programs to analyze the results. The basic data set used in the study is the face image data collection IMDB-Wiki [37] and some images include numbers, letters and characters. The programs TCSGI run in MATLAB 2016b and TGAN algorithm run in python 3.7 are performed on a CX50-G30 with 2×Intel(R) Xeon(R)Gold 6132 CPU @2.60 GHz.
The experimental device is shown in Fig. 6. A laser diode is used as the lighting source, and the retractable lens model is the Nikon AF-S DX 55–200 mm f/4–5.6 G ED (68 mm × 79 mm; Nikon, Tokyo, Japan). A BFLY-PGE-50H5M experimental camera (29 × 29 × 30 mm; Teledyne FLIR, OR, USA) is used, and the acquisition card is the M2i.2030-exp (Spectrum Intrumentation, Grosshansdorf, Germany). The projection matrix is loaded onto the DMD chip in the experiment, and the image to be identified is placed in front of the lens, and then the signal trigger is activated to start the experiment.
In this paper, the peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) are used to evaluate image quality. PSNR is generally used for engineering projects between the maximum signal and background noise. The larger the value, the lower the distortion. The mathematical formula is as shown in Eq. (11):
where
Next, different turbulence intensities and suitable transmission distances are set to study the influence of atmospheric turbulence, mainly the transmission distance d and atmospheric turbulence intensity
In Fig. 7(a), the measured PSNR values of the image are 15.08 dB, 9.72 dB, 9.42 dB, 8.97 dB, and 7.88 dB. In Fig. 7(b), the measured PSNR values are 14.89 dB, 12.78 dB, 8.40 dB, 8.22 dB, and 7.88 dB. According to the above, it can be found that at a certain intensity of turbulence or transmission distance, there will be a big difference in the imaging results at first, and then it will stabilize. At a certain intensity of turbulence, the increase in transmission distance will reduce the final imaging quality, and similarly, at a certain transmission distance, the stronger the turbulence intensity, the worse the image quality.
To some extent, the sampling rate determines the complexity and quality of the imaging process. The impact of the sampling rate on imaging performance was compared when the sampling rate was 12.5%, 25%, 50%, 75%, and 100%, as shown in Fig. 8.
The results show that at low sampling rates such as 12.5% and 25%, the patterns are almost unrecognizable. At 50%, the shape of images can be seen, and the images become clearer as the sampling rate increases. In TCSGI, as the sampling rate increases, the imaging quality is improved to a certain extent. Therefore, we choose the image with a sampling rate of 50% as the test sample in the process of image restoration.
In this section, we try to use a traditional filter to restore the image. Consider processing a distorted image with filters and using gray scale images of TCSGI (N = 50%, d = 300 m,
It can be seen that the filters have a certain effect on image restoration, but the effect is limited. Restoration with filters can show the general outline of the image. Both PSNR and SSIM have been improved to some extent, and the test averages are shown in Table 1.
TABLE 1 Average values of peak signal-to-noise ratio (PSNR) and structural similarity index map (SSIM)
Image Quality Assessment Method | TCSGI | Wiener Filter | Median Filter | Gaussian Filter |
---|---|---|---|---|
PSNR (dB) | 8.036 | 13.281 | 14.335 | 12.506 |
SSIM | 0.086 | 0.195 | 0.252 | 0.169 |
In this section, we mainly analyze the image quality of different imaging methods and need to use some indicators. Here, we use gray scale images by TCSGI (N = 50%, d = 300 m,
In the past related research [38], a deep neural network (DNN), which can complete the fitting task, has been widely used in image processing. We compared the imaging clarity of the four restoration methods to analyze their performance. In test process, three groups of pictures were selected to show, as shown in Fig. 10.
We observe the imaging results of median filters, DNN and TGAN. During the imaging process, it is inevitable that we find that there are certain differences between PSNR and SSIM due to different image complexity. We can see the general edges of the face because the main information of the image foreground is relatively large.
Combined with Table 2, Fig. 10 shows TCSGI, median filters, DNN and TGAN, in which the PSNR value of the image processed by TCSGI is between 8–9 dB, and the imaging effect is too fuzzy. Median filters can only slightly recover the fuzzy image, and the imaging effect is limited. For the DNN network, the recovered PSNR value is higher than 16 dB, and the SSIM is also around 0.5, and the imaging results are similar to the original image, but the local area has obvious change. The PSNR value of the TGAN method is more than 23 dB, and the SSIM is more than 0.7, so the similarity between the imaging and the original real image is higher.
TABLE 2 Average values of peak signal-to-noise ratio (PSNR) and structural similarity index map (SSIM)
Image Quality Assessment Method | TCSGI | Median Filter | DNN | TGAN |
---|---|---|---|---|
PSNR (dB) | 8.036 | 14.335 | 18.729 | 21.902 |
SSIM | 0.086 | 0.252 | 0.537 | 0.741 |
Figure 11 shows the imaging results of six groups. Finally, the entropy of image information restored by TGAN and the ground truth are compared. Image information entropy can be used to evaluate the overall information of the image, as shown in Eq. (13):
where p(i) is the probability that the gray value of the pixel is equal to i. It is found that the performance difference between the two imaging situations is very small, which means that the method can restore the image information well. We take six groups above as an example, as shown in Fig. 12.
The results show that with the observation of many test results, the overall PSNR value and SSIM value are extremely low for TCSGI imaging, and the imaging quality is very poor. The DNN can roughly restore the image, but the overall imaging quality is not as good as the TGAN imaging method proposed in this paper. TGAN is 17% higher than DNN on the PSNR and increases the SSIM by about 0.3. TGAN has a greater improvement effect on blurred and distorted images, the image quality is better, and the method can restore the image information well, but the texture and details are somewhat unnatural, which is the problem of the GAN.
In summary, we propose a restoration method based on deep learning for CSGI in atmospheric turbulence. The gamma-gamma model is used to simulate the medium to a strong range of turbulent fluctuation, called the TCSGI model. We study the influence of the transmission distance and turbulence intensity on imaging. We also do research and elaboration on the influence of the sampling rate on TCSGI clarity, find that the sampling rate is 50%, and see the outline of the images. We first discuss the effect of filters on restoration, which is limited. Compared with other restoration algorithms, the deep learning methods have excellent performance for image distortion restoration, especially the TGAN proposed. The restored image has good results in clarity and information entropy. It shows that deep learning has a good application in imaging restoration and can solve practical problems in computational imaging.
National Natural Science Foundation of China (No. 62275153, 62005165); Shanghai Industrial Collaborative Innovation Project (HCXBCY-2022-006); projects sponsored by the development fund for Shanghai talents (No: 2021005).
There are no conflicts of interest in the submitted manuscript, and all authors approved the manuscript for publication. The correcsponding author declares on behalf of the co-authors that the work described was original research that has not been published previously, and is not under consideration for publication elsewhere, in whole or in part. All the authors listed have approved the manuscript that is enclosed.
Data underlying the results presented in this paper are not publicly available at the time of publication but may be obtained from the authors upon reasonable request.
Curr. Opt. Photon. 2023; 7(6): 655-664
Published online December 25, 2023 https://doi.org/10.3807/COPP.2023.7.6.655
Copyright © Optical Society of Korea.
Chenzhe Jiang1, Banglian Xu1, Leihong Zhang2 , Dawei Zhang2
1College of Communication and Art Design, University of Shanghai for Science and Technology, Shanghai 200093, China
2School of Optical-electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
Correspondence to:*lhzhang@usst.edu.cn, ORCID 0000-0003-1071-3621
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Ghost imaging (GI) technology is developing rapidly, but there are inevitably some limitations such as the influence of atmospheric turbulence. In this paper, we study a ghost imaging system in atmospheric turbulence and use a gamma-gamma (GG) model to simulate the medium to strong range of turbulence distribution. With a compressed sensing (CS) algorithm and generative adversarial network (GAN), the image can be restored well. We analyze the performance of correlation imaging, the influence of atmospheric turbulence and the restoration algorithm’s effects. The restored image’s peak signal-to-noise ratio (PSNR) and structural similarity index map (SSIM) increased to 21.9 dB and 0.67 dB, respectively. This proves that deep learning (DL) methods can restore a distorted image well, and it has specific significance for computational imaging in noisy and fuzzy environments.
Keywords: Atmospheric turbulence, Deep learning, Generative adversarial network, Ghost imaging
Ghost imaging (GI), a new non-traditional imaging method, uses light field’s high-order correlation to obtain the object’s spatial or phase distribution information [1]. Computational ghost imaging (CGI) technology has great application potential and has a particular application value in 3D imaging, remote sensing imaging, the medical field, optical encryption, national defense, the military [2–4] and so on. After the theory of compressed sensing (CS) was proposed [5], which broke through the limitation of the Nyquist sampling theorem and realized compressed sensing ghost imaging (CSGI) [6], it was also used to deal with the issue of random scattering [7]. It can be thought that in remote sensing imaging, astronomical imaging and other processes, the influencing factors of the environment cannot be ignored [8].
At present, the research on GI in atmospheric turbulence is extensive. In 2012, Zhang and Wang [9] studied the influence of atmospheric turbulence on GI. In 2013, Cheng and Lin [10] studied the effects of atmospheric turbulence intensity, light source coherence, light source size and arm detector size on GI. In 2016, Yang et al. [11] studied a situation where the light path of GI system contained a strong scattering medium. In 2018, Tang et al. [12] studied the position of the measured object in GI in turbulence. When GI is performed in fog and other complex weather or long-range imaging, light beams are absorbed or scattered in the randomly changing atmospheric turbulence, and the light signal is seriously affected, resulting in noise, distortion, blurring and other problems in the final imaging [13].
Computational imaging (CI) breaks through the limitations of traditional optical imaging. There are inherent problems in the imaging process, such as the diffraction limit, optical scattering limitations, and the constraints of optical systems [14]. CI combines an imaging system with information processing and can recover complete object information from incomplete and hidden measurement information. Thus, it has great practical significance. This technology has been widely used in a variety of imaging systems, and some typical inverse problems in imaging are solved, such as quantum imaging [15, 16], three-dimensional imaging technology [17], and multimode fiber imaging [18]. In recent years, deep learning (DL) has made unprecedented progress and has been used extensively in face recognition, machine translation, natural language processing and other fields [19, 20]. Some researchers have dealt with some problems by combining them with computational imaging (CI) and have solved many problems in the imaging field.
This paper first introduces the development and current situation of GI and the principle of the imaging process. Then we use the gamma-gamma (GG) model to simulate atmospheric turbulence within the medium to strong turbulent fluctuation rage. We research the performance of GI, and some influencing factors including turbulence intensity, transmission distance, and sampling rate. Finally, we compare various restoration methods and analyze the imaging results after restoration. The results show that the distorted imaging can be well restored by the proposed method. In this paper, we deal with the distortion of GI with advanced DL methods, and the results prove the superiority of DL in processing problems of GI.
In CGI, we load a random speckle image into a digital micromirror device (DMD) instead of the traditional rotating ground glass [21, 22]. The imaging process is shown in Fig. 1. This greatly reduces the complexity of the procedure and operation in the experiment process and makes it more practical.
The phase of the laser source is modulated by the DMD. Optical path information is collected from the target object and recorded by the bucket detector. The total strength is calculated as in Eq. (1), and the correlation function is shown in Eq. (2):
where δi(x, y) means the phase information loaded into the DMD matrix in advance [23, 24], T(x, y) is target object information, Ii is total light intensity measured by the bucket detector, M is the number of samples, and I means the ensemble average of the system. CS is widely used in the signal process to accurately reconstruct high-dimensional signals from low-dimensional signals [25]. If X ∈ RN is sparse on a set of bases, it can be expressed as Eq. (3):
where φ denotes a sparse transformation matrix of size N × N, s is the signal after sparse transformation, and it is also the signal from sampling.
The model in this paper is a GG turbulence model [26]. This model is suitable for moderate and strong turbulence and has a wide range of applications and good practicality [27, 28]. The outer-scale effect Ix and the inner-scale effect Iy in atmospheric turbulence are both respectively affected by this distribution. The probability densities of these two effects are expressed as Eq. (4):
Then we can get the whole probability density expressed as Eq. (5):
where α represents large-scale vortexes in the scattering process, β is small-scale vortexes, Γ(∙) is gamma function, Kn(∙) is the second order modified Bezier function of n order, and I is the normalized received irradiance. In Eq. (6) σ21 = 1.23
It is known that the turbulence behind the target object has little effect on imaging. Therefore, atmospheric turbulence is added to the receiving optical path of the target object. The experimental process is combined according to the principles and formulas mentioned above, and the situation of atmospheric turbulence in the optical path is simulated as shown in Fig. 2.
In this paper, we propose a method based on the generative adversarial network (GAN) and name it TGAN. It has advantages such as being able to save high texture details and create patterns close to real images, so it is the main force in image super-resolution and image restoration [30, 31]. GAN was first proposed by Goodfellow et al. [32]. Most of the DL algorithms before GAN needed many complex operations, and the time cost was very high. The most basic structure of the GAN is composed of one generator and one discriminator, where Z is the factor that causes image distortion such as noise, turbulence and so on. A schematic diagram is shown in Fig. 3.
The main purpose of this network is to simulate the distribution of actual data. Assume the data distribution is PT(x), a series of actual data are distributed as {x1, x2, ..., xm} the generator model distribution is PG(x;θ), and θ is a parameter that controls generator distribution. In order to simulate actual data more accurately, the whole process is transformed into a problem of calculating maximum likelihood estimation (MLE), The likelihood estimation is L =
In this formula, the maximum value of V(G, D) represents the difference between the data from the actual distribution and the data from the generator, where the former item in the formula demonstrates the datum from PT, the latter indicates the datum from PG, and max V(G, D) is used to optimize the discriminator.
The proposed network structure is shown in Fig. 4. Input an image of 256 × 256, perform one 2D convolution with 64 convolution kernels, a kernel size of 7 × 7, and a step size of 1 to extract the features of the input image, which can reduce the number of training parameters. Speed up the computation by batch normalization (BN) and use a rectified linear unit (ReLu) [33] as the activation function, which is more efficient and simplifies the calculation. Next, there are two down-sampling blocks with the same structure. As is shown in Fig. 4, in each down-sampling block there is a 3 × 3 convolutional layers and a 2 × 2 max pooling layers, which reduces the size of the feature map, simplifies the calculation complexity, and extracts the main features of the image. The last two layers are a BN layer and a ReLu function layer. After the down-sampling block, there are nine Res-Net blocks, and the residual network makes the training faster. In each Res-Net, the input data is first subjected to a convolution layer with 256 convolution kernels, a kernel size of 3 × 3, and a step size of 1. Then, do batch standardization and ReLu activation processing. The data is then passed to a dropout layer with a ratio of 0.5%, that is, 50% of neurons are randomly disconnected to prevent possible overfitting during training. After the data fusion processing, the two transposed convolution blocks are used to restore the small size feature map to the original size. Finally, the output data is passed to a 2D convolution layer with three convolution kernels, a kernel size of 7 × 7, and a step size of 1. Finally, the data is passed to the Tanh activation layer.
The generator network Gθ is trained to generate images. Our purpose is to restore the distorted image It to the real image IT, and the training process is a process of optimizing the loss function.
The loss function is used to estimate the degree of inconsistency between the predicted value f (x) and the true value Y of the model. It is a non-negative value function, usually expressed by L(Y, f (x)), where the smaller the loss function, the better the model robustness. The loss function is a combination of adversarial loss and content loss, expressed as Eq. (8):
In Eq. (8) LGAN is the adversarial loss and LC is the content loss.
When training the original GAN (vanilla GAN) it is easy to encounter problems such as gradient disappearance and mode collapse, which is very difficult to train. The Wasserstein GAN (WGAN) proposed later uses the Wassertein-1 distance, making training less difficult. On this basis, the gradient penalty item is added to further improve the stability of training. WGAN-GP realizes stable training on multiple GAN structures, and there is almost no need to adjust hyperparameters. The expression is shown in Eq. (9):
There is content loss, that is, the gap between the assessment of the generated image and ground truth. Two common choices are L1 mean absolute error (MAE) loss, and L2 mean square error (MSE) loss. Recently, perceptual loss was proposed, which is essentially an L2 loss. It calculates the distance between the feature map generated and the ground truth feature map. The expression is shown in Eq. (10):
where φi,j represents the feature map output in the i-th max pooling layer and the j-th convolution layer (after activation) after inputting the image into VGG19 [34] (pre-trained on ImageNet). Wi,j Hi,j is the dimension of the feature map.
The TGAN algorithm in this paper uses the TensorFlow 1.1.5 [35] framework. During training, the batch size is 16. We use the Adam [36] to optimize the algorithm, where the learning rate is 0.001,beta1 is 0.9, beta2 is 0.999, and epsilon is 1e-08. After training, the time to restore one 256 × 256 image is about 0.4 s.
The whole system includes an imaging part and an algorithm part. The imaging system compresses sensing GI in atmospheric turbulence (TCSGI). The proposed method uses the GAN to restore previously distorted images, called TGAN. The general process is shown in Fig. 5.
In order to show the situation of GI and the feasibility of the proposed method, we conducted simulation programs to analyze the results. The basic data set used in the study is the face image data collection IMDB-Wiki [37] and some images include numbers, letters and characters. The programs TCSGI run in MATLAB 2016b and TGAN algorithm run in python 3.7 are performed on a CX50-G30 with 2×Intel(R) Xeon(R)Gold 6132 CPU @2.60 GHz.
The experimental device is shown in Fig. 6. A laser diode is used as the lighting source, and the retractable lens model is the Nikon AF-S DX 55–200 mm f/4–5.6 G ED (68 mm × 79 mm; Nikon, Tokyo, Japan). A BFLY-PGE-50H5M experimental camera (29 × 29 × 30 mm; Teledyne FLIR, OR, USA) is used, and the acquisition card is the M2i.2030-exp (Spectrum Intrumentation, Grosshansdorf, Germany). The projection matrix is loaded onto the DMD chip in the experiment, and the image to be identified is placed in front of the lens, and then the signal trigger is activated to start the experiment.
In this paper, the peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) are used to evaluate image quality. PSNR is generally used for engineering projects between the maximum signal and background noise. The larger the value, the lower the distortion. The mathematical formula is as shown in Eq. (11):
where
Next, different turbulence intensities and suitable transmission distances are set to study the influence of atmospheric turbulence, mainly the transmission distance d and atmospheric turbulence intensity
In Fig. 7(a), the measured PSNR values of the image are 15.08 dB, 9.72 dB, 9.42 dB, 8.97 dB, and 7.88 dB. In Fig. 7(b), the measured PSNR values are 14.89 dB, 12.78 dB, 8.40 dB, 8.22 dB, and 7.88 dB. According to the above, it can be found that at a certain intensity of turbulence or transmission distance, there will be a big difference in the imaging results at first, and then it will stabilize. At a certain intensity of turbulence, the increase in transmission distance will reduce the final imaging quality, and similarly, at a certain transmission distance, the stronger the turbulence intensity, the worse the image quality.
To some extent, the sampling rate determines the complexity and quality of the imaging process. The impact of the sampling rate on imaging performance was compared when the sampling rate was 12.5%, 25%, 50%, 75%, and 100%, as shown in Fig. 8.
The results show that at low sampling rates such as 12.5% and 25%, the patterns are almost unrecognizable. At 50%, the shape of images can be seen, and the images become clearer as the sampling rate increases. In TCSGI, as the sampling rate increases, the imaging quality is improved to a certain extent. Therefore, we choose the image with a sampling rate of 50% as the test sample in the process of image restoration.
In this section, we try to use a traditional filter to restore the image. Consider processing a distorted image with filters and using gray scale images of TCSGI (N = 50%, d = 300 m,
It can be seen that the filters have a certain effect on image restoration, but the effect is limited. Restoration with filters can show the general outline of the image. Both PSNR and SSIM have been improved to some extent, and the test averages are shown in Table 1.
TABLE 1. Average values of peak signal-to-noise ratio (PSNR) and structural similarity index map (SSIM).
Image Quality Assessment Method | TCSGI | Wiener Filter | Median Filter | Gaussian Filter |
---|---|---|---|---|
PSNR (dB) | 8.036 | 13.281 | 14.335 | 12.506 |
SSIM | 0.086 | 0.195 | 0.252 | 0.169 |
In this section, we mainly analyze the image quality of different imaging methods and need to use some indicators. Here, we use gray scale images by TCSGI (N = 50%, d = 300 m,
In the past related research [38], a deep neural network (DNN), which can complete the fitting task, has been widely used in image processing. We compared the imaging clarity of the four restoration methods to analyze their performance. In test process, three groups of pictures were selected to show, as shown in Fig. 10.
We observe the imaging results of median filters, DNN and TGAN. During the imaging process, it is inevitable that we find that there are certain differences between PSNR and SSIM due to different image complexity. We can see the general edges of the face because the main information of the image foreground is relatively large.
Combined with Table 2, Fig. 10 shows TCSGI, median filters, DNN and TGAN, in which the PSNR value of the image processed by TCSGI is between 8–9 dB, and the imaging effect is too fuzzy. Median filters can only slightly recover the fuzzy image, and the imaging effect is limited. For the DNN network, the recovered PSNR value is higher than 16 dB, and the SSIM is also around 0.5, and the imaging results are similar to the original image, but the local area has obvious change. The PSNR value of the TGAN method is more than 23 dB, and the SSIM is more than 0.7, so the similarity between the imaging and the original real image is higher.
TABLE 2. Average values of peak signal-to-noise ratio (PSNR) and structural similarity index map (SSIM).
Image Quality Assessment Method | TCSGI | Median Filter | DNN | TGAN |
---|---|---|---|---|
PSNR (dB) | 8.036 | 14.335 | 18.729 | 21.902 |
SSIM | 0.086 | 0.252 | 0.537 | 0.741 |
Figure 11 shows the imaging results of six groups. Finally, the entropy of image information restored by TGAN and the ground truth are compared. Image information entropy can be used to evaluate the overall information of the image, as shown in Eq. (13):
where p(i) is the probability that the gray value of the pixel is equal to i. It is found that the performance difference between the two imaging situations is very small, which means that the method can restore the image information well. We take six groups above as an example, as shown in Fig. 12.
The results show that with the observation of many test results, the overall PSNR value and SSIM value are extremely low for TCSGI imaging, and the imaging quality is very poor. The DNN can roughly restore the image, but the overall imaging quality is not as good as the TGAN imaging method proposed in this paper. TGAN is 17% higher than DNN on the PSNR and increases the SSIM by about 0.3. TGAN has a greater improvement effect on blurred and distorted images, the image quality is better, and the method can restore the image information well, but the texture and details are somewhat unnatural, which is the problem of the GAN.
In summary, we propose a restoration method based on deep learning for CSGI in atmospheric turbulence. The gamma-gamma model is used to simulate the medium to a strong range of turbulent fluctuation, called the TCSGI model. We study the influence of the transmission distance and turbulence intensity on imaging. We also do research and elaboration on the influence of the sampling rate on TCSGI clarity, find that the sampling rate is 50%, and see the outline of the images. We first discuss the effect of filters on restoration, which is limited. Compared with other restoration algorithms, the deep learning methods have excellent performance for image distortion restoration, especially the TGAN proposed. The restored image has good results in clarity and information entropy. It shows that deep learning has a good application in imaging restoration and can solve practical problems in computational imaging.
National Natural Science Foundation of China (No. 62275153, 62005165); Shanghai Industrial Collaborative Innovation Project (HCXBCY-2022-006); projects sponsored by the development fund for Shanghai talents (No: 2021005).
There are no conflicts of interest in the submitted manuscript, and all authors approved the manuscript for publication. The correcsponding author declares on behalf of the co-authors that the work described was original research that has not been published previously, and is not under consideration for publication elsewhere, in whole or in part. All the authors listed have approved the manuscript that is enclosed.
Data underlying the results presented in this paper are not publicly available at the time of publication but may be obtained from the authors upon reasonable request.
TABLE 1 Average values of peak signal-to-noise ratio (PSNR) and structural similarity index map (SSIM)
Image Quality Assessment Method | TCSGI | Wiener Filter | Median Filter | Gaussian Filter |
---|---|---|---|---|
PSNR (dB) | 8.036 | 13.281 | 14.335 | 12.506 |
SSIM | 0.086 | 0.195 | 0.252 | 0.169 |
TABLE 2 Average values of peak signal-to-noise ratio (PSNR) and structural similarity index map (SSIM)
Image Quality Assessment Method | TCSGI | Median Filter | DNN | TGAN |
---|---|---|---|---|
PSNR (dB) | 8.036 | 14.335 | 18.729 | 21.902 |
SSIM | 0.086 | 0.252 | 0.537 | 0.741 |