Ex) Article Title, Author, Keywords
Current Optics
and Photonics
Ex) Article Title, Author, Keywords
Curr. Opt. Photon. 2024; 8(3): 270-281
Published online June 25, 2024 https://doi.org/10.3807/COPP.2024.8.3.270
Copyright © Optical Society of Korea.
Leihong Zhang1 , Yiqiang Zhang1, Runchu Xu1, Yangjun Li1, Dawei Zhang1,2
Corresponding author: *lhzhang@usst.edu.cn, ORCID 0000-0002-1787-2978
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Information-hiding technology is introduced into an optical ghost imaging encryption scheme, which can greatly improve the security of the encryption scheme. However, in the current mainstream research on camouflage ghost imaging encryption, information hiding techniques such as digital watermarking can only hide 1/4 resolution information of a cover image, and most secret images are simple binary images. In this paper, we propose an equal-resolution image-hiding encryption scheme based on deep learning and computational ghost imaging. With the equal-resolution image steganography network based on deep learning (ERIS-Net), we can realize the hiding and extraction of equal-resolution natural images and increase the amount of encrypted information from 25% to 100% when transmitting the same size of secret data. To the best of our knowledge, this paper combines image steganography based on deep learning with optical ghost imaging encryption method for the first time. With deep learning experiments and simulation, the feasibility, security, robustness, and high encryption capacity of this scheme are verified, and a new idea for optical ghost imaging encryption is proposed.
Keywords: Computational ghost imaging, Deep learning, Equal-resolution image steganography, Optical image encryption
OCIS codes: (060.4785) Optical security and encryption; (100.3020) Image reconstruction-restoration
Optical information security technology can take advantage of the high-dimensional, parallel processing capability and fast characteristics of optics [1–3], and has great advantages in protecting information. Computational ghost imaging (CGI) is a commonly used optical image encryption method [4–7] that can enhance the security of the encryption system. Information security technology based on CGI is also constantly improving [8–13]. In 2019, Wang et al. [14] proposed an optical image watermarking method based on singular value decomposition ghost imaging (SVDGI) and a blind watermarking algorithm. In 2020, Wu et al. [15] proposed an optical image watermarking algorithm based on SVDGI and multiple transforms. In 2022, Zhou et al. [16] proposed an optical image watermarking method based on CGI and multiple logistic mappings. In 2023, Xu et al. [17] proposed an image encryption method based on CGI and key mode. Using a set of key modes in the key mode database to hide data in the illumination mode, then using CGI technology to encode the target image into a series of bucket values, and using a chaotic system to scramble and diffuse it to achieve secure transmission of images.
With the rapid development of deep learning technology, its application in the image-processing field is increasing [18]. Among them, ghost imaging encryption schemes based on deep learning show increasing application prospects [19, 20]. In 2018, Shimobaba et al. [21] applied a deep neural network (DNN) to CGI. The trained network can predict low-noise reconstruction images from high-noise reconstruction images. In 2021, Liu et al. [22] implemented a CGI reconstruction method based on non-training neural networks and applied compressed sensing technology to greatly improve image reconstruction quality. In 2022, Lin et al. [23] proposed a steganographic optical image encryption method based on single-pixel imaging and untrained neural networks. In the same year, Zhu et al. [24] proposed an optical color ghost password and steganography method based on a multi-discriminator generative adversarial network. Using deep learning algorithms, we can well calculate and obtain the correlation between ciphertext data and initial images, thus realizing the process of image decryption. However, most current deep learning algorithms are not applied to hiding encryption processes but focus more on reconstructing images in CGI [23, 24]. In correlation imaging encryption schemes, hiding encryption is a common encryption method whose basic principle is to embed confidential information in high-dimensional representations of input images so that images embedded with information look like ordinary natural images, thereby achieving covert encryption of image information.
To the best of our acknowledge, image steganography based on deep learning has seldom investigated in the CGI encryption system so far. This paper proposed an optical image-hiding encryption method based on an encoder-decoder structure [25] equal-resolution image steganography model (ERIS-Net) and CGI. In our encryption system, the secret image was hidden in a carrier image of the same size to realize the hiding and encryption of the secret image information. Then the plaintext image information was encrypted into several light-intensity value sequences by calculating the ghost image encryption system, which improved the security of the system. Firstly, the sender hides a 64 × 64 size secret image in an equal-resolution carrier image using hiding network (Hide-Net) in ERIS-Net to achieve large-capacity steganography of images; Secondly, using CGI technology to sample the generated steganography image containing secret image information 4,096 times (full sampling), it is encrypted into a series of light intensity sequence values and transmitted to the receiver, completing the encryption stage of the system; Then, the receiver restores the steganography image containing secret information features according to the received 4,096 light intensity values and illumination speckle with a second-order correlation reconstruction algorithm; Finally, the secret image is extracted from the steganography image with the extraction network (Extract-Net). This paper verifies the feasibility, security, and information-hiding capacity of this method with simulation experiments. The experimental results show that the encryption system proposed in this paper not only has high security and encryption capabilities, but also can achieve equal-resolution image-hiding encryption, greatly improving information transmission capacity during optical communication.
The principle of CGI is shown in Fig. 1. The light beam emitted by the laser is expanded and collimated to illuminate the transmission object image O(x, y). The light beam carrying the amplitude information of the object passes through the object and then illuminates a digital micromirror device (DMD) loaded with a series of Hadamard phase modulation matrices ϕi(x, y). After the phase information of the light field is modulated by the DMD, the reflected or transmitted light intensity information is collected by a bucket detector and recorded as Di. The DMD loads N phase modulation matrices in sequence, and N bucket detector values
After the receiver receives the ciphertext and key from the public channel and secure channel, it can calculate the corresponding intensity distribution Ii(x, y) according to the Fresnel diffraction theorem and the random phase modulation matrix in the key
In Eq. (1), hd(x, y) represents the Fresnel diffraction function factor when the light beam propagates a distance of d, ⊗ is the two-dimensional convolution operation symbol, and Ein(x, y) is the complex amplitude of the laser light source. The receiver can obtain the decrypted image information by performing a second-order correlation operation on the obtained light field information Ii(x, y) and ciphertext information Di. The calculation of bucket detector values and the formula for second-order correlation reconstruction are shown in Eqs. (2) and (3):
Traditional second-order correlation reconstruction algorithms have disadvantages such as low decryption efficiency and poor imaging quality during the decryption process. In order to improve imaging quality, a compressed sensing [26] optimization decryption algorithm can be used to achieve high-quality reconstruction of secret information. A compressed sensing algorithm can break through the Nyquist sampling theorem. By using the sparsity of natural images to compress and sample image signals, important information about signals can be retained. The original signal can be recovered from compressed data by solving a minimum L1 norm optimization problem. Its image reconstruction formula is shown in Eq. (4):
Image steganography based on deep learning can realize the secret image information, hiding 100% carrier image size. The ERIS-Net proposed in this paper extracts the feature information of the secret image and hides the feature information of the corresponding carrier image using multiple convolution layers, up and down sampling, etc. And it reduces the mean square error between the secret information contained in the image and the original image with training iteration, thus improving the invisibility of the secret information in the steganographic image. The sub-network Extract-Net is continuously optimized during the training process to reduce the errors between the steganography image and the cover image, the extracted image and the secret image, and finally generate a decrypted image similar to the initial secret image. It is worth noting that the image steganography model has a strong generalization ability. Only some animal images (cats and dogs) are selected for training in the training process, and the images in Set12 and Fashion-Mnist can also be embedded and extracted as secret images.
The ERIS-Net based on an encoder-decoder structure is shown in Fig. 2. In this model, the Hide-Net can hide a secret image in an equal-resolution carrier image, and the corresponding Extract-Net can directly extract the hidden secret image from the steganography image generated by the hiding network. Hide-Net includes a preprocessing convolution layer, down-sampling convolution layer, and transposed convolution up-sampling layer; Extract-Net consists of six convolution layers.
The preprocessing block can adjust the size of the secret image to the same resolution as the carrier image and extract high-dimensional features of the original image. The convolution up-sampling block and transposed convolution down-sampling block can fuse the high-dimensional feature maps of the two images after splicing, and finally generate a steganography image containing secret image information with the Tanh activation function.
The mean square error (MSE) loss function is set as the loss function of the network. The sum of the mean square error loss between the carrier image c and the steganography image c′ and β times the mean square error loss between the secret image s and the extracted secret image s′ is used as the total loss function of the equal-resolution image steganography network for training iteration (β is the weight of reconstruction error). The model loss function is shown in Eq. (5):
||c − c′|| represents calculating the mean square error between c and c′, ||s − s′|| represents calculating the mean square error between s and s′.
This network uses the Adam optimizer to accelerate training. The initial learning rate lr is set to 0.001, and the training period is 200 rounds. After the 80th and 150th training, the learning rate will be 1/10 of the original. The batch processing size is set to 8, the image size is 64 × 64 × 1, and the dataset is 7,300 grayscale-processed Oxford-IIIT_Pets images using MATLAB (2018b) software. The test set includes several Set12, Fashion MNIST, and binary images, all scaled to 64 × 64 size. The hiding and extraction effects during training, loss curves, and some prediction results of the test set are shown in Figs. 3–5. In Figs. 3 and 5, the first row represents the carrier image to be written into the secret image information, the second row represents the secret image, the third row represents the steganography image after the information is written into Hide-Net, and the fourth row represents the secret image directly extracted from the steganography image with Extract-Net.
The hardware and software configuration in the experiments is as follows: NVIDIA RTX 2050 GeForce GPU, PyCharm (the experiment’s development environment IDE), Python 3.9 (the programming environment), and CUDA version 10.6.
The image encryption scheme based on the equal-resolution image-hiding model and CGI proposed in this paper combines image information hiding and image information encryption to improve invisibility and security during information transmission. Firstly, using a deep learning model, a 64 × 64 size grayscale secret image is hidden in an equal-resolution grayscale carrier image to achieve hiding of the secret image; Secondly, with correlation imaging, the generated steganography image is encrypted into 4,096 light intensity value sequences for transmission to improve system security; Then, with the key sequence required for encryption and the second-order correlation reconstruction algorithm, the steganography image is reconstructed with low feature loss; Finally, using Extract-Net trained together with Hide-Net, the secret image is successfully extracted from the reconstructed steganography image.
Step 1: The sender takes a 64 × 64-pixel secret image S(x, y) and carrier image C(x, y) as the input of Hide-Net, calls the pre-trained model parameters to hide the secret image, S(x, y) and generates a steganography image C(x, y) containing secret image features.
Step 2: The DMD is used to load a series of Hadamard phase modulation matrices ϕi(x, y) to modulate the light field of the light beam and illuminate the generated steganography image C′(x, y), as shown in Fig. 6.
Step 3: The total light-intensity value reflected by the modulated steganography image C′(x, y) is collected by the bucket detector and recorded as a series of bucket detector values
The receiver receives the ciphertext
Step 1: The receiver reconstructs the steganography image
Step 2: The receiver takes the reconstructed steganography image
To verify the feasibility, security, and image-hiding capacity of the encryption scheme proposed in this paper, MATLAB 2018b simulation software was used to perform numerical analysis on experimental images. The experimental objects are six grayscale images (64 × 64 in size) in Set12 dataset and two images in Fashion MNIST. Four images were selected from Set12 dataset as carrier images, and the remaining four images were used as secret images. The selection of simulation results and evaluation standard parameters is shown in the next section.
This section verifies the feasibility of the proposed ERISN-CGI encryption scheme. First, we verified whether the steganography image written with secret image information could be encrypted into a series of light-intensity sequences with ghost imaging. Then extract secret image information from the image reconstructed by the compressed sensing (CS) algorithm.
To evaluate the image quality of the steganography image generated by Hide-Net, we used two image evaluation parameters, correlation coefficient (CC), and peak signal-to-noise ratio (PSNR). The steganography image was decrypted by a compressed sensing algorithm, and the secret image was finally extracted by the Extract-Net model. Consequently, the larger the value of correlation coefficient and peak signal-to-noise ratio, the higher the image quality. The mathematical formulas of correlation coefficient and peak signal-to-noise ratio are shown in Eqs. (7) and (8).
where Oi,j,
As shown in Figs. 8 and 9 and Table 1, the proposed hidden image encryption scheme based on deep learning image hiding achieved good image reconstruction results both visually and in terms of evaluation functions. Among them, the carrier image C and secret image S generate a steganography image C′ containing secret image feature information with Hide-Net. The peak signal-to-noise ratio of the two groups of Fashion MNIST images with high sparsity is generally lower than that of the two groups of Set12 images. The number of sampling times N in the correlation imaging encryption part is set to 4,096 times, that is, full sampling, to ensure the quality of the reconstructed image and improve the extraction effect of the Extract-Net. Compared with the initial secret image, the secret image S′ extracted from the secret image reconstructed by compressed sensing algorithm with Extract-Net has an average peak signal-to-noise ratio (PSNR) of 24.974 dB for four groups of experiments, which is better than those proposed in [24] and [27] image hiding encryption scheme.
TABLE 1 Comparison of encryption quality evaluation indexes of different encryption schemes (PSNR/dB)
Methods | Image | |
---|---|---|
[24] | 18.909 | 20.050 |
[27] | 16.700 | 16.680 |
Ours (SOC) | 23.299 | 24.974 |
Ours (CS) | 28.150 | 23.481 |
In addition, by comparing the decryption effect of different ghost reconstruction algorithms (CS and SOC), we can find that the average PSNR value of the steganographic image decrypted by CS algorithm is about 1 dB higher than that by SOC algorithm. However, due to the sparse operation in the compressed sensing algorithm, the feature gradient information of the steganographic image be reduced, resulting in the PSNR value of the secret image extracted by CS algorithm being lower than that of the image extracted by SOC algorithm. The CC value of the
As shown in Fig. 10, when part of the data is lost during the transmission of the ciphertext, the proposed encryption scheme can reconstruct the steganography image and extract the feature information of the secret image from it by the Extract-Net.
In Fig. 10, the CC value of the steganography image decreases slightly as the sampling rate decreases, reaching 0.9679 when the sampling rate is 50%. The CC value of the secret image decoder obtained by the deep learning model decreases greatly in accordance with the decrease in sampling rate. However, it is enough to distinguish the main feature information of the image at a 70% sampling rate, and the CC value reaches 0.8254.
Security is one of the important indicators for evaluating the quality of an encryption scheme. In the process of transmitting and receiving image information, it is inevitable to be invaded by illegal attackers and part or even all key and ciphertext information may be stolen by attackers. Among them, the ciphertext-only attack (COA) is one of the three mainstream attack methods in image encryption research. Attackers obtain plaintext images by analyzing the statistical characteristics of intercepted ciphertexts. If the ciphertext of a certain encryption scheme is intercepted and the attacker successfully obtains the encrypted plaintext information, the security of the encryption system cannot be guaranteed.
This paper used MATLAB 2018b software to analyze the security of the proposed encryption scheme with a pixel distribution histogram and key security analysis.
The pixel distribution histogram intuitively displays the pixel value distribution in image. Comparing the carrier image before embedding secret image information and the steganographic image after embedding secret information shows proposed scheme’s information-hiding ability, invisibility, and security.
As can be seen in Fig. 11, the pixel distribution of the cover image [Fig. 11(a)] and the secret image [Fig. 11(b)] are completely different, and the steganography image [Fig. 11(c)] is obtained after the secret image feature information extracted by Hide-Net is embedded in the cover image. The area with low pixel value in the cover image is reduced, and other pixel distribution areas do not change significantly. Figure 11(d) is the ciphertext information, which is the bucket detector value. Figure 11(e) is the steganography image reconstructed by the second-order correlation algorithm. Due to the influence of normalization, the brightness of the image (the overall distribution range of pixels) has changed, but the change degree of the pixel distribution of the image is very small. The feature information of the plaintext image is basically retained. Figure 11(f) is the secret image extracted through Extract-Net from Fig. 11(e), and the pixel distribution feature is similar to the initial cipher text image [Fig. 11(b)].
Attackers cannot obtain effective information about plaintext from ciphertext distribution, ensuring the security of plaintext. Therefore, the pixel distribution histogram analysis in this section is sufficient to prove that the correlation imaging encryption scheme has good defense capabilities against attack methods based on statistical pixel distribution.
The correlation of adjacent pixels refers to the degree of correlation between two pixels of adjacent positions in an image. It generally includes analysis of three directions: horizontal, vertical, and diagonal. In this part, the CC value is selected as the evaluation index.
As shown in Table 2, the adjacent pixels of the image except the ciphertext have a high correlation, which is above 0.7. However, the correlation between adjacent data of the ciphertext is very low, less than 0.03. It can be seen that the encryption system proposed in this paper has good security, and the correlation between adjacent pixels of the encrypted ciphertext is very low in the horizontal, vertical, and diagonal directions.
TABLE 2 CC values of different directions
Directions | Images | |||||
---|---|---|---|---|---|---|
Cover | Secret | Plain Text | Cipher Text | GI | Extract | |
Horizontal | 0.8623 | 0.8536 | 0.8809 | −0.0263 | 0.8795 | 0.8868 |
Vertical | 0.8915 | 0.8318 | 0.9135 | −0.0221 | 0.9096 | 0.8831 |
Diagonal | 0.7716 | 0.7257 | 0.8058 | 0.0154 | 0.7982 | 0.7844 |
As shown in Fig. 12(a), the correct steganography image can only be decrypted by the correct ciphertext and key, where wrong keys 1 are the ordinary Hadamard pattern, and wrong keys 2 are the random phase modulation matrix. Wrong ciphertext 1 is the scrambled ciphertext. In order to improve the security of the ciphertext, a random array can be introduced into the ghost image encryption process for random sampling to generate random ciphertext.
If an attacker intercepts all ciphertexts and keys and reconstructs encrypted images using second-order correlation reconstruction or compressed sensing algorithms, they can only decrypt information about carrier images and cannot obtain information about secret images.
Due to factors (such as image sensors, lighting and electronic pulses), the collected data usually contains Gaussian noise, Poisson noise, salt-and-pepper noise, multiplicative noise, and other noises.
This section adds Gaussian noise to attack the ciphertext sequence of the second group of test images to verify the noise robustness of this encryption scheme. The Gaussian noise formula is shown in Eq. (9), where C and C′ are the ciphertext values without noise and after noise interference, respectively; n represents noise intensity; and randn (4096, 1) represents a random matrix with a mean of 0 and a variance of 1.
The reconstruction results of steganography image
Unlike general cover images least significant bit (LSB) hiding, information encoding information hiding, and frequency domain information hiding (DWT, DFT, DCT, etc.), the proposed steganography method compresses and recovers secret images on all available pixel bits of cover images. Currently, traditional image embedding hiding techniques and non-cover-based information hiding techniques have relatively low capabilities. Since our method is a new embedding hiding method, to make a more intuitive comparison, this paper simply compares the image information hiding method with some other mainstream embedding hiding methods and non-embedding hiding methods. There are some non-embedding hiding methods, including cover selection-based and cover synthesis-based methods. The comparison results are shown in Table 3, where the second column is the absolute hiding capacity (hiding capacity per image), the third column is the size of the target image, and the last column is the relative hiding capacity (hiding capacity per pixel), which can be calculated by Eq. (10):
TABLE 3 Correlation coefficient (CC) values of different directions
Schemes | Absolute Capacity (bytes/image) | Image Size | Relative Capacity (bytes/pixel) |
---|---|---|---|
[28] | 64 × 64 (8 bits per pixel) | 64 × 64 (1 bit per pixel) | 0.125 |
[29] | 32 × 32 | 64 × 64 | 0.250 |
[30] | 32 × 32 × 1 | 32 × 32 × 3 | 0.333 |
Ours | 64 × 64 | 64 × 64 | 1.000 |
The hiding capacity of current mainstream image-hiding methods is shown in Table 3. [28] is the steganographic capacity of a method based on cover synthesis. Although the relative capacity of the method based on cover synthesis has been significantly improved compared to the relative capacity of the method based on cover selection (hiding text information), it is still lower than the steganographic capacity of our method. [29] is the steganographic capacity based on DWT digital watermarking technology. This image embedding method can only hide information of secret images 1/4 the size of carrier images. [30] hides grayscale images of the same width and height in color images, and its steganographic capacity is increased to 1/3 of carrier images compared to digital watermarking technology. Since our method compresses and recovers secret image pixel information on all bits of secret image pixel information, the relative capacity calculated by Eq. (10) is 1 byte/pixel (8 bits per pixel).
This paper proposed an image-hiding encryption scheme based on an ERIS-Net and CGI. Firstly, two grayscale images with a resolution of 64 × 64 are taken as the input of the encoder module (Hide-Net) of ERIS-Net to generate a steganography image that is similar to the carrier image but contains secret image feature information; Secondly, the generated steganography image is encrypted into 4,096 light intensity sequences (full sampling) with computational correlation imaging technology as ciphertext transmitted to the receiver through a public channel. The 4,096 corresponding Hadamard phase modulation matrices used for encryption are transmitted to the receiver through a secure channel to complete the encryption process; then, after receiving the ciphertext and key, the receiver uses compressed sensing or second-order correlation algorithms to decrypt the information of steganography. Finally, the decrypted steganography image is used as the input end of the decoder (Extract-Net) for secondary decryption to recover the original secret image information. Through numerical simulation, the average PSNR value and CC value of the secret image in the proposed algorithm reach 25.432 dB and 0.967, respectively. The proposed encryption scheme achieves good visual quality and numerical evaluation quality. In addition, with pixel histogram and key security analysis, the security of this scheme has been verified. Finally, by comparing it with traditional image-hiding encryption schemes such as digital watermark embedding, the high information-hiding capacity of this scheme has been verified. Compared with the traditional encryption scheme based on information hiding and optical ghost imaging, the information hiding capacity of the proposed scheme can be increased to 8 bits per pixel with the same amount of ciphertext transmission.
The authors thank the National Natural Science Foundation of China, Shanghai Industrial Collaborative Innovation Project and the development fund for Shanghai talents for help identifying collaborators for this work.
National Natural Science Foundation of China (Grant no. 62275153, 62005165); Shanghai Industrial Collaborative Innovation Project (Grant no. HCXBCY-2022-006); the development fund for Shanghai talents (Grant no. 2021005); the Key Basic Research Projects of the Basic Strengthening Plan (Grant no. 2021-JCJQ-ZD040-02-03); the Key Laboratory of Space Active Opto-electronics Technology of Chinese Academy of Sciences (Grant no. 2021ZDKF4); the Shanghai Science and Technology Innovation Action Plan (Grant no. 22dz1201300).
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.
Curr. Opt. Photon. 2024; 8(3): 270-281
Published online June 25, 2024 https://doi.org/10.3807/COPP.2024.8.3.270
Copyright © Optical Society of Korea.
Leihong Zhang1 , Yiqiang Zhang1, Runchu Xu1, Yangjun Li1, Dawei Zhang1,2
1School of Optical-electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
2Shanghai Institute of Intelligent Science and Technology, Tongji University, Shanghai 200092, China
Correspondence to:*lhzhang@usst.edu.cn, ORCID 0000-0002-1787-2978
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Information-hiding technology is introduced into an optical ghost imaging encryption scheme, which can greatly improve the security of the encryption scheme. However, in the current mainstream research on camouflage ghost imaging encryption, information hiding techniques such as digital watermarking can only hide 1/4 resolution information of a cover image, and most secret images are simple binary images. In this paper, we propose an equal-resolution image-hiding encryption scheme based on deep learning and computational ghost imaging. With the equal-resolution image steganography network based on deep learning (ERIS-Net), we can realize the hiding and extraction of equal-resolution natural images and increase the amount of encrypted information from 25% to 100% when transmitting the same size of secret data. To the best of our knowledge, this paper combines image steganography based on deep learning with optical ghost imaging encryption method for the first time. With deep learning experiments and simulation, the feasibility, security, robustness, and high encryption capacity of this scheme are verified, and a new idea for optical ghost imaging encryption is proposed.
Keywords: Computational ghost imaging, Deep learning, Equal-resolution image steganography, Optical image encryption
Optical information security technology can take advantage of the high-dimensional, parallel processing capability and fast characteristics of optics [1–3], and has great advantages in protecting information. Computational ghost imaging (CGI) is a commonly used optical image encryption method [4–7] that can enhance the security of the encryption system. Information security technology based on CGI is also constantly improving [8–13]. In 2019, Wang et al. [14] proposed an optical image watermarking method based on singular value decomposition ghost imaging (SVDGI) and a blind watermarking algorithm. In 2020, Wu et al. [15] proposed an optical image watermarking algorithm based on SVDGI and multiple transforms. In 2022, Zhou et al. [16] proposed an optical image watermarking method based on CGI and multiple logistic mappings. In 2023, Xu et al. [17] proposed an image encryption method based on CGI and key mode. Using a set of key modes in the key mode database to hide data in the illumination mode, then using CGI technology to encode the target image into a series of bucket values, and using a chaotic system to scramble and diffuse it to achieve secure transmission of images.
With the rapid development of deep learning technology, its application in the image-processing field is increasing [18]. Among them, ghost imaging encryption schemes based on deep learning show increasing application prospects [19, 20]. In 2018, Shimobaba et al. [21] applied a deep neural network (DNN) to CGI. The trained network can predict low-noise reconstruction images from high-noise reconstruction images. In 2021, Liu et al. [22] implemented a CGI reconstruction method based on non-training neural networks and applied compressed sensing technology to greatly improve image reconstruction quality. In 2022, Lin et al. [23] proposed a steganographic optical image encryption method based on single-pixel imaging and untrained neural networks. In the same year, Zhu et al. [24] proposed an optical color ghost password and steganography method based on a multi-discriminator generative adversarial network. Using deep learning algorithms, we can well calculate and obtain the correlation between ciphertext data and initial images, thus realizing the process of image decryption. However, most current deep learning algorithms are not applied to hiding encryption processes but focus more on reconstructing images in CGI [23, 24]. In correlation imaging encryption schemes, hiding encryption is a common encryption method whose basic principle is to embed confidential information in high-dimensional representations of input images so that images embedded with information look like ordinary natural images, thereby achieving covert encryption of image information.
To the best of our acknowledge, image steganography based on deep learning has seldom investigated in the CGI encryption system so far. This paper proposed an optical image-hiding encryption method based on an encoder-decoder structure [25] equal-resolution image steganography model (ERIS-Net) and CGI. In our encryption system, the secret image was hidden in a carrier image of the same size to realize the hiding and encryption of the secret image information. Then the plaintext image information was encrypted into several light-intensity value sequences by calculating the ghost image encryption system, which improved the security of the system. Firstly, the sender hides a 64 × 64 size secret image in an equal-resolution carrier image using hiding network (Hide-Net) in ERIS-Net to achieve large-capacity steganography of images; Secondly, using CGI technology to sample the generated steganography image containing secret image information 4,096 times (full sampling), it is encrypted into a series of light intensity sequence values and transmitted to the receiver, completing the encryption stage of the system; Then, the receiver restores the steganography image containing secret information features according to the received 4,096 light intensity values and illumination speckle with a second-order correlation reconstruction algorithm; Finally, the secret image is extracted from the steganography image with the extraction network (Extract-Net). This paper verifies the feasibility, security, and information-hiding capacity of this method with simulation experiments. The experimental results show that the encryption system proposed in this paper not only has high security and encryption capabilities, but also can achieve equal-resolution image-hiding encryption, greatly improving information transmission capacity during optical communication.
The principle of CGI is shown in Fig. 1. The light beam emitted by the laser is expanded and collimated to illuminate the transmission object image O(x, y). The light beam carrying the amplitude information of the object passes through the object and then illuminates a digital micromirror device (DMD) loaded with a series of Hadamard phase modulation matrices ϕi(x, y). After the phase information of the light field is modulated by the DMD, the reflected or transmitted light intensity information is collected by a bucket detector and recorded as Di. The DMD loads N phase modulation matrices in sequence, and N bucket detector values
After the receiver receives the ciphertext and key from the public channel and secure channel, it can calculate the corresponding intensity distribution Ii(x, y) according to the Fresnel diffraction theorem and the random phase modulation matrix in the key
In Eq. (1), hd(x, y) represents the Fresnel diffraction function factor when the light beam propagates a distance of d, ⊗ is the two-dimensional convolution operation symbol, and Ein(x, y) is the complex amplitude of the laser light source. The receiver can obtain the decrypted image information by performing a second-order correlation operation on the obtained light field information Ii(x, y) and ciphertext information Di. The calculation of bucket detector values and the formula for second-order correlation reconstruction are shown in Eqs. (2) and (3):
Traditional second-order correlation reconstruction algorithms have disadvantages such as low decryption efficiency and poor imaging quality during the decryption process. In order to improve imaging quality, a compressed sensing [26] optimization decryption algorithm can be used to achieve high-quality reconstruction of secret information. A compressed sensing algorithm can break through the Nyquist sampling theorem. By using the sparsity of natural images to compress and sample image signals, important information about signals can be retained. The original signal can be recovered from compressed data by solving a minimum L1 norm optimization problem. Its image reconstruction formula is shown in Eq. (4):
Image steganography based on deep learning can realize the secret image information, hiding 100% carrier image size. The ERIS-Net proposed in this paper extracts the feature information of the secret image and hides the feature information of the corresponding carrier image using multiple convolution layers, up and down sampling, etc. And it reduces the mean square error between the secret information contained in the image and the original image with training iteration, thus improving the invisibility of the secret information in the steganographic image. The sub-network Extract-Net is continuously optimized during the training process to reduce the errors between the steganography image and the cover image, the extracted image and the secret image, and finally generate a decrypted image similar to the initial secret image. It is worth noting that the image steganography model has a strong generalization ability. Only some animal images (cats and dogs) are selected for training in the training process, and the images in Set12 and Fashion-Mnist can also be embedded and extracted as secret images.
The ERIS-Net based on an encoder-decoder structure is shown in Fig. 2. In this model, the Hide-Net can hide a secret image in an equal-resolution carrier image, and the corresponding Extract-Net can directly extract the hidden secret image from the steganography image generated by the hiding network. Hide-Net includes a preprocessing convolution layer, down-sampling convolution layer, and transposed convolution up-sampling layer; Extract-Net consists of six convolution layers.
The preprocessing block can adjust the size of the secret image to the same resolution as the carrier image and extract high-dimensional features of the original image. The convolution up-sampling block and transposed convolution down-sampling block can fuse the high-dimensional feature maps of the two images after splicing, and finally generate a steganography image containing secret image information with the Tanh activation function.
The mean square error (MSE) loss function is set as the loss function of the network. The sum of the mean square error loss between the carrier image c and the steganography image c′ and β times the mean square error loss between the secret image s and the extracted secret image s′ is used as the total loss function of the equal-resolution image steganography network for training iteration (β is the weight of reconstruction error). The model loss function is shown in Eq. (5):
||c − c′|| represents calculating the mean square error between c and c′, ||s − s′|| represents calculating the mean square error between s and s′.
This network uses the Adam optimizer to accelerate training. The initial learning rate lr is set to 0.001, and the training period is 200 rounds. After the 80th and 150th training, the learning rate will be 1/10 of the original. The batch processing size is set to 8, the image size is 64 × 64 × 1, and the dataset is 7,300 grayscale-processed Oxford-IIIT_Pets images using MATLAB (2018b) software. The test set includes several Set12, Fashion MNIST, and binary images, all scaled to 64 × 64 size. The hiding and extraction effects during training, loss curves, and some prediction results of the test set are shown in Figs. 3–5. In Figs. 3 and 5, the first row represents the carrier image to be written into the secret image information, the second row represents the secret image, the third row represents the steganography image after the information is written into Hide-Net, and the fourth row represents the secret image directly extracted from the steganography image with Extract-Net.
The hardware and software configuration in the experiments is as follows: NVIDIA RTX 2050 GeForce GPU, PyCharm (the experiment’s development environment IDE), Python 3.9 (the programming environment), and CUDA version 10.6.
The image encryption scheme based on the equal-resolution image-hiding model and CGI proposed in this paper combines image information hiding and image information encryption to improve invisibility and security during information transmission. Firstly, using a deep learning model, a 64 × 64 size grayscale secret image is hidden in an equal-resolution grayscale carrier image to achieve hiding of the secret image; Secondly, with correlation imaging, the generated steganography image is encrypted into 4,096 light intensity value sequences for transmission to improve system security; Then, with the key sequence required for encryption and the second-order correlation reconstruction algorithm, the steganography image is reconstructed with low feature loss; Finally, using Extract-Net trained together with Hide-Net, the secret image is successfully extracted from the reconstructed steganography image.
Step 1: The sender takes a 64 × 64-pixel secret image S(x, y) and carrier image C(x, y) as the input of Hide-Net, calls the pre-trained model parameters to hide the secret image, S(x, y) and generates a steganography image C(x, y) containing secret image features.
Step 2: The DMD is used to load a series of Hadamard phase modulation matrices ϕi(x, y) to modulate the light field of the light beam and illuminate the generated steganography image C′(x, y), as shown in Fig. 6.
Step 3: The total light-intensity value reflected by the modulated steganography image C′(x, y) is collected by the bucket detector and recorded as a series of bucket detector values
The receiver receives the ciphertext
Step 1: The receiver reconstructs the steganography image
Step 2: The receiver takes the reconstructed steganography image
To verify the feasibility, security, and image-hiding capacity of the encryption scheme proposed in this paper, MATLAB 2018b simulation software was used to perform numerical analysis on experimental images. The experimental objects are six grayscale images (64 × 64 in size) in Set12 dataset and two images in Fashion MNIST. Four images were selected from Set12 dataset as carrier images, and the remaining four images were used as secret images. The selection of simulation results and evaluation standard parameters is shown in the next section.
This section verifies the feasibility of the proposed ERISN-CGI encryption scheme. First, we verified whether the steganography image written with secret image information could be encrypted into a series of light-intensity sequences with ghost imaging. Then extract secret image information from the image reconstructed by the compressed sensing (CS) algorithm.
To evaluate the image quality of the steganography image generated by Hide-Net, we used two image evaluation parameters, correlation coefficient (CC), and peak signal-to-noise ratio (PSNR). The steganography image was decrypted by a compressed sensing algorithm, and the secret image was finally extracted by the Extract-Net model. Consequently, the larger the value of correlation coefficient and peak signal-to-noise ratio, the higher the image quality. The mathematical formulas of correlation coefficient and peak signal-to-noise ratio are shown in Eqs. (7) and (8).
where Oi,j,
As shown in Figs. 8 and 9 and Table 1, the proposed hidden image encryption scheme based on deep learning image hiding achieved good image reconstruction results both visually and in terms of evaluation functions. Among them, the carrier image C and secret image S generate a steganography image C′ containing secret image feature information with Hide-Net. The peak signal-to-noise ratio of the two groups of Fashion MNIST images with high sparsity is generally lower than that of the two groups of Set12 images. The number of sampling times N in the correlation imaging encryption part is set to 4,096 times, that is, full sampling, to ensure the quality of the reconstructed image and improve the extraction effect of the Extract-Net. Compared with the initial secret image, the secret image S′ extracted from the secret image reconstructed by compressed sensing algorithm with Extract-Net has an average peak signal-to-noise ratio (PSNR) of 24.974 dB for four groups of experiments, which is better than those proposed in [24] and [27] image hiding encryption scheme.
TABLE 1. Comparison of encryption quality evaluation indexes of different encryption schemes (PSNR/dB).
Methods | Image | |
---|---|---|
[24] | 18.909 | 20.050 |
[27] | 16.700 | 16.680 |
Ours (SOC) | 23.299 | 24.974 |
Ours (CS) | 28.150 | 23.481 |
In addition, by comparing the decryption effect of different ghost reconstruction algorithms (CS and SOC), we can find that the average PSNR value of the steganographic image decrypted by CS algorithm is about 1 dB higher than that by SOC algorithm. However, due to the sparse operation in the compressed sensing algorithm, the feature gradient information of the steganographic image be reduced, resulting in the PSNR value of the secret image extracted by CS algorithm being lower than that of the image extracted by SOC algorithm. The CC value of the
As shown in Fig. 10, when part of the data is lost during the transmission of the ciphertext, the proposed encryption scheme can reconstruct the steganography image and extract the feature information of the secret image from it by the Extract-Net.
In Fig. 10, the CC value of the steganography image decreases slightly as the sampling rate decreases, reaching 0.9679 when the sampling rate is 50%. The CC value of the secret image decoder obtained by the deep learning model decreases greatly in accordance with the decrease in sampling rate. However, it is enough to distinguish the main feature information of the image at a 70% sampling rate, and the CC value reaches 0.8254.
Security is one of the important indicators for evaluating the quality of an encryption scheme. In the process of transmitting and receiving image information, it is inevitable to be invaded by illegal attackers and part or even all key and ciphertext information may be stolen by attackers. Among them, the ciphertext-only attack (COA) is one of the three mainstream attack methods in image encryption research. Attackers obtain plaintext images by analyzing the statistical characteristics of intercepted ciphertexts. If the ciphertext of a certain encryption scheme is intercepted and the attacker successfully obtains the encrypted plaintext information, the security of the encryption system cannot be guaranteed.
This paper used MATLAB 2018b software to analyze the security of the proposed encryption scheme with a pixel distribution histogram and key security analysis.
The pixel distribution histogram intuitively displays the pixel value distribution in image. Comparing the carrier image before embedding secret image information and the steganographic image after embedding secret information shows proposed scheme’s information-hiding ability, invisibility, and security.
As can be seen in Fig. 11, the pixel distribution of the cover image [Fig. 11(a)] and the secret image [Fig. 11(b)] are completely different, and the steganography image [Fig. 11(c)] is obtained after the secret image feature information extracted by Hide-Net is embedded in the cover image. The area with low pixel value in the cover image is reduced, and other pixel distribution areas do not change significantly. Figure 11(d) is the ciphertext information, which is the bucket detector value. Figure 11(e) is the steganography image reconstructed by the second-order correlation algorithm. Due to the influence of normalization, the brightness of the image (the overall distribution range of pixels) has changed, but the change degree of the pixel distribution of the image is very small. The feature information of the plaintext image is basically retained. Figure 11(f) is the secret image extracted through Extract-Net from Fig. 11(e), and the pixel distribution feature is similar to the initial cipher text image [Fig. 11(b)].
Attackers cannot obtain effective information about plaintext from ciphertext distribution, ensuring the security of plaintext. Therefore, the pixel distribution histogram analysis in this section is sufficient to prove that the correlation imaging encryption scheme has good defense capabilities against attack methods based on statistical pixel distribution.
The correlation of adjacent pixels refers to the degree of correlation between two pixels of adjacent positions in an image. It generally includes analysis of three directions: horizontal, vertical, and diagonal. In this part, the CC value is selected as the evaluation index.
As shown in Table 2, the adjacent pixels of the image except the ciphertext have a high correlation, which is above 0.7. However, the correlation between adjacent data of the ciphertext is very low, less than 0.03. It can be seen that the encryption system proposed in this paper has good security, and the correlation between adjacent pixels of the encrypted ciphertext is very low in the horizontal, vertical, and diagonal directions.
TABLE 2. CC values of different directions.
Directions | Images | |||||
---|---|---|---|---|---|---|
Cover | Secret | Plain Text | Cipher Text | GI | Extract | |
Horizontal | 0.8623 | 0.8536 | 0.8809 | −0.0263 | 0.8795 | 0.8868 |
Vertical | 0.8915 | 0.8318 | 0.9135 | −0.0221 | 0.9096 | 0.8831 |
Diagonal | 0.7716 | 0.7257 | 0.8058 | 0.0154 | 0.7982 | 0.7844 |
As shown in Fig. 12(a), the correct steganography image can only be decrypted by the correct ciphertext and key, where wrong keys 1 are the ordinary Hadamard pattern, and wrong keys 2 are the random phase modulation matrix. Wrong ciphertext 1 is the scrambled ciphertext. In order to improve the security of the ciphertext, a random array can be introduced into the ghost image encryption process for random sampling to generate random ciphertext.
If an attacker intercepts all ciphertexts and keys and reconstructs encrypted images using second-order correlation reconstruction or compressed sensing algorithms, they can only decrypt information about carrier images and cannot obtain information about secret images.
Due to factors (such as image sensors, lighting and electronic pulses), the collected data usually contains Gaussian noise, Poisson noise, salt-and-pepper noise, multiplicative noise, and other noises.
This section adds Gaussian noise to attack the ciphertext sequence of the second group of test images to verify the noise robustness of this encryption scheme. The Gaussian noise formula is shown in Eq. (9), where C and C′ are the ciphertext values without noise and after noise interference, respectively; n represents noise intensity; and randn (4096, 1) represents a random matrix with a mean of 0 and a variance of 1.
The reconstruction results of steganography image
Unlike general cover images least significant bit (LSB) hiding, information encoding information hiding, and frequency domain information hiding (DWT, DFT, DCT, etc.), the proposed steganography method compresses and recovers secret images on all available pixel bits of cover images. Currently, traditional image embedding hiding techniques and non-cover-based information hiding techniques have relatively low capabilities. Since our method is a new embedding hiding method, to make a more intuitive comparison, this paper simply compares the image information hiding method with some other mainstream embedding hiding methods and non-embedding hiding methods. There are some non-embedding hiding methods, including cover selection-based and cover synthesis-based methods. The comparison results are shown in Table 3, where the second column is the absolute hiding capacity (hiding capacity per image), the third column is the size of the target image, and the last column is the relative hiding capacity (hiding capacity per pixel), which can be calculated by Eq. (10):
TABLE 3. Correlation coefficient (CC) values of different directions.
Schemes | Absolute Capacity (bytes/image) | Image Size | Relative Capacity (bytes/pixel) |
---|---|---|---|
[28] | 64 × 64 (8 bits per pixel) | 64 × 64 (1 bit per pixel) | 0.125 |
[29] | 32 × 32 | 64 × 64 | 0.250 |
[30] | 32 × 32 × 1 | 32 × 32 × 3 | 0.333 |
Ours | 64 × 64 | 64 × 64 | 1.000 |
The hiding capacity of current mainstream image-hiding methods is shown in Table 3. [28] is the steganographic capacity of a method based on cover synthesis. Although the relative capacity of the method based on cover synthesis has been significantly improved compared to the relative capacity of the method based on cover selection (hiding text information), it is still lower than the steganographic capacity of our method. [29] is the steganographic capacity based on DWT digital watermarking technology. This image embedding method can only hide information of secret images 1/4 the size of carrier images. [30] hides grayscale images of the same width and height in color images, and its steganographic capacity is increased to 1/3 of carrier images compared to digital watermarking technology. Since our method compresses and recovers secret image pixel information on all bits of secret image pixel information, the relative capacity calculated by Eq. (10) is 1 byte/pixel (8 bits per pixel).
This paper proposed an image-hiding encryption scheme based on an ERIS-Net and CGI. Firstly, two grayscale images with a resolution of 64 × 64 are taken as the input of the encoder module (Hide-Net) of ERIS-Net to generate a steganography image that is similar to the carrier image but contains secret image feature information; Secondly, the generated steganography image is encrypted into 4,096 light intensity sequences (full sampling) with computational correlation imaging technology as ciphertext transmitted to the receiver through a public channel. The 4,096 corresponding Hadamard phase modulation matrices used for encryption are transmitted to the receiver through a secure channel to complete the encryption process; then, after receiving the ciphertext and key, the receiver uses compressed sensing or second-order correlation algorithms to decrypt the information of steganography. Finally, the decrypted steganography image is used as the input end of the decoder (Extract-Net) for secondary decryption to recover the original secret image information. Through numerical simulation, the average PSNR value and CC value of the secret image in the proposed algorithm reach 25.432 dB and 0.967, respectively. The proposed encryption scheme achieves good visual quality and numerical evaluation quality. In addition, with pixel histogram and key security analysis, the security of this scheme has been verified. Finally, by comparing it with traditional image-hiding encryption schemes such as digital watermark embedding, the high information-hiding capacity of this scheme has been verified. Compared with the traditional encryption scheme based on information hiding and optical ghost imaging, the information hiding capacity of the proposed scheme can be increased to 8 bits per pixel with the same amount of ciphertext transmission.
The authors thank the National Natural Science Foundation of China, Shanghai Industrial Collaborative Innovation Project and the development fund for Shanghai talents for help identifying collaborators for this work.
National Natural Science Foundation of China (Grant no. 62275153, 62005165); Shanghai Industrial Collaborative Innovation Project (Grant no. HCXBCY-2022-006); the development fund for Shanghai talents (Grant no. 2021005); the Key Basic Research Projects of the Basic Strengthening Plan (Grant no. 2021-JCJQ-ZD040-02-03); the Key Laboratory of Space Active Opto-electronics Technology of Chinese Academy of Sciences (Grant no. 2021ZDKF4); the Shanghai Science and Technology Innovation Action Plan (Grant no. 22dz1201300).
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.
TABLE 2 CC values of different directions
Directions | Images | |||||
---|---|---|---|---|---|---|
Cover | Secret | Plain Text | Cipher Text | GI | Extract | |
Horizontal | 0.8623 | 0.8536 | 0.8809 | −0.0263 | 0.8795 | 0.8868 |
Vertical | 0.8915 | 0.8318 | 0.9135 | −0.0221 | 0.9096 | 0.8831 |
Diagonal | 0.7716 | 0.7257 | 0.8058 | 0.0154 | 0.7982 | 0.7844 |