Ex) Article Title, Author, Keywords
Current Optics
and Photonics
Ex) Article Title, Author, Keywords
Curr. Opt. Photon. 2021; 5(5): 514-523
Published online October 25, 2021 https://doi.org/10.3807/COPP.2021.5.5.514
Copyright © Optical Society of Korea.
Beomjun Kim, Daerak Heo, Woonchan Moon, Joonku Hahn
Corresponding author: jhahn@knu.ac.kr, ORCID 0000-0002-5038-7253
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Methods for absolute depth estimation have received lots of interest, and most algorithms are concerned about how to minimize the difference between an input defocused image and an estimated defocused image. These approaches may increase the complexity of the algorithms to calculate the defocused image from the estimation of the focused image. In this paper, we present a new method to recover depth of scene based on a sharpness-assessment algorithm. The proposed algorithm estimates the depth of scene by calculating the sharpness of deconvolved images with a specific point-spread function (PSF). While most depth estimation studies evaluate depth of the scene only behind a focal plane, the proposed method evaluates a broad depth range both nearer and farther than the focal plane. This is accomplished using an asymmetric aperture, so the PSF at a position nearer than the focal plane is different from that at a position farther than the focal plane. From the image taken with a focal plane of 160 cm, the depth of object over the broad range from 60 to 350 cm is estimated at 10 cm resolution. With an asymmetric aperture, we demonstrate the feasibility of the sharpness-assessment algorithm to recover absolute depth of scene from a single defocused image.
Keywords: Coded aperture, Depth estimation, Image reconstruction
OCIS codes: (100.2000) Digital image processing; (100.3008) Image recognition, algorithms and filters; (100.3020) Image reconstruction-restoration; (120.3940) Metrology; (170.1630) Coded aperture imaging
Depth estimation is one of the most important areas of study in three-dimensional (3D) metrology. Depth estimation occupies a crucial position in a variety of industries, as it has been actively used for self-driving cars, inspection of defects, 3D recognition, for example. Time-of-flight and structured-light illumination are regarded as representative technologies of depth estimation, where a specific light is emitted onto a target and the reflected light signal is collected by a detector. These technologies provide a depth map with high resolution and accuracy, but they usually require expensive and complicated optical instruments.
On the other hand, depth estimation from defocused images has the great advantage that it requires only one or two images from a single conventional camera and its system has a small form factor compared to previous ones. This method originates from the relation between level of defocus (LOD) and depth profile. When the object is placed outside of the depth of field, the captured image is defocused and LOD increases proportionally to the distance from the focal plane. Based on this fact, many studies of depth recovery have concentrated on obtaining LOD from defocused images [1–3]. Therefore, evaluation of LOD from the scene plays a significant role in various applications such as image deblurring, image segmentation and depth estimation.
In general, methods for depth recovery are classified into relative depth estimation (RDE) and absolute depth estimation (ADE) from the defocused images. In RDE studies, the LOD is evaluated by computing the standard deviation, which is inferred from edge-detection. The standard deviation is related to the size of the point-spread function (PSF) and can be obtained from the high-frequency content using the derivative operator [4, 5]. Pentland [6] proposed the framework of RDE to obtain the focal disparity map by utilizing the relation between high frequency content coming from the difference of two defocused images. But this approach has an inconvenience: It needs to capture several images from the same scene, with different camera parameters. Bae and Durand [7] recovered the depth map from a single defocused image by employing the method of Elder and Zucker [1] and Levin
Most methods of ADE are based on deconvolution to recover the depth from a defocused image. The features of the PSF are important to estimate the depth precisely, and it is necessary to specially design an optical stop with a coded pattern. Levin
In this paper, we propose a new ADE method to recover the depth of scene by using an asymmetric aperture based on the sharpness-assessment algorithm. We estimate the depth of scene in a broad range from 60 to 350 cm by using an asymmetric aperture. The proposed algorithm recovers the depth of scene by calculating the sharpness of the deconvolved images. In our experiments, we present the estimated depth and the difference in estimation compared to the ground truth. Also, we show the absolute depth map textured with an all-focus image for several objects placed at different distances. Therefore, we demonstrate that our algorithm provides a feasible solution to recover absolute depth of scene from a single defocused image.
This paper is organized as follows. In Section 2, the camera with asymmetric aperture is modeled. In Section 3, the sharpness-assessment algorithm is described along the entire procedure. In Section 4, we show the experimental results to demonstrate our proposed algorithm, and in Section 5 the conclusion is given.
When we take a picture, the scene apart from the focal plane is defocused, and this blurring becomes dominant as the scene gets farther away from the focal plane. Figure 1 shows a simple camera model representing a circle of confusion (CoC) and its radius when the half of the aperture size is 7 mm. The radius is related to the position of a point source from the focal plane. As the distance from the focal plane to the point source increases, the size of the CoC also increases. In this thin lens model, the radius of the blur circle
where
Figure 1(b) shows the radius of the CoC when the distance ranges from 60 to 350 cm. Here the focal plane of the camera is at 160 cm. The radius of the CoC changes rapidly when the distance of the object is closer than the focal plane. Depth is better distinguished when the object is closer than the focal plane. On the other hand, when the object is farther than the focal plane, the variation of the CoC is very small and it is difficult to determine the depth of the object. So, we use these facts to determine the appropriate depth range.
In this study, the concept of ADE comes from the fact that the PSF differs according to capture distance. The captured CoC using a circular aperture has a simple symmetric shape, so it is impossible to distinguish whether the point source of light is located closer or farther than the focal plane. To solve this problem, we use a camera with an asymmetric aperture shaped like the numeral 7, as shown in Fig. 2(a). When the camera’s aperture is asymmetric, we get the PSF by taking a point light source with a camera. To easily move the point light source, it is displayed on a panel. The PSFs are obtained by shifting the display panel from 60 to 350 cm in 10 cm increments. Figure 2(b) shows several examples of captured PSFs when the focus plane is set to 160 cm.
The image of a simple planar scene is computed as the convolution of the scene
where is a convolution operation and
In this section, we show a new method of ADE based on the sharpness-assessment algorithm. The process is shown in Fig. 3. First, the defocused image is deconvolved using the PSF set. Second, its high-frequency contents are obtained using edge-detection operators. Then, the defocused image is segmented with respect to the objects in the scene. To estimate the depths of the objects, the regions are segmented using a set of masks {
For deconvolution of a defocused image, we use the Richardson-Lucy method that is known as a non-linear iterative deconvolution algorithm [13, 14]. It is useful for retrieving a focused image when we know the PSF of the depth layer. In our experiments, the number of iterations is set to 20. The deconvolved image using the PSF is obtained by
where
Figure 4(a) shows the defocused image of the scene where a resolution chart is located 290 cm from the camera. As shown in Fig. 4(b), we compute the reconstructed image by deconvolution with the PSF at the corresponding depth.
There are many gradient operators for edge-detection, such as Sobel, Prewitt, and Roberts, but these methods are too sensitive to choose proper parameters for various features of an image. Therefore, we use four derivative edge-detection operators in the
For expanding the width of the edge, the window summations with respect to ∂
Figure 5 shows the window summations of 1st and 2nd derivatives along the
The purpose of our depth estimation algorithm is to find the sharpest image among the defocused images. In general, a sharp edge has a greater energy density than a defocused edge under the same conditions, but unfortunately some defocused edges do have more energy than sharp edges. This comes from ringing artifacts, which are caused by deconvolution with an improper convolutional kernel having a different size. These ringing artifacts increase the energy of the defocused image, so that they obstruct accurate depth estimation. Therefore, we use total-sum scaling normalization to reduce the effect of these artifacts. The normalized window summations are defined by
In the high-frequency parts of the edge image, there is a lot of noise that interferes with accurately assessing the sharpness. Therefore, we remove the noise contained in the high-frequency of each image. We discover the features of the noise from the histogram. Usually, the noise in edge images is located in the region of small magnitude, and there are many such pixels, as shown in Fig. 6(a). Figure 6(b) shows the CDF, which is useful to determine a threshold point. We divide the CDF graph into 10 sections along the
Figure 7 shows the denoised normalized window summations for 1st and 2nd derivatives with respect to
Commonly, the energy of a sharp edge is greater than that of a defocused edge. Based on this fact, the scores for sharpness are defined by
where
The scores for sharpness are undesirably affected by the size of the PSF, so we need to compensate for this effect. The compensation factor λ
where
The sharpness of the deconvolved image is obtained by summing the overall scores. The total score for sharpness is defined by
We estimate the absolute depth of the object as the depth of the PSF for which the total score for sharpness has a maximum value within the depth range. The absolute depth for the
Figure 8 plots the normalized total score for sharpness with respect to the depth, where each score is normalized by the highest score. The red circle at the peak represents the score of the target position. From the graph, the sharpness score increases as the depth of the PSF becomes close to the actual depth. The highest value occurs at 290 cm. Therefore, the depth of object is estimated as 290 cm, which corresponds to the ground-truth.
Experimentally, we demonstrate the proposed algorithm for two examples. First, the depth of the target is estimated as the target moves from 80 to 350 cm in 30-cm increments. Since the PSFs are captured in 10-cm increments, the depth resolution is 10 cm. The focus of the camera is fixed at 160 cm in this study. We use a Canon EOS 650D DSLR (Canon, Tokyo, Japan) with a Nikon AF-S 50 mm
As shown in Fig. 9(a), the images of the resolution chart are taken by moving the target from 80 to 350 cm in 30-cm increments. The depth of the target is estimated using the sharpness-assessment algorithm. Figure 9(b) shows the normalized scores for different target depths. Figure 9(c) shows the difference between the estimated depth and the ground- truth from 80 to 350 cm. When the target is positioned at 170 cm, the depth difference is −20 cm; this means that the estimated depth is 150 cm. The depth difference is explained by the features of the PSF: As mentioned, the radius of the PSF is too small around the focal plane, and the change in PSF is slightly smaller behind the focal plane. Therefore, the accuracy of the depth estimation near and behind the focal plane is relatively low.
In Fig. 9(b), the values of two peaks are slightly different. The evaluated total score at 200 cm is only about 0.2% higher than that at 150 cm. Since our algorithm estimates the absolute depth of an object as the position with the highest total score of sharpness, the position of 200 cm is chosen. However, the double peak problem is a serious problem that can mislead with erroneous estimates. We think that this ambiguity of double peaks results from the “dead-zone” which is the region near the focal plane [8]. Near the focal plane it is relatively difficult to estimate the absolute depth precisely, due to the small variation in PSF. Therefore, the accuracy of depth estimation near the focal plane is relatively low. In addition, this small PSF brings about the ambiguity in the depth estimation, and it increases the opportunity for an additional peak to appear around 160 cm.
The second experiment is conducted on a defocused image containing a painting, a cup, a photo frame, and a post box positioned at different depths as shown in Fig. 10(a). They are positioned at 100, 160, 250, and 320 cm respectively. The focal plane of the camera is again set to 160 cm. Figure 10(b) shows the segmented image that comes from using the region merging algorithm proposed by Nock and Nielsen [16]. The regions of interest are numbered sequentially according to their depths. For each segmented region, the proposed depth estimation algorithm is applied. Figure 10(c) shows the combination of all four normalized window summations obtained from the deconvolved images for the corresponding real object depths. In this image each object is indicated by bright borders, which represent the segmented areas. When the boundary of an object overlaps with that of another object placed at a different depth, the segmentation process may distort the edge-sharpness of the original image; this is regarded as one of the factors that obstructs accurate depth estimation. For that reason, only an inner feature of a segmented area is used for depth estimation.
Figure 11(a) shows the normalized total scores for sharpness for the four regions. As shown in Fig. 11(b), the estimated depths are identical to the actual depths except for Region 3. In Region 3, the depth difference is −10 cm. Figure 11(c) shows the depth map textured with the all-focus image. The all-focus image is reconstructed by deconvolution using the PSF at the estimated depth for each region. On the other hand, the other regions having the table and wall are set to 350 cm. Therefore, the proposed algorithm provides a feasible solution to recover the absolute depth of the scene from a single defocused image.
In this paper, we have presented a new method to estimate the depth of scene by using an asymmetric aperture, based on the sharpness-assessment algorithm. The asymmetric aperture is used to distinguish whether the target is located closer or farther than the focal plane. The sharpness-assessment algorithm is composed of total-sum scaling normalization, denoising based on the CDF, and scoring of the sharpness of the deconvolved image. In our experiments we used an asymmetric aperture shaped like a “7”. With the proposed method, the depth of scene was estimated over the wide range from 60 to 350 cm and the depth difference was within −20 cm, even near and behind the focal plane. Therefore, we have demonstrated that our algorithm provides a feasible solution to recover the absolute depth of scene from a single defocused image. Even though the optimization of the features of an asymmetric aperture is very important to enhance the performance of depth estimation, we have focused on demonstration of the application of an asymmetric aperture and verification of the feasibility of our sharpness-assessment algorithm. In the future, we plan to replace the conventional camera lens with a multi-aperture lens to reduce the depth difference near and behind the focal plane. We also plan to optimize the coded aperture pattern for each aperture to enhance depth discrimination resolution and accuracy.
This research was supported by ‘The Cross-Ministry Giga Korea Project’ grant funded by Korea government (MSIT) (No. 1711116979, Development of Telecommunications Terminal with Digital Holographic Table-top Display).
Curr. Opt. Photon. 2021; 5(5): 514-523
Published online October 25, 2021 https://doi.org/10.3807/COPP.2021.5.5.514
Copyright © Optical Society of Korea.
Beomjun Kim, Daerak Heo, Woonchan Moon, Joonku Hahn
School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Korea
Correspondence to:jhahn@knu.ac.kr, ORCID 0000-0002-5038-7253
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Methods for absolute depth estimation have received lots of interest, and most algorithms are concerned about how to minimize the difference between an input defocused image and an estimated defocused image. These approaches may increase the complexity of the algorithms to calculate the defocused image from the estimation of the focused image. In this paper, we present a new method to recover depth of scene based on a sharpness-assessment algorithm. The proposed algorithm estimates the depth of scene by calculating the sharpness of deconvolved images with a specific point-spread function (PSF). While most depth estimation studies evaluate depth of the scene only behind a focal plane, the proposed method evaluates a broad depth range both nearer and farther than the focal plane. This is accomplished using an asymmetric aperture, so the PSF at a position nearer than the focal plane is different from that at a position farther than the focal plane. From the image taken with a focal plane of 160 cm, the depth of object over the broad range from 60 to 350 cm is estimated at 10 cm resolution. With an asymmetric aperture, we demonstrate the feasibility of the sharpness-assessment algorithm to recover absolute depth of scene from a single defocused image.
Keywords: Coded aperture, Depth estimation, Image reconstruction
Depth estimation is one of the most important areas of study in three-dimensional (3D) metrology. Depth estimation occupies a crucial position in a variety of industries, as it has been actively used for self-driving cars, inspection of defects, 3D recognition, for example. Time-of-flight and structured-light illumination are regarded as representative technologies of depth estimation, where a specific light is emitted onto a target and the reflected light signal is collected by a detector. These technologies provide a depth map with high resolution and accuracy, but they usually require expensive and complicated optical instruments.
On the other hand, depth estimation from defocused images has the great advantage that it requires only one or two images from a single conventional camera and its system has a small form factor compared to previous ones. This method originates from the relation between level of defocus (LOD) and depth profile. When the object is placed outside of the depth of field, the captured image is defocused and LOD increases proportionally to the distance from the focal plane. Based on this fact, many studies of depth recovery have concentrated on obtaining LOD from defocused images [1–3]. Therefore, evaluation of LOD from the scene plays a significant role in various applications such as image deblurring, image segmentation and depth estimation.
In general, methods for depth recovery are classified into relative depth estimation (RDE) and absolute depth estimation (ADE) from the defocused images. In RDE studies, the LOD is evaluated by computing the standard deviation, which is inferred from edge-detection. The standard deviation is related to the size of the point-spread function (PSF) and can be obtained from the high-frequency content using the derivative operator [4, 5]. Pentland [6] proposed the framework of RDE to obtain the focal disparity map by utilizing the relation between high frequency content coming from the difference of two defocused images. But this approach has an inconvenience: It needs to capture several images from the same scene, with different camera parameters. Bae and Durand [7] recovered the depth map from a single defocused image by employing the method of Elder and Zucker [1] and Levin
Most methods of ADE are based on deconvolution to recover the depth from a defocused image. The features of the PSF are important to estimate the depth precisely, and it is necessary to specially design an optical stop with a coded pattern. Levin
In this paper, we propose a new ADE method to recover the depth of scene by using an asymmetric aperture based on the sharpness-assessment algorithm. We estimate the depth of scene in a broad range from 60 to 350 cm by using an asymmetric aperture. The proposed algorithm recovers the depth of scene by calculating the sharpness of the deconvolved images. In our experiments, we present the estimated depth and the difference in estimation compared to the ground truth. Also, we show the absolute depth map textured with an all-focus image for several objects placed at different distances. Therefore, we demonstrate that our algorithm provides a feasible solution to recover absolute depth of scene from a single defocused image.
This paper is organized as follows. In Section 2, the camera with asymmetric aperture is modeled. In Section 3, the sharpness-assessment algorithm is described along the entire procedure. In Section 4, we show the experimental results to demonstrate our proposed algorithm, and in Section 5 the conclusion is given.
When we take a picture, the scene apart from the focal plane is defocused, and this blurring becomes dominant as the scene gets farther away from the focal plane. Figure 1 shows a simple camera model representing a circle of confusion (CoC) and its radius when the half of the aperture size is 7 mm. The radius is related to the position of a point source from the focal plane. As the distance from the focal plane to the point source increases, the size of the CoC also increases. In this thin lens model, the radius of the blur circle
where
Figure 1(b) shows the radius of the CoC when the distance ranges from 60 to 350 cm. Here the focal plane of the camera is at 160 cm. The radius of the CoC changes rapidly when the distance of the object is closer than the focal plane. Depth is better distinguished when the object is closer than the focal plane. On the other hand, when the object is farther than the focal plane, the variation of the CoC is very small and it is difficult to determine the depth of the object. So, we use these facts to determine the appropriate depth range.
In this study, the concept of ADE comes from the fact that the PSF differs according to capture distance. The captured CoC using a circular aperture has a simple symmetric shape, so it is impossible to distinguish whether the point source of light is located closer or farther than the focal plane. To solve this problem, we use a camera with an asymmetric aperture shaped like the numeral 7, as shown in Fig. 2(a). When the camera’s aperture is asymmetric, we get the PSF by taking a point light source with a camera. To easily move the point light source, it is displayed on a panel. The PSFs are obtained by shifting the display panel from 60 to 350 cm in 10 cm increments. Figure 2(b) shows several examples of captured PSFs when the focus plane is set to 160 cm.
The image of a simple planar scene is computed as the convolution of the scene
where is a convolution operation and
In this section, we show a new method of ADE based on the sharpness-assessment algorithm. The process is shown in Fig. 3. First, the defocused image is deconvolved using the PSF set. Second, its high-frequency contents are obtained using edge-detection operators. Then, the defocused image is segmented with respect to the objects in the scene. To estimate the depths of the objects, the regions are segmented using a set of masks {
For deconvolution of a defocused image, we use the Richardson-Lucy method that is known as a non-linear iterative deconvolution algorithm [13, 14]. It is useful for retrieving a focused image when we know the PSF of the depth layer. In our experiments, the number of iterations is set to 20. The deconvolved image using the PSF is obtained by
where
Figure 4(a) shows the defocused image of the scene where a resolution chart is located 290 cm from the camera. As shown in Fig. 4(b), we compute the reconstructed image by deconvolution with the PSF at the corresponding depth.
There are many gradient operators for edge-detection, such as Sobel, Prewitt, and Roberts, but these methods are too sensitive to choose proper parameters for various features of an image. Therefore, we use four derivative edge-detection operators in the
For expanding the width of the edge, the window summations with respect to ∂
Figure 5 shows the window summations of 1st and 2nd derivatives along the
The purpose of our depth estimation algorithm is to find the sharpest image among the defocused images. In general, a sharp edge has a greater energy density than a defocused edge under the same conditions, but unfortunately some defocused edges do have more energy than sharp edges. This comes from ringing artifacts, which are caused by deconvolution with an improper convolutional kernel having a different size. These ringing artifacts increase the energy of the defocused image, so that they obstruct accurate depth estimation. Therefore, we use total-sum scaling normalization to reduce the effect of these artifacts. The normalized window summations are defined by
In the high-frequency parts of the edge image, there is a lot of noise that interferes with accurately assessing the sharpness. Therefore, we remove the noise contained in the high-frequency of each image. We discover the features of the noise from the histogram. Usually, the noise in edge images is located in the region of small magnitude, and there are many such pixels, as shown in Fig. 6(a). Figure 6(b) shows the CDF, which is useful to determine a threshold point. We divide the CDF graph into 10 sections along the
Figure 7 shows the denoised normalized window summations for 1st and 2nd derivatives with respect to
Commonly, the energy of a sharp edge is greater than that of a defocused edge. Based on this fact, the scores for sharpness are defined by
where
The scores for sharpness are undesirably affected by the size of the PSF, so we need to compensate for this effect. The compensation factor λ
where
The sharpness of the deconvolved image is obtained by summing the overall scores. The total score for sharpness is defined by
We estimate the absolute depth of the object as the depth of the PSF for which the total score for sharpness has a maximum value within the depth range. The absolute depth for the
Figure 8 plots the normalized total score for sharpness with respect to the depth, where each score is normalized by the highest score. The red circle at the peak represents the score of the target position. From the graph, the sharpness score increases as the depth of the PSF becomes close to the actual depth. The highest value occurs at 290 cm. Therefore, the depth of object is estimated as 290 cm, which corresponds to the ground-truth.
Experimentally, we demonstrate the proposed algorithm for two examples. First, the depth of the target is estimated as the target moves from 80 to 350 cm in 30-cm increments. Since the PSFs are captured in 10-cm increments, the depth resolution is 10 cm. The focus of the camera is fixed at 160 cm in this study. We use a Canon EOS 650D DSLR (Canon, Tokyo, Japan) with a Nikon AF-S 50 mm
As shown in Fig. 9(a), the images of the resolution chart are taken by moving the target from 80 to 350 cm in 30-cm increments. The depth of the target is estimated using the sharpness-assessment algorithm. Figure 9(b) shows the normalized scores for different target depths. Figure 9(c) shows the difference between the estimated depth and the ground- truth from 80 to 350 cm. When the target is positioned at 170 cm, the depth difference is −20 cm; this means that the estimated depth is 150 cm. The depth difference is explained by the features of the PSF: As mentioned, the radius of the PSF is too small around the focal plane, and the change in PSF is slightly smaller behind the focal plane. Therefore, the accuracy of the depth estimation near and behind the focal plane is relatively low.
In Fig. 9(b), the values of two peaks are slightly different. The evaluated total score at 200 cm is only about 0.2% higher than that at 150 cm. Since our algorithm estimates the absolute depth of an object as the position with the highest total score of sharpness, the position of 200 cm is chosen. However, the double peak problem is a serious problem that can mislead with erroneous estimates. We think that this ambiguity of double peaks results from the “dead-zone” which is the region near the focal plane [8]. Near the focal plane it is relatively difficult to estimate the absolute depth precisely, due to the small variation in PSF. Therefore, the accuracy of depth estimation near the focal plane is relatively low. In addition, this small PSF brings about the ambiguity in the depth estimation, and it increases the opportunity for an additional peak to appear around 160 cm.
The second experiment is conducted on a defocused image containing a painting, a cup, a photo frame, and a post box positioned at different depths as shown in Fig. 10(a). They are positioned at 100, 160, 250, and 320 cm respectively. The focal plane of the camera is again set to 160 cm. Figure 10(b) shows the segmented image that comes from using the region merging algorithm proposed by Nock and Nielsen [16]. The regions of interest are numbered sequentially according to their depths. For each segmented region, the proposed depth estimation algorithm is applied. Figure 10(c) shows the combination of all four normalized window summations obtained from the deconvolved images for the corresponding real object depths. In this image each object is indicated by bright borders, which represent the segmented areas. When the boundary of an object overlaps with that of another object placed at a different depth, the segmentation process may distort the edge-sharpness of the original image; this is regarded as one of the factors that obstructs accurate depth estimation. For that reason, only an inner feature of a segmented area is used for depth estimation.
Figure 11(a) shows the normalized total scores for sharpness for the four regions. As shown in Fig. 11(b), the estimated depths are identical to the actual depths except for Region 3. In Region 3, the depth difference is −10 cm. Figure 11(c) shows the depth map textured with the all-focus image. The all-focus image is reconstructed by deconvolution using the PSF at the estimated depth for each region. On the other hand, the other regions having the table and wall are set to 350 cm. Therefore, the proposed algorithm provides a feasible solution to recover the absolute depth of the scene from a single defocused image.
In this paper, we have presented a new method to estimate the depth of scene by using an asymmetric aperture, based on the sharpness-assessment algorithm. The asymmetric aperture is used to distinguish whether the target is located closer or farther than the focal plane. The sharpness-assessment algorithm is composed of total-sum scaling normalization, denoising based on the CDF, and scoring of the sharpness of the deconvolved image. In our experiments we used an asymmetric aperture shaped like a “7”. With the proposed method, the depth of scene was estimated over the wide range from 60 to 350 cm and the depth difference was within −20 cm, even near and behind the focal plane. Therefore, we have demonstrated that our algorithm provides a feasible solution to recover the absolute depth of scene from a single defocused image. Even though the optimization of the features of an asymmetric aperture is very important to enhance the performance of depth estimation, we have focused on demonstration of the application of an asymmetric aperture and verification of the feasibility of our sharpness-assessment algorithm. In the future, we plan to replace the conventional camera lens with a multi-aperture lens to reduce the depth difference near and behind the focal plane. We also plan to optimize the coded aperture pattern for each aperture to enhance depth discrimination resolution and accuracy.
This research was supported by ‘The Cross-Ministry Giga Korea Project’ grant funded by Korea government (MSIT) (No. 1711116979, Development of Telecommunications Terminal with Digital Holographic Table-top Display).