Newly-developed lensless camera uses neural network and transformer to produce sharper images faster

Digital cameras typically require lenses to focus incoming light on an image sensor. While technology has continually improved, allowing for more compact camera systems, they are nonetheless limited by physics. A lens can only be so small, and the distance between the lens and a sensor so short. This is where ‘lensless’ cameras come in. Unburdened by the physical limitations of optical design, lensless cameras can be much smaller. Professor Masahiro Yamaguchi of the Tokyo Institute of Technology, a co-author of a research paper about a new approach to lensless camera design, said, ‘Without the limitations of a lens, the lensless camera could be ultra-miniature, which could allow new applications that are beyond our imagination.’

The idea for a lensless camera itself isn’t new. We’ve seen it before, including a single-pixel lensless camera in 2013 and, more recently, a much smaller lensless camera in 2017. A lensless camera, which comprises an image sensor and a thin mask in front of the sensor that encodes information from a given scene, requires mathematical reconstruction to produce a detailed image. While a traditional camera with an optical lens uses the glass inside its lens to achieve focus and immediately produce a sharp image, a lensless camera instead encodes light and must then reconstruct a blurry, out-of-focus image into something useful.

As its name suggests, a lensless camera omits a traditional optical lens altogether. Instead, it includes only a sensor and a mask. There’s no way for the camera to focus light on the image sensor, so a detailed image must be reconstructed using an encoded pattern and information about how light interacts with the mask and image sensor. Previous approaches have reconstructed an image using an algorithm derived from a physical model. The new method developed by researchers at the Tokyo Institute of Technology instead relies upon a novel deep learning system, resulting in better results that don’t rely on an accurate physical approximation.

Credit: Xiuxi Pan / Tokyo Institute of Technology

A group of researchers at Tokyo Tech, including professor Yamaguchi, have created a new reconstruction technique that promises improved image quality and significantly faster processing, two issues that have held back some other lensless cameras.

Earlier lensless cameras, like the one developed by Bell Labs in 2013 and CalTech’s camera in 2017, relied upon methods to control light hitting the image sensor and perform sophisticated measurements of how light interacts with the specific, physical mask and image sensor, to then reconstruct an image. Without a way to focus light, a lensless camera captures a blurry image, which must be reconstructed into a sharper image using an algorithm. By understanding how the light interacts with a thin mask in front of the image sensor, an algorithm can decode the light information and reconstruct a focused scene. However, the decoding process is extremely challenging and resource-intensive. Beyond requiring time, generating good image quality requires a perfect physical model. If an algorithm is based on an inaccurate approximation of how light interacts with the mask and sensor, the camera system will falter.

Instead of using a model-based decoding approach, the Tokyo Tech team developed a reconstruction method that relies upon deep learning. Existing deep learning methods using convolutional neural networks (CNN) aren’t efficient enough to solve the problem. As outlined by Phys.org, the issue is that a “CNN processes the image based on the relationships of neighboring ‘local’ pixels, whereas lensless optics transform local information in the scene into overlapping ‘global’ information on all the pixels of the image sensor, through a property called ‘multiplexing.”

Here we can see the new lensless camera. It includes an image sensor and a mask that is 2.5mm from the sensor. The mask is built using chromium deposition in a synthetic-silica plate. It has an aperture size of 40×40 μm.

Credit: Xiuxi Pan / Tokyo Institute of Technology

The new research relies upon a novel machine learning algorithm. It’s based upon a technique called Vision Transformer (ViT), and it promises improved global reasoning. As Phys writes, “The novelty of the algorithm lies in the structure of the multistage transformer blocks with overlapped ‘patchify’ modules. This allows it to efficiently learn image features in a hierarchical representation. Consequently, the proposed method can well address the multiplexing property and avoid the limitations of conventional CNN-based deep learning, allowing better image reconstruction.”

Vision Transformer (ViT) is leading-edge machine learning technique, which is better at global feature reasoning due to its novel structure of the multistage transformer blocks with overlapped ‘patchify’ modules. This allows it to efficiently learn image features in a hierarchical representation, making it able to address the multiplexing property and avoid the limitations of conventional CNN-based deep learning, thereby allowing better image reconstruction.

Caption credit: Phys. Image credit: Xiuxi Pan / Tokyo Institute of Technology

The proposed method, using neural networks and a connected transformer, promises improved results. Further, reconstruction errors are reduced, and computing times are shorter. The team believes that the method can be used for real-time capture of high-quality images, something that has eluded previous lensless cameras.

The first row is the ground truth scenes used to test the proposed lensless camera. In this row, the two leftmost columns are targets displayed on an LCD display, while the two rightmost columns are real objects in three-dimensional space. The second row shows the pattern captured by the lensless camera. The third row is the most informative here, as it depicts results using the proposed reconstruction technique. The fourth row shows results using a model-based approach, which has been traditionally used with lensless cameras. The fifth and final row relies upon convolutional neural networks, which as mentioned, have limitations with global image reconstruction.

Image credit: Xiuxi Pan / Tokyo Institute of Technology.

The full research paper, ‘Image reconstruction with transformer for mask-based lensless imaging,’ is available to paid users at Optica. The paper’s authors are Xuixi Pan, Xiao Chen, Saori Takeyama and Masahiro Yamaguchi. You can read the abstract below. The referenced transformer is the ViT:

A mask-based lensless camera optically encodes the scene with a thin mask and reconstructs the image afterward. The improvement of image reconstruction is one of the most important subjects in lensless imaging. Conventional model-based reconstruction approaches, which leverage knowledge of the physical system, are susceptible to imperfect system modeling. Reconstruction with a pure data-driven deep neural network (DNN) avoids this limitation, thereby having potential to provide a better reconstruction quality. However, existing pure DNN reconstruction approaches for lensless imaging do not provide a better result than model-based approaches. We reveal that the multiplexing property in lensless optics makes global features essential in understanding the optically encoded pattern. Additionally, all existing DNN reconstruction approaches apply fully convolutional networks (FCNs) which are not efficient in global feature reasoning. With this analysis, for the first time to the best of our knowledge, a fully connected neural network with a transformer for image reconstruction is proposed. The proposed architecture is better in global feature reasoning, and hence enhances the reconstruction. The superiority of the proposed architecture is verified by comparing with the model-based and FCN-based approaches in an optical experiment.

Author:
This article comes from DP Review and can be read on the original site.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__gads	1 year 24 days	The __gads cookie, set by Google, is stored under DoubleClick domain and tracks the number of times users see an advert, measures the success of the campaign and calculates its revenue. This cookie can only be read from the domain they are set on and will not track any data while browsing through other sites.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_ga_LGX92D8MKV	2 years	This cookie is installed by Google Analytics.
_gat_gtag_UA_213478817_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.

Cookie	Duration	Description
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Newly-developed lensless camera uses neural network and transformer to produce sharper images faster

BROKENMOUNT

ABOUT

PARTNERS

Newly-developed lensless camera uses neural network and transformer to produce sharper images faster

Related Posts

Accessory roundup: gimbals galore and more

Who Won the 2025 Indie Spirit Awards?

The Substance Ending Explained: More Than Just Goo

BROKENMOUNT

ABOUT

PARTNERS