Enhancing vision sensor capabilities with a 3D image stitching algorithm
SOURCE: EMBEDDED.COM
AUG 14, 2024
Rajesh Mahapatra, Anil Sripadarao, and Swastik Mahapatra
The rising popularity of time of flight (TOF) cameras in industrial applications, particularly in robotics, is attributed to their exceptional depth computing and infrared (IR) imaging capabilities. Despite these advantages, the inherent complexity of the optical system often constrains the field of view, limiting standalone functionality. This article discusses a 3D image stitching algorithm designed for a supporting host processor, eliminating the need for cloud computation. This algorithm seamlessly combines IR and depth data from multiple TOF cameras in real time, producing a continuous, high quality 3D image with an expanded field of view beyond standalone units. The stitched 3D data enables the application of state-of-the-art deep-learning networks—particularly valuable in mobile robotics applications—to revolutionize the visualization and interaction with the 3D environment.
Time of flight (TOF) cameras stand out as exceptional range imaging systems, utilizing TOF techniques to determine the distance between a camera and each point in an image. This is achieved by measuring the round-trip time of an artificial light signal emitted by a laser or an LED. TOF cameras offer precise depth information, making them valuable tools for applications where accurate distance measurement and 3D visualization are crucial, such as robotics and industrial technology applications, including collision detection and human detection over 270° field of view (FOV) for safety.
The ADTF3175 TOF sensor can achieve a calibrated 75° FOV. However, challenges arise when an application’s FOV exceeds this region, requiring multiple sensors. Integrating data from individual sensors to provide comprehensive analytics for the entire view can pose difficulties. One potential solution involves having sensors execute an algorithm on a partial FOV and transmitting the output to a host for collation. Yet this approach faces issues such as overlap zones, dead zones, and communication latencies, making it a complex problem to address effectively.
An alternate approach involves stitching the captured data from all sensors into a single image and subsequently applying detection algorithms on the stitched image. This process can be offloaded to a separate host processor, relieving the sensor units from the computational load and providing room for advanced analytics and other processing options. However, it’s important to note that traditional image stitching algorithms are inherently complex and can consume a significant portion of the host processor’s computational power. Furthermore, sending to and stitching in the cloud is not possible in many applications due to privacy reasons.
Analog Devices’ algorithmic solution can stitch the depth and IR images from the different sensors, using the point cloud projections of the depth data. This involves transforming the captured data using camera extrinsic positions and projecting it back into 2D space, resulting in a single continuous image.
This approach results in minimal computation, which helps achieve real-time operating speeds on the edge and ensures that the compute capacity of the host processor remains available for other advanced analytics.
ADI’s 3D TOF solution operates in four stages (see Figure 1):
Figure 1. A depth stitching algorithm.
A host machine is connected to multiple TOF sensors over a high speed connection such as USB. It collects depth and IR frames and stores them in a queue.
Depth and IR frames from each sensor received by the host are captured at different instances of time. To avoid temporal mismatch due to the movement of objects, inputs from all sensors need to be synchronized to the same instance of time. A time synchronizer module is used that matches incoming frames based on timestamps from the queue.
The point cloud is generated on the host using the synchronized depth data for each sensor. Each point cloud is then transformed (translated and rotated) based on its respective camera positions (see Figure 2) in the real world. Then these transformed point clouds are merged to form one single continuous point cloud covering the combined FOV of sensors (see Figure 3).
Figure 2. Camera extrinsic.
Figure 3. A merged point cloud.
The combined point cloud of the FOV is projected to a 2D canvas using a cylindrical projection algorithm also known as front view projection (see Figure 4). In other words, the algorithm projects each point of the merged point cloud onto a pixel in the 2D plane, which results in a single continuous panoramic image covering the combined field of view of all the sensors. This results in two 2D stitched images: one for stitched IR and another for stitched depth images projected onto 2D planes.
Figure 4. A cylindrical projection algorithm.
Projecting the 3D combined point cloud onto a 2D image still does not give good quality images. The images have distortions and noise. This affects visual quality and would also adversely affect any algorithm that runs on the projection. The three key issues (see Figure 5) and the fixes are documented in the following sections.
Figure 5. Issues with 2D projection.
The depth data of the ADTF3175 has an invalid depth value of 0 mm for points that are beyond the operational range of the sensor (8000 mm). This results in large void regions on the depth image and forms incomplete point clouds. A depth value of 8000 mm (the largest depth supported by the camera) was assigned to all the invalid points on the depth image and a point cloud was generated with it. This ensured that there were no gaps in the point cloud.
While projecting the 3D point cloud onto a 2D plane, there are unmapped/unfilled regions in the 2D image. Many point cloud (3D) pixels get mapped to the same 2D pixel and hence several 2D pixels remain blank. This results in the stretch pattern shown in Figure 6. To fix this, a 3 × 3 filter was used that fills the unmapped pixels with the average IR/depth value of its neighboring 8 pixels that have valid values. This resulted in a more complete output image formation and artifacts were removed (see Figure 6).
Figure 6. Filling unmapped pixels.
Because of the cylindrical projection algorithm, many points on the overlapping region end up getting the same resting coordinates on the 2D projected output. This creates noise as the background pixels overlap the foreground ones. To fix this, the radial distance of each point is compared with the existing point, and the point is replaced only if the distance from the camera origin is smaller than the existing point. Ted to retain only the foreground points and improve the projection quality (see Figure 7).
Figure 7. Overlapping noise fix.
This algorithm can stitch images from different cameras with less than 5° of overlap compared to a minimum of 20° of overlap as needed by traditional key points matching algorithms. This approach needs very little computation, making it an ideal candidate for edge systems. The integrity of the depth data is retained post-stitching as there is no image distortion. This solution further supports the modular implementation of the ADTF3175 sensors to obtain the desired FOV with minimal loss.
The FOV expansion is not confined to the horizontal dimension, and the same technique can be used to expand the view vertically to obtain true spherical vision. The solution runs on an Arm®V8, 6-core edge CPU at 10 fps for four sensors providing 275° FOV. The frame rate goes up to 30 fps when only two sensors are used.
One of the key benefits of this approach is the massive computation gain achieved—more than 3× gain in basic computation (see Table 1).
Table 1. Computational Complexity Comparison:
Traditional Algorithms vs. Proposed Algorithm for a 512 × 512 QMP Input
Algorithm | Average Floating Point Operations |
Traditional image stitching | 857 million |
Proposed PCL depth stitching | 260 million (3.29 × reduction) |
Figure 8 and Figure 9 show some results obtained using this solution.
Figure 8. Stitched IR data giving a 210° FOV.
Figure 9. Stitched IR and depth image with 278° FoV.
“Analog Devices 3DToF ADTF31xx.” GitHub, Inc.
“Analog Devices 3DToF Floor Detector.” GitHub, Inc.
“Analog Devices 3DToF Image Stitching.” GitHub, Inc.
“Analog Devices 3DToF Safety Bubble Detector.” GitHub, Inc.
“Analog Devices 3D ToF Software Suite.” GitHub, Inc.
He, Yingshen, Ge Li, Yiting Shao, Jing Wang, Yueru Chen, and Shan Liu. “A Point Cloud Compression Framework via Spherical Projection.” 2020 IEEE International Conference on Visual Communications and Image Processing, 2020.
Industrial Vision Technology. Analog Devices, Inc.
Topiwala, Anirudh. “Spherical Projection for Point Clouds.” Towards Data Science, March 2020.
Note: All images are courtesy of Analog Devices, Inc.
Rajesh Mahapatra has 30+ years of work experience and is working in the Software and Security Group of Analog Devices, Bangalore. He is passionate about solving customer problems using algorithms and embedded software working on ADI hardware solutions. He works closely with NGOs to plant trees and to provide training to urban economically challenged people to generate livelihood. He has five patents in the system, image processing, and computer vision area. |
Anil Sripadarao joined Analog Devices in 2007 and is working for the Software and Security Group of ADI, Bangalore. His areas of interest include audio/video codecs, AI/ML, computer vision algorithms, and robotics. He holds six patents in image processing and computer vision domain. |
Swastik Mahapatra is a senior machine learning engineer with the Software and Security Group. He joined Analog Devices in 2018 and has worked on various computer-vision technologies and robotic safety solutions. He has worked extensively on deep-learning edge-inference framework development, robotic applications development, and is proficient in convolutional neural networks. His professional interests include algorithm development for computer vision, 3D vision, machine learning, and robotics. |
LATEST NEWS
Devices
Here’s how to stop annoying vibrations and notifications on your Samsung device
OCT 12, 2024
WHAT'S TRENDING
Data Science
5 Imaginative Data Science Projects That Can Make Your Portfolio Stand Out
OCT 05, 2022
SOURCE: INFOQ.COM
OCT 10, 2024
SOURCE: HACKSTER.IO
OCT 04, 2024
SOURCE: GADGETS360.COM
SEP 25, 2024
SOURCE: AW.AMAZON.COM
SEP 27, 2024
SOURCE: FORBES.COM
SEP 20, 2024