CFan Academy: 3D sensing beyond the pixel plane

The self-driving cars are getting hotter and hotter nowadays, and most of the map information they get is captured by the camera on the car. What the camera captures is generally a 2D flat image, which is then converted into 3D, only that such conversion can lead to loss of data, such as the inability to accurately identify roadblocks around the car. However, many driving systems have now implemented the capture and processing of 3D information, of which the point cloud technology has received much attention.

3D information capture – not as simple as you think

We all know that 3D information is three-dimensional data, so that the information captured conforms to what we see in the normal world with both eyes. Obviously if the self-driving system captures the surrounding environment if it is also 3D information, then the self-driving system developed on the basis of manual driving can better achieve more accurate identification and avoidance of the surrounding roadblocks such as people and other cars in the vicinity, thus making the driving system safer.

However, the capture of 3D information is not a simple matter, the traditional 3D data capture is obtained by the “Stereo” (binocular vision) method. In this way, two or more cameras are arranged to capture the same scene from multiple angles, and then the 3D data of a specific object is obtained by matching the corresponding pixels with different images and calculating the position of each pixel in the 3D space by calculating the position of the pixel differently between images (Figure 1).


Figure 1 Illustration of binocular vision approach

However, this approach has a major drawback, because the driving system is often in a fast-running environment and needs to be very accurate for object recognition. But using this approach to obtain data, we need to use visual details to match corresponding points between camera images, and this computational approach is very data intensive and error prone in environments where there is a lack of texture or repetition of visual structure. In autonomous driving systems, many manufacturers are now using the LiDAR (laser radar) approach to capture 3D data. Equipped with a LiDAR camera on top of these self-driving cars, it emits high-frequency laser pulses into the surrounding area. If it encounters the occlusion of an object, these laser pulses are reflected back and the system measures the actual distance between the object and the car by calculating the time of return. For example, if a signal ahead is detected, the actual distance between the vehicle and the traffic light can be detected by the return beam (Figure 2).


Figure 2 LiDAR (laser radar) detection illustration

A self-driving car is equipped with multiple 3D LiDAR sensors that rotate rapidly to see in all directions around the sensor. These radars enable the driving system to accurately detect surrounding objects by sending millions of beams per second and then inferring the exact measurement (up to about 60 meters) from the returned beam time to any surrounding object (Figure 3).


Figure 3 Multiple 3D LiDAR sensors equipped with self-driving cars

3D information recognition and processing – point cloud model and deep learning

With the above method, how can the 3D data captured by the autonomous driving system allow the driving system to identify 3D objects? Here the driving system uses the “point cloud mode” method for recognition. The driving system uses the 3D data obtained from 3D LiDAR, including the 3D coordinates (XYZ), laser reflection intensity (Intensity) and color information (RGB) of the object. In this way, the system obtains the spatial coordinates of each sampled point on the surface of the object, and gets one data point of the object in the spatial coordinates, and these points form a collection of points, and the whole collection forms a 3D object, so that the system can realize the accurate recognition of the object.


Figure 4 Point cloud pattern illustration

For example, if a laser detects a roadblock in front of you, the information returned will form a 3D map of your surroundings in the driving system. 3D LiDAR maps not only allow the system to know exactly where you are in the world to help you navigate. It can also identify and track obstacles such as cars, pedestrians, etc. With the development of recognition technology, modern LIDAR allows you to identify a cyclist and a walking person and even measure how fast they change direction and travel. In this way for an autonomous driving system, it is like having many drivers with excellent vision sitting in the cab, who have an omnidirectional view of the front and back, up and down, left and right of the car, thus enabling fast and safe autonomous driving (Figure 5).


Figure 5 Google Car 3D LiDAR mapping of the surrounding area

Of course in the actual driving environment, each roadblock size, location and distance are different, just simply capture and identify 3D objects, which is not enough to allow the self-driving system can safely in a complex geographical environment to safely and automatically carry out driving. For this reason, scientists introduced a deep learning mechanism, they first build a model, and then the 3D LiDAR capture of different graphics of the driving system to learn, through continuous learning, the autonomous driving system can “recognize” the different capture of objects. At the same time, deep learning neural network allows the system to learn independently, and finally generate a set of complex algorithms, so that the driving system can accurately achieve the complex environment of the surrounding objects, to achieve safe autonomous driving (Figure 6).


Figure 6 Deep learning illustration

More applications of 3D information processing

Above we mainly introduce the application of 3D information processing in autonomous driving, the application in this field is a bit distant from our life. But 3D information processing also has very many applications in many fields, for example, in the medical field, equipped with 3D information recognition equipment medical equipment, through laser detection, doctors do not need to insert traditional detection equipment such as gastroscope in our body, they can very fast and accurate to the focal object for a detailed and comprehensive view, eliminating the pain of the traditional detection.

In the increasingly hot VR game field, 3D information processing technology also has a lot of applications, through similar 3D laser scanning and identification, future VR games can be very convenient to seamlessly integrate the current physical environment into the game environment, making our game experience more realistic and exciting. We expect that with the progress of technology, 3D information processing can bring more convenience to our life!

Leave a Comment