Today, almost all autonomous vehicle companies targeting level-5 autonomy use a setup involving LiDARs calibrated together with cameras working in sync to perceive the world around them. As a result, we’re seeing a huge surge in the demand for data to train the deep learning systems built around these sensors.
To this end, we built out our sensor fusion annotation toolkit. It enables labelling of complex scenes while ensuring the quality that ground truth data should have.
There are two common perception problems that any typical sensor fusion setup looks to solve:
- Object tracking & instance segmentation — Advanced Driver Assistance System applications started with camera data but now as we move towards L4 self-driving, models that use correlated LiDAR & camera data to make predictions are getting popular.
- Drivable area segmentation — This is a job where everyone’s traditionally relied on semantic segmentation of the camera feed to identify the pixels where a car can legally & safely drive. Recently, there have been moves towards using LiDAR data to segment ground points to identify the exact location around the car where it can safely drive.
Creating annotations for both of these use cases requires initial efforts to calibrate the LiDAR with the camera so that any point in 3D space can be projected onto the images produced by the camera. This is more often than a not a tedious task that requires a considerable amount trial-and-error to get right.
When I first saw what the Ouster OS-1 could do, it fascinated me. Here was a single compact device that could fulfil the roles of both camera and LiDAR — No calibration required. There is no need not go for a complex multi-camera multi-LiDAR rig to capture a 360-degree view.
When it comes to annotation of data, The Ouster LiDAR changes a world of things.
- It creates a multi-channel PNG image instead of heavy point cloud files (.pcd, .pcl, .ply etc.). Take the KITTI dataset for example. The usual point cloud file in it is around 4MB in size. That along with a 4MB PNG from a camera takes the total data per frame to 8MB. With Ouster’s data, the data per frame is around 200KB. This leads to a 97% reduction in overall data transfer costs.
- Since the data is perfectly correlated, annotations made on images can be carried over to the point cloud and vice versa. What this means is that we just need to label objects in our LiDAR tracking tool and we can automatically create instance segmentation masks. This has two benefits:
- The instance masks are 100% accurate since the cuboids only enclose the points that make up the object which is in turn used to paint the instance mask.
- There is no need to separately annotate the image & LiDAR data, which means that we can get 2x the annotations for the same cost in considerably less time.
- In a similar way, drivable area segmentation requires first making the annotations on the images, which is much easier to annotate than point clouds. The segmented point clouds are automatically created leading to more accurate & cost-effective annotations.
The amount of data that needs to be annotated is going to grow exponentially as autonomous vehicles go into production. In our endeavour to produce high-quality training data at such an enormous scale, Advancements in the technologies, such as Ouster’s LiDAR, play an important part in making this possible.
Playment tools & workflows readily support data generated from the Ouster LiDAR. We have successfully provided high-quality annotation data at relatively less cost made possible by LiDAR design. We’d be happy to assist everyone who adopts Ouster’s LiDAR philosophy to train high performance models with our expertise in data annotation.