Research Brief | Computing+AI Professor Cewu Lu: InstaBoost++: Visual Coherence Principles for Unified 2D/3D Instance Level Data Augmentation

Source:上海高等研究院英文网

Instance-level perception tasks like object detection, instance segmentation, and 3D detection require many training samples to achieve satisfactory performance. The meticulous labels for these tasks are usually expensive to obtain and data augmentation is a natural choice to tackle such a problem. However, instance-level augmentation is less studied in previous research.

Recently, Professor Cewu Lu presented an effective, efficient and unified crop-paste mechanism to augment the training set utilizing existing instance-level annotations. Such design InstaBoost was derived from visual coherence and mines three inherent principles that widely exist in real-world data: (i) background coherence in local neighbor area, (ii) appearance coherence for instance placement, and (iii) instance coherence within the same category. Such methodologies are unified for various tasks including object detection, instance segmentation, and 3D detection.

Neighborhood coherence mainly contained two steps: i) instance and background preparation and ii) random transformation sampled from neighboring space of identity transformation.

Visualization for instance and background preparation. From left to right is the original scene, background and cropped instance. Color is only for visualization in 3D scenes

Examples of contour areas of appearance coherence heatmap in both 2D (a) and 3D (b) scenes. The left part is the original scene and the right part is the effective contour area of a target object. (c) Highlights the pixels in the contour of the target object at its original position and at a new candidate paste position

The feasible transformation of position coordinates was restricted in the neighborhood of the object’s original position in neighborhood coherence. And in the case of InstaBoost, whose performance could be further elevated with a more complicated metric, i.e. appearance coherence heatmap, to better refine the new position where an instance was pasted.

Examples of appearance coherence InstaBoost. Each example consists of the original scene with an instance, appearance coherence heatmap and processed scene from left to right

The authors proposed a general scheme to combine instances and backgrounds from different scenes, in order to utilize the extensive combinations of training samples to generate abundant visually acceptable augmented scenes with high diversity.

Examples of category coherence InstaBoost. New objects from other scenes are first introduced to replace the original one according to the category coherence function. Then, the scene is further augmented with appearance coherence InstaBoost

Extensive experiments demonstrated that proposed approaches could successfully boost the performance of diverse frameworks on various datasets including Pascal VOC with additional mask annotation from VOCSDS, and COCO dataset across multiple tasks, without modifying the network structure.

Remarkable improvements were obtained: 5.1 mAP for object detection and 3.2 mAP for instance segmentation on COCO dataset, and 6.9 mAP for 3D detection on ScanNetV2 dataset. Our method can be easily integrated into different frameworks without affecting the training and inference efficiency.

Evaluation on interior/boundary segmentation accuracy of Mask R-CNN trained with and without InstaBoost

3D object detection result of original VoteNet (top) versus VoteNet trained with InstaBoost (bottom). InstaBoost still guarantees finer results in 3D scenes

The authors studied unified data augmentation principles aiding the lack of training data in instance-level perception tasks. By uniformly sampling on the neighboring identity transform for affine transformation, the simple but effective neighborhood coherence InstaBoost achieved great improvement with recent representative methods on both popular 2D and 3D perception benchmarks. They further devised InstaBoost with appearance coherence heatmap to explore more proper pasting positions, achieving more improvement. To exploit the abundant combinations of instances and backgrounds, the authors applied category coherence InstaBoost and finally improve baseline method for object detection, instance segmentation and 3D detection. Besides, online implementation of InstaBoost could be easily embedded into existing instance segmentation frameworks, where free-lunch improvement was offered with little CPU overhead.

The work was published as ‘InstaBoost++: Visual Coherence Principles for Unified 2D/3D Instance Level Data Augmentation’ by International Journal of Computer Vision and could be accessed at https://link.springer.com/article/10.1007/s11263-023-01807-9.

About Professor Lu

Dr Cewu Lu is a Professor at Shanghai Jiao Tong University (SJTU). Before he joined SJTU, he was a research fellow at Stanford University working under Prof. Fei-Fei Li and Prof. Leonidas J. Guibas. He was a Research Assistant Professor at Hong Kong University of Science and Technology with Prof. Chi Keung Tang. He got his PhD degree from The Chinese Univeristy of Hong Kong, supervised by Prof. Jiaya Jia.

Dr Lu is mainly engaged in computer vision and robotics research, and has achieved several breakthrough research results. He has published open source AI frameworks and datasets with top international level, such as Alphapose (GitHub Star 5000+), HAKE (Human Behavior Engine), and GraspNet (High Performance Robot Grasping System), a real-time human posture estimation system.

About SIAS

Shanghai Institute for Advanced Study of Zhejiang University (SIAS) is a jointly launched new institution of research and development by Shanghai Municipal Government and Zhejiang University in June, 2020. The platform represents an intersection of technology and economic development, serving as a market leading trail blazer to cultivate a novel community for innovation amongst enterprises. 

SIAS is seeking top talents working on the frontiers of computational sciences who can envision and actualize a research program that will bring out new solutions to areas include, but not limited to, Artificial Intelligence, Computational Biology, Computational Engineering and Fintech.