Research Brief | Toward Real-World Category-Level Articulation Pose Estimation


Human life is populated with articulated objects. Current Category-level Articulation Pose Estimation (CAPE) methods are studied under the single-instance setting with a fixed kinematic structure for each category.

Considering these limitations, Dr Lu and his group aim to study the problem of estimating part-level 6D pose for multiple articulated objects with unknown kinematic structures in a single RGB-D image, and reform this problem setting for real-world environments and suggest a CAPE-Real (CAPER) task setting. This setting allows varied kinematic structures within a semantic category, and multiple instances to co-exist in an observation of real world.

To support this task, they built an articulated model repository ReArt-48 and presented an efficient dataset generation pipeline, which contains Fast Articulated Object Modeling (FAOM) and Semi-Authentic MixEd Reality Technique (SAMERT). 

Accompanying the pipeline, they built a large-scale mixed reality dataset ReArtMix and a real world dataset ReArtVal. Accompanying the CAPER problem and the dataset, they proposed an effective framework that exploits RGB-D input to estimate part-level pose for multiple instances in a single forward pass. In this method, they introduced object detection from RGB-D input to handle the multi-instance problem and segmented each instance into several parts. To address the unknown kinematic structure issue, the team proposed an Articulation Parsing Network to analyze the structure of detected instance, and also build a Pair Articulation Pose Estimation module to estimate per-part 6D pose as well as joint property from connected part pairs. Extensive experiments demonstrate that the proposed method can achieve good performance on CAPER, CAPE and instance-level Robot Arm pose estimation problems, which will serve as a strong baseline for future research on the CAPER task.


The work was published by IEEE Transactions on Image Processing and could be accessed at Toward Real-World Category-Level Articulation Pose Estimation | IEEE Journals & Magazine | IEEE Xplore. The code is available in due course.


About Professor Lu

Dr. Cewu Lu is a Professor at Shanghai Jiao Tong University (SJTU). Before he joined SJTU, he was a research fellow at Stanford University working under Prof. Fei-Fei Li and Prof. Leonidas J. Guibas. He was a Research Assistant Professor at Hong Kong University of Science and Technology with Prof. Chi Keung Tang. He got his PhD degree from The Chinese Univeristy of Hong Kong, supervised by Prof. Jiaya Jia.


He was selected as one of the Overseas High-Level Young Introduced Talents in 2016, selected as one of China's Top 35 Under 35 Science and Technology Elite (MIT TR35) by MIT Review in 2018, awarded the Quyi Outstanding Young Scholar Award in 2019, honored with the Shanghai Science and Technology Progress Special Award ( ranked third ) in 2020, and published more than 100 articles in Nature, Nature Machine Intelligence, TPAMI, CVPR and other high-ranking journals and conferences with correspondent or first authorship. He was the program chair of CVM 2018, division chair of CVPR 2020, ICCV 2021, and IROS 2021.


Dr Lu is mainly engaged in computer vision and robotics research, and has achieved several breakthrough research results. He has published open source AI frameworks and datasets with top international level, such as Alphapose (GitHub Star 5000+), HAKE (Human Behavior Engine), and GraspNet (High Performance Robot Grasping System), a real-time human posture estimation system.


About SIAS

Shanghai Institute for Advanced Study of Zhejiang University (SIAS) is a jointly launched new institution of research and development by Shanghai Municipal Government and Zhejiang University in June, 2020. The platform represents an intersection of technology and economic development, serving as a market leading trail blazer to cultivate a novel community for innovation amongst enterprises. 

SIAS is seeking top talents working on the frontiers of computational sciences who can envision and actualize a research program that will bring out new solutions to areas include, but not limited to, Artificial Intelligence, Computational Biology, Computational Engineering and Fintech.