Project Summary

Title

Mimicking 3D Motion with Human-Object Interaction from 2D

Abstract

Motion mimicking, which learns the control signal policy required to reproduce both the overall trajectory and the manipulation of specific body parts, has a wide range of applications in physics-based environments such as games, VR/AR content. However, many mimicking studies relies on high quality 3D demonstrations, which is expensive. In this paper, we mimic human-object interaction (HOI) using 2D demonstrations (i.e, videos) instead of 3D demonstrations. Given videos, we first synthesize 3D demonstrations of human motion using an off-the-shelf human pose and shape estimator, and learn initial policy for human state. Then, we learn a residual value function and actions based on the human-object state to ultimately mimic the HOI. As there are no demonstrations for 3D objects, we propose a new reward based on 2D projections. To evaluate our 2-stage mimicking, we conducted an experiments with baselines using the BallPlay dataset.