*: Equal Contributions
We test object-aware retarget on the real robot on six representative tasks, which covers a wide range of manipulation skills including picking, placing, pouring, pushing, manipulating articulated objects, and bimanual cooperation. Our method enables the humanoid to perform the task in scenerios with different visual backgrounds, different object instances, and different object layouts. Here we provide the rollout videos of the six tasks.
By randomly initializing the object layouts and running the object-aware retargeting pipeline each time, we can efficiently generate a large volume of successful rollout data using OKAMI without the need for human-teleoperation. The rollout data can then be used to train closed-loop visuomotor policies through behavioral cloning. We test on two tasks, bagging and sprinkle-salt. The success rates of the visuomotor policies achieve 83.3% and 75% respectively. Here we provide the rollouts of visuomotor policies.
OKAMI's policies may fail to grasp objects due to inaccuracies in the robot controllers, the human reconstruction model or the vision models, or fail to complete tasks because of unwanted collisions, undesired upper body rotations, or inaccuracy in solving inverse kinematics. Here we provide typical failure examples.
@inproceedings{okami2024, title={OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation}, author={Jinhan Li and Yifeng Zhu and Yuqi Xie and Zhenyu Jiang and Mingyo Seo and Georgios Pavlakos and Yuke Zhu}, booktitle={8th Annual Conference on Robot Learning (CoRL)}, year={2024} }