Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in Dynamic Environments

Mingyo Seo    Ryan Gupta    Yifeng Zhu    Alexy Skoutnev   
Luis Sentis    Yuke Zhu   

The University of Texas at Austin   

Paper | Code

We tackle the problem of perceptive locomotion in dynamic environments. In this problem, a quadrupedal robot must exhibit robust and agile walking behaviors in response to environmental clutter and moving obstacles. We present a hierarchical learning framework, named PRELUDE, which decomposes the problem of perceptive locomotion into high-level decision-making to predict navigation commands and low-level gait generation to realize the target commands. In this framework, we train the high-level navigation controller with imitation learning on human demonstrations collected on a steerable cart and the low-level gait controller with reinforcement learning (RL). Therefore, our method can acquire complex navigation behaviors from human supervision and discover versatile gaits from trial and error. We demonstrate the effectiveness of our approach in simulation and with hardware experiments.


Method Overview

Overview of PRELUDE. We introduce a control hierarchy where the high-level controller, trained with imitation learning, sets navigation commands and the low-level gait controller, trained with reinforcement learning, realizes the target commands through joint-space actuation. This combination enables us to effectively deploy the entire hierarchy on quadrupedal robots in real-world environments.




Hierarchical Perceptive Locomotion Model

The high-level navigation policy generates the target velocity command at 10Hz from the onboard RGB-D camera observation and robot heading. The target velocity command, including linear and angular velocities, is used as input to the low-level gait controller along with the buffer of recent robot states. The low-level gait policy predicts the joint-space actions as the desired joint positions at 38Hz and sends them to the quadruped robot for actuation. More implementation details can be found in this page.



Real Robot Evaluation

We perform real-world trials where the robot traverses 15m-length tracks in different configurations. We compare it with our self-baseline PRELUDE (A1 Default Gait), a variant of our final model, using the robot’s default model-based controller instead. PRELUDE tracks trajectories more robustly (with a 20% increase in success rate) than PRELUDE (A1 Default Gait). We observed that PRELUDE (A1 Default Gait) drifts aggressively after a high-speed turning and collides into the wall, while PRELUDE turns rapidly to bypass the walking crowd and completes the trial successfully.


Deploying in Unseen Environments

We deployed in unseen human-centered environments with static and dynamic obstacles. It exhibits robust locomotion behaviors with on-board visual perception.


Citation


      @inproceedings{seo2022learning,
        title={Learning to Walk by Steering: Perceptive Quadrupedal Locomotion 
          in Dynamic Environments},
        author={Seo, Mingyo and Gupta, Ryan and Zhu, Yifeng and Skoutnev, Alexy
          and Sentis, Luis and Zhu, Yuke},
        booktitle={arXiv preprint arXiv:2209.09233},
        year={2022}
      }