Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains

1Georgia Institute of Technology, 2École Polytechnique Fédérale de Lausanne

Abstract

Humanoid robots promise transformative capabilities for industrial and service applications. While recent advances in Reinforcement Learning (RL) yield impressive results in locomotion, manipulation, and navigation, the proposed methods typically require enormous simulation samples to account for real-world variability.

This work proposes a novel one-stage training framework—Learn to Teach (L2T)—which unifies teacher and student policy learning. Our approach recycles simulator samples and synchronizes the learning trajectories through shared dynamics, significantly reducing sample complexities and training time while achieving state-of-the-art performance.

Furthermore, we validate the RL variant (L2T-RL) through extensive simulations and hardware tests on the Digit robot, demonstrating zero-shot sim-to-real transfer and robust performance over 12+ challenging terrains.

Video

Outdoor Environments

Robust walking in outdoor campus environment

Concrete

Stable walking on flat concrete surface with strong wind

Indoor Environments

Navigation through indoor corridors and spaces

Challenging Terrains

Traversing rocky terrain with uneven surfaces

Walking on loose sand surface

Stable walking on gravel pavement

Walking on natural grass surface

Dynamic Behaviors

Smooth turning behavior

Forward walking on football field

Turning on football field

External Forces

Recovery from forward push

Recovery from backward push

Recovery from pulling force

Special Conditions

Demonstrating the slippery surface

Walking on slippery surface with L2T policy

Walking on slippery surface with the company controller from Agility Robotics

Carrying payload while walking

Walking on elevated surface

BibTeX

      
        @misc{wu2025learnteachsampleefficientprivileged,
          title={Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains}, 
          author={Feiyang Wu and Xavier Nal and Jaehwi Jang and Wei Zhu and Zhaoyuan Gu and Anqi Wu and Ye Zhao},
          year={2025},
          eprint={2402.06783},
          archivePrefix={arXiv},
          primaryClass={cs.RO},
          url={https://arxiv.org/abs/2402.06783}, 
        }