Learning to Ball: Composing Policies for Long-Horizon Basketball Moves
1 Stanford University, 2 University of California, Riverside, 3 Roblox, 4 Clemson University
In ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2025)
Abstract
Learning a control policy for a multi-phase, long-horizon task, such as basketball maneuvers, remains challenging for reinforcement learning approaches due to the need for seamless policy composition and transitions between skills. A long-horizon task typically consists of distinct subtasks with well-defined goals, separated by transitional subtasks with unclear goals but critical to the success of the entire task. Existing methods like the mixture of experts and skill chaining struggle with tasks where individual policies do not share significant commonly explored states or lack well-defined initial and terminal states between different phases. In this paper, we introduce a novel policy integration framework to enable the composition of drastically different motor skills in multi-phase long-horizon tasks with ill-defined intermediate states. Based on that, we further introduce a high-level soft router to enable seamless and robust transitions between the subtasks. We evaluate our framework on a set of fundamental basketball skills and challenging transitions. Policies trained by our approach can effectively control the simulated character to interact with the ball and accomplish the long-horizon task specified by real-time user commands, without relying on ball trajectory references.
Video
Bibtex
@article{basketball, author = {Xu, Pei and Wu, Zhen and Wang, Ruocheng and Sarukkai, Vishnu and Fatahalian, Kayvon and Karamouzas, Ioannis and Zordan, Victor and Liu, C. Karen}, title = {Learning to Ball: Composing Policies for Long-Horizon Basketball Moves}, journal = {ACM Transactions on Graphics}, publisher = {ACM New York, NY, USA}, year = {2024}, volume = {44}, number = {6}, doi = {10.1145/3763367} }