William T. B. Uther

Tree Based Hierarchical Reinforcement Learning Degree Type: Ph.D. in Computer Science
Advisor(s): Manuela Veloso
Graduated: August 2002

Abstract:

In this thesis we investigate methods for speeding up automatic control algorithms. Specifically, we provide new abstraction techniques for Reinforcement Learning and Semi-Markov Decision Processes (SMDPs). We introduce the use of policies as temporally abstract actions. This is different from previous definitions of temporally abstract actions as we do not have termination criteria. We provide an approach for processing previously solved problems to extract these policies. We also contribute a method for using supplied or extracted policies to guide and speed up problem solving of new problems. We treat extracting policies as a supervised learning task and introduce the Lumberjack algorithm that extracts repeated sub-structure within a decision tree. We then introduce the TTree algorithm that combines state and temporal abstraction to increase problem solving speed on new problems. TTree solves SMDPs by using both user and machine supplied policies as temporally abstract actions while generating its own tree based abstract state representation. By combining state and temporal abstraction in this way, TTree is the only known SMDP algorithm that is able to ignore irrelevant or harmful subregions within a supplied abstract action while still making use of other parts of the abstract action.

Thesis Committee:
Manuela Veloso (Chair)
Jaime Carbonell
Andrew Moore
Thomas Dietterich (Oregon State University)

Randy Bryant, Head, Computer Science Department
James Morris, Dean, School of Computer Science

Keywords:
Reinforcement learning, regression trees, Markov Decision Processes, Semi-Markov Decision Processes, constructive induction, hierarchical reinforcement learning

CMU-CS-02-169.pdf (1.09 MB) ( 163 pages)
Copyright Notice