Douglas Reece Selective Perception for Robot Driving Degree Type: Ph.D. in Computer Science Advisor(s): Steven Shafer Graduated: May 1992 Abstract: Robots performing complex tasks in rich environments need very good perception modules in order to understand their situation and choose the best action. Robot planning systems have typically assumed that perception was so good that it could refresh the entire world model whenever the planning system needed it, or whenever anything in the world changed. Unfortunately, this assumption is completely unrealistic in many real-world domains because perception is far too difficult. Robots in these domains cannot use the traditional planner paradigm, but instead need a new system design that integrates reasoning with perception. In this thesis I describe how reasoning can be integrated with perception, how task knowledge can be used to select perceptual targets, and how this selection dramatically reduces the computational cost of perception. The domain addressed in this thesis is driving in traffic. I have developed a microscopic traffic simulator called PHAROS that defines the street environment for this research. PHAROS contains detailed representations of streets, markings, signs, signals, and cars. It can simulate perception and implement commands for a vehicle controlled by a separate program. I have also developed a computational model of driving called Ulysses that defines the driving task. The model describes how various traffic objects in the world determine what actions that a robot must take. These tools allowed me to implement robot driving programs that request sensing actions in PHAROS, reason about right-f-way and other traffic laws, and then command acceleration and lane changing actions to control a simulated vehicle. In the thesis I develop three selective perception techniques and implement them in three robot driving programs of increasing sophistication. The first, Ulysses-i, uses perceptual routines to control visual search in the scene. These task-specific routines use known objects to guide the search for others--e.g a routine scans along the right side of the road ahead for a sign. The second program, Ulysses-2, decides which objects are the most critical in the current situation and looks for them. It ignores objects that cannot affect the robot's actions. Ulysses-2 creates an inference tree to determine the effect of uncertain input data on action choices, and searches this tree to decide which data to sense. Finally, Ulyuses-3 uses domain knowledge to reason about how dynamic objects will move or change over time. Objects that do not move enough to affect the robot can be ignored by perception. The program uses the inference tree from Ulysses-2 and a time-stamped, persistent world model to decide what to look for. When run in the PHAROS world, the techniques included in Ulysses-3 reduced the computational cost for perception by 9 to 12 orders of magnitude when compared to an uncontrolled, general perception system. Keywords: Vision and scene understanding, active vision, robotics, knowledge representation, reasoning with uncertainty, driving, traffic simulation, tree search strategies, graphics applications CMU-CS-92-139.pdf (11 MB) Copyright Notice