Steve Rubin

ARGOS Image Understanding System Degree Type: Ph.D. in Computer Science
Advisor(s): Raj Reddy
Graduated: May 1978

Abstract:

ARGOS is an image understanding system. It builds a three-dimensional model of the task domain and uses hypothesized two-dimensional views of the model to label images. It currently achieves less than 20 error by area when labeling real-world city of Pittsburgh photographs with a knowledge base of over fifty objects. In addition, the system can determine the angle of view around the city with approximately 40 degrees of error. The labeling technique used by ARGOS is called Locus search. Locus is a non-backtracking graph search technique in which a beam of near-miss alternatives around the best path are extended in parallel through the graph. After the graph has been searched in breadth-first order, the beam of possibilities is examined in reverse order to extract a near-optimal path.

This path defines a labeling of the image and is only sub-optimal because of the pruning heuristics used in the beam creation. This thesis formulates image understanding as a problem of search shows how Locus search can be used to label images describes the many sources of knowledge used in the interpretation shows how knowledge represented as a network can be used to constrain the search explores extensions to the use of knowledge and presents the experimental results of ARGOS. Its main contributions are the demonstration that Locus search can be used for image understanding and the exploration of issues involved in this use.

Thesis Committee:
Raj Reddy (Chair)

Joseph Traub, Chair, Computer Science Department