Travis Dick

Machine Learning: Social Values, Data Efficiency, and Beyond Prediction Degree Type: Ph.D. in Computer Science
Advisor(s): Nina Balcan
Graduated: May 2019

Abstract:

In this thesis, we build on the theory and practice of Machine Learning to accommodate several modern requirements of learning systems. In particular, we focus on new requirements stemming from three distinct sources: making the best use of available data, applying learning tools to problems beyond standard prediction, and incorporating social values. Each of these themes provides an exciting research direction for the practice and theory of Machine Learning.

Learning from Mostly Unlabeled Data in Multi-class Settings. Large scale multi-class learning tasks with an abundance of unlabeled data are ubiquitous in modern Machine Learning. For example, an in-home assistive robot needs to learn to recognize common household objects, familiar faces, facial expressions, gestures, and so on in order to be useful. Such a robot can acquire large amounts of unlabeled training data simply by observing its surroundings, but it would be prohibitively time consuming to ask its owner to annotate any large portion of this data. More generally, in many modern learning problems we often have easy and cheap access to large quantities of unlabeled training data but obtaining high-quality labeled examples is relatively expensive. The first chapter of my thesis focuses on theory for large-scale multi-class learning with limited labeled data. We begin by assuming that a given supervised learning algorithm would succeed at the learning task if it had access to labeled data. Then we use the implicit assumptions made by that algorithm to show that different label-efficient algorithms will also succeed.

Machine Learning Beyond Standard Prediction Problems. While most machine learning focuses on learning to make predictions, there are important learning problems where the output of the learner is not a prediction rule. We focus on data-driven algorithm configuration, where the goal is to find the best algorithm parameters for a specific application domain. A recent exciting line of work has considered data-driven algorithm configuration in the statistical setting, where each application domain is modeled as a distribution over problem instances. In this work, we extend the theoretical foundations of this field to accommodate two new learning settings: the online setting where problems are chosen by an adversary and arrive one at a time, and the private setting, where each problem instance contains sensitive information that should not be released. Algorithm configuration problems often reduce to the maximization of a collection of piecewise Lipschitz functions. Unfortunately, the discontinuities of these functions lead to worst case impossibility results for both the online and private settings. We introduce a novel condition called dispersion that, when satisfied, allows for meaningful regret bounds and utility guarantees in these settings. Next, we show that dispersion is satisfied for many data-driven algorithm configuration problems under very mild assumptions. Finally, we uncover additional structure in data-driven algorithm configuration problems enabling efficient semi-bandit feedback, leading to very efficient configuration procedures with strong regret bounds.

Social Values for Machine Learning Systems. Machine learning systems are becoming central to the infrastructure of our society. The wide-spread use of such systems creates exciting possibilities for profoundly positive impacts in our lives though improvements to, for example, medicine, communication, and transportation. Since these systems are often not explicitly designed with social values in mind, there is a risk that their adoption could result in undesired outcomes such as privacy violations or unfair treatment of individuals. My thesis develops principled techniques for incorporating two social values into machine learning algorithms: privacy and fairness. We import the fairness notion of envy-freeness from fair division into machine learning and consider whether it is possible to learn envy-free classifiers. We also develop general tools for differentially private optimization of piecewise Lipschitz functions, a problem that arises naturally in several learning settings, including applications beyond standard prediction problems

Thesis Committee:
Maria-Florina Balcan (Chair)
Tom Mitchell
Ariel Procaccia
Yishay Mansour (Tel Aviv University)

Srinivasan Seshan, Head, Computer Science Department
Tom M. Mitchell, Interim Dean, School of Computer Science

Keywords:
Data-efficient Machine Learning, Data-driven Algorithm Configuration, Online Piecewise Lipschitz Optimization, Differential Privacy, Fairness in Machine Learning, Envy-freeness

CMU-CS-19-113.pdf (1.54 MB) ( 174 pages)
Copyright Notice