Dravyansh Sharma

Data-Driven Algorithm Design and Principled Hyperparameter Tuning in Machine Learning

Degree Type: Ph.D. in Computer Science
Advisor(s): Maria-Florina Balcan
Graduated: July 2024

Abstract

For any new machine learning technique, a large body of research often fol- lows up in order to tune the technique to work suitably for each of the numerous application areas, requiring significant scientific and engineering efforts. Moreover, this typically involves unprincipled approaches for hyperparameter selection without any guarantee on global optimality. Inspired from the recently proposed paradigm of â€˜data-driven algorithm designâ€™, this thesis develops first principled hyperparame- ter tuning techniques for several core machine learning algorithms, including semi- supervised learning, regularized linear regression, robust nearest neighbors and de- cision trees, with formal near-optimality guarantees in statistical and online learning settings.

Given multiple problem instances of a learning problem from some problem domain, we develop approaches to learn provably well-tuned parameters over the domain and answer questions related to the number of problem samples needed to learn a well-tuned learning algorithm. In addition, we also develop online learning techniques when the problem instances arrive in a potentially adversarial sequence. Our approaches apply to the following diverse scenarios:

selecting graph hyperparameters in semi-supervised learning,
setting regularization coefficients in linear regression,
controlling the robustness vs. abstention trade-off using parameterized nearest-neighbor algorithms,
splitting and pruning nodes for accurate and interpretable decision trees,
meta-learning common parameters for similar tasks, and
learning adaptively in changing environments.

In addition to providing techniques for tuning fundamental learning algorithms, we expand the applicability of data driven algorithm design to algorithm families with multiple parameters by developing better online and more computationally efficient techniques.

Thesis Committee

Maria-Florina Balcan (Chair)
Tom M. Mitchell
R. Ravi
Avrim Blum (Toyota Technological Institute at Chicago)
Tim Roughgarden (Columbia University)

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science

Keywords

Learning Theory, Data-driven Algorithm Design, Hyperparameter Tuning

Thesis Document

CMU-CS-24-120.pdf (2.79 MB) (240 pages)

At a Glance

Academic Offerings

Admissions

Directory Submenu

People

Explore the Field

Dravyansh Sharma

Data-Driven Algorithm Design and Principled Hyperparameter Tuning in Machine Learning

Abstract

Thesis Committee

Keywords

Thesis Document

At a Glance

Academic Offerings

Admissions

Directory Submenu

People

Explore the Field

What can we help you find?

Dravyansh Sharma

Data-Driven Algorithm Design and Principled Hyperparameter Tuning in Machine Learning

Abstract

Thesis Committee

Keywords

Thesis Document