Nathan Fulton Verifiably Safe Autonomy for Cyber-Physical Systems Degree Type: Ph.D. in Computer Science Advisor(s): Andre Platzer Graduated: December 2018 Abstract: This thesis demonstrates that autonomous cyber-physical systems that use machine learning for control are amenable to formal verification. Cyber-physical systems, such as autonomous vehicles and medical devices, are increasingly common and increasingly autonomous. Designing safe cyber-physical systems is difficult because of the interaction between the discrete dynamics of control software and the continuous dynamics of the vehicle's physical movement. Designing safe autonomous cyber-physical systems is even more difficult because of the interaction between classical controls software and machine learning components. Formal methods capable of reasoning about these hybrid discrete-continuous dynamics can help engineers obtain strong safety guarantees about safety-critical control systems. Several recent successes in applying formal methods to hybrid dynamical systems demonstrate that these tools provide a promising foundation for establishing safety properties about planes, trains, and cars. However, existing theory and tooling does not explain how to obtain formal safety guarantees for systems that use reinforcement learning to discover efficient control policies from data. This gap in existing knowledge is important because modern approaches toward building cyber-physical systems combine machine learning with classical controls engineering to navigate in open environments. This thesis introduces KeYmaera X, a theorem prover for hybrid systems, and uses KeYmaera X to obtain verified safety guarantees for control policies generated by reinforcement learning algorithms. These contributions enable strong safety guarantees for optimized control policies when the underlying environment matches a first-principles model. This thesis also introduces an approach toward providing safety guarantees for learned control policies even when reality deviates from modeling assumptions. The core technical contribution is a new class of algorithms that blend learning and reasoning to update models in response to newly observed dynamics in the environment. When models are updated online, we leverage verification results about the original but incorrect model to ensure that there is a systematic relationship between the optimization objective and desired safety properties. When models are updated offline, formal verification results are preserved and explainable environmental models are synthesized. These contributions provide verifiable safety guarantees for systems that are controlled by policies obtained through reinforcement learning, justifying the use of reinforcement learning in safety-critical settings. Thesis Committee: André Platzer (Chair) Jeremy Avigad Goran Frehse (ENSTA ParisTech) Zico Koltera Stefan Mitsch Srinivasan Seshan, Head, Computer Science Department Tom. M. Mitchell, Interim Dean, School of Computer Science Keywords: Cyber-Physical Systems, Hybrid Systems, Autonomous Systems, Formal Verification, Differential Dynamic Logic, Automated Theorem Proving, Reinforcement Learning CMU-CS-18-125.pdf (852.4 KB) ( 152 pages) Copyright Notice