Schedule

Time	Section
9:00 AM	Opening Remarks
9:10 AM	Invited Speaker 1: Eric Schkufza A Crash Course in Stochastic Program Optimization (slides)
9:35 AM	Invited Speaker 2: Song Han Efficient deep learning computing: a learning-based approach
10:00 AM	Poster Session 1
11:00 AM	Caroline Lemieux Neural Inference of API Functions from Input–Output Examples (slides)
11:15 AM	Placeto: Efficient Progressive Device Placement Optimization
11:30 AM	Fabian Ruffy Iroko: A Framework to Prototype Reinforcement Learning for Data Center Traffic Control (slides)
11:45 AM	Invited Speaker 3: Partha Ranganathan ML for ML: Machine Learning to drive Moore's Law
12:10 PM	Lunch Break
1:45 PM	Invited Speaker 4: Neeraja J. Yadwadkar Machine Learning for resource management in Distributed Systems
2:10 PM	Learning to Optimize Tensor Programs
2:25 PM	Learning to Design Circuits
2:40 PM	ReLeQ: A Reinforcement Learning Approach for Deep Quantization of Neural Networks
2:55 PM	Poster Session 2
4:00 PM	Invited Speaker 5: Sanjay Krishnan Learning to Optimize SQL Joins With Deep Reinforcement Learning (recording)
4:25 PM	Keynote Speaker: Jeff Dean Machine Learning for Systems (recording)
4:50 PM	Panel Discussion (Jeff Dean, Partha Ranganathan, Song Han, and Ion Stoica)
5:50 PM	Closing Remarks (10mn)

Talk Abstracts

A Crash Course in Stochastic Program Optimization (slides)

Eric Schkufza

Traditional compiler use expert-written rules to prove the correctness of program transformations, and hope for the best in terms of performance. Stochastic program optimizers turn that model on its head. They use machine learning techniques to search for aggressive performance-improving transformations, and state-of-the-art verification techniques to prove correctness after the fact. The results are novel, often inscrutable, and in many cases outperform expertly tuned code. In this talk I'll present an overview of the core technique, describe current work, and discuss directions for future research.

Efficient deep learning computing: a learning-based approach

Song Han

In the post-Moore's Law era, the amount of computation per unit cost and power is no longer increasing at its historic rate. In the post-ImageNet era, researchers are solving more complicated AI problems using larger data sets which drives the demand for more computation. This mismatch between supply and demand for computation highlights the need for co-designing efficient machine learning algorithms and domain-specific hardware architectures. Such algorithm-hardware co-design opens up a much larger design space, which requires domain experts on both sides (ML+systems), and human heuristics might be sub-optimal to explore the vast design space. We introduce three of our recent work of using machine learning to optimize the machine learning system: learning the optimal pruning strategy (AMC) and quantization strategy (HAQ) on the target hardware, rather than relying on rule-based strategies; learning the optimal neural network architecture that is specialized for a target hardware architecture, optimizing both accuracy and latency (ProxylessNAS), rather than using a generic neural network architecture across all hardware architectures; learning to optimize analog circuit parameters, rather than relying on experienced analog engineers to tune those transistors. On the other side of the loop (design hardware-friendly machine learning algorithms), I'll introduce the temporal shift module (TSM) that offers 8x lower latency, 12x higher throughput than 3D convolution-based methods, while ranking the first on both Something-Something V1 and V2 leaderboards. I'll conclude the talk by giving an outlook of the design automation for efficient machine learning system.

ML for ML: Machine Learning to drive Moore's Law

Partha Ranganathan

The computer architecture is facing an important and exciting challenge. The slowing of Moore's law (at the same time demand continues to grow) has led to new approaches to thinking about future system design including accelerators and software-defined hardware. In this talk we will discuss how machine learning has the potential to amplify these opportunities. We will discuss some specific case studies and end with some key insights specific to applying machine learning to improve computer architecture.

Machine Learning for resource management in Distributed Systems

Neeraja Yadwadkar

Traditional resource management techniques that rely on simple heuristics often fail to achieve predictable performance in contemporary complex systems that span physical servers, virtual servers, private and/or public clouds. My research aims to bring the benefits of Machine Learning (ML) models to optimize and manage such complex systems by deriving actionable insights from the performance and utilization data these systems generate. To realize this vision of model-based resource management, we need to deal with the following key challenges data-driven ML models raise: uncertainty in predictions, cost of training, generalizability from benchmark datasets to real-world systems datasets, and interpretability of the models.

In this talk, I will present our the ML formulations to demonstrate how to handle these challenges for two main problem domains in distributed systems: (I) Scheduling in parallel data-intensive computational frameworks for improved tail latencies, and (II) Performance-aware resource allocation in the public cloud environments for meeting user-specified performance and cost goals. Along the way, I will also share a list of guidelines for leveraging ML for solving problems in systems, based on my experience.

Learning to Optimize SQL Joins With Deep Reinforcement Learning (recording)

Sanjay Krishnan

To integrate information from more than two tables, a SQL query optimizer must identify the most efficient nesting of two-way table join operations to answer the query. Recent advances in AI may provide an unexpected new perspective on this classical problem that has been studied for over 40 years. Join optimization can be posed as a Markov Decision Process where the state is a graph that represents the join conditions in a query and actions are edge contractions on this graph; thereby, allowing us to apply ideas from deep reinforcement learning and imitation learning to facilitate an improved query optimizer that learns from experience, handles uncertainty, and incorporates execution feedback. I describe how our group built a full-featured query optimizer based on this MDP architecture, and we present results across a variety of database designs and query workloads in Postgres SQL and Apache Spark. I conclude by highlighting some of the under-appreciated RL research challenges in exploration, parametrization, and policy evaluation unearthed by this application.