ECE PHD Prospectus Defense: Mihailo Isakov

Starts:
12:00 pm on Friday, January 17, 2020
Ends:
2:00 pm on Friday, January 17, 2020
Contact Name:
Christine Ritzowski

Abstract:

Over the next decade, processor design will encounter a number of challenges. For instance, with the breakdown of Dennard scaling, processor designers can no longer rely on integrated circuit (IC) energy efficiency to scale well with transistor density. The net impact is increasing power and heat dissipation. In turn, designers have to forgo performance in order to fit within a certain power budget due to the inability to dissipate heat quickly, by under clocking or power-gating their cores. Recently, there has been a push toward accelerator-based architectures for better performance to power ratios. A more tightly coupled integration of accelerators has been proposed as a design methodology to improve the energy efficiency and performance of compute-intensive tasks such as graphics, machine learning, and cryptography. However, by their very nature, accelerators tend to perform poorly on common tasks. Therefore, each new acceleration domain requires its own hardware module and leads to an increase in chip area. To complicate matters further, as more compute-intensive tasks are handled in hardware with accelerators, the general-purpose portions of programs become more dominant and provide less instruction, data or task parallelism to exploit. As a consequence, the general-purpose superscalar processors experience further degradation in net performance.

This dichotomy between general-purpose processors and accelerators has long served a key motivating factor for reconfigurable hardware. In fact, reconfigurable architectures have been proposed as a processor class for achieving higher energy efficiency by adapting the microarchitecture states or organizations to compute tasks, for example, by dynamically turning off cores or modules, changing clock frequency, or rearranging the cache architecture. However, reconfiguration is only rarely triggered, with tens or hundreds of millions of processor cycles between reconfigurations. Generally, reconfiguration decisions are made in software. As such, in-software decision making often outweighs the potential benefits of the hardware adaptation. Even though higher-frequency reconfigurations may increase performance-to-power efficiency, it is relatively unexplored due to the perceived added complexity and hardware overhead associated with in-hardware decision making.

In current designs, the on-chip transistor budget is large enough to set aside a fraction to implement intelligent logical blocks that dynamically manage the operating parameters and reconfiguration decisions for better performance and power efficiency. Therefore, I am proposing a design methodology for a new class of self-aware adaptive architectures that can monitor their own micro architectural states, make complex decisions based on them, and reconfigure themselves to mirror their compute tasks. The core of this design methodology is an always active and learning "hardware nervous system" pervasive throughout the chip, that can reason about the module performance and energy usage in hundreds, not hundreds of millions of cycles. The thesis aims to make the following contributions: (1) demonstrate that extremely low latency reconfiguration is a viable venue for improving energy efficiency, (2) establish a "hardware nervous system": a lightweight, extremely low latency, on-line trained reinforcement learning architecture for making reconfiguration decisions in hardware, (3) develop a design-agnostic methodology for imbuing hardware with self-awareness and reconfigurability that treats reconfiguration as a first-class citizen, and (4) evaluate the proposed methodology on a range.