ECE PhD Prospectus Defense: Prachi Shulka

11:00 am on Thursday, June 3, 2021
1:00 pm on Thursday, June 3, 2021

Title: Temperature-aware DNN Accelerator Architecture for Monolithic 3D Systems

Presenter: Prachi Shukla

Advisor: Prof. Ayse K. Coskun (ECE)

Chair: Prof. Ajay Joshi (ECE)

Committee: Prof. Martin Herbordt (ECE); Prof. Emre Salman, Stony Brook University

Abstract: With the slowing down of 2D technology scaling, improving performance under energy, power, and thermal constraints is increasingly more challenging. Monolithic 3D (Mono3D), out of the several 3D integration technologies, has emerged as a promising technology that can overcome 2D scaling bottlenecks and provide performance and power benefits over 2D circuits. Thus, Mono3D can benefit applications with high energy consumption problems. Deep neural network (DNN) is one such application with growing significance as these are becoming increasingly prevalent in all fields. DNNs are characterized by heavy computation and data movement, thus resulting in high energy consumption. As a result, DNNs are expected to capitalize on the energy efficiency promise of Mono3D. Due to the rapid growth of exciting mobile applications that rely on DNN inference (such as in drones or autonomous cars), there has been an increasing demand for mobile DNN accelerators to achieve low inference latency. However, mobile devices have tight area, power, and thermal constraints (e.g., due to the absence of heat sinks and fans). In addition, the dense integration of DNN accelerators in Mono3D causes high power density and tight inter-tier thermal coupling, further escalating thermal issues and resulting in hot spots across tiers. Thus, temperature is a critical design concern in Mono3D DNN accelerator for mobile systems. Therefore, this thesis claims that designing mobile Mono3D DNN accelerator architectures that are aware of the temperature profile, are essential to achieve enhanced energy efficiency than 2D systems while also satisfying the thermal and power constraints. To realize this statement, this thesis proposes to make the following contributions: (i) design an optimization flow to select the optimum Mono3D DNN accelerator architecture for a given optimization goal under performance and temperature constraints; (ii) develop circuit- and architecture-level models to evaluate the power and performance characteristics of different Mono3D partitions; (iii) build a thermally-aware learning-based optimizer to co-design DNNs and Mono3D accelerator architecture for DNN inference at the edge; (iv) design dataflow-aware runtime and design-time thermal management policies for high energy efficiency; and (v) design a Mono3D accelerator architecture interfaced with emerging memory technologies for energy-efficient DNN acceleration.