• Starts: 10:00 am on Tuesday, June 7, 2022
  • Ends: 2:00 pm on Tuesday, June 7, 2022

Title: Modeling and Optimization of Emerging On-Chip Cooling Technology via Machine Learning

Presenter: Zihao Yuan

Advisor: Professor Ayse K. Coskun, ECE

Chair: Professor Richard C. Brower, ECE

Committee: Professor Sherief Reda, Department of Engineering at Brown University, Professor Ajay Joshi, ECE, Professor Rabia Yazicigil, ECE

Abstract: Over the last few decades, processor performance has continued to grow due to the down-scaling of the transistor dimensions. This performance boost has translated into high power densities and localized hot spots, which decrease the lifetime of processors and increase transistor delays and leakage power. Conventional on-chip cooling solutions are often insufficient to mitigate such high-power-density hot spots efficiently. Emerging cooling technologies such as liquid cooling via microchannels, thermoelectric coolers (TECs), two-phase vapor chambers (VCs), and hybrid cooling options (e.g., of liquid cooling via microchannels and TECs) have the potential to provide better cooling performance compared to conventional cooling solutions. However, these potential solutions’ cooling performance and cooling power vary significantly based on their design and operational parameters (such as liquid flow velocity, evaporator design, TEC current, etc.) and the chip specifications. In addition, the cooling models of such emerging cooling technologies may require additional CFD simulations (e.g., two-phase cooling), which are time-consuming and with large memory requirements. Given the vast solution space of possible cooling solutions (including possible hybrids) and cooling subsystem parameters, the optimal solution search time is also prohibitively time-consuming.

To minimize the cooling power overhead while satisfying chip thermal constraints, there is a need for an optimization flow that enables rapid and accurate thermal simulation and selection of the best cooling solution and the associated cooling parameters for a given chip design and workload profile.

This thesis claims that combining the compact thermal modeling methodology with machine learning (ML) models has the potential to rapidly and accurately carry out thermal simulations and predict the optimal cooling solution and its cooling parameters for arbitrary chip designs. The thesis aims to realize this optimization flow through three fronts.

First, it proposes a parallel compact thermal simulator, PACT, that enables speedy and accurate standard-cell level to architectural level thermal analysis for processors. PACT has high extensibility and applicability and can model and evaluate the thermal behaviors of emerging integration (e.g., monolithic 3D) and cooling technologies (e.g., two-phase vapor chambers).

Second, it proposed an ML-based temperature-dependent simulation framework designed for two-phase cooling methods to enable fast and accurate thermal simulations for the two-phase cooling method. This simulation framework can also be applied to other emerging cooling technologies.

Third, this thesis proposes a systematic way to create novel Deep Learning (DL) models to predict the optimal cooling methods and cooling parameters for a given chip design at design time. Through experiments based on real-world high-power-density chips and their floorplans, this thesis aims to demonstrate that using ML models can substantially minimize the simulation time of emerging cooling technologies and improve the optimization time of emerging cooling solutions while achieving the same optimization accuracy compared to brute force methods.

Location:
PHO 339