High-Performance Computing (HPC) systems have become crucial to accelerating scientific discovery and technological innovation. Because of the high demand for these systems, increasing application performance and efficiency on the system can have a significant impact on productivity.
Fifth-year PhD candidate Efe Şencan’s paper, “Analyzing GPU Utilization in HPC Workloads: Insights from Large-Scale Systems,” written in collaboration with CISE Director and Professor Ayşe K. Coşkun (ECE, SE) and co-authors from Lawrence Berkeley National Labs, investigates how these inefficiencies occur.

Graphics Processing Units (GPUs) are the fundamental building blocks powering AI, scientific computing, and large-scale simulations. However, applications using GPUs in HPC systems can have imbalances that result in sub-optimal performance and sub-optimal use of energy and computing resources. The paper, which recently won the Best Paper award at the Practice and Experience in Advanced Research Computing (PEARC) Conference, proposes novel metrics to capture spatial (across GPUs) and temporal (over time) imbalances in applications running on GPU node resource usage.
Results indicate that although temporal balance is typically maintained across jobs, applications often don’t require all of the available GPU memory on a node, pointing to opportunities for more efficient resource requests and improved memory usage in such high-performance computing applications. “Efe’s work provides valuable insights into application efficiency in supercomputers, paving the way for new methods to overcome inefficiencies,” Coşkun said. “As data center energy demand rises at unprecedented levels, improving the efficiency of applications will be essential for meeting the world’s growing compute needs.”
To better understand how GPU workloads use system resources, Şencan analyzed applications executed on Perlmutter, a flagship supercomputer at Lawrence Berkeley National Laboratory with more than 1,500 GPU nodes in addition to CPU nodes. “My research focused on how GPU node resources are actually being used in practice,” Şencan explained. “We developed new metrics to quantify how balanced the use of these resources are in user jobs. For example, some applications request dozens of GPU nodes but only keep a portion of them fully active, leaving the rest underutilized. At scale, these imbalances can have a large impact – decreasing the user’s scientific productivity and depleting their allocation.”
The study also examined temporal balance — how evenly resources are used throughout a job’s runtime. In some cases, applications showed bursts of activity separated by long idle phases, resulting in inefficient use of GPU node resources over time. “That’s why it’s important not only to consider average utilization, but also to capture imbalances over time,” Şencan explained. “By quantifying these patterns, we can identify opportunities to request resources more effectively and improve application performance.”
PEARC is a leading high-performance computing (HPC) event that brings together academia, national labs, and industry. “Efe’s award-winning paper leverages real telemetry from a production supercomputer to uncover new insights into GPU usage at a time when demand for GPUs is higher than ever,” Coşkun said. The work has already drawn strong interest, with over 800 downloads in its first month.
Efe Şencan received his B.S. in Computer Engineering from Sabanci University in Istanbul, Turkey. He is now a fifth-year PhD candidate at Boston University, co-advised by Professor Ayse K. Coskun (ECE, SE), Director of CISE, and Professor Brian Kulis (ECE, SE). His research focuses on applying machine learning to detect and diagnose performance problems in large-scale supercomputers. Şencan has also gained industry and national lab experience as a Software Engineer Intern at Meta and a Research Intern at the National Energy Research Scientific Computing Center (NERSC).