- Starts: 9:00 am on Tuesday, April 21, 2026
- Ends: 11:00 am on Tuesday, April 21, 2026
ECE PhD Thesis Defense: Amin Mosayyebzadeh
Title: A Unified Object Storage Caching Substrate for Data Centers
Presenter: Amin Mosayyebzadeh
Advisor: Professor Orran Krieger
Chair: Professor Chen Yang
Committee: Professor Orran Krieger, Professor Peter Desnoyers, Professor Renato Mancuso, Professor David Starobinski
Google Scholar Link: https://scholar.google.com/citations?user=13jt2nwAAAAJ&hl=en&authuser=1
Abstract: Modern data centers rely heavily on immutable, object-based data lakes that provide low-cost, scalable storage for a wide range of workloads with diverse access patterns. However, application performance often suffers due to mismatches between application access patterns and the design assumptions of data lakes, limited bisection bandwidth, and extensive data sharing across clusters.
Existing caching solutions at the framework, cluster, and storage levels address some of these challenges, but each suffers from fundamental limitations when workloads span multiple clusters or require elasticity, consistency, and efficient resource sharing.
This thesis introduces D5N, a data-center-wide cooperative caching system that extends an immutable object store with distributed, locality-aware caches, that provides high performance while enabling efficient data sharing and improved resource utilization across the data center. The system leverages locality-aware placement, rack-level caching, and immutable object semantics to simplify consistency management, reduce coordination overhead, and improve scalability and robustness. Furthermore, D5N enables optimizations such as object transformation and small-object aggregation by intercepting all data lake accesses, bridging the gap between application requirements and backend storage capabilities.
D5N employs a distributed directory and the Globally Weighted Frequency (GWF) policy to make globally informed caching decisions, automatically adapting replication, eviction, and placement to aggregate demand. These mechanisms allow D5N to exploit unused cache capacity across the data center, reduce backend load, and maintain consistency even under read-write sharing. Evaluations show that D5N matches state-of-the-art cluster caches on local workloads while delivering up to 4x performance benefits for cross-cluster data sharing and reducing read-write sharing overheads by up to 20×. D5N also supports data-layout transformations, enabling optimizations such as small-object aggregation that reduce backend disk utilization from 50% to 10%.
Beyond the core system, this thesis presents an ecosystem of three additional works that demonstrate the versatility of D5N as a storage-level caching substrate. Bolted extends D5N to support secure, elastic environment by enabling dynamic resource allocation and strong isolation, motivating the design choice to decouple cache from compute nodes for cluster elasticity. However, decoupling alone is insufficient if storage remains tied to local resources, which motivates LSVD. Positioned between applications and the backend, LSVD exposes a virtual disk abstraction and allows D5N to aggregate small objects into larger units, improving efficiency and aligning with backend storage characteristics. Kariz complements this ecosystem by incorporating analytics framework semantics to generate DAG-aware caching and prefetching strategies, demonstrating the benefits of cross-cluster data sharing and leveraging D5N’s global directory for higher-level caching decisions. Together, these systems show that D5N provides a flexible, extensible foundation for a unified storage ecosystem.
Overall, this work demonstrates that a unified, storage-level cooperative cache is both feasible and essential for future data-center architectures. By combining locality, global coordination, and the ability to transform data before persistence, D5N significantly improves performance, enhances cross-cluster sharing, and enables new caching and storage abstractions. The thesis concludes that while D5N does not replace framework- or cluster-level caches, it provides a powerful foundation that simplifies their design and amplifies their effectiveness, paving the way for more efficient, elastic, and collaborative data-center computing.
- Location:
- PHO 339
