Caching for Datacenters and Data Lakes

D3N (Datacenter-scale Data Delivery Network)

Overview

The Datacenter-scale Data Delivery Network (D3N) project aims to improve application performance and reduce demand on storage systems and data center networks. Inspired by Content Delivery networks (CDNs), D3N’s architecture is designed to cache data on the access side of storage and network bottlenecks for throughput-bound storage workloads.

Diagram showing example D3N architecture
Cache servers (orange) serve requests from nearby clients (blue), cooperating in pools; data blocks
are distributed across servers in a pool via consistent hashing. Lookup servers (black) identify L1 servers to clients, and next-layer servers to cache servers forwarding requests.

Motivation

Data is increasingly important to the success of organizations today. Consequently, the way data is stored and accessed is equally as important as the data itself. Data lakes–low-cost object-storage repositories–are often a part of an organization’s private datacenter. In large distributed organizations, these data lakes are constantly being accessed by many compute clusters operated by different parts of the organization. Even with a well-designed datacenter network, cluster-to-data lake bandwidth is typically much less than the bandwidth to storage within the compute clusters. Because of this disconnect, many users must manually copy a repeatedly accessed dataset to their local storage. This increases complexity and performance overhead to manage data placement and replication.

The D3N project leverages insights drawn from CDNs by caching data on the access side of bandwidth bottlenecks (e.g., rack-to-rack and cluster-to-data lake), with CDN techniques used to direct I/O requests to the correct cache. D3N is designed to accelerate big data analytics workloads with strong locality and limited network connectivity between compute clusters and data storage.

Collaboration with Red Hat

There are many benefits to developing in a collaborative environment and working with the open source community. The students working on the D3N project have been able to leverage the expertise and resources that Red Hat offers allowing their project to advance more rapidly.

The D3N project has been implemented as a modification to Red Hat Ceph Storage’s RADOS Gateway (RGW) in the Massachusetts Open Cloud (MOC) datacenter. RGW is a Ceph component that supports an S3/Swift-compatible object interface. The prototype on the MOC implements a two layer version of D3N. Level one acts independently and local to the rack server, while level two is a logically distributed cache formed by pooling the contents of all cache servers in the cluster. This maximizes the data held on the cluster side of any cluster-to-storage pool bottleneck. Evaluation of this implementation shows significant performance improvements and has proven to serve data substantially faster than per-compute-node hard drives.

For more information, please visit the D3N page at the Mass Open Cloud website.  D3N is part of a group of projects focused on Big Data Enablement, which you can learn more about here.