Collaborative Research: CNS Core: Small: A New Architecture for Petabyte-scale File Transfer Evaluated in FABRIC

Sponsor: National Science Foundation (NSF)

Award Number: 2215672

PI: Abraham Matta


File transfer is a fundamental operation of the Internet. Important scientific instruments, such as the James Webb Space Telescope and the Large Hadron Collider, generate massive files daily. As files increase in size, it is more likely errors are introduced during their transfer, such as the ones not caught by the checksum process. Reliability is especially important when the files store data that capture a rare event, such as rare gravitational waves whose loss could significantly impact scientific conclusions if they go undetected. This collaborative project brings together investigators from Arizona State University and Boston University to develop a multi-layer error detection (MLED) architecture. MLED will not only eliminate detectable errors, it also provably reduces the probability of undetected errors.

Each layer in the MLED architecture is parameterized by a policy that includes the scope of the error detection method among other requirements. When aggregated, the layers improve error detection capabilities and significantly reduce the probability of undetected errors. This project has three research directions: (1) to develop the theory of MLED to study trade-offs among policy parameters to optimize the probability of undetected errors and file transfer delay; (2) to design a software-defined implementation of MLED exploiting features in FABRIC, a new testbed of networked computing systems; (3) to evaluate MLED to investigate and understand the design trade-offs that must be navigated to harness FABRIC, to validate the theory, and to compare to other tools across the spectrum of file transfer solutions.

This project impacts the science of understanding networked computing systems through the design and implementation of the MLED architecture using features of FABRIC that are unavailable in other network testbeds. Reliable file transfer is especially important when the conclusions drawn from the data critically depend on their correctness, with potential consequences to scientific knowledge. In addition, MLED advances data science to better understand undetected errors, their propagation through file transfer, and management of massive data sets. FABRIC Across Borders offers the opportunity to use MLED for petabyte-scale file transfer across the oceans. Outreach activities will share experience using FABRIC with the testbed community and promote the adoption of MLED for file transfer. By the end of this project, it will have contributed towards NSF and national research priorities on networked computing systems, and helped educate graduate and undergraduate students, while broadening participation of underrepresented minorities.

Project website: Jupyter Notebooks will be used throughout the project to prototype, explore, and document the experimental process. To enable reproducibility the notebooks and code repositories will be shared on the project website, in addition to resulting publications. This website will be maintained for at least three years following the project end date.

For more information, click here.