ECE PhD Dissertation Defense: Sahil Nikhil Tikale

  • Starts: 10:00 am on Thursday, March 23, 2023
  • Ends: 12:00 pm on Thursday, March 23, 2023

Title: Market driven Elastic Secure Infrastructure

Presenter: Sahil Nikhil Tikale

Advisor: Professor Orran Krieger

Chair: TBA

Committee: Professor Orran Krieger, Professor David Starobinski, Professor Martin Herbordt, Professor Peter Desnoyers, Professor Larry Rudolph

Abstract: Static allocation of clusters in the Data Centers is a source of inefficient use of resources. The Data Centers are shared by multiple tenants each managing clusters supporting different applications (HPC, Cloud, Big Data etc). The tools such as Openstack-Ironic, MaaS, Foreman etc used for managing the clusters take full control of the hardware leading to their silofication. Consequently, the clusters are stood up with sufficient capacity to deal with peak demand. This leads to under-utilization during off-peak periods and in the cases where the demand exceeds capacity the clusters suffer from degraded quality of service (QoS) or may violate service level objectives (SLOs). Considering that all the clusters may not undergo peak demand at the same time provides an opportunity to improve the efficiency of all clusters by sharing resources between them.

We propose the Marketplace driven Elastic Secure Infrastructure (MESI) both as an alternative to the public cloud and as an architecture for the lowest layer of the public cloud to improve its efficiency. MESI is based on the idea of enabling tenants to share hardware they own with tenants they may not trust and between clusters with different security requirements. The architecture provides control and freedom of choice to the tenants whether they wish to deploy and manage these services themselves or use it from a trusted third-party. MESI services fit into three layers that build on each other to provide: 1) Elastic Infrastructure, 2) Elastic Secure Infrastructure, and 3) Market driven Elastic Secure Infrastructure.

1) Hardware Isolation Layer (HIL) -- the bottom most layer of MESI is designed for moving nodes between multiple tools and schedulers used for managing the clusters. It defines HIL to control the layer 2 switches and bare-metal servers such that tenants can elastically adjust the size of the clusters in response to the changing demand of the workload. It enables movement of nodes between clusters with minimal to no modifications required to the tools and workflow used for managing these clusters. (2) Elastic Secure Infrastructure (ESI) builds on HIL to enable sharing of servers between clusters with different security requirements and mutually non-trusting tenants of the Data Center. ESI enables the borrowing tenant to minimize its trust in the node provider and take control of trade-offs between cost, performance and security. This enables sharing of nodes between tenants that are not only part of the same organization by can be organization-tenants in a co-located Data Center. (3) The Bare-metal Marketplace is an incentive-based system that uses economic principles of the marketplace to encourage the tenants to share their servers with others not just when they do not need them but also when others need them more. It provides tenants with the ability to define their own cluster objectives and sharing constraints and freedom to decide the quantity of nodes they wish to share with others.

MESI is evaluated using prototype implementations at each layer of the architecture. (i) The HIL prototype implemented with only 3000 loc is able to support many provisioning tools and schedulers with little to no modification; adds no overhead to the performance of the clusters and is in active production use at MOC managing over 150 servers and 11 switches. (ii) The ESI prototype builds on the HIL prototype and adds to it an attestation service, a provisioning service and deterministically built open-source firmware. Results demonstrate that it is possible to build a cluster that is secure, elastic, fairly quick to set up and tenant requires only minimum trust on the provider for the availability of the node. (iii) The MESI prototype demonstrates the feasibility of having a one-of-kind multi-provider marketplace for trading bare-metal servers where providers are also users of the nodes. The evaluation of the MESI prototype shows that all the clusters benefit from participating in the marketplace. It uses agents to trade bare-metal servers in a marketplace to meet the requirements of their clusters. Results show that compared operating as silos individual clusters see a 50 % improvement in the total work done: up to 200 % improvement (reduction) of waiting queues and 60 % improvement in the aggregate utilization of the test bed.

This dissertation makes the following contributions: (i) It defines the architecture of MESI that allows mutually non-trusting tenants of the datacenter to share resources between clusters with different security requirements. (ii) Demonstrates that it is possible to design a service that breaks the silos of static allocation of clusters yet have a small TCB and no overhead to the performance of the clusters. (iii) Provides a unique architecture that puts the tenant in control of its own security and minimizes the trust needed in the provider for sharing nodes. (iv) Finally results show that it is possible to encourage even mutually non-trusting tenants to share their nodes with each other without any central authority making allocation decisions.

PHO 339