Design of the Management Component in a Scavenged Storage Environment

No Thumbnail Available

Date

2005-07-31

Journal Title

Series/Report No.

Journal ISSN

Volume Title

Publisher

Abstract

High-end mass storage systems are increasingly becoming popular in supercomputing facilities for their large storage capacities and superior data delivery rates. End-users, on the other hand, face problems in processing this data on their local machines due to the limited disk bandwidth and memory. The Freeloader project is based on the premise that in a LAN environment, a large number of such workstations collectively represent significant storage space and aggregate I/O bandwidth, if harnessed when idle. Aggregation of these precious resources is made viable by the high speed interconnect that exists between nodes in a LAN. The FreeLoader project is an effort to aggregate free storage space, and I/O bandwidth contributions from commodity desktops to provide a shared cache/scratch space for large, immutable data sets. Striping is initially used to distribute data among multiple workstations, and this enables subsequent retrieval of data in the form of parallel streams from multiple workstations. In this thesis, we present the management component of the Freeloader project. We discuss the functionality of the management component in terms of data placement and maintenance of information about workstations which donate storage. We show how the striping of data maximizes retrieval rates and helps in load balancing. We present the choices we face in the design of the management component and how to minimize its overheads. We also model the entire Freeloader cloud as a cache space with an eviction policy, due to the dynamic nature of space contributions and the limited amount of donated space. We discuss how the management component handles data set eviction in a manner that exploits temporal locality based on a history of accesses. We also discuss experimental results which show the impact of different striping parameters on the data access rates, and the viability of Freeloader in comparison to traditional data retrieval from high-end storage systems.

Description

Keywords

storage scavenging, distributed systems, parallel I/O, resource aggregation

Citation

Degree

MS

Discipline

Computer Science

Collections