293 0

BINGO: Serviceable, Scalable Distributed Cold Storage

BINGO: Serviceable, Scalable Distributed Cold Storage
Other Titles
BINGO: 서비스 가용성과 확장성을 갖춘 저전력 분산 스토리지
Lee, Jaemyoun
Alternative Author(s)
Issue Date
The massive storage infrastructure required by cloud computing systems has significant implications for power bills, carbon emissions, and the logistics of data centers. Various proprietary ‘cold storage’ services, based on spun-down disks or tapes, offer reduced tariffs, but also incur extended first access times. One way of improving the access latency of cold storage is to build them on a file system that allows for the input/output (I/O) patterns of storage devices. After analyzing a trace of I/O requests from a messenger service, I found that they displayed a strongly skewed Zipfian distribution, and that most of the stored data were cold. I have developed a cold storage testbed for mobile messenger services that considers the power consumption of each hard disk in the system. Current cloud benchmark tools cannot reproduce the I/O pattern. Therefore, I have developed a framework for benchmarking cold storage systems that emulates this type of long-tailed distribution and reduces the power consumption of mobile messenger services. This dissertation describes a prototype distributed file system (Bluemoon) that provides energy-efficient and scalable cold storage for data-intensive applications. The proposed system uses a concentrated I/O-operation data management architecture that facilitates the spinning down of storage devices as soon as possible. Although the power savings are remarkable, the access latency inherited from the spin-down technology is 1,000 times higher than that of regular storage. To reduce the latency, I developed an enhanced file system architecture (BINGO) that is serviceable, scalable, and manageable. BINGO can be utilized to construct tiered storage systems in small-scale data centers, thus reducing the operational and raised floor costs. This is possible under a Room–Bucket approach that provides excellent performance and scalability by isolating storage devices and separating their responsibilities. This approach is useful for concentrating the write operations on a few collections of storage devices, and distributing consistent data replications in the specific storage devices that belong to these collections. The findings are as follows: first, a history map that records accessed storage devices by writing data objects is biased to a few devices. Second, the power consumption of storage devices can be reduced by 58:17 %. Thirdly, write throughput and scalability are determined by the number of Rooms and Buckets. Finally, read throughput is maintained at the level of similar traditional storage systems without using the spin-down technology. These contributions facilitate substantial reductions in electrical power and the use of inexpensive equipment for storage. Consequently, it should be possible to build a data center that handles zettabyte-scale services using only renewable energy.
Appears in Collections:
Files in This Item:
There are no files associated with this item.
RIS (EndNote)
XLS (Excel)


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.