Time filter

Source Type

United States

Fan Z.,University of Minnesota | Du D.H.C.,University of Minnesota | Voigt D.,HP Storage
IEEE Symposium on Mass Storage Systems and Technologies | Year: 2014

With the rapid development of new types of nonvolatile memory (NVM), one of these technologies may replace DRAM as the main memory in the near future. Some drawbacks of DRAM, such as data loss due to power failure or a system crash can be remedied by NVM's non-volatile nature. In the meantime, solid state drives (SSDs) are becoming widely deployed as storage devices for faster random access speed compared with traditional hard disk drives (HDDs). For applications demanding higher reliability and better performance, using NVM as the main memory and SSDs as storage devices becomes a promising architecture. Although SSDs have better performance than HDDs, SSDs cannot support in-place updates (i.e., an erase operation has to be performed before a page can be updated) and suffer from a low endurance problem that each unit will wear out after certain number of erase operations. In an NVM based main memory, any updated pages called dirty pages can be kept longer without the urgent need to be flushed to SSDs. This difference opens an opportunity to design new cache policies that help extend the lifespan of SSDs by wisely choosing cache eviction victims to decrease storage write traffic. However, it is very challenging to design a policy that can also increase the cache hit ratio for better system performance. Most existing DRAM-based cache policies have mainly concentrated on the recency or frequency status of a page. On the other hand, most existing NVM-based cache policies have mainly focused on the dirty or clean status of a page. In this paper, by extending the concept of the Adaptive Replacement Cache (ARC), we propose a Hierarchical Adaptive Replacement Cache (H-ARC) policy that considers all four factors of a page's status: dirty, clean, recency, and frequency. Specifically, at the higher level, H-ARC adaptively splits the whole cache space into a dirty-page cache and a clean-page cache. At the lower level, inside the dirty-page cache and the clean-page cache, H-ARC splits them into a recency-page cache and a frequency-page cache separately. During the page eviction process, all parts of the cache will be balanced towards to their desired sizes. © 2014 IEEE.

Lillibridge M.,Hewlett - Packard | Eshghi K.,Hewlett - Packard | Bhagwat D.,HP Storage
HP Laboratories Technical Report | Year: 2013

Slow restoration due to chunk fragmentation is a serious problem facing inline chunk-based data deduplication systems: restore speeds for the most recent backup can drop orders of magnitude over the lifetime of a system. We study three techniques-increasing cache size, container capping, and using a forward assembly area- for alleviating this problem. Container capping is an ingest-time operation that reduces chunk fragmentation at the cost of forfeiting some deduplication, while using a forward assembly area is a new restore-time caching and prefetching technique that exploits the perfect knowledge of future chunk accesses available when restoring a backup to reduce the amount of RAM required for a given level of caching at restore time. We show that using a larger cache per stream-we see continuing benefits even up to 8 GB-can produce up to a 5-16X improvement, that giving up as little as 8% deduplication with capping can yield a 2-6X improvement, and that using a forward assembly area is strictly superior to LRU, able to yield a 2-4X improvement while holding the RAM budget constant. © Copyright 2013 Hewlett-Packard Development Company.

Meixner B.,University of Passau | Ettengruber M.,HP Storage | Kosch H.,University of Passau
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Year: 2012

Preserving access to multimedia data over time may prove to be the most challenging task in all things concerning multimedia. Preserving access to data from previous technical generations has always been a rather difficult endeavor, but multimedia data with an almost endless succession of encoding and compression algorithms sets the stakes even higher, especially when not only considering migrating the data from one generation earlier to a current technology but from decades ago. The time to start thinking and developing techniques and methodologies to keep data accessible over time is right now because the first challenges become visible on the horizon: How to archive the ever growing (and growing exponentially so) amounts of data without major manual intervention as soon as a storage media runs out of free space. Is there such a thing as "endless storage capacity"? Would an "endless storage capacity" really help? Or do we need totally new ways of thinking in regard to archiving digital data for the future? © 2012 Springer-Verlag.

Fan Z.,University of Minnesota | Haghdoost A.,University of Minnesota | Du D.H.C.,University of Minnesota | Voigt D.,HP Storage
Proceedings - IEEE Computer Society's Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, MASCOTS | Year: 2015

Most computer systems currently consist of DRAM as main memory and hard disk drives (HDDs) as storage devices. Due to the volatile nature of DRAM, the main memory may suffer from data loss in the event of power failures or system crashes. With rapid development of new types of non-volatile memory (NVRAM), such as PCM, Memristor, and STT-RAM, it becomes likely that one of these technologies will replace DRAM as main memory in the not-too-distant future. In an NVRAM based buffer cache, any updated pages can be kept longer without the urgency to be flushed to HDDs. This opens opportunities for designing new buffer cache policies that can achieve better storage performance. However, it is challenging to design a policy that can also increase the cache hit ratio. In this paper, we propose a buffer cache policy, named I/O-Cache, that regroups and synchronizes long sets of consecutive dirty pages to take advantage of HDDs' fast sequential access speed and the non-volatile property of NVRAM. In addition, our new policy can dynamically separate the whole cache into a dirty cache and a clean cache, according to the characteristics of the workload, to decrease storage writes. We evaluate our scheme with various traces. The experimental results show that I/O-Cache shortens I/O completion time, decreases the number of I/O requests, and improves the cache hit ratio compared with existing cache policies. © 2015 IEEE.

Rabinovici-Cohen S.,IBM | Cummings R.,Antesignanus | Fineberg S.,HP Storage
CEUR Workshop Proceedings | Year: 2014

Long term preservation of digital information, including machine generated large data sets, is a growing necessity in many domains. A key challenge to this need is the creation of vendor-neutral storage containers that can be interpreted over time. We describe SIRF, the Self-contained Information Retention Format, which is being developed by the Storage Networking Industry Association (SNIA) to support this challenge. We define the SIRF components, its metadata, categories and elements, along with some security guidelines. SIRF metadata includes the semantic information as well as schema and ontological information needed to preserve the physical integrity and logical meaning of preservation objects. We also describe how the SIRF logical format is serialized for storage containers in the cloud and for tape based containers. Aspects of SIRF serialization for the cloud are being experimented with OpenStack Swift object storage in the ForgetIT EU project.

Discover hidden collaborations