Smart redundancy for big-data systems: Theory and Practice
Large-scale distributed storage and caching systems form the foundation of big-data systems. A key scalability challenge in distributed storage systems is achieving fault tolerance in a resource-efficient manner. Towards addressing this challenge, erasure codes provide a storage-efficient alternative to the traditional approach of data replication. However, classical erasure codes come with critical drawbacks: while optimal in utilizing storage space, they significantly increase the usage of other important cluster resources such as network and I/O. In the first part of the talk, I present new erasure codes and theoretical optimality guarantees. The proposed codes reduce the network and I/O usage by 35-70% for typical parameters while retaining the storage efficiency of classical codes. I then present an erasure-coded storage system that employs the proposed codes, and demonstrate significant benefits over the state-of-the-art in evaluations under production setting at Facebook. Our codes have been integrated into Apache Hadoop 3.0. The second part of the talk focuses on achieving high performance in distributed caching systems. These systems routinely face the challenges of skew in data popularity, background traffic imbalance, and server failures, which result in load imbalance across servers and degradation in read latencies. I present EC-Cache, a cluster cache that employs erasure coding to achieve a 3-5x improvement as compared to the state-of-the-art.
Rashmi K. Vinayak received her PhD in the EECS department at UC Berkeley in 2016, where she is now a postdoctoral researcher at AMPLab/RISELab and BLISS working with Ion Stoica and Kannan Ramchandran. Her dissertation received the Eli Jury Award 2016 from the EECS department at UC Berkeley for outstanding achievement in the area of systems, communications, control, or signal processing. Rashmi is the recipient of the IEEE Data Storage Best Paper and Best Student Paper Awards for the years 2011/2012. She is also a recipient of the Facebook Fellowship 2012-13, the Microsoft Research PhD Fellowship 2013-15, and the Google Anita Borg Memorial Scholarship 2015-16. Her research interests lie in the theoretical and system challenges that arise in storage and analysis of big data.