Targeting Skewed Workloads with the Smelter Distributed DB

Presented at the OSDI 2023 Poster Session.


Distributed databases shard data across many machines, allowing them to support large-scale applications whose data cannot fit on a single-machine database. This scalability presents a trade-off, as the throughput of distributed databases is much lower than that of modern single-machine databases for skewed workloads.

We present Smelter, the first distributed database that can support large-scale applications and approaches the throughput of networked, replicated single-machine databases under skewed workloads. Smelter’s dual-database concurrency control protocol allows it to combine a set of servers running a distributed protocol for scalability with a single hotshard that provides single-machine database-like throughput. Our specialized replication protocol then provides fault tolerance for the hotshard while maintaining most of its high throughput. Our evaluation shows that Smelter achieves over an order of magnitude better performance under skewed workloads than a state-of-the-art distributed database.