OpenStack Swift is a versatile platform to build highly scalable and highly available storage clouds. However, deploying a Swift based storage cloud for high performance while keeping low cost (both upfront and ongoing) is a challenging task. First task for the cloud builders is to identify the characteristics of the workload that they are optimizing, i.e. the distribution of the object sizes and ratios between the read, write and delete operations. Next, for their specific workload characteristics, the cloud builders need to consider several important questions with overlapping implications:
(1) How to provision the hardware resources (e.g. CPU, memory, I/O devices) for the storage server and proxy server in a cost-effective way. (2) What is the best ratio between the number of storage servers and the number of proxy servers in a Swift storage cloud? (3) Should I use more expensive but faster I/O devices for certain Swift services (e.g. container)? (4) Based on certain hardware provisioning, what software-level tunings and optimizations (e.g. Swift configuration files, Filesystem, and OS settings) are recommended for optimal performance?
Besides considering above questions, the cloud builders also want to know how their Swift storage cloud performs in various degraded modes, i.e. when failures happen (e.g. one of the storage servers is down). Their SLA requirements may mandate a minimum performance even in face of some failures.
Based on our hands-on experience with several Swift implementations and hundreds of benchmark runs in our labs, we would like to share our methods on how to provision a Swift cloud storage on both hardware and software sides with the expected performance, while keeping low upfront cost. In addition, we would also talk about how to precisely benchmark a Swift storage cloud by simulating different workloads and failure scenarios. We will share both quantitative results and derived best practices.