As you may know, Running Redis with persistence storage on AWS has been a challenge for users. Local storage lacks data durability, and standard EBS is perceived to be slow and unpredictable. A recent discussion amongst the Redis community suggested that users change the default Redis configuration to AOF with “fsync every second” for better data durability than you can get with today’s default snapshotting policy. As many Redis users run on AWS, one of the arguments against changing the configuration is the perceived slowness of EBS, and the fact that AOF is much more demanding than snapshotting with respect to storage access. As a provider of the Redis Cloud, we decided to find out whether EBS really slows down Redis when used over various AWS platforms. Redis Data Persistence: Redis’ AOF and Snapshotting persistent storage mechanisms are very efficient, as they only use sequential writes with no seek time. Because of this, Redis can run pretty well with data-persistence even on non-SSD hardware. That being said, slow storage devices can still be very painful for Redis. We see this when the AOF “fsync every second” policy blocks an entire Redis operation due to slow disk I/O. In addition, background saving for point-in-time snapshots or AOF rewrites takes longer, which results in more memory being used for copy-on-write, and less memory being available for your app. Testing Redis with and without Persistent Storage: We conducted our latest benchmark to understand how each typical Redis configuration on AWS performs with and without persistent storage. We tested AOF with the “fsync every second” policy, as we believe this is the most common data-persistence configuration that nicely balances performance and data durability. In fact, over 90% of our Redis users who enable data-persistence use AOF with “fsync every second.” Although it is a common Redis configuration, we didn’t test a setup in which a master node serves the entire data from RAM, leaving the slave node to deal with data persistence. This configuration may result in issues like replication bottlenecks between master and slave, which are not directly related to the storage bottlenecks we were testing for. And, we’re leaving the effect of Redis fork, AOF rewrite and snapshot operations for our next posts, so stay tuned! Benchmark test results Here’s what we found out when trying to compare how each platform runs with and without data-persistence. Throughput
所有平台在没有数据持久性的情况下运行速度稍快(无 DP),但差异很小:对于 m1.xlarge 实例为 15%,对于其他实例等于或低于 8%,对于 Redis Cloud 平台约为 1%。注意:默认情况下,Redis Cloud 为其每个集群节点使用一个 RAIDed EBS,但我们在该基准的所有其他平台用了未 RAID 的配置。因此,可以安全地假设一个更为理想的 EBS 配置可以减少性能退化。延迟
在 m1.xlarge 实例上,具有数据持久性的平均响应时间高出 13%,而在其他所有实例中高出不到 8%。同样,由于 Redis Cloud 的 EBS 配置优化,具有数据持久性的平均响应时间仅高出 2%。在 m1.small 和 m1.large 实例中,99% 的响应时间出现了明显的延迟差异,这主要是由于这些实例比 m2.2xlarge 和 m2.4.xlarge 等大型实例更有可能在一个物理服务器上共享,正如 Adrian Cockcroft 很好的解释 此处。另一方面,在 m1.xlarge 和 m2.2xlarge 实例中观察到了非常小的较高延迟,而在 Redis Cloud 中看到了相等的响应时间。注意:我们的响应时间测量考虑了网络往返时间、Redis/Memcached 处理时间,以及我们的 memtier_benchmark 工具解析结果所需的时间。 AOF 应当是 Redis 的默认配置吗?我们认为是。该基准清楚地表明,在正常情况下,使用具有标准的未 RAID 的 EBS 配置的 AOF 在各种 AWS 平台上运行 Redis 不会对 Redis 的性能产生显着影响。如果考虑到 Redis 专业人员通常在使用任何数据持久性方法之前都会仔细调整其 redis.conf 文件,并且新手通常不会生成与我们在该基准中使用的大小的负载,那么可以安全地假设在实际情况下可以几乎忽视这种性能差异。为什么 Redis Cloud 的性能要好得多?
在后台,我们监控每一个 Redis 命令,并持续比较其响应时间与应有的最优值。这种架构使我们的客户可以运行在最强大的实例上并享受最高的性能,而无论他们数据集的大小(包括小到中等)。借助 Garantia Data,他们可以做到这一点, 同时仅按每小时实际使用的 GB 付费——因此获得了双重优势。基准测试设置对于想要详细了解我们的基准测试的人,以下是我们使用的资源
对于每个测试平台,我们运行了 2 个测试
我们在每个配置中运行三次每个测试,并使用以下参数计算平均结果
我们的数据集大小包括