Setting readahead (RA from now on) appropriately is a contentious subject. There are a lot of variables involved, but in my particular case I am setting out to minimize those variables, get a baseline, and have a reasonable idea of what to expect out of this configuration:
- Environment: Amazon EC2
- Instance Size: m3.xlarge (4 vCPU, 15GiB RAM)
- Disk Config: Single EBS Volume, 1000 PIOPS
The testing I am going to be doing is pretty detailed, and intended for use in a future whitepaper, so I wanted to get some prep done and figure out exactly what I was dealing with here before I moved forward. The initial testing (which is somewhat unusual for MongoDB) involves a lot of sequential IO. Normally, I am tweaking an instance for random IO and optimizing for memory utilization efficiency – a very different beast which generally means low RA settings. For this testing, I figured I would start with my usual config (and the one I was using on a beefy local server) and do some tweaking to see what the impact was.
I was surprised to find a huge cliff in terms of operations per second hitting the volume when I dropped RA to 16. I expected the larger readahead settings to help up to a certain point because of the sequential IO (probably up to the point that I saturate the bandwidth to the EBS volume or similar). But I did not expect the “cliff” between RA settings of 32, and RA settings of 16.
To elaborate: one of the things I was keeping an eye on was the page faulting rate within MongoDB. MongoDB only reports “hard” page faults, where the data is actually fetched off the disk. Since I was wiping out the system caches between caching runs, all of the data I was reading had to come from the disk, so the fault rate should be pretty predictable, and IO was going to be my limiting factor.
With the RA settings at 32, my tests were taking longer than 64, 64 took longer than 128 etc. until the results for 256, 512 were close enough to make no difference and RA was not really a factor anymore. At 32, the faulting rate was relatively normal – somewhere around 20 faults/sec at peak and well within the capacity of the PIOPS volume to satisfy, this was a little higher than the 64 RA fault rate which ran at ~15 faults/sec. I was basically just keeping an eye on it, it did not seem to be playing too big a part.
With an RA of 16 though, things slowed down dramatically. The faults spike to over 1000 faults/sec and stay there. That’s a ~50x increase over the 32 RA setting and basically is pegging the max PIOPS I have on that volume. Needless to say the test takes a **lot** longer to run with the IO pegged. To show this graphically, here are the run completion times with the differing RA settings (click for larger view):
TL;DR I will be using RA settings of 128 for this testing, and will be very careful before dropping RA below 32 on EBS volumes in EC2 in future.
Update: A bit of digging revealed that the max/default size of an IO request on provisioned IOPS instances is 16K, which would mean that setting RA to 32 matches this well, whereas dropping below it by 50% is essentially a bad mismatch. Not sure it justifies the 1000+ IOPS that suddenly appear, but at least it’s a partial explanation.