Posted by: mk408 in Disk Storage on June 6, 2009
Recently, I had a discussion with a colleague about storage performance, and he kept talking about IOPS, whereas I have always measured it with the, perhaps more traditional, bytes per second. Since IOPS is effectively the reciprocal of latency, I have tended to ignore it for disk storage, as I have yet to see any use case which is synchronous, let alone sensitive to sub-centiseond latencies.
The alleged use case is a random write-heavy Oracle instance. I confirmed with a DBA I know that Oracle's block sizes will range from 4 to 32kiB. That suggests that the worst case random I/O can't occur, as the payload for each operaton will be between 8 and 64 sectors. Still, I no longer have the data for benchmarks I ran, to be able to quantify how much of a difference this might make.
I can, however, quantify what I've observed in terms of throughput numbers. A commodity 500GB 7200RPM SATA drive can do around 100MiB/s for sustained, sequential I/O. It drops to around 10MiB/s for sustained, contentious (though not rigorously, statistically random) I/O. If it can do around 100 IOPS, the payloads must be much larger than even 64 sectors, closer to quadruple that number. Perhaps Linux scheduler queue combined with NCQ gets enough adjacency for that fourfould increase.
Back to Oracle, or, perhaps, any database, does it really perform I/O in a synchronous fashion, not even dispatching an operation until the previous one succeeded? This strikes me as unlikely, especially in a high-concurrency environment, which is what I would assume anything with many, random writes would be. Surely part of the whole point of something like intent logging is the ability to do (otherwise reckless) asynchoronous writes, and logging is patently sequential.
Regardless, both theory and empirical observation lead me to the conclusion that real-world loads and capacities are more meaningfully measured in bytes not operations per unit time. Am I missing something?