Dave Chinner wrote:
On Fri, Oct 24, 2008 at 09:29:42AM +1000, Mark Goodwin wrote:
We're about to deploy a system+jbod dedicated for performance
regression tracking. The idea is to build the XFS dev branch
nightly, run a bunch of self contained benchmarks, and generate
a progressive daily report - date on the X-axis, with (perhaps)
wallclock runtime on the y-axis.
wallclock runtime is not indicative of relative performance
for many benchmarks. e.g. dbench runs for a fixed time and
then gives a throughput number as it's output. It's the throughput
you want to compare.....
either, or. Both are differential. I want to keep this really simple,
just provide high level tracking on *when* a performance regression
may have been introduced but only with broad indicators. I don't
think anyone is regularly tracking this for XFS and we should be.
The aim is to track relative XFS performance on a daily basis
for various workloads on identical h/w. If each workload runs for
approx the same duration, the reports can all share the same
generic y-axis. THe long term trend should have a positive
If you are measuring walltime, then you should see a negative
gradient as an indication of improvement....
yes :) what I ment, but was thinking "positively"
Regressions can be date correlated with commits.
For the benchmarks to be useful as regression tests, then the
harness really needs to be profiling and gathering statistics at the
same time so that we might be able to determine what caused the
I would regard that as follow-up once an issue has been identified.
My proposal is too simple to be useful for diagnosis, but it should
be enough to provide heads-up. That's the aim to start with. The same
h/w can also be set up for more sophisticated measurements in the
Comments, benchmark suggestions?
The usual set - bonnie++, postmark, ffsb, fio, sio, etc.
Then some artificial tests that stress scalability like speed of
creating 1m small files with long names in a directory, the speed of
a cold cache read of the directory, the speed of a hot-cache read of
the directory, time to stat all the files (cold and hot cache),
time to remove all the files, etc. And then how well it scales
as you do this with more threads and directories in parallel...
yeah OK, bits and pieces of the the above, enough to provide broad
ANyone already running this?
Know of a test harness and/or report generator?
Perhap you might want to look more closely at FFSB - it has a
fairly interesting automated test harness. e.g. it was used to
And you can probably set up custom workloads to cover all the things
that the standard benchmarks do.....
I'll poke around on those pages for some ideas.
Thanks for the reply.