On Thu, 27 Jan 2011, Stan Hoeppner wrote:
david@xxxxxxx put forth on 1/27/2011 2:11 PM:
how do I understand how to setup things on multi-disk systems? the documentation
I've found online is not that helpful, and in some ways contradictory.
Visit http://xfs.org There you will find:
File system structure:
thanks for the pointers.
If there really are good rules for how to do this, it would be very helpful if
you could just give mkfs.xfs the information about your system (this partition
is on a 16 drive raid6 array) and have it do the right thing.
If your disk array is built upon Linux mdraid, recent versions of mkfs.xfs will
read the parameters and automatically make the filesystem accordingly, properly.
mxfs.fxs will not do this for PCIe/x hardware RAID arrays or external FC/iSCSI
based SAN arrays as there is no standard place to acquire the RAID configuration
information for such systems. For these you will need to configure mkfs.xfs
At minimum you will want to specify stripe width (sw) which needs to match the
hardware stripe width. For RAID0 sw=[#of_disks]. For RAID 10, sw=[#disks/2].
For RAID5 sw=[#disks-1]. For RAID6 sw=[#disks-2].
You'll want at minimum agcount=16 for striped hardware arrays. Depending on the
number and spindle speed of the disks, the total size of the array, the
characteristics of the RAID controller (big or small cache), you may want to
increase agcount. Experimentation may be required to find the optimum
parameters for a given hardware RAID array. Typically all other parameters may
be left at defaults.
does this value change depending on the number of disks in the array?
Picking the perfect mkfs.xfs parameters for a hardware RAID array can be
somewhat of a black art, mainly because no two vendor arrays act or perform
if mkfs.xfs can figure out how to do the 'right thing' for md raid arrays,
can there be a mode where it asks the users for the same information that
it gets from the kernel?
Systems of a caliber requiring XFS should be thoroughly tested before going into
production. Testing _with your workload_ of multiple parameters should be
performed to identify those yielding best performance.
the problem with this is that for large arrays, formatting the array and
loading it with data can take a day or more, even before you start running
the test. This is made even worse if you are scaling up an existing system
a couple orders of magnatude, because you may not have the full workload
available to you. Saying that you should test out every option before
going into production is a cop-out. The better you can test it, the better
off you are, but without knowing what the knobs do, just doing a test and
twiddling the knobs to do another test isn't very useful. If there is a
way to set the knobs in the general ballpark, then you can test and see if
the performance seems adaquate, if not you can try teaking one of the
knobs a little bit and see if it helps or hurts. but if the knobs aren't
even in the ballpark when you start, this doesn't help much.