SMP may be part of the problem here, but I think I have a work around
for the underlying problem. The issue is I think that xfs log writes are
somewhere between 512 bytes and 32K in size, and they vary in 512 byte
multiples. This causes the log writes to trigger some code in raid5
which makes you wait for other I/O to complete.
If you choose another partition (non raid5) for the log and make an external
log instead of using an internal log then the issue should go away, we
are going to test this here. I suppose a mirrored log would be the best
solution here as you could still survive the loss of a drive with the
log on it.
In the meantime I am going to look at how we can stripe align the log writes
which would be a good thing anyway.
> It looks like a kernel SMP issue more than a 3ware driver
> - I have tried the following SMP kernels:
> 2.4.3-xfs-1.0.1 and 2.4.7pre8 cvs version which include
> the same 3ware 3w-xxxx driver. (I have not yet flash the firmware).
> - I can see it on a plain IDE raid5 partition.
> I will now flash the 3ware card firmware. but I don't
> know what that can change...
> Next step is rebooting with noapic.
> /dev/md1 is now a raid5-xfs on the system HD
> # raid5 on hda
> raiddev /dev/md1
> raid-level 5
> nr-raid-disks 3
> persistent-superblock 1
> chunk-size 64
> parity-algorithm left-symmetric
> device /dev/hda6
> raid-disk 0
> device /dev/hda7
> raid-disk 1
> device /dev/hda8
> raid-disk 2
> I can still see the nfs freeze (server nfs.cluster not responding)
> on both raid5 devices under 2.4.3-xfs. on the client side
> syslog reports a "nfs_statfs: statfs error = 116 ".
> I can still see it under 2.4.7pre8 on /dev/md0 or /dev/md1
> but only after a little longer time...
> Off topic: One strange thing is the checksumming function:
> kernel: raid5: measuring checksumming speed
> kernel: 8regs : 1169.200 MB/sec
> kernel: 32regs : 788.000 MB/sec
> kernel: pIII_sse : 1727.600 MB/sec
> kernel: pII_mmx : 1924.000 MB/sec
> kernel: p5_mmx : 2045.200 MB/sec
> kernel: raid5: using function: pIII_sse (1727.600 MB/sec)
> Why not using the faster p5_mmx ?
> > > >4) try SMP+raid5 on local IDE HD to rule out 3ware issue
> ruled out. :(
> Dr Tru Huynh | Bioinformatique Structurale
> mailto:tru@xxxxxxxxxx | tel/fax +33 1 45 68 87 37/19
> Institut Pasteur, 25-28 rue du Docteur Roux, 75724 Paris CEDEX 15 France