we have a "Processes stuck in D state" problem here for a long time.
It occurrs with different kernel/xfs versions (smp always enabled in
Linux Kernel 2.4.21-xfs 1.3.0-pre4 (and megaraid2-patch)
And also with these obsolete ones:
Linux Kernel 2.4.19-xfs 1.2 (and megaraid-patch)
Linux Kernel 2.4.20-xfs 1.2 (and megaraid-patch)
But not with this one:
Linux kernel 2.4.21-rc1 with xfs as obtained from XFS CVS on 2003-05-02
(megaraid2-patch applied) - We are using this kernel currently, as we
have no other choice.
Back to Linux Kernel 2.4.21-xfs 1.3.0-pre4 (and megaraid2-patch):
Many processes accessing the same XFS filesystem (the one for samba in
the example below) are stuck in D state..
toschi 3827 0.1 0.1 6376 3304 ? D 16:10 0:00
/usr/local/samba/bin/smbd -D -p 445
As time goes by (depending on user activity), load increases to really
high values (90 or so) as more and more procceses accessing the not
reacting filesystem get stuck in D state.
Finally the system does not react anymore.
This is not depending on samba. It also happens with any other
application (i. e. apache) accessing an XFS mounted filesystem. It
seems to be filesystem specific, i.e. only processes on one XFS
filesystem are "running" in D state, not on another mounted XFS
It is a production system and it takes an averge of about a week after a
reboot until this happens, but isn't exactly reproducable.
The machines we have here are Dell Poweredge 2600 Dual Xeon. The XFS
related behaviour is not depending on single or dual CPU usage nor on
hyperthreading switched on/off.
Stefan Röhrich stefan@xxxxxxxxx, sr@xxxxxxxx