I've been looking at this further today.
If Steve Lord is correct in his assessment that my Oops was due to a 'out of
memory condition' (I'm sure he is, looks sensible) and I've correctly
interpreted memory usage (via 'vmstat') then it would appear that the
kernel is running out of available memory, with resultant problems for
XFS because all of the memory is been used, most of it to cache filesystem
data. Currently 'top' is showing:
7:03pm up 7:19, 2 users, load average: 0.00, 0.00, 0.00
108 processes: 107 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 0.1% user, 0.5% system, 0.0% nice, 99.2% idle
Mem: 898848K av, 895792K used, 3056K free, 0K shrd, 13392K buff
Swap: 1028120K av, 4780K used, 1023340K free 820396K cached
Confirming that the majority of the memory is been used for cache and that
there is currently just under 3Mbytes of RAM free, so I can see how a sudden
kernel demand for memory may result in a failure before 'kswapd' has chance
to free some memory up.
Isn't this a common problem? wouldn't this be the normal state for
a fileserver with more 'active' data than memory - filesystem cache will
(under Linux) grow to use as much memory as possible, hence 'free' memory
will be at a minimum? From this reasoning adding more memory is unlikely
to help - as the system will just use more cache until there is the same
minimum amount of memory?
It would appear that XFS has the potential to make heavier demands
on kernel memory and/or does not check that memory was allocated (I
assume that its memory allocation calls are such that it expects memory
to be allocated from immediately available physical memory?), hence
this problem (I believe a number of people have reported problems that
have been attributed to memory allocation errors?).
..... have I missed something?
I then looked to try to find out how to decrease the maximum amount of
memory that the kernel would use as FS cache (or increase the minimum
amount of memory that the kernel would try to keep free). Thanks to
harri.haataja@xxxxxxxxxxxxxx for pointing me in the direction of
'Documentation/sysctl/vm.txt' in the kernel. Though as was noted this
document is out of date referring to the 2.2 series kernels. It does
however point to '/proc/sys/vm/feepages' as giving the number of pages
at which 'kswapd' will start to free pages. This seems to exist in the
2.4 series kernels upto 2.4.9 (it may be 10 or 11?) after that the VM
changed and this parameter went away (though the documentation hasn't
changed...... So I'm completely lost with the later kernels!). Anyway
back to 2.4.9:
# cat /proc/sys/vm/freepages
383 766 1149
Which means that the kernel will try to maintain 1149 pages (~4.5Mbytes)
free memory, will try even harder to free memory at 766 pages and will
stop allocating memory other than to root if free memory falls below
383 pages (~1.5Mbytes). This would seem to agree with 'vmstat'/'top'
which tend to show ~3Mbytes free memory. I then tried to increase
these values, however these appear to be read only values (tried by
writing directly to the file and using 'sysctl'. Indeed a comment
in the kernel file 'mm/page_alloc.c' confirms that the 'freepages'
array is not writable due to potential conflicts with different memory
zones. So I'm stuck again. I've not been able to find any obvious
place in the kernel source to change these values at kernel compilation
Any ideas? (either for 2.4.9 or ideally in latter/current
kernels, I have reproduced what looked like a similar failure with
2.4.16 and 2.4.17 but unfortunately did not get the Oops details to
confirm that it was the same problem, I'll try to setup a test to
get this info, is there any more info that would help). FYI: test
environment is a server and a number (~6) client machines running
a mixture of 'bonnie' runs and back-to-back tar's copying the local
/usr to the shared filesystem from the server (last time I looked
at this it lasted ~ 24hours).
Many thanks for your time.
"Ian D. Hardy" wrote:
> > On Tue, 2002-01-15 at 13:33, Ian D. Hardy wrote:
> > > Hi,
> > >
> > > For some time we've been having problem with a server, which is acting
> > > as a master/control node and NFS server for a computational cluster
> > > (~180 client nodes). The server will crash after anywhere between
> > > a few hours and 10 days operation. We've tried various kernels and
> > > XFS patch versions from 2.4.9 kernel with XFS patch-2.4.9-xfs-2001-08-17
> > > up to and including 2.4.16 kernel with the xfs-2.4.16-all-i386 patch,
> > > if anything the 2.4.9 kernel has proved the most reliable (it normally
> > > lasts between 4 and 10 days! - 2.4.16 lasted less than 24hrs).
> .... more details deleted
> > >
> > Almost certainly this is an out of memory condition, just from looking
> > at the code in the function you oopsed in. Would you say your system is
> > stressed when it comes to memory?
> > Steve
> > Steve Lord voice: +1-651-683-3511
> > Principal Engineer, Filesystem Software email: lord@xxxxxxx
> I'd not regard the server as short of memory as its using ~660Mbytes as
> file system cache, though interestingly it does appear to be using some
> swap space. Is it possible that XFS is having problems when there is not
> memory immediately available, I've included some 'vmstat' output:
> vmstat 10 10
> procs memory swap io system
> r b w swpd free buff cache si so bi bo in cs us sy
> 0 1 28 11116 3784 42300 674656 0 0 37 35 19 3 0 2
> 0 1 28 11116 3480 42300 674960 0 0 0 0 167 194 0 0
> 0 1 28 11116 3176 42300 675264 0 0 0 0 139 107 0 0
> 0 1 28 11116 3056 42300 675380 0 0 0 2 152 142 0 0
> In Irix I'd tune the kernel parameters 'min_free_pages'... to ensure that
> there was always physical memory available, is there any equivalent in
> Linux (sorry if this is a silly/obvious question).
> Many thanks.
> Ian Hardy Tel: 023 80593577
> Research Services Fax: 023 80593131
> Computing Services email: idh@xxxxxxxxxxx
> Southampton University i.d.hardy@xxxxxxxxxxx
> Southampton S017 1BJ, UK.
/////////////Technical Coordination, Research Services////////////////////
Ian Hardy Tel: 023 80 593577
Computing Services Mobile: 0709 2127503
Southampton University email: idh@xxxxxxxxxxx
Southampton S017 1BJ, UK. i.d.hardy@xxxxxxxxxxx
\\'BUGS: The notion of errors is ill-defined' (IRIX man page for netstat)\