xfs
[Top] [All Lists]

Re: 2.4.13 Mem Related Hangs

To: Jason Allen <jallen@xxxxxxxx>
Subject: Re: 2.4.13 Mem Related Hangs
From: Steve Lord <lord@xxxxxxx>
Date: 13 Dec 2001 09:32:15 -0600
Cc: Jim Eshleman <jce0@xxxxxxxxxx>, linux-xfs@xxxxxxxxxxx, sandeen@xxxxxxx
In-reply-to: <1008257446.4677.0.camel@xxxxxxxxxxxxxx>
References: <3BE6C909.6070308@xxxxxxxxxx> <3BF14253.1060008@xxxxxxxxxx> <1008256442.14210.0.camel@xxxxxxxxxxxxxxxxxxxx> <1008257446.4677.0.camel@xxxxxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
On Thu, 2001-12-13 at 09:30, Jason Allen wrote:
> We upgraded our 8CPU/8GB Dell to 2.4.16 w/XFS last week and it works
> *much* better than previous 2.4 w/XFS kernels.  No, hangs, crashes or VM
> weirdness, even under heavy loads.

You might want to sync up with the latest kernel when you get a chance,
a couple of nasty bugs in the I/O path got squashed this week. I will
ask Eric to update the 2.4.16 patches so that cvs is not the only way
to get to this.

Steve

> 
> Thanks,
> 
> Jason Allen
> Fermilab
> 
> 
> 
> On Thu, 2001-12-13 at 09:14, Steve Lord wrote:
> > On Tue, 2001-11-13 at 09:54, Jim Eshleman wrote:
> > > >   FWIW me too, on an 8-way 8.5GB (64GB HIGHMEM enabled) IBM Netfinity 
> > > > x370 (8500R) which functions as a production mail server.  I currently 
> > > > run 2.4.9 with XFS and it stays up for about a week under heavy load. 
> > > > 2.4.13 lasted about 4 hours under light load until all memory was 
> > > > consumed by cache then it became unresponsive.
> > > > 
> > > >   2.4.13 on a 2-way 1GB (64GB HIGHMEM enabled) Netfinity x350 test box 
> > > > with the same kernel config and XFS works fine even under stress, so 
> > > > perhaps our problem is similar to the discussion on l-k "Google's mm 
> > > > problems"...
> > > 
> > > 
> > >    Update: I'm unable to make 2.4.14 fail on the test box (running 
> > > Cerberus, bonnie++ against two XFS volumes, and LTP simultaneously) but 
> > > it melts-down just as 2.4.13 does on the big production box.  A short 
> > > time after all memory is eaten by file cache, and under light load, the 
> > > machine becomes unresponsive.  It took about five minutes to login at 
> > > the console.  No error messages on the console or in syslog.  Here's 
> > > some info, it's obvious in the vmstat output where the melt-down occurs:
> > > 
> > >    kernel config: http://www.lehigh.edu/~jce0/2.4.14-config
> > >    bootup messages: http://www.lehigh.edu/~jce0/2.4.14-messages
> > >    vmstat 60 output: http://www.lehigh.edu/~jce0/2.4.14-vmstat
> > >    ver_linux output: http://www.lehigh.edu/~jce0/ver_linux.out
> > > 
> > >    This is linus 2.4.14 patched with linux-2.4.14-xfs-2001-11-06.patch 
> > > and LVM 0.9.1_beta6, compiled with egcs-2.91.66.  It's a RH 7.1 system.
> > > 
> > >    I know Andrea and Marcelo? were testing and fixing some HIGHMEM 
> > > things.  Were there any patches and did they make it into the Linus tree?
> > > 
> > >    Any assistance greatly appreciated.
> > > 
> > > Jim
> > > 
> > 
> > Going through my old email - I think I just fixed this - there was a bug
> > in the delayed allocation handling in XFS which caused a memory leak
> > due to a buffer_head reference count leak. The latest cvs tree (2.4.16
> > based) has the fix in it.
> > 
> > This bug was introduced around the time the new VM showed up in 2.4.10.
> > 
> > Steve
> > 
> > -- 
> > 
> > Steve Lord                                      voice: +1-651-683-3511
> > Principal Engineer, Filesystem Software         email: lord@xxxxxxx
> 
-- 

Steve Lord                                      voice: +1-651-683-3511
Principal Engineer, Filesystem Software         email: lord@xxxxxxx


<Prev in Thread] Current Thread [Next in Thread>