xfs
[Top] [All Lists]

Re: TAKE - fix panic caused by mix of local and remote access to xfs

To: marchuk@xxxxxxxxxxxxxxxxx
Subject: Re: TAKE - fix panic caused by mix of local and remote access to xfs
From: Steve Lord <lord@xxxxxxx>
Date: Wed, 30 May 2001 12:59:18 -0500
Cc: Galen Arnold <arnoldg@xxxxxxxxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: Message from <marchuk@xxxxxxxxxxxxxxxxx> of "Wed, 30 May 2001 10:43:02 PDT." <Pine.LNX.4.21.0105301042500.712-100000@xxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
Hi,

I am running iozone on a 2Gbyte box on 2.4.5-xfs right now it is still in
the first pass - I only have a striped md filesystem, not hardware raid.
I have not been running long enough to say if the problem is fixed or not.

There continue to be general highmem problems in 2.4 linux right now, there
are known deadlocks in the generic code - nothing to do with XFS itself,
but XFS may be better at exposing them. Things do appear to be improving,
the 2.4.2 kernel did not last 10 seconds on this test for Galen, the 2.4.4
kernel lasted several hours. There are more changes in 2.4.5, but Linus rushed
them in right before he put out the 2.4.5 release and he left for Japan,
there was continuing discussion that this would not fix the deadlocks.

Since then there has been a patch to change how I/O is done for highmem
boxes, which a) improves performance and b) should if I understand it
get rid of deadlocks.

I would recommend trying the 2.4.5 kernel, and should that still have
problems I can try the patch from Jens Axboe for changing the way the
I/O is done and pass it on to you.

Thanks for continuing to pound on xfs though.

Steve

> linux-2.4.4-xfs
> 
> *****************************
> Walter Marchuk
> Senior Computer Specialist
> University of Washington
> Electrical Engineering
> Room: 307g
> 206-221-5421
> marchuk@xxxxxxxxxxxxxxxxx
> *****************************
> 
> On Wed, 30 May 2001, Galen Arnold wrote:
> 
> > Walter,
> > 
> > Welcome to the club!  I can reproduce that behavior with 2.4.4-xfs and
> > this iozone test (my host has 2G mem, so I test with big files):
> > 
> >     iozone -s 4000m -r 64k
> > 
> > My box also hangs (doesn't crash or panic).  Was that 2.4.5-xfs you
> > tested, if so, you saved me the trouble and I'll delay my next cvs
> > checkout/rebuild?
> > 
> > -Galen
> > 
> > +
> > Galen Arnold, system engineer--systems group       arnoldg@xxxxxxxxxxxxx
> > National Center for Supercomputing Applications           (217) 244-3473
> > 152 Computer Applications Bldg., 605 E. Spfld. Ave., Champaign, IL 61820
> > 
> > On Wed, 30 May 2001 marchuk@xxxxxxxxxxxxxxxxx wrote:
> > 
> > > My fileserver has the latest XFS CVS kernel.  Yesterday the machine
> > > crashed/froze  It stopped at 8PM right after this allocation failed
> > > error.  Notice the time when the machine came back, 1AM (it was physicall
> y
> > > rebooted).  
> > > 
> > > I did a search for this error and saw a correlation with xfs and this
> > > error.  Do any of you know if the error and the crash was due to xfs bug?
> > > 
> > > May 29 20:01:09 gauss kernel: __alloc_pages: 0-order allocation failed.  
>     
> > > May 30 01:34:19 gauss syslogd 1.3-3: restart.    
> > > *****************************
> > > Walter Marchuk
> > > Senior Computer Specialist
> > > University of Washington
> > > Electrical Engineering
> > > Room: 307g
> > > 206-221-5421
> > > marchuk@xxxxxxxxxxxxxxxxx
> > > *****************************
> > > 
> > > On Tue, 29 May 2001, Steve Lord wrote:
> > > 
> > > > > That means it's been tagged?
> > > > 
> > > > I am not sure what you mean by tagged, I checked in a fix for the probl
> em
> > > > with the implementation I checked in on Friday, and in cleaning my mail
> > > > box I saw that I had said I would do a followup under the same heading.
> > > > 
> > > > Steve
> > > > 
> > > > > 
> > > > > -- 
> > > > > Austin Gonyou
> > > > > Systems Architect, CCNA
> > > > > Coremetrics, Inc.
> > > > > Phone: 512-796-9023
> > > > > email: austin@xxxxxxxxxxxxxxx
> > > > > 
> > > > > On Tue, 29 May 2001, Steve Lord wrote:
> > > > > 
> > > > > > >
> > > > > > > Sod's law says you find the hole right after you ship the code. T
> here is 
> > > > > a
> > > > > > > bug in this code, I would not do a cvs update until I get a fix i
> n, I kno
> > > > > w
> > > > > > > roughly how to fix it, but it will take me a while to code and te
> st it.
> > > > > > >
> > > > > > > I would hold off on the cvs tree for a while until you see anothe
> r messag
> > > > > e
> > > > > > > from me on this thread - probably not today.
> > > > > > >
> > > > > > > Steve
> > > > > > >
> > > > > >
> > > > > > This was fixed over the weekend by the way.
> > > > > >
> > > > > > Steve
> > > > > >
> > > > 
> > > > 
> > > 
> > 



<Prev in Thread] Current Thread [Next in Thread>