xfs
[Top] [All Lists]

Re: kernel panic with nfsd and xfs filesystem

To: Vladimir Vukicevic <vladimir@xxxxxxxxxx>
Subject: Re: kernel panic with nfsd and xfs filesystem
From: Steve Lord <lord@xxxxxxx>
Date: Fri, 09 Mar 2001 11:30:12 -0600
Cc: Steve Lord <lord@xxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: Message from Vladimir Vukicevic <vladimir@xxxxxxxxxx> of "09 Mar 2001 12:18:34 EST." <E14bQXC-0007hT-00@xxxxxxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
Sorry for the silence on this end - got dragged into a conference call on
a totally different topic, sat and listened for over an hour.

Anyway, looks like you are willing to place the blame elsewhere,
I have not seen similar, but my NFS usage between linux boxes is not
huge. You mentioned switching to user space nfsd - the kernel version
should be OK from the XFS point of view now we have the speed problem
worked out - I still plan on going back and cleaning up what I did
there, sounds like there may be unmount problems later.

Not sure I can help with the NFS problem itself though.

Steve

> On 09 Mar 2001 11:45:51 -0500, Vladimir Vukicevic wrote:
> > 
> > 
> > Hmm.  I now have a repeatable case of this I/O error, but as the other
> > end is running the kernel nfsd server, I'm not exactly sure how to debug
> > it or turn on debugging on the other end... A 'cat foo.ogg /dev/null' on
> > the mounted partition repeatably gives an I/O error.
> > 
> > Any thoughts on how to diagnose this?  I'll keep poking..
> 
> 
> Doh. Forgot about tcpdump. :-)  So, this is what I'm seeing:
> 
> 11:57:46.159530 rain.ximian.priv.4040915765 > ogg.nfs: 140 lookup fh
> Unknown/1 "01-letters_from_the_wasteland.ogg" (DF)
> 11:57:46.163988 ogg.nfs > rain.ximian.priv.4040915765: reply ok 128
> lookup fh Unknown/1 (DF)
> 
> So, the lookup goes okay. Then the weirdness starts.
> 
> 11:57:46.173585 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:46.217986 ogg > rain.ximian.priv: (frag 20288:1480@1480+)
> 11:57:46.220115 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
> read (frag 20288:1480@0+)
> 
> Looking at this more in ethereal, this call/reply sequence has XID
> 0xf0db7b35. It appears to succeed (although I'm confused why it's
> reading @ offset 2408448). However, the next 3 calls have the exact same
> XID (marked as dup's in ethereal), and it's reading same size/offset.
> 
> 11:57:46.865276 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:46.907584 ogg > rain.ximian.priv: (frag 20544:1480@1480+)
> 11:57:46.909674 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
> read (frag 20544:1480@0+)
> 
> 11:57:48.265276 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:48.319183 ogg > rain.ximian.priv: (frag 20800:1480@1480+)
> 11:57:48.321456 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
> read (frag 20800:1480@0+)
> 
> 11:57:51.065342 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:51.114002 ogg > rain.ximian.priv: (frag 21056:1480@1480+)
> 11:57:51.115317 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
> read (frag 21056:1480@0+)
> 
> Then, it switches to a new XID, and the same set repeats itself.
> 
> 11:57:56.666020 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:56.720428 ogg > rain.ximian.priv: (frag 21312:1480@1480+)
> 11:57:56.722696 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
> read (frag 21312:1480@0+)
> 
> 11:57:57.365349 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:57.411418 ogg > rain.ximian.priv: (frag 21568:1480@1480+)
> 11:57:57.413422 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
> read (frag 21568:1480@0+)
> 
> 11:57:58.765279 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:57:58.814856 ogg > rain.ximian.priv: (frag 21824:1480@1480+)
> 11:57:58.817140 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
> read (frag 21824:1480@0+)
> 
> 11:58:01.565281 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
> Unknown/1 4096 bytes @ 2408448 (DF)
> 11:58:01.617302 ogg > rain.ximian.priv: (frag 22080:1480@1480+)
> 11:58:01.619220 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
> read (frag 22080:1480@0+)
> 
> 
> ... and then cp dies with an I/O error.
> 
> So, from looking at this, I'm going to blame the client side NFS stuff
> here -- especially since the file is perfectly fine on the server
> itself.  Alan Cox said that he wasn't aware of any nfs client-side
> patches that have gone in since 2.4.2 came out. 
> 
> Note that this is actually a similar error to what I was seeing while
> running my iPAQ with a NFS'd root filesystem -- certain files just give
> I/O errors.  This is one of them; I recreated by copying the file to an
> ext2 filesystem and getting the same I/O error.
> 
> So, this isn't XFS related (whew!), but nfs on linux does indeed suck.
> :-P
> 
> 
>     - Vlad
> 



<Prev in Thread] Current Thread [Next in Thread>