xfs
[Top] [All Lists]

Re: kernel panic with nfsd and xfs filesystem

To: Vladimir Vukicevic <vladimir@xxxxxxxxxx>
Subject: Re: kernel panic with nfsd and xfs filesystem
From: Vladimir Vukicevic <vladimir@xxxxxxxxxx>
Date: 09 Mar 2001 12:18:34 -0500
Cc: Steve Lord <lord@xxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: <E14bQ1Y-0007cM-00@xxxxxxxxxxxxxxx>
Sender: owner-linux-xfs@xxxxxxxxxxx
On 09 Mar 2001 11:45:51 -0500, Vladimir Vukicevic wrote:
> 
> 
> Hmm.  I now have a repeatable case of this I/O error, but as the other
> end is running the kernel nfsd server, I'm not exactly sure how to debug
> it or turn on debugging on the other end... A 'cat foo.ogg /dev/null' on
> the mounted partition repeatably gives an I/O error.
> 
> Any thoughts on how to diagnose this?  I'll keep poking..


Doh. Forgot about tcpdump. :-)  So, this is what I'm seeing:

11:57:46.159530 rain.ximian.priv.4040915765 > ogg.nfs: 140 lookup fh
Unknown/1 "01-letters_from_the_wasteland.ogg" (DF)
11:57:46.163988 ogg.nfs > rain.ximian.priv.4040915765: reply ok 128
lookup fh Unknown/1 (DF)

So, the lookup goes okay. Then the weirdness starts.

11:57:46.173585 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
Unknown/1 4096 bytes @ 2408448 (DF)
11:57:46.217986 ogg > rain.ximian.priv: (frag 20288:1480@1480+)
11:57:46.220115 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
read (frag 20288:1480@0+)

Looking at this more in ethereal, this call/reply sequence has XID
0xf0db7b35. It appears to succeed (although I'm confused why it's
reading @ offset 2408448). However, the next 3 calls have the exact same
XID (marked as dup's in ethereal), and it's reading same size/offset.

11:57:46.865276 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
Unknown/1 4096 bytes @ 2408448 (DF)
11:57:46.907584 ogg > rain.ximian.priv: (frag 20544:1480@1480+)
11:57:46.909674 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
read (frag 20544:1480@0+)

11:57:48.265276 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
Unknown/1 4096 bytes @ 2408448 (DF)
11:57:48.319183 ogg > rain.ximian.priv: (frag 20800:1480@1480+)
11:57:48.321456 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
read (frag 20800:1480@0+)

11:57:51.065342 rain.ximian.priv.4057692981 > ogg.nfs: 112 read fh
Unknown/1 4096 bytes @ 2408448 (DF)
11:57:51.114002 ogg > rain.ximian.priv: (frag 21056:1480@1480+)
11:57:51.115317 ogg.nfs > rain.ximian.priv.4057692981: reply ok 1472
read (frag 21056:1480@0+)

Then, it switches to a new XID, and the same set repeats itself.

11:57:56.666020 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
Unknown/1 4096 bytes @ 2408448 (DF)
11:57:56.720428 ogg > rain.ximian.priv: (frag 21312:1480@1480+)
11:57:56.722696 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
read (frag 21312:1480@0+)

11:57:57.365349 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
Unknown/1 4096 bytes @ 2408448 (DF)
11:57:57.411418 ogg > rain.ximian.priv: (frag 21568:1480@1480+)
11:57:57.413422 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
read (frag 21568:1480@0+)

11:57:58.765279 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
Unknown/1 4096 bytes @ 2408448 (DF)
11:57:58.814856 ogg > rain.ximian.priv: (frag 21824:1480@1480+)
11:57:58.817140 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
read (frag 21824:1480@0+)

11:58:01.565281 rain.ximian.priv.4074470197 > ogg.nfs: 112 read fh
Unknown/1 4096 bytes @ 2408448 (DF)
11:58:01.617302 ogg > rain.ximian.priv: (frag 22080:1480@1480+)
11:58:01.619220 ogg.nfs > rain.ximian.priv.4074470197: reply ok 1472
read (frag 22080:1480@0+)


... and then cp dies with an I/O error.

So, from looking at this, I'm going to blame the client side NFS stuff
here -- especially since the file is perfectly fine on the server
itself.  Alan Cox said that he wasn't aware of any nfs client-side
patches that have gone in since 2.4.2 came out. 

Note that this is actually a similar error to what I was seeing while
running my iPAQ with a NFS'd root filesystem -- certain files just give
I/O errors.  This is one of them; I recreated by copying the file to an
ext2 filesystem and getting the same I/O error.

So, this isn't XFS related (whew!), but nfs on linux does indeed suck.
:-P


    - Vlad



<Prev in Thread] Current Thread [Next in Thread>