[Top] [All Lists]

Re: XFS file corruption bug ?

To: Nathan Scott <nathans@xxxxxxx>
Subject: Re: XFS file corruption bug ?
From: James Foris <jforis@xxxxxxxxx>
Date: Fri, 18 Mar 2005 20:33:32 -0600
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <20050317095634.C3120542@xxxxxxxxxxxxxxxxxxxxxxxx>
References: <4237C29B.2020001@xxxxxxxxx> <20050317095634.C3120542@xxxxxxxxxxxxxxxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041116
Nathan Scott wrote:

Hi James,

On Tue, Mar 15, 2005 at 11:22:35PM -0600, James Foris wrote:
1-36 GB system disk, 2-74 GB data disk configured as a single RAID0 partition with 256K chunk size. This "md0" partition is formatted as XFS with external journal on the system disk:

   /sbin/mkfs.xfs -f -l logdev=/dev/sda5,sunit=8 /dev/md0

Just trying to isolate the problem futher - you say (below)
the mkfs options don't matter, so I'm assuming you dropped
the "sunit=8" bit above to test that?  (and hence tested a
version 1 log).

Does it happen with no data device stripe alignment though?
(i.e. use -dsu=0,sw=0, IIRC).

First test (with 256K chunk size) shows it still happens, but at a much
reduced frequency; failures rates ranged between 100 and 600 files on a
140 GByte partition. First test run with that argument added to the format
command showed only 4 file failures.

We are re-running the test again to confirm this low number. Also, we will
try a 128K chunk size to see the impact there. (128K chunk size increased
the number of failures.) I will follow up with results Monday.

using tools from "xfsprogs-2.6.25-1".

First the partition was zeroed ("dd if=dev/zero of=/dev/md0 ....."), then a known pattern was written in 516K files (4K + 2 x 256K). The partition (~140 GBytes) was filled to 98%,
then the partition was first unmounted, then remounted.

Were these files written via buffered or direct IO?  Any mount
options being used there?

No mount options. We can do this wit a "cp" command from a shell, so
buffered I/O does it.

3. File system creation options do not matter; the using the default mkfs.xfs settings
   shows corruption, too.

The default mkfs options will still pick up MD stripe geometry
information, and the kernel code will make use of that (hence
my earlier suggestion to try override that).

What is NOT known yet:
5. We have only tested software RAID0. The test needs to be repeated on the other
   RAID modes.

Also just a regular disk, but using the same sunit/swidth would
be good (could you send xfs_info output for a failing case?) to
rule out MD.

xfs_info /dev/md0
meta-data=/data isize=256 agcount=16, agsize=2241472 blks
= sectsz=512
data = bsize=4096 blocks=35862976, imaxpct=25
= sunit=64 swidth=128 blks, unwritten=1
naming =version 2 bsize=4096
log =external bsize=4096 blocks=18065, version=2
= sectsz=512 sunit=1 blks
realtime =none extsz=524288 blocks=0, rtextents=0

Can anyone duplicate this? And if not, where should I be looking to

We will certainly try to.  Thanks for reporting it.


<Prev in Thread] Current Thread [Next in Thread>