George Barnett wrote:
> On 04/02/2009, at 12:46 PM, Eric Sandeen wrote:
>>> bad version number 0x0 on inode 18046
>>> bad magic number 0x0 on inode 18047
>>> bad version number 0x0 on inode 18047
>>> bad directory block magic # 0 in block 0 for directory inode 18000
>> Interesting that all the bad magic numbers were 0... not sure what to
>> make of that, offhand, I'm afraid...
> Oh dear.
> I'm going to try moving the filesystem to ext3 to see if this
> continues. If it does, it would suggest a bug in the underlying
> raid10 implementation or a problem with the disks, although they're
> not reporting any errors .
one thing to note is that xfs is very good at detecting on-disk
corruption, not sure ext3 will be as good. So ext3 may seem to run
finer, longer, even if there is an underlying problem.
> Is there any further debugging I can do before I start fresh?
well, it'd be great to have an isolated testcase, if you can reproduce
Also I don't know what exact kernel ubuntu uses or what patches are in
it; you might try a stock upstream kernel w/ the same config,
2.6.27.$LATEST, and see if you continue to have problems.
> 1. The hardware ecc recovered smartctl metric is /very/ high,
> although I'm told this may be normal for samsung drives. I cant think
> of any way to confirm a disk problem without a CRC checking fs though.