xfs
[Top] [All Lists]

Re: corruption issue

To: Stephen Lord <lord@xxxxxxx>
Subject: Re: corruption issue
From: Christian Rice <xian@xxxxxxxxxxx>
Date: Thu, 15 Aug 2002 18:53:39 -0700
Cc: linux-xfs@xxxxxxxxxxx
Organization: Tippett Studio
References: <3D5C2E2B.1D4C1356@xxxxxxxxxxx> <1029456794.1256.18.camel@snafu>
Sender: owner-linux-xfs@xxxxxxxxxxx
Stephen Lord wrote:
> 
> On Thu, 2002-08-15 at 17:41, Christian Rice wrote:
> > I'm wondering if this is a recoverable situation:
> 
> Ugh, so you actually ran repair on the filesystem here by
> the look of it. Those values in the super block are very
> strange, 18446744073709551615 is -1 as a 64 bit value.
> Hopefully the dirty log prevented repair from actually
> doing something to the disk. Have you attempted to mount
> the filesystem without running repair first? In general
> after some form of crash just mounting the fs is the
> best thing to do.

Yeah, I ran xfs_repair on it after trying a mount and getting the usual:

[root@ozu tmp]# mount -t xfs /dev/hdb3 /hdb3
mount: wrong fs type, bad option, bad superblock on /dev/hdb3,
       or too many mounted file systems

xfs_repair complained as I submitted earlier, but I did not use the '-L'
option.  The disk is not critical--just typical of what is happening to
us relatively regularly on Linux (never on IRIX!).

A worthy bit of information to include--I have found sometimes rewriting
the partition table makes the disk mountable again.  I did that after
trying a mount (fdisk /dev/hdb; write).  Did I screw up?  It's not
highly scientific, to be sure.

I have attached the dd output (xxx.gz) and, though I could not mount the
filesystem, I ran xfs_check on it anyway.  Well, it produced a little
over 8GB of output, but most of it is repetitious after the first 1700
or so lines.  This is with xfsprogs-2.2.1-0 installed now.  The output
is also attached.  I hope it doesn't bog anyone's mbox down.

I sure wouldn't mind knowing how to recover data from this type of
situation--is there a paper/manual/brain transplant that imparts such
knowledge.  The owner of the data is not going to freak if this disk is
lost, but I often find myself in this position.  This exact problem is a
great time-consumer for my systems guys.

The workstation in which the disk was corrupted was hung while running
Maya, and the user hit the reset key.

I highly appreciate the help.

> 
> Also please update your xfsprogs to the latest version from
> oss, there are almost certainly bug fixes since the copy you
> have.
> 
> So try and mount the fs, report what that does, then if it
> mounted, unmount it and try running xfs_check on the filesystem
> and send us the output. It may also be useful to do dd off the
> first 4K of the filesystem and send that as well:
> 
>  dd if=/dev/hdb3 of =xxx bs=4k count=1
> 
> Do you know what happened to the system, was this a loss of power
> or a software crash? If there was valuable information on the disk
> then we can try and help get some of it back. It sounds like it
> was a system disk though.
> 
> Steve
> 
> >
> > [root@ozu root]# xfs_repair /dev/hdb3
> > Phase 1 - find and verify superblock...
> > sb root inode value 18446744073709551615 inconsistent with calculated
> > value 13835051801809780864
> > resetting superblock root inode pointer to 18446744069414584448
> > sb realtime bitmap inode 18446744073709551615 inconsistent with
> > calculated value 13835051801809780865
> > resetting superblock realtime bitmap ino pointer to 18446744069414584449
> > sb realtime summary inode 18446744073709551615 inconsistent with
> > calculated value 13835051801809780866
> > resetting superblock realtime summary ino pointer to
> > 18446744069414584450
> > Phase 2 - using internal log
> >         - zero log...
> > ERROR: The filesystem has valuable metadata changes in a log which needs
> > to be replayed.  Mount the filesystem to replay the log, and unmount it
> > before re-running xfs_repair.  If you are unable to mount the
> > filesystem, then use the -L option to destroy the log and attempt a
> > repair.
> > Note that destroying the log may cause corruption -- please attempt a
> > mount of the filesystem before doing this.
> >
> >
> > In the past, sometimes the entire contents of the disk end up in
> > lost+found after xfs_repair -L.  I ran xfs_repair -n, and it seemed to
> > want to unlink quite a few inodes (thousands, including system files
> > that could not have possibly been in active use during operation).  Yes,
> > the system crashed, I'm not absolutely positive I had write caching
> > turned off (I've been using hdparm -W 0) on this system.
> >
> > It was running 2.4.18 with xfs 1.1, not the latest CVS stuff.  Also, I
> > ran the checks on a system with xfsprogs-2.0.3-0.rpm installed.
> >
> > Anybody can offer any hope, or do I mkfs now?  If there's a recovery
> > from this, that would save me and my sysadmins countless hours of work
> > into the future, as we have perhaps 100 machines running xfs on linux.
> >
> > Thanks.
> >
> > --
> >
> > christian rice  director of technology
> > tippett studio  510.649.9711 l--xr-----
> --
> 
> Steve Lord                                      voice: +1-651-683-3511
> Principal Engineer, Filesystem Software         email: lord@xxxxxxx

-- 

christian rice  director of technology
tippett studio  510.649.9711 l--xr-----

Attachment: xxx.gz
Description: GNU Zip compressed data

Attachment: xfs_check.out.gz
Description: GNU Zip compressed data

<Prev in Thread] Current Thread [Next in Thread>