xfs
[Top] [All Lists]

Re: xfsrestore fails to exit properly under 2.6.1

To: oar.gfdl.linux-workstations@xxxxxxxx, cattelan@xxxxxxx, linux-xfs@xxxxxxxxxxx
Subject: Re: xfsrestore fails to exit properly under 2.6.1
From: Phil Macias <Philip.Macias@xxxxxxxx>
Date: Wed, 12 May 2004 11:01:42 -0400
In-reply-to: <3174-Wed04Feb2004130248-0500-Philip.Macias@NOAA.gov>
References: <4229-Wed04Feb2004062354-0500-Philip.Macias@NOAA.gov> <1075916534.96681.5.camel@lupo.thebarn.com> <3174-Wed04Feb2004130248-0500-Philip.Macias@NOAA.gov>
Sender: linux-xfs-bounce@xxxxxxxxxxx
All,

Finally back at this and have found the fix.

The problem was the "xfsrestorehousekeepingdir" files. 

 root: ls -la /clone/xfsrestorehousekeepingdir
 total 304k
 drwx------    2 root     root         4.0k May 12 02:19 ./
 drwxr-xr-x    3 root     root         4.0k May 12 02:19 ../
 -rw-------    1 root     root          24k May 12 02:07 .nfs000067c200000002
 -rw-------    1 root     root          39k May 12 02:07 .nfs000067c300000003
 -rw-------    1 root     root         190k May 12 02:07 .nfs000067c400000004
 -rw-------    1 root     root          46M May 12 02:06 .nfs000067c500000005
 -rw-------    1 root     root          36k May 12 02:07 .nfs000067c600000001
 -rw-------    1 root     root            0 May 12 02:06 .nfs0000685d00000006

In the 2.4.x systems we have that dir could be on the NFS-mounted
destination drives with no problem. Evendently they were creating a
problem for 2.6.x as they remained locked by some process even after
the hanging xfsrestore process was killed. Adding a

 rm -rf /tmp/xfsrestorehousekeepingdir/

and 

 "-a /tmp" parameter to the cloning script fixed everything.

 - Phil


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Phil Macias" <Philip.Macias@xxxxxxxx>, 

  RSIS, Inc. *  NOAA/GFDL * Princeton, NJ

      ___,___
       |_|_,       609 987 5059 office         
         |         609 203 5874 cell           
          >__,


 | Date: 4 Feb 2004 13:02:49 -0500
 | Bcc: root@xxxxxxxxxxxxxxxxxxxx
 | Date: Wed,  4 Feb 2004 13:02:48 -0500
 | CC: Philip.Macias@xxxxxxxx, linux-xfs@xxxxxxxxxxx
 | From: Phil Macias <Philip.Macias@xxxxxxxx>
 | 
 | Here you go:
 | 
 |   root: ~# gdb --pid=7896
 |   GNU gdb Red Hat Linux (5.2-2)
 |   Copyright 2002 Free Software Foundation, Inc.
 |   GDB is free software, covered by the GNU General Public License, and you 
are
 |   welcome to change it and/or distribute copies of it under certain 
conditions.
 |   Type "show copying" to see the conditions.
 |   There is absolutely no warranty for GDB.  Type "show warranty" for details.
 |   This GDB was configured as "i386-redhat-linux".
 |   Attaching to process 7896
 |   Reading symbols from /usr/sbin/xfsrestore...done.
 |   Reading symbols from /usr/lib/libhandle.so.1...done.
 |   Loaded symbols for /usr/lib/libhandle.so.1
 |   Reading symbols from /usr/lib/libattr.so.1...done.
 |   Loaded symbols for /usr/lib/libattr.so.1
 |   Reading symbols from /lib/libc.so.6...done.
 |   Loaded symbols for /lib/libc.so.6
 |   Reading symbols from /usr/lib/libgcc_s.so.1...done.
 |   Loaded symbols for /usr/lib/libgcc_s.so.1
 |   Reading symbols from /lib/ld-linux.so.2...done.
 |   Loaded symbols for /lib/ld-linux.so.2
 |   0x4010c566 in open64 () from /lib/libc.so.6
 |   
 |   (gdb) backtrace
 |   #0  0x4010c566 in open64 () from /lib/libc.so.6
 |   #1  0x400e4dd9 in opendir () from /lib/libc.so.6
 |   #2  0x08073a0b in wipepersstate () at content.c:3600
 |   #3  0x08072701 in content_complete () at content.c:2587
 |   #4  0x08062de7 in main (argc=4, argv=0xbffff514) at main.c:636
 |   #5  0x4004ed06 in __libc_start_main () from /lib/libc.so.6
 | 
 | 
 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 | "Phil Macias" <Philip.Macias@xxxxxxxx>, 
 | 
 |   RSIS, Inc. *  NOAA/GFDL * Princeton, NJ
 | 
 |       ___,___
 |        |_|_,       609 987 5059 office         
 |          |         609 203 5874 cell           
 |           >__,
 | 
 | 
 |  | From: Russell Cattelan <cattelan@xxxxxxx>
 |  | Cc: linux-xfs@xxxxxxxxxxx
 |  | Date: Wed, 04 Feb 2004 11:42:14 -0600
 |  | 
 |  | Are you able to attach gdb to the hung process and
 |  | get a backtrace?
 |  | 
 |  | On Wed, 2004-02-04 at 05:23, Phil Macias wrote:
 |  | > Hello,
 |  | >
 |  | > We have been using XFS on RedHat linux for over two years (since I
 |  | > have been here). I have been using a script to clone workstations from
 |  | > one-another for over a year whereby the host-to-be-cloned exports it's
 |  | > XFS partitions over NFS and they are mounted by the donor host and
 |  | > contents copied with:
 |  | >
 |  | >       xfsdump -v5 -l 0 - /dev/hda7 | xfsrestore -v5 - /clone-var
 |  | >
 |  | > This process worked reliably for over a year on the 2.4.x kernels and
 |  | > these versions of xfs tools:
 |  | >
 |  | >   acl-2.1.1-gfdl-1-1
 |  | >   attr-2.1.1-gfdl-1-1
 |  | >   dmapi-2.0.5-gfdl-1-1
 |  | >   kernel-2.4.18-XFS-NFS-base-gfdl-2-1
 |  | >   xfsdump-2.2.4-gfdl-1-1
 |  | >   xfsprogs-2.3.6-gfdl-1-1
 |  | >
 |  | > PROBLEM: I am testing the 2.6.1 kernel and the following relevant
 |  | > packages:
 |  | >
 |  | >   libelf-0.8.2-2-gfdl-1-1
 |  | >   elfutils-libelf-0.89-2-gfdl-1-1
 |  | >   popt-1.8.1-0.31-gfdl-1-1
 |  | >   gcc-3.3-gfdl-1-1
 |  | >   kernel-2.6.0-complete-gfdl-1-1
 |  | >   glibc-2.3.2-gfdl-1-1
 |  | >   beecrypt-3.0.1-gfdl-1-1
 |  | >
 |  | >   acl-2.2.21-gfdl.tgz
 |  | >   attr-2.4.12-gfdl.tgz
 |  | >   binutils-2.14-gfdl.tgz
 |  | >   dmapi-2.1.0-gfdl.tgz
 |  | >   xfsprogs-2.6.0-gfdl.tgz
 |  | >
 |  | > Everything on the system works well except the remote
 |  | > xfsdump/xfsrestore. Running:
 |  | >
 |  | >       xfsdump -v5 -l 0 - /dev/hda7 | xfsrestore -v5 - /clone-var
 |  | >
 |  | > on the 2.6.1 system does dump/restore properly, but xfsrestore never
 |  | > exits. I have to issue these commands to release xfsrestore:
 |  | >
 |  | >   kill %
 |  | >   fuser -k /clone-var/
 |  | >
 |  | >         ...where /clone-var/ is the target partition. Please note that
 |  | >         the target host is still running the old (2.4.x) kernel.
 |  | >
 |  | > Here are the last lines to the "-v5" output:
 |  | >
 |  | >   ...
 |  | >   xfsrestore: read file hdr off 0 flags 0x0 ino 14680333 mode 0x0000a1ff
 |  | >   xfsrestore: preemptchk( )
 |  | >   xfsrestore: restoring lib/scrollkeeper/pt_BR (14680333 0)
 |  | >   xfsrestore: restoring symbolic link ino 14680333 
lib/scrollkeeper/pt_BR
 |  | >   xfsrestore: drive_simple read( want 32 )
 |  | >   xfsrestore: drive_simple return_read_buf( returning 32 )
 |  | >   xfsrestore: xlate_extenthdr
 |  | >   xfsrestore: read extent hdr size 32 offset 0 type 4 flags 00000000
 |  | >   xfsrestore: drive_simple read( want 32 )
 |  | >   xfsrestore: drive_simple return_read_buf( returning 32 )
 |  | >   xfsrestore: drive_simple get_mark( )
 |  | >   xfsrestore: drive_simple read( want 256 )
 |  | >   xfsrestore: drive_simple return_read_buf( returning 256 )
 |  | >   xfsrestore: xlate_bstat
 |  | >   xfsrestore: xlate_bstat: pre-xlate
 |  | >           bs_ino 0
 |  | >           bs_mode  0
 |  | >   xfsrestore: xlate_bstat: post-xlate
 |  | >           bs_ino 0
 |  | >           bs_mode  0
 |  | >   xfsrestore: xlate_filehdr: pre-xlate
 |  | >           fh_offset 0
 |  | >           fh_flags 83886080
 |  | >           fh_checksum 13835040720794157312
 |  | >   xfsrestore: xlate_filehdr: post-xlate
 |  | >           fh_offset 0
 |  | >           fh_flags 5
 |  | >           fh_checksum 13835040720794157312
 |  | >   xfsrestore: read file hdr off 0 flags 0x5 ino 0 mode 0x00000000
 |  | >   xfsrestore: preemptchk( )
 |  | >   xfsrestore: Media_end: pos=3D=3D3
 |  | >   xfsrestore: drive_simple end_read( )
 |  | >   xfsrestore: getting next media file for non-dir restore
 |  | >   xfsrestore: Media_mfile_next: purp=3D=3D2 pos=3D=3D0
 |  | >   xfsrestore: tree finalize
 |  | >   xfsrestore: restore complete: 139 seconds elapsed
 |  | >   -------------------------------------------------
 |  | >
 |  | > Syslog shows no erors or any info about xfsrestore.
 |  | >
 |  | > Any idea why xfsrestore fails to exit properly?
 |  | >
 |  | > Thanx,
 |  | >
 |  | >
 |  | > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 |  | > "Phil Macias" <Philip.Macias@xxxxxxxx>,
 |  | >
 |  | >   RSIS, Inc. *  NOAA/GFDL * Princeton, NJ
 |  | >
 |  | >       ___,___
 |  | >        |_|_,       609 987 5059 office
 |  | >          |         609 203 5874 cell
 |  | >           >__,
 |  | >
 |  | >
 |  | >
 |  | --
 |  | Russell Cattelan <cattelan@xxxxxxxxxxx>
 |  | 
 |  | --=-P/55uI4EMIJNOPBs/njs
 |  | Content-Type: application/pgp-signature; name=signature.asc
 |  | Content-Description: This is a digitally signed message part
 |  | 
 |  | -----BEGIN PGP SIGNATURE-----
 |  | Version: GnuPG v1.2.4 (FreeBSD)
 |  | 
 |  | iD8DBQBAIS72NRmM+OaGhBgRApgsAJ91dL40QRuf489yvuWP0edD5CW3mgCePnHE
 |  | jJKK3969prkDV3Ty8A15YcE=
 |  | =PMku
 |  | -----END PGP SIGNATURE-----
 |  | 
 |  | --=-P/55uI4EMIJNOPBs/njs--
 |  | 
 |  | 
 | 
 | 


<Prev in Thread] Current Thread [Next in Thread>
  • Re: xfsrestore fails to exit properly under 2.6.1, Phil Macias <=