xfs
[Top] [All Lists]

RE: xfs_growfs failure....

To: "david@xxxxxxxxxxxxx" <david@xxxxxxxxxxxxx>
Subject: RE: xfs_growfs failure....
From: Jason Vagalatos <Jason.Vagalatos@xxxxxxxxxxxxxxxx>
Date: Wed, 24 Feb 2010 10:08:26 -0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>, Joe Allen <Joe.Allen@xxxxxxxxxx>
In-reply-to: <84A4B16519BD4C43ABD91AFB3CB84B6F097ED42EE9@sbapexch05>
References: <20100224115420.GK16175@xxxxxxxxxxxxxxxx> <5A88945F-EAA8-4678-8ADA-7700E3FF607B@xxxxxxxxxxxxxxxx> <84A4B16519BD4C43ABD91AFB3CB84B6F097ED42EE9@sbapexch05>
Thread-index: Acq1dQcNWxvN0I7rQVKCkqd4cDXmogAAt4kwAADhvUA=
Thread-topic: xfs_growfs failure....

David,

This might provide some useful insight too..  I just remembered that the xfs_growfs command was run twice.  The first time it errored because I omitted the –d option.  I reran it with the –d option and it completed successfully.

 

Is it possible the xfs_growfs grew the filesystem 2x intended size by running it twice?  It seems that the command would fail the second time because all of the remaining space on the underlying device was used up by the first run?

 

Thanks,

 

Jason Vagalatos

Storage Administrator
Citrix|Online

7408 Hollister Avenue

Goleta California 93117
T:  805.690.2943 | M:  805.403.9433
jason.vagalatos@xxxxxxxxxxxxxxxx
http://www.citrixonline.com

 

From: Jason Vagalatos
Sent: Wednesday, February 24, 2010 9:57 AM
To: 'david@xxxxxxxxxxxxx'
Cc: 'xfs@xxxxxxxxxxx'; Joe Allen
Subject: RE: xfs_growfs failure....

 

 

Hi David,

I’m picking this up from Joe.  I’ll attempt to answer your questions.

 

The underlying device was grown from 89TB to 100TB.  The xfs filesystem utilizes an external logdev.  After the underlying device was grown by approx 11TB, we ran xfs_growfs –d <filesystem_mount_point>.  This command completed without errors, but the filesystem immediately went into a bad state.

 

We are running xfsprogs-2.8.20-1.el5.centos on RHEL Kernel Linux 2.6.18-53.1.19.el5 #1 SMP Tue Apr 22 03:01:10 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux

 

We killed the xfs_repair before it was able to find a secondary superblock and make things worse.

 

Currently the underlying block device is:

 

--- Logical volume ---

  LV Name                /dev/logfs-sessions/sessions

  VG Name                logfs-sessions

  LV UUID                32TRbe-OIDw-u4aH-fUmD-FLmU-5jRv-MaVYDg

  LV Write Access        read/write

  LV Status              available

  # open                 0

  LV Size                100.56 TB

  Current LE             26360253

  Segments               18

  Allocation             inherit

  Read ahead sectors     0

  Block device           253:61

 

At this point what are our options to recover this filesystem?

 

Thank you for any help you may be able to provide.

 

Jason Vagalatos

 

From: Joe Allen
Sent: Wednesday, February 24, 2010 9:15 AM
To: Jason Vagalatos
Subject: Fwd: xfs_growfs failure....

 


wierd. he seems to be implying we were at 110tb and stuff is written there. I guess we need to be 100% sure of the space allocated. were the other Luns ever attached ?

 

 

 


Begin forwarded message:

From: Dave Chinner <david@xxxxxxxxxxxxx>ng 
Date: February 24, 2010 3:54:20 AM PST
To: Joe Allen <Joe.Allen@xxxxxxxxxx>
Cc: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>
Subject: Re: xfs_growfs failure....

On Wed, Feb 24, 2010 at 02:44:37AM -0800, Joe Allen wrote:

I am in some difficulty here over a 100TB filesystem that

Is now unusable after a xfs_growfs command.

 

Is there someone that might help assist?

 

#mount: /dev/logfs-sessions/sessions: can't read superblock

 

Filesystem "dm-61": Disabling barriers, not supported with external log device

attempt to access beyond end of device

dm-61: rw=0, want=238995038208, limit=215943192576


You've grown the filesystem to 238995038208 ѕectors (111.3TiB),
but the underlying device is only 215943192576 sectors (100.5TiB)
in size.

I'm assuming that you're trying to mount the filesystem after a
reboot? I make this assumption as growfs is an online operation and
won't grow if the underlying block device has not already been
grown. For a subsequent mount to fail with the underlying device
being too small, something about the underlying block
device had to change....

What kernel version and xfsprogs version are you using?

xfs_repair -n <device> basically looks for superblocks (phase 1 I

guess ) for a long time. I'm letting it run, but not much hope.


Don't repair the filesystem - there is nothing wrong with it
unless you start modifying stuff. What you need to do is fix the
underlying device to bring it back to the size it was supposed to
be at when the grow operation was run.

What does /proc/partitions tell you about the size of dm-61? does
that report the correct size, and if it does, what is it?

I'm hesitant to run xfs_repair -L  or without the -n flag for fear of making it worse.


Good - don't run anything like that until you sort out whether the
underlying device is correctly sized or not.

-bash-3.1# xfs_db -r -c 'sb 0' -c p /dev/logfs-sessions/sessions

magicnum = 0x58465342

blocksize = 4096

dblocks = 29874379776


XFS definitely thinks it is 111.3TiB in size.

rblocks = 0

rextents = 0

uuid = fc8bdf76-d962-43c1-ae60-b85f378978a6

logstart = 0

rootino = 2048

rbmino = 2049

rsumino = 2050

rextsize = 384

agblocks = 268435328

agcount = 112


112 AGs of 1TiB each - that confirms the grow succeeded and it was
able to write metadata to disk between 100 and 111 TiB without
errors being reported. That implies the block device must have been
that big at some point...

rbmblocks = 0

logblocks = 32000

versionnum = 0x3184

sectsize = 512

inodesize = 256

inopblock = 16

fname = "\000\000\000\000\000\000\000\000\000\000\000\000"

blocklog = 12

sectlog = 9

inodelog = 8

inopblog = 4

agblklog = 28

rextslog = 0

inprogress = 0

imax_pct = 25

icount = 7291520

ifree = 8514

fdblocks = 6623185597


With 24.5TiB of free space

-bash-3.1# xfs_db -r -c 'sb 2' -c p /dev/logfs-sessions/sessions

magicnum = 0x58465342

blocksize = 4096

dblocks = 24111418368


That's 89.9TiB...

rblocks = 0

rextents = 0

uuid = fc8bdf76-d962-43c1-ae60-b85f378978a6

logstart = 0

rootino = 2048

rbmino = 2049

rsumino = 2050

rextsize = 384

agblocks = 268435328

agcount = 90


And 90 AGs. That tells me the filesystem was created as a 90TiB
filesystem. Can you tell me if you attempted to grow from 90TiB to
100TiB or from 100TiB to 110TiB?  There were bugs at one point in
both the userspace grow code and the kernel code that resulted in
bad grows (hence the need to know the versions this occurred on
and what you were actually attempting to do), but these problems
can usually be fixed up with some xfs_db magic.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>