On Sun, Aug 31, 2003 at 01:41:09PM +0200, Simon Matter wrote:
> I have upgraded one of my servers some days ago to my own
> kernel-2.4.20-20.7.XFS1.3.0. This is the kernel-2.4.20-19.9.XFS1.3.0 from
> oss, upgraded to the RedHat 2.4.20-20 release and nptl disabled (for
> RH7.x). I'm running RedHat 7.2, upgraded to the latest errata level. All
> xfs tools are the most current.
> Yesterday, my /home filesystem was shut down with the following error:
> xfs_inotobp: xfs_imap() returned an error 22 on md(9,8). Returning error.
> xfs_iunlink_remove: xfs_inotobp() returned an error 22 on md(9,8).
> Returning error.
> xfs_inactive: xfs_ifree() returned an error = 22 on md(9,8)
> xfs_force_shutdown(md(9,8),0x1) called from line 1846 of file
> xfs_vnodeops.c. Return address = 0xd08bd8c6
> Filesystem "md(9,8)": I/O Error Detected. Shutting down filesystem: md(9,8)
> Please umount the filesystem, and rectify the problem(s)
Hmmm - seen a couple of these reports now, unfortunately I haven't
come across it myself yet. Someone else has a bugzilla bug open,
it seems to be really difficult to trigger this one.
> [root@xxl root]# xfs_check /dev/md8
> xfs_check: warning - cannot get sector size from block device /dev/md8:
> Invalid argument
this is where those bogus md "ictls" are coming from, btw. Its
fixed in more recent kernels, you can search lkml archives for
recent Neil Brown patches and find the fix (its some missing code
in md, and is harmless here).
> agf_freeblks 22053, counted 30629 in ag 6
> agi_freecount 332, counted 331 in ag 153
> [root@xxl root]# xfs_repair -n -l /dev/md9 /dev/md8
Running xfs_repair without -n might help in this case to fix sufficient
data on the first run that subsequent runs don't dump core. :|
The gdb trace from xfs_repair will provide more clues here too.
> xfs_repair: warning - cannot get sector size from block device /dev/md8:
> Invalid argument
> xfs_repair: warning - cannot get sector size from block device /dev/md9:
> Invalid argument
> Phase 1 - find and verify superblock...
> sb root inode value 128 inconsistent with calculated value 512
> would reset superblock root inode pointer to 512
Wierd. Is this a 4K blocksize filesystem? Do you have xfs_info
output handy? taa.
> sb realtime bitmap inode 129 inconsistent with calculated value 513
> would reset superblock realtime bitmap ino pointer to 513
> sb realtime summary inode 130 inconsistent with calculated value 514
> would reset superblock realtime summary ino pointer to 514
> Phase 2 - using external log on /dev/md9
> - scan filesystem freespace and inode maps...
> root inode chunk not found
> avl_insert: Warning! duplicate range [512,576]
> add_inode - duplicate inode range
> Phase 3 - for each AG...
> - scan (but don't clear) agi unlinked lists...
> - process known inodes and perform inode discovery...
> - agno = 0
> bad .. entry in directory inode 128, points to self: would clear entry
> data fork in inode 170 claims metadata block 32
> bad data fork in inode 170
> would have cleared inode 170
> data fork in inode 171 claims metadata block 33
> bad data fork in inode 171
> would have cleared inode 171
> - agno = 1
> - agno = 2
> - agno = 3
> - agno = 4
> - agno = 5
> - agno = 156
> - agno = 157
> - agno = 158
> - agno = 159
> - process newly discovered inodes...
> Phase 4 - check for duplicate blocks...
> - setting up duplicate extent list...
> Segmentation fault
> [root@xxl root]# xfs_repair -V
> xfs_repair version 2.5.6
> I never considered the following a problem but wanted to mention it anyways.
> While running xfs_check and xfs_repair, I also get this in the logs:
> md: xfs_db(pid 9985) used obsolete MD ioctl, upgrade your software to use
> new ictls.
This part at least is an md problem fixed in more recent kernels.
The corruption is more concerning though, and unrelated.