Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 23 Jan 2004 12:01:55 -0800 (PST) Received: from mail4-red-R.bigfish.com (mail-red.bigfish.com [216.148.222.61]) by oss.sgi.com (8.12.10/8.12.9) with SMTP id i0NK1a7J022295 for ; Fri, 23 Jan 2004 12:01:39 -0800 Received: from mail4-red.bigfish.com (localhost.localdomain [127.0.0.1]) by mail4-red-R.bigfish.com (Postfix) with ESMTP id 1705E17B479 for ; Fri, 23 Jan 2004 20:01:29 +0000 (UCT) Received: by mail4-red (MessageSwitch) id 1074888088984995_1435; Fri, 23 Jan 2004 20:01:28 +0000 (UCT) Received: from mx1.spimageworks.com (unknown [160.33.20.27]) by mail4-red.bigfish.com (Postfix) with ESMTP id C870017B1B5 for ; Fri, 23 Jan 2004 20:01:28 +0000 (UCT) Received: from zoot.spimageworks.com (zoot.spimageworks.com [172.18.5.16]) by mx1.spimageworks.com (8.11.6/8.11.6) with ESMTP id i0NK1Sm16748 for ; Fri, 23 Jan 2004 12:01:28 -0800 Subject: XFS internal error / xfs_force_shutdown From: Dan Lake To: linux-xfs@oss.sgi.com Content-Type: multipart/mixed; boundary="=-GbYnO8yNucPG3vyLsb4+" X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Date: 23 Jan 2004 12:01:28 -0800 Message-Id: <1074888088.16610.1013.camel@zoot.spimageworks.com> Mime-Version: 1.0 X-BigFish: v X-archive-position: 1873 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: danl@imageworks.com Precedence: bulk X-list: linux-xfs Content-Length: 18924 Lines: 339 --=-GbYnO8yNucPG3vyLsb4+ Content-Type: text/plain Content-Transfer-Encoding: 7bit I have been using XFS with some success on Linux NFS fileservers. In the past month, one server has been subjected to an extremely heavy load - lots of IO, many files created/read/deleted, full filesystem. With the increase in load, the system has crashed or hung numerous times. I've included system configuration information and XFS relevant traces from /var/log/messages below. I'd like to stabilize this system. Any suggestions? =Dan Lake Imageworks -- pi equals 3 -- for small values of pi and large values of 3 --=-GbYnO8yNucPG3vyLsb4+ Content-Disposition: attachment; filename=xfs-errors Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; name="xfs-errors; charset=ISO-8859-1" Hardware is: 2P Xeon 2.8GHz 2GB memory JBOD with 13 disks 3 md volumes: md0: 7 disks RAID0 md1: 4 disks RAID5 md2: 2 disks RAID0 Adaptec AIC-7899 U160 SCSI controller Broadcom BCM-5701 gigabit network interface XFS setup: $ xfs_info /dev/md0 meta-data=3D/mnt/vol102 isize=3D256 agcount=3D60, agsize=3D10= 48576 blks =3D sectsz=3D512 data =3D bsize=3D4096 blocks=3D62229552, imaxpc= t=3D25 =3D sunit=3D8 swidth=3D56 blks, unwritt= en=3D1 naming =3Dversion 2 bsize=3D4096 log =3Dinternal bsize=3D4096 blocks=3D30392, version= =3D1 =3D sectsz=3D512 sunit=3D0 blks realtime =3Dnone extsz=3D229376 blocks=3D0, rtextents=3D0 $ xfs_info /dev/md1 meta-data=3D/mnt/vol101 isize=3D256 agcount=3D103, agsize=3D1= 048576 blks =3D sectsz=3D512 data =3D bsize=3D4096 blocks=3D107528976, imaxp= ct=3D25 =3D sunit=3D8 swidth=3D24 blks, unwritt= en=3D1 naming =3Dversion 2 bsize=3D4096 log =3Dinternal bsize=3D4096 blocks=3D32768, version= =3D1 =3D sectsz=3D512 sunit=3D0 blks realtime =3Dnone extsz=3D98304 blocks=3D0, rtextents=3D0 $ xfs_info /dev/md2 meta-data=3D/mnt/vol55 isize=3D256 agcount=3D17, agsize=3D10= 48568 blks =3D sectsz=3D512 data =3D bsize=3D4096 blocks=3D17779872, imaxpc= t=3D25 =3D sunit=3D8 swidth=3D16 blks, unwritt= en=3D1 naming =3Dversion 2 bsize=3D4096 log =3Dinternal bsize=3D4096 blocks=3D8688, version=3D1 =3D sectsz=3D512 sunit=3D0 blks realtime =3Dnone extsz=3D65536 blocks=3D0, rtextents=3D0 /var/log/messages snippets: Linux 2.4.22 SMP + xfs-2.4.22 patches=20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Dec 24 12:44:49 foobar kernel: xfs_inotobp: xfs_imap() returned an error 2= 2 on md(9,1). Returning error. Dec 24 12:44:49 foobar kernel: xfs_iunlink_remove: xfs_inotobp() returned = an error 22 on md(9,1). Returning error. Dec 24 12:44:49 foobar kernel: xfs_inactive:^Ixfs_ifree() returned an error= =3D 22 on md(9,1) Dec 24 12:44:49 foobar kernel: xfs_force_shutdown(md(9,1),0x1) called from = line 1873 of file xfs_vnodeops.c. Return address =3D 0xf89f5ee6 Dec 24 12:44:49 foobar kernel: Filesystem "md(9,1)": I/O Error Detected. S= hutting down filesystem: md(9,1) Dec 24 12:44:49 foobar kernel: Please umount the filesystem, and rectify th= e problem(s) Jan 6 16:00:05 foobar kernel: Filesystem "md(9,0)": xfs_log_write: reserva= tion ran out. Need to up reservation Jan 6 16:00:05 foobar kernel: xfs_force_shutdown(md(9,0),0x8) called from = line 1715 of file xfs_log.c. Return address =3D 0xf89f5ee6 Jan 6 16:00:05 foobar kernel: Filesystem "md(9,0)": Corruption of in-memor= y data detected. Shutting down filesystem: md(9,0) Jan 6 16:00:05 foobar kernel: Please umount the filesystem, and rectify th= e problem(s) Jan 6 16:00:05 foobar kernel: xfs_force_shutdown(md(9,0),0x2) called from = line 1297 of file xfs_log.c. Return address =3D 0xf89f5ee6 Jan 9 09:40:27 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89ca7d1 Jan 9 09:40:27 foobar kernel: d4b55b8c f89c47b9 f8a11f75 00000001 00000000= f8a11f68 00000362 f89ca7d1 Jan 9 09:40:27 foobar kernel: 0803ae90 00000000 00000001 00000000 0= 0001000 00000001 00000018 00000000 Jan 9 09:40:27 foobar kernel: 00000005 000000f0 0000002f 00000008 0= 003ae90 0000a205 00000000 00000000 Jan 9 09:40:27 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] [] Jan 9 09:40:27 foobar kernel: nfsd: non-standard errno: -990 Jan 9 11:57:24 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89ca7d1 Jan 9 11:57:24 foobar kernel: d4af7b8c f89c47b9 f8a11f75 00000001 00000000= f8a11f68 00000362 f89ca7d1 Jan 9 11:57:24 foobar kernel: 0803ae90 00000000 00000001 00000000 d= 1466000 00000000 00000018 c0259824 Jan 9 11:57:24 foobar kernel: f6c11de0 c4d12580 0000002f 00000008 0= 003ae90 00000000 d1466000 00000000 Jan 9 11:57:24 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] [] [] Jan 9 11:57:24 foobar kernel: nfsd: non-standard errno: -990 Jan 9 14:11:25 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_RETURN= at line 295 of file xfs_alloc.c. Caller 0xf899559d Jan 9 14:11:25 foobar kernel: c44ada00 f89947bd f8a11a09 00000001 00000000= f8a119fd 00000127 f899559d Jan 9 14:11:25 foobar kernel: 00000000 00000000 00000000 0000000c c= a23789c ca237920 00091e18 f899559d Jan 9 14:11:25 foobar kernel: ca23789c ca237920 00091e18 00000010 0= 0091e1c 0000000c 00000001 0009203c Jan 9 14:11:25 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] [] [] [] [<= f89eddbb>] [] [] [] [] [] []=20 Jan 12 08:08:08 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89ca7d1 Jan 12 08:08:08 foobar kernel: d4b91b8c f89c47b9 f8a11f75 00000001 00000000= f8a11f68 00000362 f89ca7d1 Jan 12 08:08:08 foobar kernel: 21000c16 00000000 00000001 00000000 c= 025e407 43011dac 00000018 00000000 Jan 12 08:08:08 foobar kernel: 00000018 00000006 00000056 00000021 0= 0000c16 d5122c20 00000296 00000000 Jan 12 08:08:08 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] [] [] Jan 12 08:08:08 foobar kernel: nfsd: non-standard errno: -990 Added ikeep mount option. Linux 2.4.24 SMP + xfs-2.4.23 patches =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Jan 20 08:50:54 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89cb7d1 Jan 20 08:50:54 foobar kernel: f5497b8c f89c57b9 f8a11e55 00000001 00000000= f8a11e48 00000362 f89cb7d1 Jan 20 08:50:54 foobar kernel: 4d164a94 00000000 00000001 00000000 0= 0000000 00000001 00000018 ffffffff Jan 20 08:50:54 foobar kernel: 00000000 00000000 00000047 0000004d 0= 0164a94 c2908400 039d3000 00000000 Jan 20 08:50:54 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [] [] Jan 20 08:50:54 foobar kernel: nfsd: non-standard errno: -990 Jan 20 09:21:26 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89cb7d1 Jan 20 09:21:26 foobar kernel: f549db8c f89c57b9 f8a11e55 00000001 00000000= f8a11e48 00000362 f89cb7d1 Jan 20 09:21:26 foobar kernel: 4b0035b5 00000000 00000001 00000000 0= 0000000 f5a29400 00000018 00000000 Jan 20 09:21:26 foobar kernel: 00000000 00000001 00000056 0000004b 0= 00035b5 000001c4 f89c9ad3 00000000 Jan 20 09:21:26 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] Jan 20 09:21:26 foobar kernel: nfsd: non-standard errno: -990 Jan 20 16:23:34 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 1634 of file xfs_alloc.c. Caller 0xf8997fe2 Jan 20 16:23:34 foobar kernel: e67e5d28 f8997289 f8a11903 00000001 00000000= f8a118dd 00000662 f8997fe2 Jan 20 16:23:34 foobar kernel: f55fb800 e6cf2a6c e6cf285c 00091e88 0= 00000ab 00000001 00091e18 00000010 Jan 20 16:23:34 foobar kernel: 00000000 00000001 f35c2da0 e6cf8c04 e= 6cf8c04 e6cf5f0c f8997fe2 e6cf8c04 Jan 20 16:23:34 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] Jan 20 16:23:34 foobar kernel: xfs_force_shutdown(md(9,0),0x8) called from = line 4046 of file xfs_bmap.c. Return address =3D 0xf89f6fc6 Jan 20 16:23:34 foobar kernel: Filesystem "md(9,0)": Corruption of in-memor= y data detected. Shutting down filesystem: md(9,0) Jan 20 16:23:34 foobar kernel: Please umount the filesystem, and rectify th= e problem(s) Jan 20 21:05:01 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89cb7d1 Jan 20 21:05:01 foobar kernel: f5d93b8c f89c57b9 f8a11e55 00000001 00000000= f8a11e48 00000362 f89cb7d1 Jan 20 21:05:01 foobar kernel: 4b0035b5 00000000 00000001 00000000 c= 0251659 f76af0c0 00000018 c033f450 Jan 20 21:05:01 foobar kernel: 00000020 00000042 0000003f 0000004b 0= 00035b5 f76af0c0 c607b2d4 00000000 Jan 20 21:05:01 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] Jan 20 21:05:01 foobar kernel: nfsd: non-standard errno: -990 Jan 20 21:16:06 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89cb7d1 Jan 20 21:16:06 foobar kernel: f5d3fb8c f89c57b9 f8a11e55 00000001 00000000= f8a11e48 00000362 f89cb7d1 Jan 20 21:16:06 foobar kernel: 48be1c1e 00000000 00000001 00000000 0= 0001000 00000001 00000018 00000000 Jan 20 21:16:06 foobar kernel: 00000005 000000f0 00000047 00000048 0= 0be1c1e 0000a205 00000000 00000000 Jan 20 21:16:06 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] [] Jan 20 21:16:06 foobar kernel: nfsd: non-standard errno: -990 Jan 20 21:42:26 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89cb7d1 Jan 20 21:42:26 foobar kernel: f5d8fb8c f89c57b9 f8a11e55 00000001 00000000= f8a11e48 00000362 f89cb7d1 Jan 20 21:42:26 foobar kernel: 48be1c1e 00000000 00000001 00000000 0= 0000000 f5fab000 00000018 00000000 Jan 20 21:42:26 foobar kernel: 00000000 00000001 00000056 00000048 0= 0be1c1e 000001c4 f89df7f2 00000000 Jan 20 21:42:26 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] Jan 20 21:42:26 foobar kernel: nfsd: non-standard errno: -990 Jan 21 01:04:41 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89cb7d1 Jan 21 01:04:41 foobar kernel: f5d59b8c f89c57b9 f8a11e55 00000001 00000000= f8a11e48 00000362 f89cb7d1 Jan 21 01:04:41 foobar kernel: 4b0035b5 00000000 00000001 00000000 0= 0000000 f5fab000 00000018 00000000 Jan 21 01:04:41 foobar kernel: 00000000 e02b24ec 0000003f 0000004b 0= 00035b5 c02633f4 ca0458a0 00000000 Jan 21 01:04:41 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] [] Jan 21 01:04:41 foobar kernel: nfsd: non-standard errno: -990 Jan 21 01:08:02 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89cb7d1 Jan 21 01:08:02 foobar kernel: f5d55b8c f89c57b9 f8a11e55 00000001 00000000= f8a11e48 00000362 f89cb7d1 Jan 21 01:08:02 foobar kernel: 4cd07e81 00000000 00000001 00000000 0= 0000000 f5fab000 00000018 00000000 Jan 21 01:08:02 foobar kernel: 00000000 00000001 00000047 0000004c 0= 0d07e81 000001c4 f89c9ad3 00000000 Jan 21 01:08:02 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] Jan 21 01:08:02 foobar kernel: nfsd: non-standard errno: -990 Jan 21 01:23:46 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89cb7d1 Jan 21 01:23:46 foobar kernel: f5877b8c f89c57b9 f8a11e55 00000001 00000000= f8a11e48 00000362 f89cb7d1 Jan 21 01:23:46 foobar kernel: 4cd07e81 00000000 00000001 00000000 0= 0001000 00000001 00000018 00000000 Jan 21 01:23:46 foobar kernel: 00000005 000000f0 00000056 0000004c 0= 0d07e81 0000a205 00000000 00000000 Jan 21 01:23:46 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] [] Jan 21 01:23:46 foobar kernel: nfsd: non-standard errno: -990 Jan 21 06:23:18 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89ca7d1 Jan 21 06:23:18 foobar kernel: f603bb8c f89c47b9 f8a11f75 00000001 00000000= f8a11f68 00000362 f89ca7d1 Jan 21 06:23:18 foobar kernel: 4cd07e81 00000000 00000001 00000000 f= 5c938c0 f7fcba40 00000018 f7fcba40 Jan 21 06:23:18 foobar kernel: c0264300 f603bca8 0000003f 0000004c 0= 0d07e81 00000040 f5c938c0 00000000 Jan 21 06:23:18 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] [] Jan 21 06:23:18 foobar kernel: nfsd: non-standard errno: -990 Jan 21 06:33:41 foobar kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 866 of file xfs_ialloc.c. Caller 0xf89ca7d1 Jan 21 06:33:41 foobar kernel: f6015b8c f89c47b9 f8a11f75 00000001 00000000= f8a11f68 00000362 f89ca7d1 Jan 21 06:33:41 foobar kernel: 4cd07e81 00000000 00000001 00000000 c= 2dcd800 c2dcd800 00000018 c2dcd800 Jan 21 06:33:41 foobar kernel: c0264300 f6015ca8 00000047 0000004c 0= 0d07e81 c47ab308 e9207acc 00000000 Jan 21 06:33:41 foobar kernel: Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] = [] [] [] [] [] [] [] [] [] [] [= ] [] [] [] Jan 21 06:33:41 foobar kernel: nfsd: non-standard errno: -990 --=-GbYnO8yNucPG3vyLsb4+--