Received: (from majordomo@localhost) by oss.sgi.com (8.11.2/8.11.3) id g1RHrlk11041 for linux-xfs-outgoing; Wed, 27 Feb 2002 09:53:47 -0800 Received: from post.webmailer.de (natwar.webmailer.de [192.67.198.70]) by oss.sgi.com (8.11.2/8.11.3) with SMTP id g1RHrR911017 for ; Wed, 27 Feb 2002 09:53:27 -0800 Received: from there (pool2149.studentenheim.uni-tuebingen.de [134.2.212.149]) by post.webmailer.de (8.9.3/8.8.7) with SMTP id PAA03827 for ; Wed, 27 Feb 2002 15:52:46 +0100 (MET) Message-Id: <200202271452.PAA03827@post.webmailer.de> Content-Type: text/plain; charset="iso-8859-1" From: Simon Pabst Reply-To: elros@numenor.de To: "linux-xfs@oss.sgi.com" Subject: again: xfs/lvm problem, disappearing files Date: Wed, 27 Feb 2002 15:51:35 +0100 X-Mailer: KMail [version 1.3.2] MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-xfs@oss.sgi.com Precedence: bulk Hi all, I got the problem I reportet yesterday again, while writing to a samba-mounted xfs-lvm volume (/fs3) (this time vmware-win2k, yesterday real win2k). I was writing files to /fs3/scr/dc which is now reported empty: root@judicator:/fs3/scr > ls -la /fs3/scr/dc total 0 the client process on windows is happily writing to this folder while I did the ls. Also in the parent directory /fs3/scr/ one of my backup-directories "chimaera_backups" is missing (didn't even touch it for days): root@judicator:/fs3/scr > ls -la /fs3/scr ls: /fs3/scr/chimaera_backups: No such file or directory total 16 drwxrwxrwt 7 root root 88 Feb 25 00:56 . drwxr-xr-x 5 root root 46 Feb 8 18:19 .. drwxrwx--- 11 thrawn thrawn 4096 Feb 27 13:54 dc drwxr-xr-x 6 root root 75 Feb 17 20:06 judicator_backups drwxrwx--- 7 thrawn thrawn 4096 Feb 26 15:50 old_dc drwxrwsrwt 3 thrawn admin 4096 Feb 27 13:20 upload I had this "no such file or directory" problem with reiserfs a year ago, while writing to nfs shares. Running xfs_repair -n after umount /fs3 gives me root@judicator:/ > xfs_repair -n /dev/vg02/lvol1 Phase 1 - find and verify superblock... bad primary superblock - bad magic number !!! attempting to find secondary superblock... ........................................................................................................... I abortet this and did a reboot. Now xfs_repair -n shows no errors at all, and root@judicator:/ > xfs_repair /dev/vg02/lvol1 xfs_repair: warning - cannot set blocksize on block device /dev/vg02/lvol1: Invalid argument Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - agno = 32 - agno = 33 - agno = 34 - agno = 35 - agno = 36 - agno = 37 - agno = 38 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - clearing existing "lost+found" inode - deleting existing "lost+found" entry - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - agno = 32 - agno = 33 - agno = 34 - agno = 35 - agno = 36 - agno = 37 - agno = 38 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 18376471, moving to lost+found disconnected dir inode 18800263, moving to lost+found disconnected dir inode 117385730, moving to lost+found disconnected dir inode 275899789, moving to lost+found disconnected dir inode 310212310, moving to lost+found disconnected dir inode 402676446, moving to lost+found disconnected dir inode 506154388, moving to lost+found Phase 7 - verify and correct link counts... done I get some empty folders in lost+found, and some files that were from completely different parts of the directory structure (not /fs3/scr/ where I was writing) apart from those files nothing in /fs3/scr seems to be missing, but the files that were written last seem to be corrupted. I'm afraid it is still the "ancient" RH-2.4.9-13 kernel, haven't had time to upgrade (BTW oss.sgi.com seems down for me). same as yesterday: athlon 900, 512mb mount options: /dev/vg02/lvol1 /fs3 xfs defaults,noatime,nodiratime,logbufs=4 1 2 vg02/lvol1 is a 2-disk stripeset using 2 maxtor 80gb ide drives /var/log/messages did not show any xfs-related information any clues would be highly appreciated thanks -simon pabst