From owner-xfs@oss.sgi.com Sat Feb 2 01:48:52 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Feb 2008 01:48:57 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_50,DATE_IN_PAST_03_06 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m129moDj031280 for ; Sat, 2 Feb 2008 01:48:52 -0800 X-ASG-Debug-ID: 1201945751-02c100010000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from outbound-mail-25.bluehost.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with SMTP id C447D591F79 for ; Sat, 2 Feb 2008 01:49:11 -0800 (PST) Received: from outbound-mail-25.bluehost.com (outbound-mail-25.bluehost.com [69.89.21.20]) by cuda.sgi.com with SMTP id 2yh1dF1sfWzOXx3s for ; Sat, 02 Feb 2008 01:49:11 -0800 (PST) Received: (qmail 22650 invoked by uid 0); 2 Feb 2008 09:49:11 -0000 Received: from unknown (HELO box176.bluehost.com) (69.89.25.176) by mailproxy2.bluehost.com with SMTP; 2 Feb 2008 09:49:11 -0000 Received: from localhost ([127.0.0.1] helo=box176.bluehost.com) by box176.bluehost.com with esmtp (Exim 4.68) (envelope-from ) id 1JLEzn-0007Dw-AS for linux-xfs@oss.sgi.com; Sat, 02 Feb 2008 02:49:11 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-ASG-Orig-Subj: Welcome to the "Com.vn" mailing list Subject: Welcome to the "Com.vn" mailing list From: com.vn-request@chanhung.com To: linux-xfs@oss.sgi.com X-No-Archive: yes Message-ID: Date: Fri, 01 Feb 2008 21:04:09 -0700 Precedence: bulk X-BeenThere: com.vn@chanhung.com X-Mailman-Version: 2.1.9.cp2 X-Identified-User: {507:box176.bluehost.com:mailman:box176.bluehost.com} {sentby:program running on server} X-Barracuda-Connect: outbound-mail-25.bluehost.com[69.89.21.20] X-Barracuda-Start-Time: 1201945752 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.07 X-Barracuda-Spam-Status: No, SCORE=-1.07 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=BSF_SC0_SA085b, NO_REAL_NAME X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41150 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.55 NO_REAL_NAME From: does not include a real name 0.40 BSF_SC0_SA085b URI: Custom Rule SA085b X-Virus-Scanned: ClamAV 0.91.2/5649/Fri Feb 1 16:54:58 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14310 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: com.vn-request@chanhung.com Precedence: bulk X-list: xfs Welcome to the Com.vn@chanhung.com mailing list! To post to this list, send your email to: com.vn@chanhung.com General information about the mailing list is at: http://chanhung.com/mailman/listinfo/com.vn_chanhung.com If you ever want to unsubscribe or change your options (eg, switch to or from digest mode, change your password, etc.), visit your subscription page at: http://chanhung.com/mailman/options/com.vn_chanhung.com/linux-xfs%40oss.sgi.com You can also make such adjustments via email by sending a message to: Com.vn-request@chanhung.com with the word `help' in the subject or body (don't include the quotes), and you will get back a message with instructions. You must know your password to change your options (including changing the password, itself) or to unsubscribe. It is: kiapuxix Normally, Mailman will remind you of your chanhung.com mailing list passwords once every month, although you can disable this if you prefer. This reminder will also include instructions on how to unsubscribe or change your account options. There is also a button on your options page that will email your current password to you. From owner-xfs@oss.sgi.com Sat Feb 2 02:12:37 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Feb 2008 02:12:39 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_45 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m12ACWCG000537 for ; Sat, 2 Feb 2008 02:12:37 -0800 X-ASG-Debug-ID: 1201947159-02c200250000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from hs-out-2122.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 70B7C59201E for ; Sat, 2 Feb 2008 02:12:40 -0800 (PST) Received: from hs-out-2122.google.com (hs-out-0708.google.com [64.233.178.244]) by cuda.sgi.com with ESMTP id PqEzeTdQhIjVCFRQ for ; Sat, 02 Feb 2008 02:12:40 -0800 (PST) Received: by hs-out-2122.google.com with SMTP id 4so1258964hsl.4 for ; Sat, 02 Feb 2008 02:12:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; bh=KQ+XDSd85LR04eZyGat7UEEw/AclcpVu1p2W5m8ql8s=; b=uNa2UJAoVtNrPM3B0M+CfqhwGbNYUjNiu1XV28RZC4+GRyQdkFeltt9Qgqbxsaur/T6Zk6CG4HNZHVS7uJfRYJIwmR1tx4t0tjBPloF7gTyV1ODCUw5/CShqbmjW6H3LNcTx7GnnqNrpcUamj2V9PGKJTa1kxH8JwGhaNI9FFX4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=K3mw7VBZeG6MiBKuzw+tmYDaXcyrVnt5spkLt9H2n7xAsdAcaTZejcO0hK8jp41eK6mvbtE2o5RYs5PKd3b1tJ6DAXQEwH/qYf4kmAzRezMAaV2hAA1u33vyFN+j1s203gQorvqwlgFnqpxDR9C1ROdcv0FaroS02tB6flvBexI= Received: by 10.150.136.6 with SMTP id j6mr1820493ybd.126.1201947158951; Sat, 02 Feb 2008 02:12:38 -0800 (PST) Received: by 10.150.133.20 with HTTP; Sat, 2 Feb 2008 02:12:38 -0800 (PST) Message-ID: <8f1895b90802020212h278968efy7a644c55c480134e@mail.gmail.com> Date: Sat, 2 Feb 2008 11:12:38 +0100 From: "Per Lundberg" To: xfs@oss.sgi.com X-ASG-Orig-Subj: XFS: failed to read RT inodes Subject: XFS: failed to read RT inodes MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Barracuda-Connect: hs-out-0708.google.com[64.233.178.244] X-Barracuda-Start-Time: 1201947160 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41150 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5649/Fri Feb 1 16:54:58 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14311 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: perlun@gmail.com Precedence: bulk X-list: xfs Hello, After a hard drive failure (and no backup...) I am trying with all my strength do recover the file systems... With no real success. I have dumped over the two file systems in question to another, healthy disk drive. After dumping back the data to a third hard disk, one of the file system can be successfully mounted but with a huge number of files in the lost+found directory (after recovery using xfs_repair). Strange since the dump indicated only 11 bad sectors on that file system. However, the big (55 gig) file system cannot even be mounted and this is troublesome since it contains the most valuable data. This is what I get after doing xfs_repair on the volume: Jan 10 10:30:59 amos kernel: SGI XFS with ACLs, security attributes, realtime, large block numbers, no debug enabled Jan 10 10:30:59 amos kernel: SGI XFS Quota Management subsystem Jan 10 10:31:00 amos kernel: XFS mounting filesystem sda10 Jan 10 10:31:00 amos kernel: XFS: failed to read RT inodes The repair was done using Debian-compiled Linux kernel (2.6.17-2-k7) using xfsprogs 2.8.11-1. Afterwards, I've upgraded the kernel to 2.6.22 and xfsprogs 2.9.4-2 but it doesn't really make much of a difference. I've tried to disable the realtime stuff afterwards, by mounting the filesystem using rtdev=/dev/null but I only got the following message out of that: Jan 10 10:36:40 amos kernel: XFS: Invalid device [/dev/null], error=-15 *Any* ideas on how to proceed with this is greatly appreciated... Thanks in advance. -- Best regards, Per Lundberg From owner-xfs@oss.sgi.com Sat Feb 2 07:59:01 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Feb 2008 07:59:04 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_45 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m12FwtFE022265 for ; Sat, 2 Feb 2008 07:59:01 -0800 X-ASG-Debug-ID: 1201967657-4d7300d10000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 915CE592AFB for ; Sat, 2 Feb 2008 07:54:17 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id U03eo3LQigmy7XQG for ; Sat, 02 Feb 2008 07:54:17 -0800 (PST) Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 6CDCB18DAFE24; Sat, 2 Feb 2008 09:53:43 -0600 (CST) Message-ID: <47A49206.4020700@sandeen.net> Date: Sat, 02 Feb 2008 09:53:42 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Per Lundberg CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: XFS: failed to read RT inodes Subject: Re: XFS: failed to read RT inodes References: <8f1895b90802020212h278968efy7a644c55c480134e@mail.gmail.com> In-Reply-To: <8f1895b90802020212h278968efy7a644c55c480134e@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1201967657 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41174 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5653/Sat Feb 2 05:54:59 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14312 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Per Lundberg wrote: > Hello, > > After a hard drive failure (and no backup...) I am trying with all my > strength do recover the file systems... With no real success. I have > dumped over the two file systems in question to another, healthy disk > drive. After dumping back the data to a third hard disk, one of the > file system can be successfully mounted but with a huge number of > files in the lost+found directory (after recovery using xfs_repair). > Strange since the dump indicated only 11 bad sectors on that file > system. > > However, the big (55 gig) file system cannot even be mounted and this > is troublesome since it contains the most valuable data. This is what > I get after doing xfs_repair on the volume: > > Jan 10 10:30:59 amos kernel: SGI XFS with ACLs, security attributes, > realtime, large block numbers, no debug enabled > Jan 10 10:30:59 amos kernel: SGI XFS Quota Management subsystem > Jan 10 10:31:00 amos kernel: XFS mounting filesystem sda10 > Jan 10 10:31:00 amos kernel: XFS: failed to read RT inodes maybe try xfs_db on the device; "sb 0" and "p" commands to see what the rt inodes are. -Eric From owner-xfs@oss.sgi.com Sat Feb 2 10:07:21 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Feb 2008 10:07:25 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m12I7Jia032249 for ; Sat, 2 Feb 2008 10:07:21 -0800 X-ASG-Debug-ID: 1201975661-49d702140000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from py-out-1112.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id AB95CD76D10 for ; Sat, 2 Feb 2008 10:07:41 -0800 (PST) Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.179]) by cuda.sgi.com with ESMTP id 0syRb1yJmWa9XwTN for ; Sat, 02 Feb 2008 10:07:41 -0800 (PST) Received: by py-out-1112.google.com with SMTP id j37so1673733pyc.4 for ; Sat, 02 Feb 2008 10:07:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:from:to:cc:subject:date:message-id:x-mailer; bh=Wu0g60u/KYOWC5JBgRvQHpt2mWsgcKEnD9/RT9q3WOw=; b=QzHmQOTLiNUslOJyKFmjJ8D7RGtXV23oj1vO+D60Gfch6RNhs2d/tGei+CgWK6I7s7uB9k0q9qMdqqnZu1JCF0ht7gSqpHbDvY98EoAbH+HsGtbW2T8hb+hxTDZsoQ/0S2tM+LBF/IRst+ervKBhXr/swkyqKjQhVALvIfieLos= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:date:message-id:x-mailer; b=L+JUWONM0WCFqZm/Y4/xHILkGaGxGPcomV6w4mKGzteuAZ/PAf7hqVeI7ZQjXgKlLhLUEARDPMxjES6t26AbVKDlBAyL93F7uSVGaFEFTcgBZf7sqZZXb/NsFmmfPQVZIE2FGX5OhiLckuBU6OZ1XzIjBZQqS0mPkRPLoqEWOBY= Received: by 10.141.185.3 with SMTP id m3mr3394244rvp.236.1201975659781; Sat, 02 Feb 2008 10:07:39 -0800 (PST) Received: from tux ( [219.134.231.43]) by mx.google.com with ESMTPS id g1sm3524827rvb.0.2008.02.02.10.07.34 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 02 Feb 2008 10:07:39 -0800 (PST) Received: by tux (sSMTP sendmail emulation); Sun, 3 Feb 2008 02:07:23 +0800 From: Denis Cheng To: Tim Shimmin , Tim Shimmin Cc: xfs@oss.sgi.com, linux-kernel@vger.kernel.org X-ASG-Orig-Subj: [PATCH] xfs: add __init/__exit mark to specific init/cleanup functions Subject: [PATCH] xfs: add __init/__exit mark to specific init/cleanup functions Date: Sun, 3 Feb 2008 02:07:23 +0800 Message-Id: <1201975643-8662-1-git-send-email-crquan@gmail.com> X-Mailer: git-send-email 1.5.3.8 X-Barracuda-Connect: py-out-1112.google.com[64.233.166.179] X-Barracuda-Start-Time: 1201975661 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41182 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5656/Sat Feb 2 08:31:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14313 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: crquan@gmail.com Precedence: bulk X-list: xfs Signed-off-by: Denis Cheng --- fs/xfs/linux-2.6/xfs_super.c | 2 +- fs/xfs/linux-2.6/xfs_vnode.c | 2 +- fs/xfs/support/ktrace.c | 4 ++-- fs/xfs/support/uuid.c | 2 +- fs/xfs/xfs_vfsops.c | 4 ++-- 5 files changed, 7 insertions(+), 7 deletions(-) diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c index 8cb63c6..abacbdd 100644 --- a/fs/xfs/linux-2.6/xfs_super.c +++ b/fs/xfs/linux-2.6/xfs_super.c @@ -361,7 +361,7 @@ xfs_fs_inode_init_once( inode_init_once(vn_to_inode((bhv_vnode_t *)vnode)); } -STATIC int +STATIC int __init xfs_init_zones(void) { xfs_vnode_zone = kmem_zone_init_flags(sizeof(bhv_vnode_t), "xfs_vnode", diff --git a/fs/xfs/linux-2.6/xfs_vnode.c b/fs/xfs/linux-2.6/xfs_vnode.c index 814169f..56e23cd 100644 --- a/fs/xfs/linux-2.6/xfs_vnode.c +++ b/fs/xfs/linux-2.6/xfs_vnode.c @@ -40,7 +40,7 @@ #define vptosync(v) (&vsync[((unsigned long)v) % NVSYNC]) static wait_queue_head_t vsync[NVSYNC]; -void +void __init vn_init(void) { int i; diff --git a/fs/xfs/support/ktrace.c b/fs/xfs/support/ktrace.c index 5cf2e86..17f2b7c 100644 --- a/fs/xfs/support/ktrace.c +++ b/fs/xfs/support/ktrace.c @@ -21,7 +21,7 @@ static kmem_zone_t *ktrace_hdr_zone; static kmem_zone_t *ktrace_ent_zone; static int ktrace_zentries; -void +void __init ktrace_init(int zentries) { ktrace_zentries = zentries; @@ -36,7 +36,7 @@ ktrace_init(int zentries) ASSERT(ktrace_ent_zone); } -void +void __exit ktrace_uninit(void) { kmem_zone_destroy(ktrace_hdr_zone); diff --git a/fs/xfs/support/uuid.c b/fs/xfs/support/uuid.c index e157015..493a6ec 100644 --- a/fs/xfs/support/uuid.c +++ b/fs/xfs/support/uuid.c @@ -133,7 +133,7 @@ uuid_table_remove(uuid_t *uuid) mutex_unlock(&uuid_monitor); } -void +void __init uuid_init(void) { mutex_init(&uuid_monitor); diff --git a/fs/xfs/xfs_vfsops.c b/fs/xfs/xfs_vfsops.c index a154459..cde337e 100644 --- a/fs/xfs/xfs_vfsops.c +++ b/fs/xfs/xfs_vfsops.c @@ -58,7 +58,7 @@ #include "xfs_vfsops.h" -int +int __init xfs_init(void) { extern kmem_zone_t *xfs_bmap_free_item_zone; @@ -152,7 +152,7 @@ xfs_init(void) return 0; } -void +void __exit xfs_cleanup(void) { extern kmem_zone_t *xfs_bmap_free_item_zone; -- 1.5.3.8 From owner-xfs@oss.sgi.com Sat Feb 2 11:28:35 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Feb 2008 11:28:39 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,J_CHICKENPOX_45 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m12JSYEC003943 for ; Sat, 2 Feb 2008 11:28:35 -0800 X-ASG-Debug-ID: 1201980535-149e00850000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wx-out-0506.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3541F100885F for ; Sat, 2 Feb 2008 11:28:55 -0800 (PST) Received: from wx-out-0506.google.com (wx-out-0506.google.com [66.249.82.228]) by cuda.sgi.com with ESMTP id UHSg5BUAnHqgQVAW for ; Sat, 02 Feb 2008 11:28:55 -0800 (PST) Received: by wx-out-0506.google.com with SMTP id s9so1547752wxc.32 for ; Sat, 02 Feb 2008 11:28:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=RXN4zfrK83KxRefUwD6+FtmL9N8ToMTjhIqvy4rW0pE=; b=kFCaPvGoSxPk9jEpKgMtVIEfmBSoCJmYB41NERsfbhfDDQuXubsT1Z1fzSXG699pdnHqi95yLc24yHbAi7tCeEWekVCmQo3kj4ub1HyW87dMwQxIr0+AP89h7sPqRae7kYwLJ9JC/Sk76jo6fVKKOouhKczifgYgGykCvGYnZZg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=PnhvRS3V5UPgL4jfyAo2exHYIx3NGADQYK1yOikbWQOoh3JXsUhSOKegPiBNCqxmRvUAaip6WjVJiIJaEKHXQBtIoWqWr5KGLg9U4qyByhHGW5IxOJZO4mWxPIb+EKDfZSsQhAspy7znf0n0BVhyxcTrn6hyRmI8YQaa/OT0CNc= Received: by 10.150.189.9 with SMTP id m9mr2023821ybf.73.1201978893046; Sat, 02 Feb 2008 11:01:33 -0800 (PST) Received: by 10.150.133.20 with HTTP; Sat, 2 Feb 2008 11:01:32 -0800 (PST) Message-ID: <8f1895b90802021101k63d29842re986ef75ffcc113@mail.gmail.com> Date: Sat, 2 Feb 2008 20:01:32 +0100 From: "Per Lundberg" To: "Eric Sandeen" X-ASG-Orig-Subj: Re: XFS: failed to read RT inodes Subject: Re: XFS: failed to read RT inodes Cc: xfs@oss.sgi.com In-Reply-To: <47A49206.4020700@sandeen.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <8f1895b90802020212h278968efy7a644c55c480134e@mail.gmail.com> <47A49206.4020700@sandeen.net> X-Barracuda-Connect: wx-out-0506.google.com[66.249.82.228] X-Barracuda-Start-Time: 1201980536 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41186 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5656/Sat Feb 2 08:31:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14314 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: perlun@gmail.com Precedence: bulk X-list: xfs On Feb 2, 2008 4:53 PM, Eric Sandeen wrote: > > Jan 10 10:30:59 amos kernel: SGI XFS with ACLs, security attributes, > > realtime, large block numbers, no debug enabled > > Jan 10 10:30:59 amos kernel: SGI XFS Quota Management subsystem > > Jan 10 10:31:00 amos kernel: XFS mounting filesystem sda10 > > Jan 10 10:31:00 amos kernel: XFS: failed to read RT inodes > maybe try xfs_db on the device; "sb 0" and "p" commands to see what the > rt inodes are. Thanks for the hint! However, what I did was do a full restore of the volume again and then xfs_repair + xfs_repair -L. Now, I can mount it - but *everything* (and I really mean literally everything) has been put into lost+found. Slightly annoying to say the least, but still a whole lot better than having lost it all. I think it *could* be because of a bad imaging program (Image for Windows). The first sector of the partition doesn't seem to come through properly (after restore, it contains 512 0x00 bytes. Quite wrong in other words, so that could be the reason for the breakage. I guess I could put back the broken drive again and just "dd" over the data to the working disk but when I tried to do that yesterday on another partition, I got 500 or so bad sectors... compared to 11 with Image for Windows. So either Image for Windows was lying to me or the disk has been starting to fall apart even more. :-) -- Best regards, Per Lundberg From owner-xfs@oss.sgi.com Sat Feb 2 23:43:11 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Feb 2008 23:43:16 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: * X-Spam-Status: No, score=1.2 required=5.0 tests=AWL,BAYES_20,J_CHICKENPOX_210 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m137h8mq025169 for ; Sat, 2 Feb 2008 23:43:11 -0800 X-ASG-Debug-ID: 1202024610-7b7f00180000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from elasmtp-dupuy.atl.sa.earthlink.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id D1181594CB3 for ; Sat, 2 Feb 2008 23:43:30 -0800 (PST) Received: from elasmtp-dupuy.atl.sa.earthlink.net (elasmtp-dupuy.atl.sa.earthlink.net [209.86.89.62]) by cuda.sgi.com with ESMTP id iueSzk56KfCJLDFl for ; Sat, 02 Feb 2008 23:43:30 -0800 (PST) Received: from [64.131.226.230] (helo=[192.168.0.150]) by elasmtp-dupuy.atl.sa.earthlink.net with asmtp (TLSv1:AES256-SHA:256) (Exim 4.34) id 1JLZV6-0002cQ-3v; Sun, 03 Feb 2008 02:42:52 -0500 Message-ID: <47A57056.5070904@mpigani.org> Date: Sun, 03 Feb 2008 02:42:14 -0500 From: Dragos User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: David Chinner CC: Michael Tokarev , David Greaves , linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: assemble vs create an array....... Subject: Re: assemble vs create an array....... X-Priority: 1 (Highest) References: <474F869D.5040503@mpigani.org> <18255.41044.614676.410107@notabene.brown> <47501D7E.7000804@dgreaves.com> <475552D2.4000802@mpigani.org> <47568DE1.1050108@dgreaves.com> <4758129D.40600@mpigani.org> <475825C0.4070605@msgid.tls.msk.ru> <20071206212225.GN115527101@sgi.com> In-Reply-To: <20071206212225.GN115527101@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-ELNK-Trace: 382be18c4f3f7bcad780f4a490ca6956abb457f1b4332f52b7337aa60c136f28ffc5a819a1d59646350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c X-Originating-IP: 64.131.226.230 X-Barracuda-Connect: elasmtp-dupuy.atl.sa.earthlink.net[209.86.89.62] X-Barracuda-Start-Time: 1202024610 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.90 X-Barracuda-Spam-Status: No, SCORE=-1.90 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=X_PRIORITY_HIGH X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41233 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.12 X_PRIORITY_HIGH Sent with 'X-Priority' set to high X-Virus-Scanned: ClamAV 0.91.2/5665/Sat Feb 2 18:35:12 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14315 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dragos@mpigani.org Precedence: bulk X-list: xfs Hello, I am not sure if you have received my email from last week with the results of the different combinations prescribed (it contained html code). Anyway, I did a ro mount to check the partition and was happy to see a lot of files intact. A few seemed destroyed, but I am not sure. I tried a xfs_check on the partition and it told me: ERROR: The filesystem have valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_check. If you are unable to mount the filesystem, then use the xfs_repair -L option to destroy the log and attempt a repair. Since I am unable to mount the partition, shoud I use the -L option with xfs_repair, or let it run without it? Again, please let me know if I should resend my previous email with the log file of "xfs_repair -n". Thank you for your time, Dragos David Chinner wrote: > On Thu, Dec 06, 2007 at 07:39:28PM +0300, Michael Tokarev wrote: > >> What to do is to give repairfs a try for each permutation, >> but again without letting it to actually fix anything. >> Just run it in read-only mode and see which combination >> of drives gives less errors, or no fatal errors (there >> may be several similar combinations, with the same order >> of drives but with different drive "missing"). >> > > Ugggh. > > >> It's sad that xfs refuses mount when "structure needs >> cleaning" - the best way here is to actually mount it >> and see how it looks like, instead of trying repair >> tools. >> > > It self protection - if you try to write to a corrupted filesystem, > you'll only make the corruption worse. Mounting involves log > recovery, which writes to the filesystem.... > > >> Is there some option to force-mount it still >> (in readonly mode, knowing it may OOPs kernel etc)? >> > > Sure you can: mount -o ro,norecovery > > But it you hit corruption it will still shut down on you. If > the machine oopses then that is a bug. > > >> thread prompted me to think. If I can't force-mount it >> (or browse it using other ways) as I can almost always >> do with (somewhat?) broken ext[23] just to examine things, >> maybe I'm trying it before it's mature enough? ;) >> > > Hehe ;) > > For maximum uber-XFS-guru points, learn to browse your filesystem > with xfs_db. :P > > Cheers, > > Dave. > From owner-xfs@oss.sgi.com Sun Feb 3 14:05:26 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Feb 2008 14:05:33 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m13M5N84021713 for ; Sun, 3 Feb 2008 14:05:26 -0800 X-ASG-Debug-ID: 1202076345-4a3900570000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from postoffice.aconex.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 6F006D7ADE8 for ; Sun, 3 Feb 2008 14:05:46 -0800 (PST) Received: from postoffice.aconex.com (prod.aconex.com [203.89.192.138]) by cuda.sgi.com with ESMTP id bttLAv3VbP1RUvbZ for ; Sun, 03 Feb 2008 14:05:46 -0800 (PST) Received: from edge.scott.net.au (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id E111792D3EE; Mon, 4 Feb 2008 09:05:43 +1100 (EST) X-ASG-Orig-Subj: Re: NFSD on XFS with RT subvolume Subject: Re: NFSD on XFS with RT subvolume From: Nathan Scott Reply-To: nscott@aconex.com To: Rabeeh Khoury Cc: linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: Aconex Date: Mon, 04 Feb 2008 09:05:43 +1100 Message-Id: <1202076343.9463.465.camel@edge.scott.net.au> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: prod.aconex.com[203.89.192.138] X-Barracuda-Start-Time: 1202076346 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41290 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5673/Sun Feb 3 13:06:20 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14316 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs On Wed, 2008-01-30 at 16:37 +0200, Rabeeh Khoury wrote: > Hi All, > > Exporting an XFS volume with kernel NFSD when real-time subvolume is > enabled hangs the kernel. > > I'm using vanilla LK 2.6.22.7; first I create the XFS volume with two > partitions of 20GB each with extent size of 1MB; then I create a > subdirectory in the volume and mark it (using xfs_io util) as it belongs > to the rt subvolume with inheritance flag. > > After mounting that volume through NFSv3 / UDP; and trying a 'dd > if=/dev/zero of=/mnt/rt/test bs=1M count=1000' the machine running NFSD > hangs infinitely. Did you manage to get a stack trace, OOC? No reason why it shouldn't work AFAIK. cheers. -- Nathan From owner-xfs@oss.sgi.com Sun Feb 3 14:13:32 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Feb 2008 14:13:34 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_23 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m13MDPnu022413 for ; Sun, 3 Feb 2008 14:13:30 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA08295; Mon, 4 Feb 2008 09:13:36 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m13MDWLF49855544; Mon, 4 Feb 2008 09:13:34 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m13MDQ4449837659; Mon, 4 Feb 2008 09:13:26 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Mon, 4 Feb 2008 09:13:26 +1100 From: David Chinner To: Peter Zijlstra Cc: Sven Geggus , linux-kernel@vger.kernel.org, David Chinner , xfs@oss.sgi.com, hch@lst.de Subject: Re: XFS oops in vanilla 2.6.24 Message-ID: <20080203221326.GV155407@sgi.com> References: <1201778591.28547.291.camel@lappy> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1201778591.28547.291.camel@lappy> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5673/Sun Feb 3 13:06:20 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14317 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Thu, Jan 31, 2008 at 12:23:11PM +0100, Peter Zijlstra wrote: > Lets CC the XFS maintainer.. Adding the xfs list and hch. It might be a couple of days before I get to this - I've got a week of backlog to catch up on after LCA.... > On Wed, 2008-01-30 at 20:23 +0000, Sven Geggus wrote: > > Hi there, > > > > I get the following with 2.6.24: > > > > Ending clean XFS mount for filesystem: dm-0 > > BUG: unable to handle kernel paging request at virtual address f2134000 How long after mount does this happen? Does it happen when listing a specific directory? i.e. do you have a reproducable test case for it? Cheers, Dave. > > printing eip: c021a13a *pde = 010b5067 *pte = 32134000 > > Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC > > Modules linked in: radeon drm rfcomm l2cap sym53c8xx scsi_transport_spi snd_via82xx 8139too snd_mpu401_uart snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc via_agp agpgart > > > > Pid: 3889, comm: bash Not tainted (2.6.24 #3) > > EIP: 0060:[] EFLAGS: 00010282 CPU: 0 > > EIP is at xfs_file_readdir+0xfa/0x18c > > EAX: 00000000 EBX: 000002f5 ECX: 00000020 EDX: 00000000 > > ESI: 00000000 EDI: f2133ff8 EBP: f227ff68 ESP: f227ff10 > > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 > > Process bash (pid: 3889, ti=f227e000 task=f7205a80 task.ti=f227e000) > > Stack: 000002f5 00000000 2c000137 00000000 00000000 c0165358 f227ff94 f221c810 > > f2d85e48 00000000 00000000 00000000 000002f5 00000000 f2133000 00001000 > > 00000ff8 000002f9 00000000 c0421c80 f221c810 f1cdbe48 f227ff88 c0165543 > > Call Trace: > > [] show_trace_log_lvl+0x1a/0x2f > > [] show_stack_log_lvl+0x9b/0xa3 > > [] show_registers+0xa0/0x1e2 > > [] die+0x10f/0x1dd > > [] do_page_fault+0x43a/0x519 > > [] error_code+0x6a/0x70 > > [] vfs_readdir+0x5d/0x89 > > [] sys_getdents64+0x5e/0xa0 > > [] syscall_call+0x7/0xb > > ======================= > > Code: 89 74 24 04 81 e3 ff ff ff 7f 89 1c 24 ff 55 bc 85 c0 0f 85 82 00 00 00 8b 4f 10 31 d2 83 c1 1f 83 e1 f8 29 4d d0 19 55 d4 01 cf <8b> 57 08 8b 4f 0c 89 55 d8 89 4d dc 83 7d d4 00 7f a1 7c 06 83 > > EIP: [] xfs_file_readdir+0xfa/0x18c SS:ESP 0068:f227ff10 > > ---[ end trace e518e1370efb695e ]--- > > > > Sven > > -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sun Feb 3 14:25:13 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Feb 2008 14:25:16 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m13MP9WE023401 for ; Sun, 3 Feb 2008 14:25:12 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA08613; Mon, 4 Feb 2008 09:25:27 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m13MPQLF49872564; Mon, 4 Feb 2008 09:25:27 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m13MPPPo49847381; Mon, 4 Feb 2008 09:25:25 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Mon, 4 Feb 2008 09:25:25 +1100 From: David Chinner To: Sven Geggus Cc: xfs@oss.sgi.com Subject: Re: XFS oops in vanilla 2.6.24 Message-ID: <20080203222525.GX155407@sgi.com> References: <1201778591.28547.291.camel@lappy> <20080203221326.GV155407@sgi.com> <20080203222034.GA7460@diesel.geggus.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080203222034.GA7460@diesel.geggus.net> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5673/Sun Feb 3 13:06:20 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14318 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Sun, Feb 03, 2008 at 11:20:34PM +0100, Sven Geggus wrote: > David Chinner schrieb am Sonntag, den 03. Februar um 23:13 Uhr: > > > How long after mount does this happen? Does it happen when listing a specific > > directory? i.e. do you have a reproducable test case for it? > > Well, it happens very soon after mounting the device. Typically when > doing ls or something trivial like this. Unfortunately this is not > really reproducable but it does happen reproducable on one specific > device here sooner or later when booting into 2.6.24. Ok. Can you use xfs_metadump to grab an image of the filesystem that you are using to reproduce this and put it up somewhere on the web so I can try to reproduce it locally? > I run xfs-repair just to make shure that the filesystem is not > corrupted. Works fine when booting 2.6.23.8 which I have been using > before. Good - that saves me from having to ask. ;) Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sun Feb 3 14:52:08 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Feb 2008 14:52:11 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m13Mq5SE025022 for ; Sun, 3 Feb 2008 14:52:07 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA09270; Mon, 4 Feb 2008 09:52:22 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m13MqLLF49594618; Mon, 4 Feb 2008 09:52:22 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m13MqI0949831254; Mon, 4 Feb 2008 09:52:18 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Mon, 4 Feb 2008 09:52:18 +1100 From: David Chinner To: Michael Nishimoto Cc: XFS Mailing List Subject: Re: Extent merging past MAXEXTLEN Message-ID: <20080203225218.GY155407@sgi.com> References: <479E4A52.6000804@agami.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <479E4A52.6000804@agami.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5673/Sun Feb 3 13:06:20 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14319 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Mon, Jan 28, 2008 at 01:34:10PM -0800, Michael Nishimoto wrote: > Hi everyone, > > Is there a reason to continue limiting extent merges past MAXEXTLEN > on 64-bit systems? I don't think 32/64bit issues enter into this - the max extent length is determined by the filesystem block size (the on disk length is in filesystem blocks) so by default we are already at byte lengths greater than what fits in a 32bit variable (i.e. 2^21*2^12 = 2^33). Basically, MAXEXTLEN defines the number of bits in the length field of the on-disk extent record when it is packed and so without a on-disk format change, we can't sanely increase the size of this field. Perhaps we should look at doing this, but it's going to have to wait until the btree re-factoring code is completed so we can implement the record format change sanely.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sun Feb 3 20:30:58 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Feb 2008 20:31:06 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m144UtRR022140 for ; Sun, 3 Feb 2008 20:30:58 -0800 X-ASG-Debug-ID: 1202099477-7afd002c0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4E19059798C for ; Sun, 3 Feb 2008 20:31:17 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id ECr9N5X038xk7C1Y for ; Sun, 03 Feb 2008 20:31:17 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 3FB8418DAFEF1 for ; Sun, 3 Feb 2008 22:30:45 -0600 (CST) Message-ID: <47A694F3.9010307@sandeen.net> Date: Sun, 03 Feb 2008 22:30:43 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: xfs-oss X-ASG-Orig-Subj: unpushed 4-month-old mods? Subject: unpushed 4-month-old mods? Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202099478 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41314 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5678/Sun Feb 3 17:15:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14320 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs At least these three mods which I did back in September to get Fedora 8 / 2.6.23 into shape on 4k stacks, and a bugfix, are still not pushed to kernel.org, and are missing in 2.6.24... Is there any reason for the holdup? Makes me wonder what else isn't pushed... ------------- Refactor xfs_mountfs Refactoring xfs_mountfs() to call sub-functions for logical chunks can help save a bit of stack, and can make it easier to read this long function. The mount path is one of the longest common callchains, easily getting to within a few bytes of the end of a 4k stack when over lvm, quotas are enabled, and quotacheck must be done. With this change on top of the other stack-related changes I've sent, I can get xfs to survive a normal xfsqa run on 4k stacks over lvm. Signed-off-by: Eric Sandeen Merge of xfs-linux-melb:xfs-kern:29834a by kenmcd. Refactor xfs_mountfs and: optimize XFS_IS_REALTIME_INODE w/o realtime config Use XFS_IS_REALTIME_INODE in more places, and #define it to 0 if CONFIG_XFS_RT is off. This should be safe because mount checks in xfs_rtmount_init: # define xfs_rtmount_init(m) (((mp)->m_sb.sb_rblocks == 0)? 0 : (ENOSYS)) so if we get mounted w/o CONFIG_XFS_RT, no realtime inodes should be encountered after that. Defining XFS_IS_REALTIME_INODE to 0 saves a bit of stack space, presumeably gcc can optimize around the various "if (0)" type checks: xfs_alloc_file_space -8 xfs_bmap_adjacent -16 xfs_bmapi -8 xfs_bmap_rtalloc -16 xfs_bunmapi -28 xfs_free_file_space -64 xfs_imap +8 <-- ? hmm. xfs_iomap_write_direct -12 xfs_qm_dqusage_adjust -4 xfs_qm_vop_chown_reserve -4 Signed-off-by: Eric Sandeen Merge of xfs-linux-melb:xfs-kern:30014a by kenmcd. Use XFS_IS_REALTIME_INODE() rather than open coding the check. fix 32-bit compat ioctls for GETXFLAGS, SETXFLAGS, GETVERSION XFS_IOC_GETVERSION, XFS_IOC_GETXFLAGS and XFS_IOC_SETXFLAGS all take a "long" which changes size between 32 and 64 bit platforms. So, the ioctl cmds that come in from a 32-bit app aren't as expected, for example on GETXFLAGS, unknown cmd fd(3) cmd(80046601){t:'f';sz:4} due to the size mismatch. So, use instead the 32-bit version of the commands for compat ioctls, and other than that it doesn't take any more manipulation. Also, for both native and compat versions, just define them to the values as defined in fs.h Signed-off-by: Eric Sandeen Merge of xfs-linux-melb:xfs-kern:29849a by kenmcd. fix 32-bit compat ioctls for GETXFLAGS, SETXFLAGS, GETVERSION From owner-xfs@oss.sgi.com Sun Feb 3 20:46:04 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Feb 2008 20:46:07 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m144jv9V023120 for ; Sun, 3 Feb 2008 20:46:03 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA16110; Mon, 4 Feb 2008 15:46:14 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m144kCLF50026827; Mon, 4 Feb 2008 15:46:13 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m144kBDW50060722; Mon, 4 Feb 2008 15:46:11 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Mon, 4 Feb 2008 15:46:11 +1100 From: David Chinner To: Eric Sandeen Cc: xfs-oss Subject: Re: unpushed 4-month-old mods? Message-ID: <20080204044611.GF155407@sgi.com> References: <47A694F3.9010307@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47A694F3.9010307@sandeen.net> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5678/Sun Feb 3 17:15:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14321 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Sun, Feb 03, 2008 at 10:30:43PM -0600, Eric Sandeen wrote: > At least these three mods which I did back in September to get Fedora 8 > / 2.6.23 into shape on 4k stacks, and a bugfix, are still not pushed to > kernel.org, and are missing in 2.6.24... > > Is there any reason for the holdup? Makes me wonder what else isn't > pushed... The holdup is that we drew a line in the sand for the 2.6.24 before 2.6.23 was released. We did this because of the massive amount of invasive change we already had queued up for 2.6.24. All those mods that got held up will be pushed into 2.6.25 release. Given the problems with the readdir changes (which still isn't 100% right given we've had at least one bug report of a crash in the readdir code post 2.6.24 release), we couldn't have handled much more change in that release..... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sun Feb 3 20:50:08 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Feb 2008 20:50:11 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m144o4qf023528 for ; Sun, 3 Feb 2008 20:50:08 -0800 X-ASG-Debug-ID: 1202100626-0424013d0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4EB42596874 for ; Sun, 3 Feb 2008 20:50:27 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id pfkDJEY8p4t4HI2O for ; Sun, 03 Feb 2008 20:50:27 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id C267218DAFE3B; Sun, 3 Feb 2008 22:50:25 -0600 (CST) Message-ID: <47A69990.3030501@sandeen.net> Date: Sun, 03 Feb 2008 22:50:24 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: David Chinner CC: xfs-oss X-ASG-Orig-Subj: Re: unpushed 4-month-old mods? Subject: Re: unpushed 4-month-old mods? References: <47A694F3.9010307@sandeen.net> <20080204044611.GF155407@sgi.com> In-Reply-To: <20080204044611.GF155407@sgi.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202100627 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41316 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5678/Sun Feb 3 17:15:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14322 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs David Chinner wrote: > On Sun, Feb 03, 2008 at 10:30:43PM -0600, Eric Sandeen wrote: >> At least these three mods which I did back in September to get Fedora 8 >> / 2.6.23 into shape on 4k stacks, and a bugfix, are still not pushed to >> kernel.org, and are missing in 2.6.24... >> >> Is there any reason for the holdup? Makes me wonder what else isn't >> pushed... > > The holdup is that we drew a line in the sand for the 2.6.24 before > 2.6.23 was released. We did this because of the massive amount of invasive > change we already had queued up for 2.6.24. All those mods that got > held up will be pushed into 2.6.25 release. > > Given the problems with the readdir changes (which still isn't 100% right > given we've had at least one bug report of a crash in the readdir code > post 2.6.24 release), we couldn't have handled much more change in that > release..... Ok, as long as you guys know what's up :) I just spent the evening sorting out why Fedora 9 was blowing up on install, when Fedora 8 was working so well... :( -Eric From owner-xfs@oss.sgi.com Sun Feb 3 22:28:16 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Feb 2008 22:28:25 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_62, J_CHICKENPOX_63,J_CHICKENPOX_64 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m146SDHl029501 for ; Sun, 3 Feb 2008 22:28:16 -0800 X-ASG-Debug-ID: 1202106508-6eaa00310000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mta4.srv.hcvlny.cv.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 799F5D7F368 for ; Sun, 3 Feb 2008 22:28:28 -0800 (PST) Received: from mta4.srv.hcvlny.cv.net (mta4.srv.hcvlny.cv.net [167.206.4.199]) by cuda.sgi.com with ESMTP id f6Qiiru6zLGtB5B4 for ; Sun, 03 Feb 2008 22:28:28 -0800 (PST) Received: from freyr.home (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta4.srv.hcvlny.cv.net (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) with ESMTP id <0JVP00905BBC7T21@mta4.srv.hcvlny.cv.net> for xfs@oss.sgi.com; Mon, 04 Feb 2008 01:28:27 -0500 (EST) Received: by freyr.home (Postfix, from userid 1000) id 2591A800BA3; Mon, 04 Feb 2008 01:28:09 -0500 (EST) Date: Mon, 04 Feb 2008 01:28:08 -0500 From: "Josef 'Jeff' Sipek" X-ASG-Orig-Subj: [PATCH 1/1] XFS: Replace custom AIL linked-list code with struct list_head Subject: [PATCH 1/1] XFS: Replace custom AIL linked-list code with struct list_head In-reply-to: <20080125070800.GH155407@sgi.com> To: dgc@sgi.com, xfs@oss.sgi.com, hch@infradead.org Cc: "Josef 'Jeff' Sipek" Message-id: <1202106488-31494-1-git-send-email-jeffpc@josefsipek.net> X-Mailer: git-send-email 1.5.4.rc2.85.g9de45-dirty Content-transfer-encoding: 7BIT References: <20080125070800.GH155407@sgi.com> X-Barracuda-Connect: mta4.srv.hcvlny.cv.net[167.206.4.199] X-Barracuda-Start-Time: 1202106515 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41321 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5678/Sun Feb 3 17:15:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14323 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jeffpc@josefsipek.net Precedence: bulk X-list: xfs Signed-off-by: Josef 'Jeff' Sipek --- This patch assumes you already have Dave Chinner's patch for xfsidbg_xlogitem and xfsidbg_xaildump is needed. Changes since V1: - Pass around a pointer to the AIL, not the struct list_head - Make sure things compile & run with CONFIG_XFS_DEBUG --- fs/xfs/xfs_mount.h | 2 +- fs/xfs/xfs_trans.h | 7 +-- fs/xfs/xfs_trans_ail.c | 149 +++++++++++++++++++---------------------------- 3 files changed, 62 insertions(+), 96 deletions(-) diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index f7c620e..435d625 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -220,7 +220,7 @@ extern void xfs_icsb_sync_counters_flags(struct xfs_mount *, int); #endif typedef struct xfs_ail { - xfs_ail_entry_t xa_ail; + struct list_head xa_ail; uint xa_gen; struct task_struct *xa_task; xfs_lsn_t xa_target; diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index 7f40628..50ce02b 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -113,13 +113,8 @@ struct xfs_mount; struct xfs_trans; struct xfs_dquot_acct; -typedef struct xfs_ail_entry { - struct xfs_log_item *ail_forw; /* AIL forw pointer */ - struct xfs_log_item *ail_back; /* AIL back pointer */ -} xfs_ail_entry_t; - typedef struct xfs_log_item { - xfs_ail_entry_t li_ail; /* AIL pointers */ + struct list_head li_ail; /* AIL pointers */ xfs_lsn_t li_lsn; /* last on-disk lsn */ struct xfs_log_item_desc *li_desc; /* ptr to current desc*/ struct xfs_mount *li_mountp; /* ptr to fs mount */ diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 4d6330e..8b3fd60 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -28,13 +28,13 @@ #include "xfs_trans_priv.h" #include "xfs_error.h" -STATIC void xfs_ail_insert(xfs_ail_entry_t *, xfs_log_item_t *); -STATIC xfs_log_item_t * xfs_ail_delete(xfs_ail_entry_t *, xfs_log_item_t *); -STATIC xfs_log_item_t * xfs_ail_min(xfs_ail_entry_t *); -STATIC xfs_log_item_t * xfs_ail_next(xfs_ail_entry_t *, xfs_log_item_t *); +STATIC void xfs_ail_insert(xfs_ail_t *, xfs_log_item_t *); +STATIC xfs_log_item_t * xfs_ail_delete(xfs_ail_t *, xfs_log_item_t *); +STATIC xfs_log_item_t * xfs_ail_min(xfs_ail_t *); +STATIC xfs_log_item_t * xfs_ail_next(xfs_ail_t *, xfs_log_item_t *); #ifdef DEBUG -STATIC void xfs_ail_check(xfs_ail_entry_t *, xfs_log_item_t *); +STATIC void xfs_ail_check(xfs_ail_t *, xfs_log_item_t *); #else #define xfs_ail_check(a,l) #endif /* DEBUG */ @@ -57,7 +57,7 @@ xfs_trans_tail_ail( xfs_log_item_t *lip; spin_lock(&mp->m_ail_lock); - lip = xfs_ail_min(&(mp->m_ail.xa_ail)); + lip = xfs_ail_min(&(mp->m_ail)); if (lip == NULL) { lsn = (xfs_lsn_t)0; } else { @@ -91,7 +91,7 @@ xfs_trans_push_ail( { xfs_log_item_t *lip; - lip = xfs_ail_min(&mp->m_ail.xa_ail); + lip = xfs_ail_min(&mp->m_ail); if (lip && !XFS_FORCED_SHUTDOWN(mp)) { if (XFS_LSN_CMP(threshold_lsn, mp->m_ail.xa_target) > 0) xfsaild_wakeup(mp, threshold_lsn); @@ -111,15 +111,17 @@ xfs_trans_first_push_ail( { xfs_log_item_t *lip; - lip = xfs_ail_min(&(mp->m_ail.xa_ail)); + lip = xfs_ail_min(&(mp->m_ail)); *gen = (int)mp->m_ail.xa_gen; if (lsn == 0) return lip; - while (lip && (XFS_LSN_CMP(lip->li_lsn, lsn) < 0)) - lip = lip->li_ail.ail_forw; + list_for_each_entry(lip, &mp->m_ail.xa_ail, li_ail) { + if (XFS_LSN_CMP(lip->li_lsn, lsn) >= 0) + return lip; + } - return lip; + return NULL; } /* @@ -326,7 +328,7 @@ xfs_trans_unlocked_item( * the call to xfs_log_move_tail() doesn't do anything if there's * not enough free space to wake people up so we're safe calling it. */ - min_lip = xfs_ail_min(&mp->m_ail.xa_ail); + min_lip = xfs_ail_min(&mp->m_ail); if (min_lip == lip) xfs_log_move_tail(mp, 1); @@ -354,15 +356,13 @@ xfs_trans_update_ail( xfs_log_item_t *lip, xfs_lsn_t lsn) __releases(mp->m_ail_lock) { - xfs_ail_entry_t *ailp; xfs_log_item_t *dlip=NULL; xfs_log_item_t *mlip; /* ptr to minimum lip */ - ailp = &(mp->m_ail.xa_ail); - mlip = xfs_ail_min(ailp); + mlip = xfs_ail_min(&mp->m_ail); if (lip->li_flags & XFS_LI_IN_AIL) { - dlip = xfs_ail_delete(ailp, lip); + dlip = xfs_ail_delete(&mp->m_ail, lip); ASSERT(dlip == lip); } else { lip->li_flags |= XFS_LI_IN_AIL; @@ -370,11 +370,11 @@ xfs_trans_update_ail( lip->li_lsn = lsn; - xfs_ail_insert(ailp, lip); + xfs_ail_insert(&mp->m_ail, lip); mp->m_ail.xa_gen++; if (mlip == dlip) { - mlip = xfs_ail_min(&(mp->m_ail.xa_ail)); + mlip = xfs_ail_min(&mp->m_ail); spin_unlock(&mp->m_ail_lock); xfs_log_move_tail(mp, mlip->li_lsn); } else { @@ -404,14 +404,12 @@ xfs_trans_delete_ail( xfs_mount_t *mp, xfs_log_item_t *lip) __releases(mp->m_ail_lock) { - xfs_ail_entry_t *ailp; xfs_log_item_t *dlip; xfs_log_item_t *mlip; if (lip->li_flags & XFS_LI_IN_AIL) { - ailp = &(mp->m_ail.xa_ail); - mlip = xfs_ail_min(ailp); - dlip = xfs_ail_delete(ailp, lip); + mlip = xfs_ail_min(&mp->m_ail); + dlip = xfs_ail_delete(&mp->m_ail, lip); ASSERT(dlip == lip); @@ -420,7 +418,7 @@ xfs_trans_delete_ail( mp->m_ail.xa_gen++; if (mlip == dlip) { - mlip = xfs_ail_min(&(mp->m_ail.xa_ail)); + mlip = xfs_ail_min(&(mp->m_ail)); spin_unlock(&mp->m_ail_lock); xfs_log_move_tail(mp, (mlip ? mlip->li_lsn : 0)); } else { @@ -458,7 +456,7 @@ xfs_trans_first_ail( { xfs_log_item_t *lip; - lip = xfs_ail_min(&(mp->m_ail.xa_ail)); + lip = xfs_ail_min(&(mp->m_ail)); *gen = (int)mp->m_ail.xa_gen; return lip; @@ -482,9 +480,9 @@ xfs_trans_next_ail( ASSERT(mp && lip && gen); if (mp->m_ail.xa_gen == *gen) { - nlip = xfs_ail_next(&(mp->m_ail.xa_ail), lip); + nlip = xfs_ail_next(&(mp->m_ail), lip); } else { - nlip = xfs_ail_min(&(mp->m_ail).xa_ail); + nlip = xfs_ail_min(&(mp->m_ail)); *gen = (int)mp->m_ail.xa_gen; if (restarts != NULL) { XFS_STATS_INC(xs_push_ail_restarts); @@ -514,8 +512,7 @@ int xfs_trans_ail_init( xfs_mount_t *mp) { - mp->m_ail.xa_ail.ail_forw = (xfs_log_item_t*)&mp->m_ail.xa_ail; - mp->m_ail.xa_ail.ail_back = (xfs_log_item_t*)&mp->m_ail.xa_ail; + INIT_LIST_HEAD(&mp->m_ail.xa_ail); return xfsaild_start(mp); } @@ -534,7 +531,7 @@ xfs_trans_ail_destroy( */ STATIC void xfs_ail_insert( - xfs_ail_entry_t *base, + xfs_ail_t *ailp, xfs_log_item_t *lip) /* ARGSUSED */ { @@ -543,27 +540,22 @@ xfs_ail_insert( /* * If the list is empty, just insert the item. */ - if (base->ail_back == (xfs_log_item_t*)base) { - base->ail_forw = lip; - base->ail_back = lip; - lip->li_ail.ail_forw = (xfs_log_item_t*)base; - lip->li_ail.ail_back = (xfs_log_item_t*)base; + if (list_empty(&ailp->xa_ail)) { + list_add(&lip->li_ail, &ailp->xa_ail); return; } - next_lip = base->ail_back; - while ((next_lip != (xfs_log_item_t*)base) && - (XFS_LSN_CMP(next_lip->li_lsn, lip->li_lsn) > 0)) { - next_lip = next_lip->li_ail.ail_back; + list_for_each_entry_reverse(next_lip, &ailp->xa_ail, li_ail) { + if (XFS_LSN_CMP(next_lip->li_lsn, lip->li_lsn) <= 0) + break; } - ASSERT((next_lip == (xfs_log_item_t*)base) || + + ASSERT((&next_lip->li_ail == &ailp->xa_ail) || (XFS_LSN_CMP(next_lip->li_lsn, lip->li_lsn) <= 0)); - lip->li_ail.ail_forw = next_lip->li_ail.ail_forw; - lip->li_ail.ail_back = next_lip; - next_lip->li_ail.ail_forw = lip; - lip->li_ail.ail_forw->li_ail.ail_back = lip; - xfs_ail_check(base, lip); + list_add(&lip->li_ail, &next_lip->li_ail); + + xfs_ail_check(ailp, lip); return; } @@ -573,15 +565,13 @@ xfs_ail_insert( /*ARGSUSED*/ STATIC xfs_log_item_t * xfs_ail_delete( - xfs_ail_entry_t *base, + xfs_ail_t *ailp, xfs_log_item_t *lip) /* ARGSUSED */ { - xfs_ail_check(base, lip); - lip->li_ail.ail_forw->li_ail.ail_back = lip->li_ail.ail_back; - lip->li_ail.ail_back->li_ail.ail_forw = lip->li_ail.ail_forw; - lip->li_ail.ail_forw = NULL; - lip->li_ail.ail_back = NULL; + xfs_ail_check(ailp, lip); + + list_del(&lip->li_ail); return lip; } @@ -592,14 +582,13 @@ xfs_ail_delete( */ STATIC xfs_log_item_t * xfs_ail_min( - xfs_ail_entry_t *base) + xfs_ail_t *ailp) /* ARGSUSED */ { - register xfs_log_item_t *forw = base->ail_forw; - if (forw == (xfs_log_item_t*)base) { + if (list_empty(&ailp->xa_ail)) return NULL; - } - return forw; + + return list_first_entry(&ailp->xa_ail, xfs_log_item_t, li_ail); } /* @@ -609,15 +598,14 @@ xfs_ail_min( */ STATIC xfs_log_item_t * xfs_ail_next( - xfs_ail_entry_t *base, + xfs_ail_t *ailp, xfs_log_item_t *lip) /* ARGSUSED */ { - if (lip->li_ail.ail_forw == (xfs_log_item_t*)base) { + if (lip->li_ail.next == &ailp->xa_ail) return NULL; - } - return lip->li_ail.ail_forw; + return list_first_entry(&lip->li_ail, xfs_log_item_t, li_ail); } #ifdef DEBUG @@ -626,57 +614,40 @@ xfs_ail_next( */ STATIC void xfs_ail_check( - xfs_ail_entry_t *base, + xfs_ail_t *ailp, xfs_log_item_t *lip) { xfs_log_item_t *prev_lip; - prev_lip = base->ail_forw; - if (prev_lip == (xfs_log_item_t*)base) { - /* - * Make sure the pointers are correct when the list - * is empty. - */ - ASSERT(base->ail_back == (xfs_log_item_t*)base); + if (list_empty(&ailp->xa_ail)) return; - } /* * Check the next and previous entries are valid. */ ASSERT((lip->li_flags & XFS_LI_IN_AIL) != 0); - prev_lip = lip->li_ail.ail_back; - if (prev_lip != (xfs_log_item_t*)base) { - ASSERT(prev_lip->li_ail.ail_forw == lip); + prev_lip = list_entry(lip->li_ail.prev, xfs_log_item_t, li_ail); + if (&prev_lip->li_ail != &ailp->xa_ail) ASSERT(XFS_LSN_CMP(prev_lip->li_lsn, lip->li_lsn) <= 0); - } - prev_lip = lip->li_ail.ail_forw; - if (prev_lip != (xfs_log_item_t*)base) { - ASSERT(prev_lip->li_ail.ail_back == lip); + + prev_lip = list_entry(lip->li_ail.next, xfs_log_item_t, li_ail); + if (&prev_lip->li_ail != &ailp->xa_ail) ASSERT(XFS_LSN_CMP(prev_lip->li_lsn, lip->li_lsn) >= 0); - } #ifdef XFS_TRANS_DEBUG /* - * Walk the list checking forward and backward pointers, - * lsn ordering, and that every entry has the XFS_LI_IN_AIL - * flag set. This is really expensive, so only do it when - * specifically debugging the transaction subsystem. + * Walk the list checking lsn ordering, and that every entry has the + * XFS_LI_IN_AIL flag set. This is really expensive, so only do it + * when specifically debugging the transaction subsystem. */ - prev_lip = (xfs_log_item_t*)base; - while (lip != (xfs_log_item_t*)base) { - if (prev_lip != (xfs_log_item_t*)base) { - ASSERT(prev_lip->li_ail.ail_forw == lip); + prev_lip = list_entry(&ailp->xa_ail, xfs_log_item_t, li_ail); + list_for_each_entry(lip, &ailp->xa_ail, li_ail) { + if (&prev_lip->li_ail != &ailp->xa_ail) ASSERT(XFS_LSN_CMP(prev_lip->li_lsn, lip->li_lsn) <= 0); - } - ASSERT(lip->li_ail.ail_back == prev_lip); ASSERT((lip->li_flags & XFS_LI_IN_AIL) != 0); prev_lip = lip; - lip = lip->li_ail.ail_forw; } - ASSERT(lip == (xfs_log_item_t*)base); - ASSERT(base->ail_back == prev_lip); #endif /* XFS_TRANS_DEBUG */ } #endif /* DEBUG */ -- 1.5.4.rc2.85.g9de45-dirty From owner-xfs@oss.sgi.com Sun Feb 3 22:33:17 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Feb 2008 22:33:23 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_40 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m146XFoD030023 for ; Sun, 3 Feb 2008 22:33:17 -0800 X-ASG-Debug-ID: 1202106670-6ed8003e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mta5.srv.hcvlny.cv.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 71D22D7F381 for ; Sun, 3 Feb 2008 22:31:10 -0800 (PST) Received: from mta5.srv.hcvlny.cv.net (mta5.srv.hcvlny.cv.net [167.206.4.200]) by cuda.sgi.com with ESMTP id FHIFYBDRX231KuEi for ; Sun, 03 Feb 2008 22:31:10 -0800 (PST) Received: from freyr.home (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta5.srv.hcvlny.cv.net (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) with ESMTP id <0JVP005HQBFVJFW0@mta5.srv.hcvlny.cv.net> for xfs@oss.sgi.com; Mon, 04 Feb 2008 01:31:07 -0500 (EST) Received: by freyr.home (Postfix, from userid 1000) id 3F4E5800BA3; Mon, 04 Feb 2008 01:30:52 -0500 (EST) Date: Mon, 04 Feb 2008 01:30:52 -0500 (EST) From: jeffpc@home.josefsipek.net (Jeff) X-ASG-Orig-Subj: XFS compile error Subject: XFS compile error To: xfs@oss.sgi.com Cc: jeffpc@josefsipek.net Message-id: <20080204063052.3F4E5800BA3@freyr.home> MIME-version: 1.0 Content-type: TEXT/PLAIN Content-transfer-encoding: 8BIT X-Barracuda-Connect: mta5.srv.hcvlny.cv.net[167.206.4.200] X-Barracuda-Start-Time: 1202106670 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41321 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5678/Sun Feb 3 17:15:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14324 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jeffpc@home.josefsipek.net Precedence: bulk X-list: xfs Enabling CONFIG_XFS_DEBUG as well as XFS_TRANS_DEBUG (by editing the makefile) results in a compile error: $ make CHK include/linux/version.h CHK include/linux/utsrelease.h CALL scripts/checksyscalls.sh CHK include/linux/compile.h CC [M] fs/xfs/xfs_rtalloc.o CC [M] fs/xfs/xfs_acl.o CC [M] fs/xfs/linux-2.6/xfs_stats.o CC [M] fs/xfs/linux-2.6/xfs_sysctl.o CC [M] fs/xfs/xfs_alloc.o CC [M] fs/xfs/xfs_alloc_btree.o CC [M] fs/xfs/xfs_attr.o CC [M] fs/xfs/xfs_attr_leaf.o CC [M] fs/xfs/xfs_bit.o CC [M] fs/xfs/xfs_bmap.o CC [M] fs/xfs/xfs_bmap_btree.o CC [M] fs/xfs/xfs_btree.o CC [M] fs/xfs/xfs_buf_item.o fs/xfs/xfs_buf_item.c: In function ‘xfs_buf_item_log_debug’: fs/xfs/xfs_buf_item.c:64: error: implicit declaration of function ‘bfset’ fs/xfs/xfs_buf_item.c: In function ‘xfs_buf_item_log_check’: fs/xfs/xfs_buf_item.c:127: error: implicit declaration of function ‘btst’ fs/xfs/xfs_buf_item.c:130: warning: format ‘%x’ expects type ‘unsigned int’, but argument 3 has type ‘struct xfs_buf_log_item_t *’ fs/xfs/xfs_buf_item.c:130: warning: format ‘%x’ expects type ‘unsigned int’, but argument 4 has type ‘struct xfs_buf_t *’ fs/xfs/xfs_buf_item.c:130: warning: format ‘%x’ expects type ‘unsigned int’, but argument 5 has type ‘char *’ make[2]: *** [fs/xfs/xfs_buf_item.o] Error 1 make[1]: *** [fs/xfs] Error 2 make: *** [fs] Error 2 From owner-xfs@oss.sgi.com Mon Feb 4 06:14:47 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 06:14:50 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_47 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14EEhq0030047 for ; Mon, 4 Feb 2008 06:14:47 -0800 X-ASG-Debug-ID: 1202134226-782303810000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from lucidpixels.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 9E6735998AA for ; Mon, 4 Feb 2008 06:10:26 -0800 (PST) Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by cuda.sgi.com with ESMTP id yai2tMuDXcE0xJHQ for ; Mon, 04 Feb 2008 06:10:26 -0800 (PST) Received: by lucidpixels.com (Postfix, from userid 1001) id 2D9581C000230; Mon, 4 Feb 2008 09:09:53 -0500 (EST) Date: Mon, 4 Feb 2008 09:09:53 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: Michael Tokarev cc: Moshe Yudkowsky , linux-raid@vger.kernel.org, xfs@oss.sgi.com, sandeen@sandeen.net X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) In-Reply-To: <47A7188A.4070005@msgid.tls.msk.ru> Message-ID: References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> User-Agent: Alpine 1.00 (DEB 882 2007-12-20) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Barracuda-Connect: lucidpixels.com[75.144.35.66] X-Barracuda-Start-Time: 1202134226 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41350 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5678/Sun Feb 3 17:15:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14325 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs On Mon, 4 Feb 2008, Michael Tokarev wrote: > Moshe Yudkowsky wrote: > [] >> If I'm reading the man pages, Wikis, READMEs and mailing lists correctly >> -- not necessarily the case -- the ext3 file system uses the equivalent >> of data=journal as a default. > > ext3 defaults to data=ordered, not data=journal. ext2 doesn't have > journal at all. > >> The question then becomes what data scheme to use with reiserfs on the > > I'd say don't use reiserfs in the first place ;) > >> Another way to phrase this: unless you're running data-center grade >> hardware and have absolute confidence in your UPS, you should use >> data=journal for reiserfs and perhaps avoid XFS entirely. > > By the way, even if you do have a good UPS, there should be some > control program for it, to properly shut down your system when > UPS loses the AC power. So far, I've seen no such programs... > > /mjt Why avoid XFS entirely? esandeen, any comments here? Justin. From owner-xfs@oss.sgi.com Mon Feb 4 06:25:54 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 06:26:03 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14EPrWK030777 for ; Mon, 4 Feb 2008 06:25:54 -0800 X-ASG-Debug-ID: 1202135171-4f6b002a0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B5DD2D816A6 for ; Mon, 4 Feb 2008 06:26:12 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id to2W1dZ7BVrDaSZk for ; Mon, 04 Feb 2008 06:26:12 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 6DA4718DAFE3B; Mon, 4 Feb 2008 08:25:38 -0600 (CST) Message-ID: <47A72061.3010800@sandeen.net> Date: Mon, 04 Feb 2008 08:25:37 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Justin Piszcz CC: Michael Tokarev , Moshe Yudkowsky , linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202135175 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41351 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5678/Sun Feb 3 17:15:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14326 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Justin Piszcz wrote: > Why avoid XFS entirely? > > esandeen, any comments here? Heh; well, it's the meme. see: http://oss.sgi.com/projects/xfs/faq.html#nulls and note that recent fixes have been made in this area (also noted in the faq) Also - the above all assumes that when a drive says it's written/flushed data, that it truly has. Modern write-caching drives can wreak havoc with any journaling filesystem, so that's one good reason for a UPS. If the drive claims to have metadata safe on disk but actually does not, and you lose power, the data claimed safe will evaporate, there's not much the fs can do. IO write barriers address this by forcing the drive to flush order-critical data before continuing; xfs has them on by default, although they are tested at mount time and if you have something in between xfs and the disks which does not support barriers (i.e. lvm...) then they are disabled again, with a notice in the logs. Note also that ext3 has the barrier option as well, but it is not enabled by default due to performance concerns. Barriers also affect xfs performance, but enabling them in the non-battery-backed-write-cache scenario is the right thing to do for filesystem integrity. -Eric > Justin. > From owner-xfs@oss.sgi.com Mon Feb 4 06:42:22 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 06:42:27 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14EgHrB031569 for ; Mon, 4 Feb 2008 06:42:22 -0800 X-ASG-Debug-ID: 1202136159-50b9016a0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1BD75D814DD for ; Mon, 4 Feb 2008 06:42:39 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id 0H72o639FJS6AYDh for ; Mon, 04 Feb 2008 06:42:39 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id D302218DAFEF7; Mon, 4 Feb 2008 08:42:38 -0600 (CST) Message-ID: <47A7245E.4070207@sandeen.net> Date: Mon, 04 Feb 2008 08:42:38 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Justin Piszcz CC: Michael Tokarev , Moshe Yudkowsky , linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> <47A72061.3010800@sandeen.net> In-Reply-To: <47A72061.3010800@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202136160 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41352 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5678/Sun Feb 3 17:15:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14327 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Eric Sandeen wrote: > Justin Piszcz wrote: > >> Why avoid XFS entirely? >> >> esandeen, any comments here? > > Heh; well, it's the meme. > > see: > > http://oss.sgi.com/projects/xfs/faq.html#nulls > > and note that recent fixes have been made in this area (also noted in > the faq) Actually, continue reading past that specific entry to the next several, it covers all this quite well. -Eric From owner-xfs@oss.sgi.com Mon Feb 4 07:31:06 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 07:31:13 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14FV2GE001140 for ; Mon, 4 Feb 2008 07:31:06 -0800 X-ASG-Debug-ID: 1202139081-205f02cc0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from smtp02.lnh.mail.rcn.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 45B8E59A228 for ; Mon, 4 Feb 2008 07:31:22 -0800 (PST) Received: from smtp02.lnh.mail.rcn.net (smtp02.lnh.mail.rcn.net [207.172.157.102]) by cuda.sgi.com with ESMTP id 4AlR8oIx7whUBTqT for ; Mon, 04 Feb 2008 07:31:22 -0800 (PST) Received: from mr08.lnh.mail.rcn.net ([207.172.157.28]) by smtp02.lnh.mail.rcn.net with ESMTP; 04 Feb 2008 10:31:16 -0500 Received: from smtp01.lnh.mail.rcn.net (smtp01.lnh.mail.rcn.net [207.172.4.11]) by mr08.lnh.mail.rcn.net (MOS 3.8.6-GA) with ESMTP id JQE20347; Mon, 4 Feb 2008 10:31:14 -0500 (EST) Received: from 207-229-180-107.arm-bsr1.chi-arm.il.cable.rcn.com (HELO [172.28.54.160]) ([207.229.180.107]) by smtp01.lnh.mail.rcn.net with ESMTP; 04 Feb 2008 10:30:09 -0500 Message-ID: <47A72FBC.9090701@pobox.com> Date: Mon, 04 Feb 2008 09:31:08 -0600 From: Moshe Yudkowsky Organization: The Institute User-Agent: Mozilla-Thunderbird 2.0.0.9 (X11/20080110) MIME-Version: 1.0 To: Eric Sandeen CC: Justin Piszcz , Michael Tokarev , linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> <47A72061.3010800@sandeen.net> In-Reply-To: <47A72061.3010800@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Junkmail-Status: score=10/50, host=mr08.lnh.mail.rcn.net X-Junkmail-SD-Raw: score=unknown, refid=str=0001.0A010208.47A72FC3.012E,ss=1,fgs=0, ip=207.172.4.11, so=2007-10-30 19:00:17, dmn=5.4.3/2007-11-16 X-Junkmail-IWF: false X-Barracuda-Connect: smtp02.lnh.mail.rcn.net[207.172.157.102] X-Barracuda-Start-Time: 1202139085 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41354 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5679/Mon Feb 4 06:57:26 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14328 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: moshe@pobox.com Precedence: bulk X-list: xfs Eric, Thanks very much for your note. I'm becoming very leery of resiserfs at the moment... I'm about to run another series of crash tests. Eric Sandeen wrote: > Justin Piszcz wrote: > >> Why avoid XFS entirely? >> >> esandeen, any comments here? > > Heh; well, it's the meme. Well, yeah... > Note also that ext3 has the barrier option as well, but it is not > enabled by default due to performance concerns. Barriers also affect > xfs performance, but enabling them in the non-battery-backed-write-cache > scenario is the right thing to do for filesystem integrity. So if I understand you correctly, you're stating that current the most reliable fs in its default configuration, in terms of protection against power-loss scenarios, is XFS? -- Moshe Yudkowsky * moshe@pobox.com * www.pobox.com/~moshe "There is something fundamentally wrong with a country [USSR] where the citizens want to buy your underwear." -- Paul Thereaux From owner-xfs@oss.sgi.com Mon Feb 4 08:29:55 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 08:30:00 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,J_CHICKENPOX_23 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14GTo2P007893 for ; Mon, 4 Feb 2008 08:29:55 -0800 X-ASG-Debug-ID: 1202142612-591b025e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.emlix.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 6A009D82B7D for ; Mon, 4 Feb 2008 08:30:12 -0800 (PST) Received: from mx1.emlix.com (mx1.emlix.com [193.175.82.87]) by cuda.sgi.com with ESMTP id c5McXkNVSUv1M6zp for ; Mon, 04 Feb 2008 08:30:12 -0800 (PST) Received: from gate.emlix.com ([193.175.27.217]:50835 helo=mailer.emlix.com) by mx1.emlix.com with esmtp (Exim 4.63) (envelope-from ) id 1JM4QM-0007MD-Ny for xfs@oss.sgi.com; Mon, 04 Feb 2008 17:44:02 +0100 Received: by mailer.emlix.com id 1JM4Cv-0007hi-OO; Mon, 04 Feb 2008 17:30:09 +0100 Received: by spinat.emlix.com (Postfix, from userid 2047) id 8CDEA7F452; Mon, 4 Feb 2008 17:30:09 +0100 (CET) Date: Mon, 4 Feb 2008 17:30:09 +0100 From: Tobias Ulmer To: xfs@oss.sgi.com X-ASG-Orig-Subj: Another oops in xfs_file_readdir (vanilla linux 2.6.24+) Subject: Another oops in xfs_file_readdir (vanilla linux 2.6.24+) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.15+20070412 (2007-04-11) Message-Id: Organization: emlix gmbh, Goettingen, Germany X-Barracuda-Connect: mx1.emlix.com[193.175.82.87] X-Barracuda-Start-Time: 1202142613 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41358 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5679/Mon Feb 4 06:57:26 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14329 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tu@emlix.com Precedence: bulk X-list: xfs Hi, i've got a nice oops from a clean pull of linus tree on saturday (head is 24e1c13c93cbdd05e4b7ea921c0050b036555adc) BUG: unable to handle kernel paging request at f8000000 IP: [] xfs_file_readdir+0x157/0x1e0 *pde = 00000000 Oops: 0000 [#1] PREEMPT Modules linked in: Pid: 30823, comm: rm Not tainted (2.6.24toy #18) EIP: 0060:[] EFLAGS: 00010246 CPU: 0 EIP is at xfs_file_readdir+0x157/0x1e0 EAX: 00000000 EBX: 0000046b ECX: 00000028 EDX: 00000000 ESI: 00000000 EDI: f7fffff8 EBP: de28e480 ESP: e04cdf18 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process rm (pid: 30823, ti=e04cc000 task=ccf96f60 task.ti=e04cc000) Stack: 0000046b 00000000 090485aa 00000000 00000000 c016c0e0 e04cdf94 f6da6280 00000000 00000000 00000000 0000046b 00000000 f7fff000 00001000 00000ff8 0000046e 00000000 c03974c0 f6da6280 de28d3c0 c016c0e0 c016c321 e04cdf94 Call Trace: [] filldir64+0x0/0xe0 [] filldir64+0x0/0xe0 [] vfs_readdir+0x81/0xa0 [] sys_getdents64+0x73/0xd0 [] sysenter_past_esp+0x5f/0x85 ======================= Code: 81 e3 ff ff ff 7f 89 1c 24 ff 54 24 14 85 c0 75 51 8b 4f 10 31 d2 83 c1 1f 83 e1 f8 29 4c 24 24 19 24 28 00 <8b> 47 08 8b 57 0c 89 44 24 2c 89 54 24 30 7f 9d 0f 8c 23 ff ff EIP: [] xfs_file_readdir+0x157/0x1e0 SS:ESP 0068:e04cdf18 ---[ end trace 52962aefa1b8fed3 ]--- Poking around in the sources using objdump shows that it breaks at 0x207 (xfs_file_readdir begins at 0xb0) 200: 01 cf add %ecx,%edi } size = buf.used; de = (struct hack_dirent *)buf.dirent; curr_offset = de->offset /* & 0x7fffffff */; while (size > 0) { 202: 83 7c 24 28 00 cmpl $0x0,0x28(%esp) reclen = ALIGN(sizeof(struct hack_dirent) + de->namlen, sizeof(u64)); size -= reclen; de = (struct hack_dirent *)((char *)de + reclen); curr_offset = de->offset /* & 0x7fffffff */; 207: 8b 47 08 mov 0x8(%edi),%eax 20a: 8b 57 0c mov 0xc(%edi),%edx 20d: 89 44 24 2c mov %eax,0x2c(%esp) 211: 89 54 24 30 mov %edx,0x30(%esp) } size = buf.used; de = (struct hack_dirent *)buf.dirent; curr_offset = de->offset /* & 0x7fffffff */; Reproduction: No idea, the system was used for compiling various software packages. The process in question did delete a file tree of about ~100MB sources and binaries, nothing special. Please keep me in CC, thanks, Tobias From owner-xfs@oss.sgi.com Mon Feb 4 08:38:54 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 08:38:59 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14GcqLa008533 for ; Mon, 4 Feb 2008 08:38:54 -0800 X-ASG-Debug-ID: 1202143154-67da02ca0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from hobbit.corpit.ru (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0836359AA58 for ; Mon, 4 Feb 2008 08:39:14 -0800 (PST) Received: from hobbit.corpit.ru (hobbit.corpit.ru [81.13.94.6]) by cuda.sgi.com with ESMTP id Gbjdztgjb76NNLL7 for ; Mon, 04 Feb 2008 08:39:14 -0800 (PST) Received: from [192.168.1.1] (paltus.tls.msk.ru [192.168.1.1]) by hobbit.corpit.ru (Postfix) with ESMTP id CAE0A3562B; Mon, 4 Feb 2008 19:38:40 +0300 (MSK) (envelope-from mjt@tls.msk.ru) Message-ID: <47A73F90.3020307@msgid.tls.msk.ru> Date: Mon, 04 Feb 2008 19:38:40 +0300 From: Michael Tokarev User-Agent: Icedove 1.5.0.12 (X11/20070607) MIME-Version: 1.0 To: Eric Sandeen CC: Justin Piszcz , Moshe Yudkowsky , linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> <47A72061.3010800@sandeen.net> In-Reply-To: <47A72061.3010800@sandeen.net> X-Enigmail-Version: 0.94.2.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: hobbit.corpit.ru[81.13.94.6] X-Barracuda-Start-Time: 1202143155 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41359 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5679/Mon Feb 4 06:57:26 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14330 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mjt@tls.msk.ru Precedence: bulk X-list: xfs Eric Sandeen wrote: [] > http://oss.sgi.com/projects/xfs/faq.html#nulls > > and note that recent fixes have been made in this area (also noted in > the faq) > > Also - the above all assumes that when a drive says it's written/flushed > data, that it truly has. Modern write-caching drives can wreak havoc > with any journaling filesystem, so that's one good reason for a UPS. If Unfortunately an UPS does not *really* help here. Because unless it has control program which properly shuts system down on the loss of input power, and the battery really has the capacity to power the system while it's shutting down (anyone tested this? With new UPS? and after an year of use, when the battery is not new?), -- unless the UPS actually has the capacity to shutdown system, it will cut the power at an unexpected time, while the disk(s) still has dirty caches... > the drive claims to have metadata safe on disk but actually does not, > and you lose power, the data claimed safe will evaporate, there's not > much the fs can do. IO write barriers address this by forcing the drive > to flush order-critical data before continuing; xfs has them on by > default, although they are tested at mount time and if you have > something in between xfs and the disks which does not support barriers > (i.e. lvm...) then they are disabled again, with a notice in the logs. Note also that with linux software raid barriers are NOT supported. /mjt From owner-xfs@oss.sgi.com Mon Feb 4 08:45:04 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 08:45:09 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14Gj1tc009037 for ; Mon, 4 Feb 2008 08:45:04 -0800 X-ASG-Debug-ID: 1202143523-591602bd0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 72791D82E15 for ; Mon, 4 Feb 2008 08:45:24 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com with ESMTP id iHef4ycfGfTWF5Fp for ; Mon, 04 Feb 2008 08:45:24 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id m14GjL2b007065; Mon, 4 Feb 2008 11:45:22 -0500 Received: from lacrosse.corp.redhat.com (lacrosse.corp.redhat.com [172.16.52.154]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m14GjK3g018889; Mon, 4 Feb 2008 11:45:20 -0500 Received: from neon.msp.redhat.com (neon.msp.redhat.com [10.15.80.10]) by lacrosse.corp.redhat.com (8.12.11.20060308/8.11.6) with ESMTP id m14GjJLP029105; Mon, 4 Feb 2008 11:45:20 -0500 Message-ID: <47A7411F.2040702@sandeen.net> Date: Mon, 04 Feb 2008 10:45:19 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (X11/20071115) MIME-Version: 1.0 To: Moshe Yudkowsky CC: Justin Piszcz , Michael Tokarev , linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> <47A72061.3010800@sandeen.net> <47A72FBC.9090701@pobox.com> In-Reply-To: <47A72FBC.9090701@pobox.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 X-Barracuda-Connect: mx1.redhat.com[66.187.233.31] X-Barracuda-Start-Time: 1202143524 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41360 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5679/Mon Feb 4 06:57:26 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14331 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Moshe Yudkowsky wrote: > So if I understand you correctly, you're stating that current the most > reliable fs in its default configuration, in terms of protection against > power-loss scenarios, is XFS? I wouldn't go that far without some real-world poweroff testing, because various fs's are probably more or less tolerant of a write-cache evaporation. I suppose it'd depend on the size of the write cache as well. -Eric From owner-xfs@oss.sgi.com Mon Feb 4 09:22:01 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 09:22:10 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_32 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14HLwDo010914 for ; Mon, 4 Feb 2008 09:22:01 -0800 X-ASG-Debug-ID: 1202145739-4c9f02110000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from hobbit.corpit.ru (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1D4BFD833C3 for ; Mon, 4 Feb 2008 09:22:20 -0800 (PST) Received: from hobbit.corpit.ru (hobbit.corpit.ru [81.13.94.6]) by cuda.sgi.com with ESMTP id ExRGO1IWlEY0GEO6 for ; Mon, 04 Feb 2008 09:22:20 -0800 (PST) Received: from [192.168.1.1] (paltus.tls.msk.ru [192.168.1.1]) by hobbit.corpit.ru (Postfix) with ESMTP id 285773562A; Mon, 4 Feb 2008 20:22:18 +0300 (MSK) (envelope-from mjt@tls.msk.ru) Message-ID: <47A749C9.6010503@msgid.tls.msk.ru> Date: Mon, 04 Feb 2008 20:22:17 +0300 From: Michael Tokarev User-Agent: Icedove 1.5.0.12 (X11/20070607) MIME-Version: 1.0 To: Eric Sandeen CC: Moshe Yudkowsky , Justin Piszcz , linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> <47A72061.3010800@sandeen.net> <47A72FBC.9090701@pobox.com> <47A7411F.2040702@sandeen.net> In-Reply-To: <47A7411F.2040702@sandeen.net> X-Enigmail-Version: 0.94.2.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: hobbit.corpit.ru[81.13.94.6] X-Barracuda-Start-Time: 1202145741 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41362 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5680/Mon Feb 4 08:24:59 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14332 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mjt@tls.msk.ru Precedence: bulk X-list: xfs Eric Sandeen wrote: > Moshe Yudkowsky wrote: >> So if I understand you correctly, you're stating that current the most >> reliable fs in its default configuration, in terms of protection against >> power-loss scenarios, is XFS? > > I wouldn't go that far without some real-world poweroff testing, because > various fs's are probably more or less tolerant of a write-cache > evaporation. I suppose it'd depend on the size of the write cache as well. I know no filesystem which is, as you say, tolerant to a write-cache evaporation. If a drive says the data is written but in fact it's not, it's a Bad Drive (tm) and it should be thrown away immediately. Fortunately, almost all modern disk drives don't lie this way. The only thing needed for the filesystem is to tell the drive to flush it's cache at the appropriate time, and actually wait for the flush to complete. Barriers (mentioned in this thread) is just another way to do so, in a somewhat more efficient way, but normal cache flush will do as well. IFF the write caching is enabled in the first place - note that with some workloads, write caching in the drive actually makes write speed worse, not better - namely, in case of massive writes. Speaking of XFS (and with ext3fs with write barriers enabled) - I'm confused here as well, and answers to my questions didn't help either. As far as I understand, XFS only use barriers, not regular cache flushes, hence without write barrier support (which is not here for linux software raid, which is explained elsewhere) it's unsafe, -- probably the same applies to ext3 with barrier support enabled. But I'm not sure I got it all correctly. /mjt From owner-xfs@oss.sgi.com Mon Feb 4 12:52:13 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 12:52:19 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14KqA5n025655 for ; Mon, 4 Feb 2008 12:52:13 -0800 X-ASG-Debug-ID: 1202158353-3d2602830000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 96DA059C334; Mon, 4 Feb 2008 12:52:33 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id tisqe83LI5nEehor; Mon, 04 Feb 2008 12:52:33 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JM8Io-00041l-Kv; Mon, 04 Feb 2008 20:52:30 +0000 Date: Mon, 4 Feb 2008 15:52:30 -0500 From: Christoph Hellwig To: "Josef 'Jeff' Sipek" Cc: dgc@sgi.com, xfs@oss.sgi.com, hch@infradead.org X-ASG-Orig-Subj: Re: [PATCH 1/1] XFS: Replace custom AIL linked-list code with struct list_head Subject: Re: [PATCH 1/1] XFS: Replace custom AIL linked-list code with struct list_head Message-ID: <20080204205230.GA14084@lst.de> References: <20080125070800.GH155407@sgi.com> <1202106488-31494-1-git-send-email-jeffpc@josefsipek.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1202106488-31494-1-git-send-email-jeffpc@josefsipek.net> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1202158353 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41377 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5683/Mon Feb 4 10:17:58 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14333 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Mon, Feb 04, 2008 at 01:28:08AM -0500, Josef 'Jeff' Sipek wrote: > Signed-off-by: Josef 'Jeff' Sipek > --- > This patch assumes you already have Dave Chinner's patch for > xfsidbg_xlogitem and xfsidbg_xaildump is needed. > > Changes since V1: > > - Pass around a pointer to the AIL, not the struct list_head > - Make sure things compile & run with CONFIG_XFS_DEBUG Does it work with XFS_TRANS_DEBUG defined aswell? > - lip = xfs_ail_min(&(mp->m_ail.xa_ail)); > + lip = xfs_ail_min(&(mp->m_ail)); Care to remove these useless braces in all the places you touch while you're at it? From owner-xfs@oss.sgi.com Mon Feb 4 14:27:31 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 14:27:37 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14MRTbi031449 for ; Mon, 4 Feb 2008 14:27:31 -0800 X-ASG-Debug-ID: 1202164071-319002020000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from lucidpixels.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 080DAD8B483 for ; Mon, 4 Feb 2008 14:27:51 -0800 (PST) Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by cuda.sgi.com with ESMTP id IOi1N7WWqHB3JmwN for ; Mon, 04 Feb 2008 14:27:51 -0800 (PST) Received: by lucidpixels.com (Postfix, from userid 1001) id 2C9E01C000292; Mon, 4 Feb 2008 17:27:50 -0500 (EST) Date: Mon, 4 Feb 2008 17:27:50 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: Michael Tokarev cc: Eric Sandeen , Moshe Yudkowsky , linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) In-Reply-To: <47A73F90.3020307@msgid.tls.msk.ru> Message-ID: References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> <47A72061.3010800@sandeen.net> <47A73F90.3020307@msgid.tls.msk.ru> User-Agent: Alpine 1.00 (DEB 882 2007-12-20) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Barracuda-Connect: lucidpixels.com[75.144.35.66] X-Barracuda-Start-Time: 1202164072 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41381 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5685/Mon Feb 4 12:29:32 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14334 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs On Mon, 4 Feb 2008, Michael Tokarev wrote: > Eric Sandeen wrote: > [] >> http://oss.sgi.com/projects/xfs/faq.html#nulls >> >> and note that recent fixes have been made in this area (also noted in >> the faq) >> >> Also - the above all assumes that when a drive says it's written/flushed >> data, that it truly has. Modern write-caching drives can wreak havoc >> with any journaling filesystem, so that's one good reason for a UPS. If > > Unfortunately an UPS does not *really* help here. Because unless > it has control program which properly shuts system down on the loss > of input power, and the battery really has the capacity to power the > system while it's shutting down (anyone tested this? With new UPS? > and after an year of use, when the battery is not new?), -- unless > the UPS actually has the capacity to shutdown system, it will cut > the power at an unexpected time, while the disk(s) still has dirty > caches... You use nut and a large enough UPS to handle the load of the system, it shuts the machine down just fine. > >> the drive claims to have metadata safe on disk but actually does not, >> and you lose power, the data claimed safe will evaporate, there's not >> much the fs can do. IO write barriers address this by forcing the drive >> to flush order-critical data before continuing; xfs has them on by >> default, although they are tested at mount time and if you have >> something in between xfs and the disks which does not support barriers >> (i.e. lvm...) then they are disabled again, with a notice in the logs. > > Note also that with linux software raid barriers are NOT supported. > > /mjt > > From owner-xfs@oss.sgi.com Mon Feb 4 15:39:32 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 15:39:36 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m14NdVgI002711 for ; Mon, 4 Feb 2008 15:39:32 -0800 X-ASG-Debug-ID: 1202168393-122d01a00000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from filer.fsl.cs.sunysb.edu (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0A08C7C8F61; Mon, 4 Feb 2008 15:39:53 -0800 (PST) Received: from filer.fsl.cs.sunysb.edu (filer.fsl.cs.sunysb.edu [130.245.126.2]) by cuda.sgi.com with ESMTP id bq7rR1twp4VZgJ11; Mon, 04 Feb 2008 15:39:53 -0800 (PST) Received: from josefsipek.net (baal.fsl.cs.sunysb.edu [130.245.126.78]) by filer.fsl.cs.sunysb.edu (8.12.11.20060308/8.13.1) with ESMTP id m14NdidM025995; Mon, 4 Feb 2008 18:39:44 -0500 Received: by josefsipek.net (Postfix, from userid 1000) id 245701C00DD2; Mon, 4 Feb 2008 18:39:46 -0500 (EST) Date: Mon, 4 Feb 2008 18:39:46 -0500 From: "Josef 'Jeff' Sipek" To: Christoph Hellwig Cc: dgc@sgi.com, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 1/1] XFS: Replace custom AIL linked-list code with struct list_head Subject: Re: [PATCH 1/1] XFS: Replace custom AIL linked-list code with struct list_head Message-ID: <20080204233946.GA27110@josefsipek.net> References: <20080125070800.GH155407@sgi.com> <1202106488-31494-1-git-send-email-jeffpc@josefsipek.net> <20080204205230.GA14084@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080204205230.GA14084@lst.de> User-Agent: Mutt/1.5.16 (2007-06-11) X-Barracuda-Connect: filer.fsl.cs.sunysb.edu[130.245.126.2] X-Barracuda-Start-Time: 1202168394 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41386 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5687/Mon Feb 4 14:54:45 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14335 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jeffpc@josefsipek.net Precedence: bulk X-list: xfs On Mon, Feb 04, 2008 at 03:52:30PM -0500, Christoph Hellwig wrote: > On Mon, Feb 04, 2008 at 01:28:08AM -0500, Josef 'Jeff' Sipek wrote: > > Signed-off-by: Josef 'Jeff' Sipek > > --- > > This patch assumes you already have Dave Chinner's patch for > > xfsidbg_xlogitem and xfsidbg_xaildump is needed. > > > > Changes since V1: > > > > - Pass around a pointer to the AIL, not the struct list_head > > - Make sure things compile & run with CONFIG_XFS_DEBUG > > Does it work with XFS_TRANS_DEBUG defined aswell? With XFS_TRANS_DEBUG on, other places in XFS don't compile, but this does and works (xfsqa ran fine). > > - lip = xfs_ail_min(&(mp->m_ail.xa_ail)); > > + lip = xfs_ail_min(&(mp->m_ail)); > > Care to remove these useless braces in all the places you touch while you're at it? Will do. Josef 'Jeff' Sipek. -- The box said "Windows XP or better required". So I installed Linux. From owner-xfs@oss.sgi.com Mon Feb 4 16:31:04 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 16:31:07 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_23 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m150UvZX009861 for ; Mon, 4 Feb 2008 16:31:01 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA16545; Tue, 5 Feb 2008 11:31:14 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m150VCLF43120074; Tue, 5 Feb 2008 11:31:13 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m150VAjP51208715; Tue, 5 Feb 2008 11:31:10 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Tue, 5 Feb 2008 11:31:10 +1100 From: David Chinner To: Andrea Perotti Cc: xfs-oss Subject: Re: Some kind of problems with xfs and 2.6.24 kernel Message-ID: <20080205003109.GI155407@sgi.com> References: <39932.84.221.232.4.1202164725.squirrel@manage.unbit.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <39932.84.221.232.4.1202164725.squirrel@manage.unbit.it> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5687/Mon Feb 4 14:54:45 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14336 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs [please report problems direct to xfs@oss.sgi.com, not me ;)] On Mon, Feb 04, 2008 at 11:38:45PM +0100, Andrea Perotti wrote: > Hi David > I've found you address in a linux kernel mailing list and I guess > you are the guy I was looking for. > > I'm gentoo linux user and I'm running on my laptop the 2.6.24 kernel with > the standard gentoo patchset and I have both / and /home on XFS. > > During normal usage I got a couple of time the kernel error you can find > attached. > > Both of them occurred when I was connected with the imap server (dovecot) > present on that machine, accessing to the same account (imap was reading > into /home/$myuser/.maildir/... ) and I get aware that something was gone > because I cannot access to one of my dir inside my mailbox. > > I hope these information could be usefull to you. If you need any extra > info, like the option I used to create and mount FS or anything else, just > ask me. > > Jan 31 15:12:09 stakanov_II BUG: unable to handle kernel paging request at virtual address f8000000 > Jan 31 15:12:09 stakanov_II printing eip: c022dd78 *pde = 00000000 > Jan 31 15:12:09 stakanov_II Oops: 0000 [#1] PREEMPT > Jan 31 15:12:09 stakanov_II Modules linked in: b43 yenta_socket rsrc_nonstatic ssb > Jan 31 15:12:09 stakanov_II > Jan 31 15:12:09 stakanov_II Pid: 19390, comm: imap Not tainted (2.6.24-gentoo-metallica #4) > Jan 31 15:12:09 stakanov_II EIP: 0060:[] EFLAGS: 00010282 CPU: 0 > Jan 31 15:12:09 stakanov_II EIP is at xfs_file_readdir+0xf2/0x1aa > Jan 31 15:12:09 stakanov_II EAX: 00000000 EBX: 000001a4 ECX: 00000058 EDX: 00000000 > Jan 31 15:12:09 stakanov_II ESI: 00000000 EDI: f7fffff8 EBP: f766d780 ESP: e05bbf1c > Jan 31 15:12:09 stakanov_II DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 > Jan 31 15:12:09 stakanov_II Process imap (pid: 19390, ti=e05ba000 task=f7310f60 task.ti=e05ba000) > Jan 31 15:12:09 stakanov_II Stack: 000001a4 00000000 0f49408a 00000000 00000000 c0159838 e05bbf94 e1de1a80 > Jan 31 15:12:09 stakanov_II 00000000 00000000 00000000 000001a4 00000000 f7fff000 00001000 00000ff8 > Jan 31 15:12:09 stakanov_II 000001ad 00000000 c03ae120 f766d780 fffffffe e1de27e8 c0159a00 e05bbf94 > Jan 31 15:12:09 stakanov_II Call Trace: > Jan 31 15:12:09 stakanov_II [] filldir64+0x0/0xc5 > Jan 31 15:12:09 stakanov_II [] vfs_readdir+0x4a/0x74 > Jan 31 15:12:09 stakanov_II [] filldir64+0x0/0xc5 > Jan 31 15:12:09 stakanov_II [] sys_getdents64+0x63/0xa5 > Jan 31 15:12:09 stakanov_II [] sysenter_past_esp+0x5f/0x85 > Jan 31 15:12:09 stakanov_II ======================= > Jan 31 15:12:09 stakanov_II Code: 04 81 e3 ff ff ff 7f 89 1c 24 ff 54 24 14 85 > c0 0f 85 a9 00 00 00 8b 4f 10 31 d2 83 c1 1f 83 e1 f8 29 4c 24 24 19 54 24 28 > 01 cf <8b> 47 08 8b 57 0c 83 7c 24 28 00 89 44 24 2c 89 54 24 30 7f 99 > Jan 31 15:12:09 stakanov_II EIP: [] xfs_file_readdir+0xf2/0x1aa SS:ESP 0068:e05bbf1c > Jan 31 15:12:09 stakanov_II ---[ end trace 96802517b18c4092 ]--- > Feb 4 10:49:18 stakanov_II BUG: unable to handle kernel paging request at virtual address f8000000 > Feb 4 10:49:18 stakanov_II printing eip: c022dd78 *pde = 00000000 > Feb 4 10:49:18 stakanov_II Oops: 0000 [#1] PREEMPT > Feb 4 10:49:18 stakanov_II Modules linked in: yenta_socket rsrc_nonstatic snd_hda_intel > Feb 4 10:49:18 stakanov_II > Feb 4 10:49:18 stakanov_II Pid: 6057, comm: imap Not tainted (2.6.24-gentoo-metallica #9) > Feb 4 10:49:18 stakanov_II EIP: 0060:[] EFLAGS: 00010282 CPU: 0 > Feb 4 10:49:18 stakanov_II EIP is at xfs_file_readdir+0xf2/0x1aa > Feb 4 10:49:18 stakanov_II EAX: 00000000 EBX: 000001a9 ECX: 00000058 EDX: 00000000 > Feb 4 10:49:18 stakanov_II ESI: 00000000 EDI: f7fffff8 EBP: ef5a2880 ESP: ef40bf1c > Feb 4 10:49:18 stakanov_II DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 > Feb 4 10:49:18 stakanov_II Process imap (pid: 6057, ti=ef40a000 task=ef626520 task.ti=ef40a000) > Feb 4 10:49:18 stakanov_II Stack: 000001a9 00000000 132fea33 00000000 00000000 c0159838 ef40bf94 ef98bc00 > Feb 4 10:49:18 stakanov_II 00000000 00000000 00000000 000001a9 00000000 f7fff000 00001000 00000ff8 > Feb 4 10:49:18 stakanov_II 000001b2 00000000 c03a6120 ef5a2880 fffffffe ef98ae28 c0159a00 ef40bf94 > Feb 4 10:49:18 stakanov_II Call Trace: > Feb 4 10:49:18 stakanov_II [] filldir64+0x0/0xc5 > Feb 4 10:49:18 stakanov_II [] vfs_readdir+0x4a/0x74 > Feb 4 10:49:18 stakanov_II [] filldir64+0x0/0xc5 > Feb 4 10:49:18 stakanov_II [] sys_getdents64+0x63/0xa5 > Feb 4 10:49:18 stakanov_II [] sysenter_past_esp+0x5f/0x85 > Feb 4 10:49:18 stakanov_II [] sta_info_release+0x0/0x5b > Feb 4 10:49:18 stakanov_II ======================= > Feb 4 10:49:18 stakanov_II Code: 04 81 e3 ff ff ff 7f 89 1c 24 ff 54 24 14 85 > c0 0f 85 a9 00 00 00 8b 4f 10 31 d2 83 c1 1f 83 e1 f8 29 4c 24 24 19 54 24 28 > 01 cf <8b> 47 08 8b 57 0c 83 7c 24 28 00 89 44 24 2c 89 54 24 30 7f 99 > Feb 4 10:49:18 stakanov_II EIP: [] xfs_file_readdir+0xf2/0x1aa SS:ESP 0068:ef40bf1c > Feb 4 10:49:18 stakanov_II ---[ end trace d899adf5158e5245 ]--- Yeah, this is the third report of a readdir problem on 2.6.24. The others are here: http://oss.sgi.com/archives/xfs/2008-02/msg00019.html http://oss.sgi.com/archives/xfs/2008-02/msg00007.html I'm pulling down a metadump image of one of the problem filesystems right now and I'm going to try to reproduce it locally. When we find the problem, we'll push the fix into the stable branch. more details in a while. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Mon Feb 4 21:24:16 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 21:24:37 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m155OB2F027535 for ; Mon, 4 Feb 2008 21:24:14 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA23013; Tue, 5 Feb 2008 16:24:24 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m155OMLF51259785; Tue, 5 Feb 2008 16:24:23 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m155OJ8Q51324493; Tue, 5 Feb 2008 16:24:19 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Tue, 5 Feb 2008 16:24:18 +1100 From: David Chinner To: Sven Geggus Cc: xfs@oss.sgi.com, Tobias Ulmer , Andrea Perotti Subject: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Message-ID: <20080205052418.GU155259@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5691/Mon Feb 4 18:12:57 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14337 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Sven, Tomas, Andrea: Can you try the patch attached below to see if it fixes the xfs_file_readdir() oops you are seeing and let me know if it fixes the problem? It looks like we're deferencing a pointer beyond the end of a buffer if the buffer is filled exactly. This bug does not crash ia64 (even with memory poisoning enabled), which is why the targeted corner case testing I did a while back did not pick this up when fixing a similar bug a month ago. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- Fix yet another corner case oops in xfs_file_readdir(). Signed-off-by: Dave Chinner --- fs/xfs/linux-2.6/xfs_file.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_file.c 2008-01-16 16:24:01.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c 2008-02-05 15:13:17.153110696 +1100 @@ -351,8 +351,8 @@ xfs_file_readdir( size = buf.used; de = (struct hack_dirent *)buf.dirent; - curr_offset = de->offset /* & 0x7fffffff */; while (size > 0) { + curr_offset = de->offset /* & 0x7fffffff */; if (filldir(dirent, de->name, de->namlen, curr_offset & 0x7fffffff, de->ino, de->d_type)) { @@ -363,7 +363,6 @@ xfs_file_readdir( sizeof(u64)); size -= reclen; de = (struct hack_dirent *)((char *)de + reclen); - curr_offset = de->offset /* & 0x7fffffff */; } } From owner-xfs@oss.sgi.com Mon Feb 4 23:19:52 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Feb 2008 23:19:58 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m157JpcR032531 for ; Mon, 4 Feb 2008 23:19:52 -0800 X-ASG-Debug-ID: 1202196013-1bd0009c0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.novgorod.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8BDFFD8F5E5 for ; Mon, 4 Feb 2008 23:20:13 -0800 (PST) Received: from mail.novgorod.net (mail.novgorod.net [81.25.0.34]) by cuda.sgi.com with ESMTP id dKAIbkFi0Qbo6S75 for ; Mon, 04 Feb 2008 23:20:13 -0800 (PST) Received: from localhost (localhost) by mail.novgorod.net (8.13.4/8.13.4) id m157K8ZC089202; Tue, 5 Feb 2008 10:20:08 +0300 (MSK) (envelope-from MAILER-DAEMON) Date: Tue, 5 Feb 2008 10:20:08 +0300 (MSK) From: Mail Delivery Subsystem Message-Id: <200802050720.m157K8ZC089202@mail.novgorod.net> To: MIME-Version: 1.0 Content-Type: multipart/report; report-type=delivery-status; boundary="m157K8ZC089202.1202196008/mail.novgorod.net" X-ASG-Orig-Subj: Returned mail: see transcript for details Subject: Returned mail: see transcript for details Auto-Submitted: auto-generated (failure) X-Barracuda-Connect: mail.novgorod.net[81.25.0.34] X-Barracuda-Start-Time: 1202196014 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41416 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5691/Mon Feb 4 18:12:57 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14338 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: MAILER-DAEMON@mail.novgorod.net Precedence: bulk X-list: xfs This is a MIME-encapsulated message --m157K8ZC089202.1202196008/mail.novgorod.net The original message was received at Tue, 5 Feb 2008 10:20:08 +0300 (MSK) from [81.25.5.230] ----- The following addresses had permanent fatal errors ----- (reason: 550 5.1.1 ... User unknown) ----- Transcript of session follows ----- ... while talking to ns.geol.msu.ru.: >>> DATA <<< 550 5.1.1 ... User unknown 550 5.1.1 ... User unknown <<< 503 5.0.0 Need RCPT (recipient) --m157K8ZC089202.1202196008/mail.novgorod.net Content-Type: message/delivery-status Reporting-MTA: dns; mail.novgorod.net Received-From-MTA: DNS; [81.25.5.230] Arrival-Date: Tue, 5 Feb 2008 10:20:08 +0300 (MSK) Final-Recipient: RFC822; mitya@seismic.geol.msu.ru Action: failed Status: 5.1.1 Remote-MTA: DNS; ns.geol.msu.ru Diagnostic-Code: SMTP; 550 5.1.1 ... User unknown Last-Attempt-Date: Tue, 5 Feb 2008 10:20:08 +0300 (MSK) --m157K8ZC089202.1202196008/mail.novgorod.net Content-Type: text/rfc822-headers Return-Path: Received: from oss.sgi.com ([81.25.5.230]) by mail.novgorod.net (8.13.4/8.13.4) with ESMTP id m157K8ZC089200 for ; Tue, 5 Feb 2008 10:20:08 +0300 (MSK) (envelope-from xfs@oss.sgi.com) Message-Id: <200802050720.m157K8ZC089200@mail.novgorod.net> From: xfs@oss.sgi.com To: mitya@seismic.geol.msu.ru Subject: Mail System Error - Returned Mail Date: Tue, 5 Feb 2008 10:20:00 +0300 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0004_0F90E1EC.F475B130" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 --m157K8ZC089202.1202196008/mail.novgorod.net-- From owner-xfs@oss.sgi.com Tue Feb 5 00:56:14 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 00:56:20 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m158uB9M010130 for ; Tue, 5 Feb 2008 00:56:14 -0800 X-ASG-Debug-ID: 1202201792-3c6500120000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.novgorod.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 589F6D9002D for ; Tue, 5 Feb 2008 00:56:33 -0800 (PST) Received: from mail.novgorod.net (mail.novgorod.net [81.25.0.34]) by cuda.sgi.com with ESMTP id SaGUt0ocachKg8Md for ; Tue, 05 Feb 2008 00:56:33 -0800 (PST) Received: from localhost (localhost) by mail.novgorod.net (8.13.4/8.13.4) id m158uTJY074976; Tue, 5 Feb 2008 11:56:29 +0300 (MSK) (envelope-from MAILER-DAEMON) Date: Tue, 5 Feb 2008 11:56:29 +0300 (MSK) From: Mail Delivery Subsystem Message-Id: <200802050856.m158uTJY074976@mail.novgorod.net> To: MIME-Version: 1.0 Content-Type: multipart/report; report-type=delivery-status; boundary="m158uTJY074976.1202201789/mail.novgorod.net" X-ASG-Orig-Subj: Returned mail: see transcript for details Subject: Returned mail: see transcript for details Auto-Submitted: auto-generated (failure) X-Barracuda-Connect: mail.novgorod.net[81.25.0.34] X-Barracuda-Start-Time: 1202201794 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41421 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5691/Mon Feb 4 18:12:57 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14339 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: MAILER-DAEMON@mail.novgorod.net Precedence: bulk X-list: xfs This is a MIME-encapsulated message --m158uTJY074976.1202201789/mail.novgorod.net The original message was received at Tue, 5 Feb 2008 11:56:27 +0300 (MSK) from [81.25.5.230] ----- The following addresses had permanent fatal errors ----- (reason: 550 Message was not accepted -- invalid mailbox. Local mailbox pvm_48211@mail.ru is unavailable: user not found) ----- Transcript of session follows ----- ... while talking to mxs.mail.ru.: >>> DATA <<< 550 Message was not accepted -- invalid mailbox. Local mailbox pvm_48211@mail.ru is unavailable: user not found 554 5.0.0 Service unavailable --m158uTJY074976.1202201789/mail.novgorod.net Content-Type: message/delivery-status Reporting-MTA: dns; mail.novgorod.net Received-From-MTA: DNS; [81.25.5.230] Arrival-Date: Tue, 5 Feb 2008 11:56:27 +0300 (MSK) Final-Recipient: RFC822; pvm_48211@mail.ru Action: failed Status: 5.2.0 Remote-MTA: DNS; mxs.mail.ru Diagnostic-Code: SMTP; 550 Message was not accepted -- invalid mailbox. Local mailbox pvm_48211@mail.ru is unavailable: user not found Last-Attempt-Date: Tue, 5 Feb 2008 11:56:29 +0300 (MSK) --m158uTJY074976.1202201789/mail.novgorod.net Content-Type: text/rfc822-headers Return-Path: Received: from oss.sgi.com ([81.25.5.230]) by mail.novgorod.net (8.13.4/8.13.4) with ESMTP id m158uRJY074946 for ; Tue, 5 Feb 2008 11:56:27 +0300 (MSK) (envelope-from linux-xfs@oss.sgi.com) Message-Id: <200802050856.m158uRJY074946@mail.novgorod.net> From: linux-xfs@oss.sgi.com To: pvm_48211@mail.ru Subject: Delivery reports about your e-mail Date: Tue, 5 Feb 2008 11:55:19 +0300 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0014_C3CF33A4.F130F2FA" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 --m158uTJY074976.1202201789/mail.novgorod.net-- From owner-xfs@oss.sgi.com Tue Feb 5 03:48:19 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 03:48:23 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: *** X-Spam-Status: No, score=3.0 required=5.0 tests=AWL,BAYES_00,HEADER_ESQ, J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m15BmHoR018967 for ; Tue, 5 Feb 2008 03:48:19 -0800 X-ASG-Debug-ID: 1202212116-481a02290000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from fg-out-1718.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B0BA159FA5F for ; Tue, 5 Feb 2008 03:48:36 -0800 (PST) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.159]) by cuda.sgi.com with ESMTP id Fz8F6sf10Up2zKGu for ; Tue, 05 Feb 2008 03:48:36 -0800 (PST) Received: by fg-out-1718.google.com with SMTP id e12so2261341fga.8 for ; Tue, 05 Feb 2008 03:48:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; bh=uYLBWO9isJnrQsCkBpmCwf+ZVZGqYdnFQCHLIzZfJgY=; b=GSpQxO5fyh5uqiRjLR/Rvj36vTujRNYtYkUVRBaShXm7hHNwyStG+eABk5In4IZHabKDZbOvahKVQUK+nOJm3yMj9deBBzRVUQXmI/JRZTfQhgfQGx3SKKUZwdyScIE6na+4X4otW469qsKQroXwA+q3tHJcXYCXmg7EhEm59yY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=VS8gN7dlOGKHoz0MtTSysW8FlPjlP7BCyOhVmz9MK/bG9MRK7U48FdBPRr1df8YGHmkSKSw0Ft5JHt89V6BDMZQj+uuj+SYkxGC7iWOOQnQvBUj5uBRnEQkwovQArOhVDoiHsPNRF70F6CuN+cWALtP3FFJkrogmYNPpyPtEE+Y= Received: by 10.86.65.11 with SMTP id n11mr7720463fga.4.1202211743352; Tue, 05 Feb 2008 03:42:23 -0800 (PST) Received: by 10.86.100.20 with HTTP; Tue, 5 Feb 2008 03:42:23 -0800 (PST) Message-ID: Date: Tue, 5 Feb 2008 11:42:23 +0000 From: "Jamie Tufnell" To: xfs@oss.sgi.com X-ASG-Orig-Subj: Linux mkfs.xfs parameters for large files Subject: Linux mkfs.xfs parameters for large files MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Barracuda-Connect: fg-out-1718.google.com[72.14.220.159] X-Barracuda-Start-Time: 1202212120 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41432 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5692/Mon Feb 4 23:21:55 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14340 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: diesql@googlemail.com Precedence: bulk X-list: xfs Hi, I've done some research into how to best-prepare our Linux (2.6.18) XFS filesystem for a read-heavy workload of mostly large files (files between 50MB and 500MB) and I have some questions on block-sizes / extents. My experience with other filesystems leads me to believe I should experiment with large block-sizes for this application. In trying to do so, I found that the Linux XFS implementation forces the block size to not exceed the page size -- 4kB -- which seems a bit small in this case. At first, I thought this was *really* bad news for me but I've since read some posts and the impression I'm getting is it's not that big of a deal? Can someone please explain why (if that's true)? I've read that an extent is one or more "continguous" blocks, so should I simply tune the extent size in the same way I was expecting to tune the block-size? Are there any other mkfs.xfs parameters I should experiment with.. so far I plan to use stripe width/unit as the filesystem will be on a striped RAID. Lastly, would you recommend XFS for this application? Is there a better filesystem for what I'm trying to do? Any help greatly appreciated! Thanks, J. From owner-xfs@oss.sgi.com Tue Feb 5 04:31:14 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 04:31:19 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m15CV9Fm025857 for ; Tue, 5 Feb 2008 04:31:14 -0800 X-ASG-Debug-ID: 1202214692-79ed02750000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ishtar.tlinx.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7BFB3D90E0C for ; Tue, 5 Feb 2008 04:31:32 -0800 (PST) Received: from ishtar.tlinx.org (ishtar.tlinx.org [64.81.245.74]) by cuda.sgi.com with ESMTP id jdFu52qNiXCan4c4 for ; Tue, 05 Feb 2008 04:31:32 -0800 (PST) Received: from [192.168.3.11] (Athena [192.168.3.11]) by ishtar.tlinx.org (8.13.3/8.12.10/SuSE Linux 0.7) with ESMTP id m15CVTRx008517; Tue, 5 Feb 2008 04:31:29 -0800 Message-ID: <47A85721.2090807@tlinx.org> Date: Tue, 05 Feb 2008 04:31:29 -0800 From: Linda Walsh User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: xfs@oss.sgi.com CC: linux-raid@vger.kernel.org X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> <47A72061.3010800@sandeen.net> <47A72FBC.9090701@pobox.com> <47A7411F.2040702@sandeen.net> <47A749C9.6010503@msgid.tls.msk.ru> In-Reply-To: <47A749C9.6010503@msgid.tls.msk.ru> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Connect: ishtar.tlinx.org[64.81.245.74] X-Barracuda-Start-Time: 1202214692 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41433 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5692/Mon Feb 4 23:21:55 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14341 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: xfs@tlinx.org Precedence: bulk X-list: xfs Michael Tokarev wrote: > note that with some workloads, write caching in > the drive actually makes write speed worse, not better - namely, > in case of massive writes. ---- With write barriers enabled, I did a quick test of a large copy from one backup filesystem to another. I'm not what you refer to when you say large, but this disk has 387G used with 975 files, averaging about 406MB/file. I was copying from /hde (ATA100-750G) to /sdb (SATA-300-750G) (both, basically underlying model) Of course your 'mileage may vary', and these were averages over 12 runs each (w/ + w/out wcaching); (write cache on) write read dev ave TPS MB/s MB/s hde ave 64.67 30.94 0.0 sdb ave 249.51 0.24 30.93 (write cache off) write read dev ave TPS MB/s MB/s hde ave 45.63 21.81 0.0 xx: ave 177.76 0.24 21.96 write w/cache = (30.94-21.86)/21.86 => 45% faster w/o write cache = 100-(100*21.81/30.94) => 30% slower These disks have barrier support, so I'd guess the differences would have been greater if you didn't worry about losing w-cache contents. If barrier support doesn't work and one has to disable write-caching, that is a noticeable performance penalty. All writes with noatime, nodiratime, logbufs=8. FWIW...slightly OT, the rates under Win for their write-through (FAT32-perf) vs. write-back caching (NTFS-perf) were FAT about 60% faster over NTFS or NTFS ~ 40% slower than FAT32 (with ops for no-last-access and no 3.1 filename creation) From owner-xfs@oss.sgi.com Tue Feb 5 06:25:15 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 06:25:27 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m15EPCk9032031 for ; Tue, 5 Feb 2008 06:25:14 -0800 X-ASG-Debug-ID: 1202221532-442f01e90000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from malkusch.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3C4A7D77A94 for ; Tue, 5 Feb 2008 06:25:32 -0800 (PST) Received: from malkusch.de (malkusch.de [217.171.203.96]) by cuda.sgi.com with ESMTP id owNL9jUBcZDGfXmH for ; Tue, 05 Feb 2008 06:25:32 -0800 (PST) Received: from massmann.mhn.de ([129.187.19.220] helo=ferrara.massmann.mhn.de) by malkusch.de with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.68) (envelope-from ) id 1JMOjr-0006vk-11 for xfs@oss.sgi.com; Tue, 05 Feb 2008 15:25:31 +0100 From: Markus Malkusch To: xfs@oss.sgi.com X-ASG-Orig-Subj: xfs_repair progress Subject: xfs_repair progress Date: Tue, 5 Feb 2008 15:25:24 +0100 User-Agent: KMail/1.9.7 X-Face: L]ysRB@fzNk6qkK4ZAv:`;XNHsDCn=u0yWT&<_W5&QR,BABX[.eEQA}67D_fm(=?utf-8?q?=7EU=3BYgaIm=0A=09ordzS=2E=7E?=(CoDV+\i4n*[@aCEE(bi=HaM}',/)B/w,IeQ_"[(*1@,]_0!ge.2 X-Barracuda-Connect: malkusch.de[217.171.203.96] X-Barracuda-Start-Time: 1202221535 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41440 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5692/Mon Feb 4 23:21:55 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14342 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: markus@malkusch.de Precedence: bulk X-list: xfs xfs_repair 2.9.4 is running since 3 days. It's still since those 3 days at Phase 3 agno = 0. Is there a possibility to look if xfs_repair is doing progress (it is using 107% CPU)? The filesystem is 4,6T large. I don't know how many Inodes there are and I can't look right now as xfs_repair is running, but there are lots as I have several hardlinks of all files. Markus Malkusch PS: Could you CC me, because I'm not on the mailinglist. From owner-xfs@oss.sgi.com Tue Feb 5 06:53:30 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 06:53:33 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m15ErQlw000954 for ; Tue, 5 Feb 2008 06:53:30 -0800 X-ASG-Debug-ID: 1202223225-252e03d30000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5F5F45A0434 for ; Tue, 5 Feb 2008 06:53:46 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id kyCndqo2GARFkWUS for ; Tue, 05 Feb 2008 06:53:46 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 5266918DB2995; Tue, 5 Feb 2008 08:53:44 -0600 (CST) Message-ID: <47A87877.1050106@sandeen.net> Date: Tue, 05 Feb 2008 08:53:43 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Matthias Schniedermeyer CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: mkfs.xfs doesn't detect size of storage correctly Subject: Re: mkfs.xfs doesn't detect size of storage correctly References: <20080129093201.GA16203@citd.de> In-Reply-To: <20080129093201.GA16203@citd.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202223229 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41443 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5692/Mon Feb 4 23:21:55 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14343 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Matthias Schniedermeyer wrote: > There is roughly 1/3 missing. SGI, have you settled on a fix for this? Latest released xfsprogs, 2.9.5 has this problem, and I've put it in rawhide; it's also in the F9 alpha spin. Should I revert to 2.9.4 if you don't have plans to fix it soon? Nothing indicates to the user that their fs was misformatted, and there will be a growing number fileystems missing a substantial portion of their space out there.... -Eric From owner-xfs@oss.sgi.com Tue Feb 5 06:56:15 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 06:56:17 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.7 required=5.0 tests=AWL,BAYES_00,HEADER_ESQ, J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m15EuCqB001392 for ; Tue, 5 Feb 2008 06:56:15 -0800 X-ASG-Debug-ID: 1202223394-429a015d0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3D1905A0487 for ; Tue, 5 Feb 2008 06:56:34 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id zdsWFWoL3N6y2KJx for ; Tue, 05 Feb 2008 06:56:34 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id C01A918DB2995; Tue, 5 Feb 2008 08:56:01 -0600 (CST) Message-ID: <47A87901.7010508@sandeen.net> Date: Tue, 05 Feb 2008 08:56:01 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Jamie Tufnell CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Linux mkfs.xfs parameters for large files Subject: Re: Linux mkfs.xfs parameters for large files References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202223395 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41443 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5692/Mon Feb 4 23:21:55 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14344 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Jamie Tufnell wrote: > Hi, > > I've done some research into how to best-prepare our Linux (2.6.18) > XFS filesystem for a read-heavy workload of mostly large files (files > between 50MB and 500MB) and I have some questions on block-sizes / > extents. > > My experience with other filesystems leads me to believe I should > experiment with large block-sizes for this application. In trying to > do so, I found that the Linux XFS implementation forces the block size > to not exceed the page size -- 4kB -- which seems a bit small in this > case. At first, I thought this was *really* bad news for me but I've > since read some posts and the impression I'm getting is it's not that > big of a deal? In your careful benchmarking & testing, which aspects of the performance you've found does not meet your needs or expectations? That's always a good place to start. -Eric From owner-xfs@oss.sgi.com Tue Feb 5 07:35:21 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 07:35:24 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m15FZHLt003688 for ; Tue, 5 Feb 2008 07:35:21 -0800 X-ASG-Debug-ID: 1202225661-6bfc00400000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from enyo.dsw2k3.info (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 01C7A5A0DE4 for ; Tue, 5 Feb 2008 07:34:21 -0800 (PST) Received: from enyo.dsw2k3.info (enyo.dsw2k3.info [195.71.86.239]) by cuda.sgi.com with ESMTP id PAGFjQeGzFFkIYOj for ; Tue, 05 Feb 2008 07:34:21 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by enyo.dsw2k3.info (Postfix) with ESMTP id E79072BC4F; Tue, 5 Feb 2008 16:33:47 +0100 (CET) X-Virus-Scanned: ClamAV 0.91.2/5692/Mon Feb 4 23:21:55 2008 on oss.sgi.com X-Virus-Scanned: Debian amavisd-new at enyo.dsw2k3.info Received: from enyo.dsw2k3.info ([127.0.0.1]) by localhost (enyo.dsw2k3.info [127.0.0.1]) (amavisd-new, port 10024) with LMTP id kocG20k7WaHz; Tue, 5 Feb 2008 16:33:41 +0100 (CET) Received: from citd.de (p4FC4DAB0.dip.t-dialin.net [79.196.218.176]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by enyo.dsw2k3.info (Postfix) with ESMTP id E4A3D2BC40; Tue, 5 Feb 2008 16:33:40 +0100 (CET) Date: Tue, 5 Feb 2008 16:33:37 +0100 From: Matthias Schniedermeyer To: Eric Sandeen Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: mkfs.xfs doesn't detect size of storage correctly Subject: Re: mkfs.xfs doesn't detect size of storage correctly Message-ID: <20080205153337.GA9915@citd.de> References: <20080129093201.GA16203@citd.de> <47A87877.1050106@sandeen.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47A87877.1050106@sandeen.net> User-Agent: Mutt/1.5.17+20080114 (2008-01-14) X-Barracuda-Connect: enyo.dsw2k3.info[195.71.86.239] X-Barracuda-Start-Time: 1202225662 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-ASG-Whitelist: HEADER (^X-Barracuda-Connect: [^ ]+\.dsw2k3\.info\[) X-Virus-Status: Clean X-archive-position: 14345 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: ms@citd.de Precedence: bulk X-list: xfs On 05.02.2008 08:53, Eric Sandeen wrote: > Matthias Schniedermeyer wrote: > > > There is roughly 1/3 missing. > > SGI, have you settled on a fix for this? Latest released xfsprogs, > 2.9.5 has this problem, and I've put it in rawhide; it's also in the F9 > alpha spin. Should I revert to 2.9.4 if you don't have plans to fix it > soon? > > Nothing indicates to the user that their fs was misformatted, and there > will be a growing number fileystems missing a substantial portion of > their space out there.... The big question here is if you can grow the filesystem to fix it. Before i found the workaround with the agcount i tested to grow the fs, but it didn't work. So: A correctly(tm) fixed package i think it must include a xfs_grows that can fix a damaged(tm) fs, otherwise people who use such a filesystem (and can't backup/restore) are stuck with missing a large chunk of space. Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. From owner-xfs@oss.sgi.com Tue Feb 5 13:13:22 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 13:13:25 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m15LDGVJ027751 for ; Tue, 5 Feb 2008 13:13:20 -0800 Received: from [134.15.251.3] (melb-sw-corp-251-3.corp.sgi.com [134.15.251.3]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id IAA17721; Wed, 6 Feb 2008 08:13:28 +1100 Message-ID: <47A8D156.8060405@sgi.com> Date: Wed, 06 Feb 2008 08:12:54 +1100 From: Mark Goodwin Reply-To: markgw@sgi.com Organization: SGI Engineering User-Agent: Thunderbird 1.5.0.14 (Windows/20071210) MIME-Version: 1.0 To: Eric Sandeen CC: Matthias Schniedermeyer , xfs@oss.sgi.com Subject: Re: mkfs.xfs doesn't detect size of storage correctly References: <20080129093201.GA16203@citd.de> <47A87877.1050106@sandeen.net> In-Reply-To: <47A87877.1050106@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/5700/Tue Feb 5 11:08:59 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14346 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: markgw@sgi.com Precedence: bulk X-list: xfs Eric Sandeen wrote: > Matthias Schniedermeyer wrote: > >> There is roughly 1/3 missing. > > SGI, have you settled on a fix for this? Yes, Barry is reviewing and testing your proposed fix. The delay was caused by most of the XFS team attending LCA last week. .. sorry. We'll get a new package out onto OSS asap. Cheers -- Mark > Latest released xfsprogs, > 2.9.5 has this problem, and I've put it in rawhide; it's also in the F9 > alpha spin. Should I revert to 2.9.4 if you don't have plans to fix it > soon? > > Nothing indicates to the user that their fs was misformatted, and there > will be a growing number fileystems missing a substantial portion of > their space out there.... > > -Eric > > From owner-xfs@oss.sgi.com Tue Feb 5 14:51:29 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 14:51:36 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m15MpSua031723 for ; Tue, 5 Feb 2008 14:51:29 -0800 X-ASG-Debug-ID: 1202251908-1e9000e80000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from malkusch.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 19D9F5A43BB for ; Tue, 5 Feb 2008 14:51:49 -0800 (PST) Received: from malkusch.de (malkusch.de [217.171.203.96]) by cuda.sgi.com with ESMTP id UdEK31tYV6kbtH2H for ; Tue, 05 Feb 2008 14:51:49 -0800 (PST) Received: from massmann.mhn.de ([129.187.19.220] helo=ferrara.massmann.mhn.de) by malkusch.de with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.68) (envelope-from ) id 1JMWdm-0002B0-Vt; Tue, 05 Feb 2008 23:51:47 +0100 From: Markus Malkusch To: Chris Wedgwood X-ASG-Orig-Subj: Re: xfs_repair progress Subject: Re: xfs_repair progress Date: Tue, 5 Feb 2008 23:51:40 +0100 User-Agent: KMail/1.9.7 Cc: xfs@oss.sgi.com References: <200802051525.25101.markus@malkusch.de> <20080205172414.GA801@puku.stupidest.org> In-Reply-To: <20080205172414.GA801@puku.stupidest.org> X-Face: L]ysRB@fzNk6qkK4ZAv:`;XNHsDCn=u0yWT&<_W5&QR,BABX[.eEQA}67D_fm(=?utf-8?q?=7EU=3BYgaIm=0A=09ordzS=2E=7E?=(CoDV+\i4n*[@aCEE(bi=HaM}',/)B/w,IeQ_"[(*1@,]_0!ge.2 X-Barracuda-Connect: malkusch.de[217.171.203.96] X-Barracuda-Start-Time: 1202251911 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41473 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5700/Tue Feb 5 11:08:59 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14347 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: markus@malkusch.de Precedence: bulk X-list: xfs Chris Wedgwood: > is it swapping? (run vmstat) I can't run vmstat, as I cancelt the xfs_repair before your mail. I remember that one CPU was fully loaded (Top said 107% CPU for xfs_repair). The system doesn't feel that it was swaping. But I also remember that there were about 300M in Swap. Anyway, I deleted all those hardlinks which reduced the number of inodes from about 1.5 million to 325735. Now I can run xfs_repair in two minutes. Markus Malkusch From owner-xfs@oss.sgi.com Tue Feb 5 15:41:51 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 15:41:54 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m15NflvM001610 for ; Tue, 5 Feb 2008 15:41:49 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA23246; Wed, 6 Feb 2008 10:42:05 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 1B5C558C4C11; Wed, 6 Feb 2008 10:42:05 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 976923 - Fix oops in xfs_file_readdir() Message-Id: <20080205234205.1B5C558C4C11@chook.melbourne.sgi.com> Date: Wed, 6 Feb 2008 10:42:05 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/5700/Tue Feb 5 11:08:59 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14348 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Fix oops in xfs_file_readdir() When xfs_file_readdir() exactly fills a buffer, it can move it's index past the end of the buffer and dereference it even though the result of the dereference is never used. On some platforms this causes an oops. Date: Wed Feb 6 10:41:34 AEDT 2008 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: lachlan@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30458a fs/xfs/linux-2.6/xfs_file.c - 1.163 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_file.c.diff?r1=text&tr1=1.163&r2=text&tr2=1.162&f=h - Only update the current offset in xfs_file_readdir() when it is safe to do so. From owner-xfs@oss.sgi.com Tue Feb 5 16:20:02 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 16:20:05 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m160Jt4r007620 for ; Tue, 5 Feb 2008 16:20:00 -0800 Received: from pc-bnaujok.melbourne.sgi.com (pc-bnaujok.melbourne.sgi.com [134.14.55.58]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA24183; Wed, 6 Feb 2008 11:20:17 +1100 Date: Wed, 06 Feb 2008 11:20:45 +1100 To: "xfs@oss.sgi.com" Subject: [PATCH] Fix mkfs.xfs default AG sizing From: "Barry Naujok" Organization: SGI Cc: "Eric Sandeen" Content-Type: multipart/mixed; boundary=----------wK0q2RDgzG3paggCTWdLgh MIME-Version: 1.0 Message-ID: User-Agent: Opera Mail/9.24 (Win32) X-Virus-Scanned: ClamAV 0.91.2/5700/Tue Feb 5 11:08:59 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14349 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@sgi.com Precedence: bulk X-list: xfs ------------wK0q2RDgzG3paggCTWdLgh Content-Type: text/plain; format=flowed; delsp=yes; charset=utf-8 Content-Transfer-Encoding: 7bit Based on Eric's patches and my investigations, I have generated the attached patch. ------------wK0q2RDgzG3paggCTWdLgh Content-Disposition: attachment; filename=fix_mkfs.patch Content-Type: text/x-patch; name=fix_mkfs.patch Content-Transfer-Encoding: Quoted-Printable --- a/xfsprogs/mkfs/xfs_mkfs.c 2008-02-06 11:17:50.000000000 +1100 +++ b/xfsprogs/mkfs/xfs_mkfs.c 2008-02-06 11:17:10.769672851 +1100 @@ -412,7 +412,9 @@ /* * First handle the extremes - the points at which we will * always use the maximum AG size, the points at which we - * always use the minimum, and a "small-step" for 16-128Mb. + * always use the minimum, and a "small-step" for 16-128MB. + * + * These apply regardless of storage configuration. */ if (dblocks >=3D TERABYTES(32, blocklog)) { blocks =3D XFS_AG_MAX_BLOCKS(blocklog); @@ -427,6 +429,8 @@ } =20 /* + * Sizes between 128MB and 32TB: + * * For the remainder we choose an AG size based on the * number of data blocks available, trying to keep the * number of AGs relatively small (especially compared @@ -435,7 +439,9 @@ * count can be increased by growfs, so prefer to use * smaller counts at mkfs time. * - * This scales us up smoothly between min/max AG sizes. + * For a single underlying storage device less than 4TB + * in size, just use 4 AGs, otherwise (for JBOD/RAIDs) + * scale up smoothly between min/max AG sizes. */ =20 if (!multidisk) { @@ -443,12 +449,8 @@ blocks =3D XFS_AG_MAX_BLOCKS(blocklog); goto done; } - count =3D 4; - - goto done; - } - - if (dblocks > GIGABYTES(512, blocklog)) + shift =3D 2; + } else if (dblocks > GIGABYTES(512, blocklog)) shift =3D 5; else if (dblocks > GIGABYTES(8, blocklog)) shift =3D 4; @@ -456,14 +458,17 @@ shift =3D 3; else ASSERT(0); - blocks =3D dblocks >> shift; + /* + * If dblocks is not evenly divisible by the number of + * desired AGs, add 1 to "blocks" so we don't lose the + * last bit of the filesystem. The same principle applies + * to the AG count, so we don't lose the last AG! + */ + blocks =3D (dblocks >> shift) + ((dblocks & xfs_mask32lo(shift)) !=3D 0); =20 done: - ASSERT (count || blocks); if (!count) count =3D dblocks / blocks + (dblocks % blocks !=3D 0); - if (!blocks) - blocks =3D dblocks / count; =20 *agsize =3D blocks; *agcount =3D count; @@ -1391,7 +1396,7 @@ =20 sectoralign =3D 0; xlv_dsunit =3D xlv_dswidth =3D 0; - if (!nodsflag && !xi.disfile) + if (!xi.disfile) get_subvol_stripe_wrapper(dfile, SVTYPE_DATA, &xlv_dsunit, &xlv_dswidth, §oralign); if (sectoralign) { @@ -1797,12 +1802,9 @@ =20 } else if (daflag) /* User-specified AG count */ agsize =3D dblocks / agcount + (dblocks % agcount !=3D 0); - else { - get_subvol_stripe_wrapper(dfile, SVTYPE_DATA, - &xlv_dsunit, &xlv_dswidth, §oralign), - calc_default_ag_geometry(blocklog, dblocks, xlv_dsunit | xlv_dswidth, - &agsize, &agcount); - } + else + calc_default_ag_geometry(blocklog, dblocks, + xlv_dsunit | xlv_dswidth, &agsize, &agcount); =20 /* * If the last AG is too small, reduce the filesystem size @@ -1817,24 +1819,28 @@ =20 validate_ag_geometry(blocklog, dblocks, agsize, agcount); =20 - if (!nodsflag && dsunit) { - if (xlv_dsunit && xlv_dsunit !=3D dsunit) { - fprintf(stderr, - _("%s: Specified data stripe unit %d is not " - "the same as the volume stripe unit %d\n"), - progname, dsunit, xlv_dsunit); - } - if (xlv_dswidth && xlv_dswidth !=3D dswidth) { - fprintf(stderr, - _("%s: Specified data stripe width %d is not " - "the same as the volume stripe width %d\n"), - progname, dswidth, xlv_dswidth); + if (!nodsflag) { + if (dsunit) { + if (xlv_dsunit && xlv_dsunit !=3D dsunit) { + fprintf(stderr, + _("%s: Specified data stripe unit %d " + "is not the same as the volume stripe " + "unit %d\n"), + progname, dsunit, xlv_dsunit); + } + if (xlv_dswidth && xlv_dswidth !=3D dswidth) { + fprintf(stderr, + _("%s: Specified data stripe width %d " + "is not the same as the volume stripe " + "width %d\n"), + progname, dswidth, xlv_dswidth); + } + } else { + dsunit =3D xlv_dsunit; + dswidth =3D xlv_dswidth; + nodsflag =3D 1; } - } else { - dsunit =3D xlv_dsunit; - dswidth =3D xlv_dswidth; - nodsflag =3D 1; - } + } /* else dsunit & dswidth can't be set if nodsflag is set */ =20 /* * If dsunit is a multiple of fs blocksize, then check that is a ------------wK0q2RDgzG3paggCTWdLgh-- From owner-xfs@oss.sgi.com Tue Feb 5 17:15:19 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 17:15:22 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m161FExN009819 for ; Tue, 5 Feb 2008 17:15:19 -0800 X-ASG-Debug-ID: 1202260537-1cce01dd0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ishtar.tlinx.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C8B86D9F266 for ; Tue, 5 Feb 2008 17:15:37 -0800 (PST) Received: from ishtar.tlinx.org (ishtar.tlinx.org [64.81.245.74]) by cuda.sgi.com with ESMTP id 9B5AeFacX7ejkKHg for ; Tue, 05 Feb 2008 17:15:37 -0800 (PST) Received: from [192.168.3.11] (Athena [192.168.3.11]) by ishtar.tlinx.org (8.13.3/8.12.10/SuSE Linux 0.7) with ESMTP id m161FaAJ017037 for ; Tue, 5 Feb 2008 17:15:36 -0800 Message-ID: <47A90A38.3050206@tlinx.org> Date: Tue, 05 Feb 2008 17:15:36 -0800 From: "Linda A. Walsh" User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: xfs-oss X-ASG-Orig-Subj: %inodes question Subject: %inodes question Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Connect: ishtar.tlinx.org[64.81.245.74] X-Barracuda-Start-Time: 1202260537 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41475 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5700/Tue Feb 5 11:08:59 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14350 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: law@tlinx.org Precedence: bulk X-list: xfs Does this really matter much anymore? It defaults to 25%, but on a 1T disk, that's 250G "reserved"(?) for inodes? Even 1% still gives you 10G of inodes. From owner-xfs@oss.sgi.com Tue Feb 5 17:43:57 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 17:44:04 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m161hsE1010926 for ; Tue, 5 Feb 2008 17:43:57 -0800 X-ASG-Debug-ID: 1202262251-1cd4023d0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ishtar.tlinx.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id BFE4FD9F488 for ; Tue, 5 Feb 2008 17:44:11 -0800 (PST) Received: from ishtar.tlinx.org (ishtar.tlinx.org [64.81.245.74]) by cuda.sgi.com with ESMTP id GrCB02FSll45KvOM for ; Tue, 05 Feb 2008 17:44:11 -0800 (PST) Received: from [192.168.3.11] (Athena [192.168.3.11]) by ishtar.tlinx.org (8.13.3/8.12.10/SuSE Linux 0.7) with ESMTP id m161ClGB017011; Tue, 5 Feb 2008 17:12:52 -0800 Message-ID: <47A9098F.4020801@tlinx.org> Date: Tue, 05 Feb 2008 17:12:47 -0800 From: Linda Walsh User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: Michael Tokarev CC: Eric Sandeen , Justin Piszcz , Moshe Yudkowsky , linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> <47A72061.3010800@sandeen.net> <47A73F90.3020307@msgid.tls.msk.ru> In-Reply-To: <47A73F90.3020307@msgid.tls.msk.ru> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Connect: ishtar.tlinx.org[64.81.245.74] X-Barracuda-Start-Time: 1202262256 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41476 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5702/Tue Feb 5 16:46:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14351 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: xfs@tlinx.org Precedence: bulk X-list: xfs Michael Tokarev wrote: > Unfortunately an UPS does not *really* help here. Because unless > it has control program which properly shuts system down on the loss > of input power, and the battery really has the capacity to power the > system while it's shutting down (anyone tested this? ---- Yes. I must say, I am not connected or paid by APC. > With new UPS? > and after an year of use, when the battery is not new?), -- unless > the UPS actually has the capacity to shutdown system, it will cut > the power at an unexpected time, while the disk(s) still has dirty > caches... -------- If you have a "SmartUPS" by "APC", their is a freeware demon that monitors it's status. The UPS has USB and serial connections. It's included in some distributions (SuSE). The config file is pretty straight forward. I recommend the "1000XL" (1000 peak Volt-Amp load -- usually at startup; note, this is not the same as watts as some of us were taught in basic electronics class since the unit isn't a simple resistor (like a light bulb). over the 1500XL because with the 1000XL, you can buy several "add-on batteries" that plug into the back. One minor (but not fatal) design flaw: the add-on batteries give no indication that they are "live" (I knocked a cord on one, and only got 7 minutes of uptime before things shut-down instead of my expected 20. I have 3-cells total (controller & 1 extra pack). So why is my run time so short? I am being lazy in buying more extension packs. The UPS is running 3 computers, the house-phone, (answering and wireless handsets). a digital clock, 1 LCD (usually off), The real killer is a new workstation with 2x2-Core-II chips and other comparable equipment. The "1500XL" doesn't allow for adding more power packs. The "2200XL" does allow extra packs but comes in a rack-mount format. It's not just a battery backup -- it conditions the power -- to filter out spikes and emit a pure sine wave. It will kick in during over or under voltage conditions (you can set the sensitivity). Adjustable alarm when on battery, setting of output volts (115, 230, 120, 240). It selftests at least every 2 weeks or shorter (to your fancy). It also has a network feature (that I haven't gotten to work yet -- they just changed the format), that allows other computers on the same net to also be notified and take action. You specify what scripts to run at what times (power off, power on, getting critically low, etc). Hasn't failed me 'yet' -- cept when a charger died and was replaced free of cost (within warantee). I have a separate setup another room for another computer. The upspowerd runs on linux or windows (under cygwin, I think). You can specify when to shut down -- like "5 minutes of battery life left. The controller unit has 1 battery. But the add-ons have 2 batteries each, so the first add-on adds 3x to the run-time. When my system did shut down "prematurely", it went through the full "halt" sequence, which I'd presume flushes disk caches. > >> the drive claims to have metadata safe on disk but actually does not, >> and you lose power, the data claimed safe will evaporate, there's not >> much the fs can do. IO write barriers address this by forcing the drive >> to flush order-critical data before continuing; xfs has them on by >> default, although they are tested at mount time and if you have >> something in between xfs and the disks which does not support barriers >> (i.e. lvm...) then they are disabled again, with a notice in the logs. > Note also that with linux software raid barriers are NOT supported. ------ Are you sure about this? When my system boots, I used to have 3 new IDE's, and one older one. XFS checked each drive for barriers and turned off barriers for a disk that didn't support it. ... or are you referring specifically to linux-raid setups? Would it be possible on boot to have xfs probe the Raid array, physically, to see if barriers are really supported (or not), and disable them if they are not (and optionally disabling write caching, but that's a major performance hit in my experience. Linda From owner-xfs@oss.sgi.com Tue Feb 5 18:03:04 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 18:03:08 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1622xcR011848 for ; Tue, 5 Feb 2008 18:03:02 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA26582 for ; Wed, 6 Feb 2008 13:03:21 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 20C7158C4C11; Wed, 6 Feb 2008 13:03:21 +1100 (EST) To: xfs@oss.sgi.com Subject: TAKE 971186 - add __init/__exit mark to specific init/cleanup functions Message-Id: <20080206020321.20C7158C4C11@chook.melbourne.sgi.com> Date: Wed, 6 Feb 2008 13:03:21 +1100 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/5702/Tue Feb 5 16:46:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14352 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs add __init/__exit mark to specific init/cleanup functions Date: Wed Feb 6 11:36:42 AEDT 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-init Inspected by: Denis Cheng Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30459a fs/xfs/xfs_vfsops.c - 1.551 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vfsops.c.diff?r1=text&tr1=1.551&r2=text&tr2=1.550&f=h fs/xfs/support/uuid.c - 1.21 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/support/uuid.c.diff?r1=text&tr1=1.21&r2=text&tr2=1.20&f=h fs/xfs/support/ktrace.c - 1.28 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/support/ktrace.c.diff?r1=text&tr1=1.28&r2=text&tr2=1.27&f=h fs/xfs/linux-2.6/xfs_vnode.c - 1.156 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_vnode.c.diff?r1=text&tr1=1.156&r2=text&tr2=1.155&f=h fs/xfs/linux-2.6/xfs_super.c - 1.408 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_super.c.diff?r1=text&tr1=1.408&r2=text&tr2=1.407&f=h - add __init/__exit mark to specific init/cleanup functions From owner-xfs@oss.sgi.com Tue Feb 5 18:12:23 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 18:12:26 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_32 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m162CLp4012392 for ; Tue, 5 Feb 2008 18:12:23 -0800 X-ASG-Debug-ID: 1202263962-3b49032c0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from hobbit.corpit.ru (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 53DD7D9F77F for ; Tue, 5 Feb 2008 18:12:43 -0800 (PST) Received: from hobbit.corpit.ru (hobbit.corpit.ru [81.13.94.6]) by cuda.sgi.com with ESMTP id hAPFNhEA0rny5gew for ; Tue, 05 Feb 2008 18:12:43 -0800 (PST) Received: from [192.168.1.200] (mjt.ppp.tls.msk.ru [192.168.1.200]) by hobbit.corpit.ru (Postfix) with ESMTP id 92E262B08C; Wed, 6 Feb 2008 05:12:40 +0300 (MSK) (envelope-from mjt@tls.msk.ru) Message-ID: <47A91798.4090703@msgid.tls.msk.ru> Date: Wed, 06 Feb 2008 05:12:40 +0300 From: Michael Tokarev Organization: Telecom Service, JSC User-Agent: Icedove 1.5.0.14pre (X11/20071018) MIME-Version: 1.0 To: Linda Walsh CC: Eric Sandeen , Justin Piszcz , Moshe Yudkowsky , linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) Subject: Re: RAID needs more to survive a power hit, different /boot layout for example (was Re: draft howto on making raids for surviving a disk crash) References: <47A612BE.5050707@pobox.com> <47A623EE.4050305@msgid.tls.msk.ru> <47A62A17.70101@pobox.com> <47A6DA81.3030008@msgid.tls.msk.ru> <47A6EFCF.9080906@pobox.com> <47A7188A.4070005@msgid.tls.msk.ru> <47A72061.3010800@sandeen.net> <47A73F90.3020307@msgid.tls.msk.ru> <47A9098F.4020801@tlinx.org> In-Reply-To: <47A9098F.4020801@tlinx.org> X-Enigmail-Version: 0.94.2.0 OpenPGP: id=4F9CF57E Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: hobbit.corpit.ru[81.13.94.6] X-Barracuda-Start-Time: 1202263964 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41477 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5702/Tue Feb 5 16:46:53 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14353 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mjt@tls.msk.ru Precedence: bulk X-list: xfs Linda Walsh wrote: > > Michael Tokarev wrote: >> Unfortunately an UPS does not *really* help here. Because unless >> it has control program which properly shuts system down on the loss >> of input power, and the battery really has the capacity to power the >> system while it's shutting down (anyone tested this? > ---- > Yes. I must say, I am not connected or paid by APC. > >> With new UPS? >> and after an year of use, when the battery is not new?), -- unless >> the UPS actually has the capacity to shutdown system, it will cut >> the power at an unexpected time, while the disk(s) still has dirty >> caches... > -------- > If you have a "SmartUPS" by "APC", their is a freeware demon that monitors [...] Good stuff. I knew at least SOME UPSes are good... ;) Too bad I rarely see such stuff in use by regular home users... [] >> Note also that with linux software raid barriers are NOT supported. > ------ > Are you sure about this? When my system boots, I used to have > 3 new IDE's, and one older one. XFS checked each drive for barriers > and turned off barriers for a disk that didn't support it. ... or > are you referring specifically to linux-raid setups? I'm referring especially to linux-raid setups (software raid). md devices don't support barriers, because of a very simple reasons: once more than one disk drive is involved, md layer can't guarantee ordering ACROSS drives too. The problem is that in case of power loss during writes, when an array needs recovery/resync (at least the parts which were being written, if bitmaps are in use), md layer will choose arbitrary drive as a "master" and will copy data to another drive (speaking of simplest case of 2-drive raid1 array). But the thing is that one drive may have two last barriers written (I mean the data that was "assotiated" with the barriers), and another neither of the two - in two different places. And hence we may see quite.. some inconsistency here. This is regardless of whether underlying component devices supports barriers or not. > Would it be possible on boot to have xfs probe the Raid array, > physically, to see if barriers are really supported (or not), and disable > them if they are not (and optionally disabling write caching, but that's > a major performance hit in my experience. Xfs already probes the devices as you describe, exactly the same way as you've seen with your ide disks, and disables barriers. The question and confusing was about what happens when the barriers are disabled (provided, again, that we don't rely on UPS and other external things). As far as I understand, when barriers are working properly, xfs should be safe wrt power losses (still a bit unsure about this). Now, when barriers are turned off (for whatever reason), is it still as safe? I don't know. Does it use regular cache flushes in place of barriers in that case (which ARE supported by md layer)? Generally, it has been said numerous times that XFS is not "powercut-friendly", and it has to be used when everything is stable, including power. Hence I'm afraid to deploy it where I know the power is not stable (we've about 70 such places here, with servers in each, where they don't always replace UPS batteries in time - ext3fs never crashed so far, while ext2 did). Thanks. /mjt From owner-xfs@oss.sgi.com Tue Feb 5 19:20:36 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 19:20:41 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m163KVLw015292 for ; Tue, 5 Feb 2008 19:20:36 -0800 X-ASG-Debug-ID: 1202268054-1cce03200000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 97433D9FF63 for ; Tue, 5 Feb 2008 19:20:54 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id FyBhIjGBtah3Atfn for ; Tue, 05 Feb 2008 19:20:54 -0800 (PST) Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 9D5A618DB3450; Tue, 5 Feb 2008 21:20:51 -0600 (CST) Message-ID: <47A92793.8040006@sandeen.net> Date: Tue, 05 Feb 2008 21:20:51 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: "Linda A. Walsh" CC: xfs-oss X-ASG-Orig-Subj: Re: %inodes question Subject: Re: %inodes question References: <47A90A38.3050206@tlinx.org> In-Reply-To: <47A90A38.3050206@tlinx.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202268054 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41477 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5704/Tue Feb 5 18:30:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14354 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Linda A. Walsh wrote: > Does this really matter much anymore? It defaults to 25%, but > on a 1T disk, that's 250G "reserved"(?) for inodes? Even 1% > still gives you 10G of inodes. Not any more. xfsprogs 2.9.5 will put it at 5% for a 1T fs. And it's not reserved per se; it's just the maximum. Inodes are allocated dynamically up to that point. -Eric From owner-xfs@oss.sgi.com Tue Feb 5 20:45:08 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 20:45:15 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_62, J_CHICKENPOX_63,J_CHICKENPOX_64 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m164j5lL022907 for ; Tue, 5 Feb 2008 20:45:08 -0800 X-ASG-Debug-ID: 1202273123-1cd403cc0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mta5.srv.hcvlny.cv.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1783CDA033D for ; Tue, 5 Feb 2008 20:45:23 -0800 (PST) Received: from mta5.srv.hcvlny.cv.net (mta5.srv.hcvlny.cv.net [167.206.4.200]) by cuda.sgi.com with ESMTP id vR2iAg4lpvffHKta for ; Tue, 05 Feb 2008 20:45:23 -0800 (PST) Received: from freyr.home (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta5.srv.hcvlny.cv.net (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) with ESMTP id <0JVS009FDVVJ3NZ0@mta5.srv.hcvlny.cv.net> for xfs@oss.sgi.com; Tue, 05 Feb 2008 23:45:22 -0500 (EST) Received: by freyr.home (Postfix, from userid 1000) id BC2E2800BA3; Tue, 05 Feb 2008 23:44:51 -0500 (EST) Date: Tue, 05 Feb 2008 23:44:51 -0500 From: "Josef 'Jeff' Sipek" X-ASG-Orig-Subj: [PATCH 1/1] XFS: Replace custom AIL linked-list code with struct list_head Subject: [PATCH 1/1] XFS: Replace custom AIL linked-list code with struct list_head In-reply-to: <20080204205230.GA14084@lst.de> To: dgc@sgi.com, xfs@oss.sgi.com, hch@infradead.org Cc: "Josef 'Jeff' Sipek" Message-id: <1202273091-19629-1-git-send-email-jeffpc@josefsipek.net> X-Mailer: git-send-email 1.5.4.rc2.85.g9de45-dirty Content-transfer-encoding: 7BIT References: <20080204205230.GA14084@lst.de> X-Barracuda-Connect: mta5.srv.hcvlny.cv.net[167.206.4.200] X-Barracuda-Start-Time: 1202273128 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41480 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5704/Tue Feb 5 18:30:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14355 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jeffpc@josefsipek.net Precedence: bulk X-list: xfs Signed-off-by: Josef 'Jeff' Sipek --- This patch assumes you already have Dave Chinner's patch for xfsidbg_xlogitem and xfsidbg_xaildump is needed. Changes since V2: - remove extra parenthesis Changes since V1: - Pass around a pointer to the AIL, not the struct list_head - Make sure things compile & run with CONFIG_XFS_DEBUG --- fs/xfs/xfs_mount.h | 2 +- fs/xfs/xfs_trans.h | 7 +-- fs/xfs/xfs_trans_ail.c | 149 +++++++++++++++++++---------------------------- 3 files changed, 62 insertions(+), 96 deletions(-) diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index f7c620e..435d625 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -220,7 +220,7 @@ extern void xfs_icsb_sync_counters_flags(struct xfs_mount *, int); #endif typedef struct xfs_ail { - xfs_ail_entry_t xa_ail; + struct list_head xa_ail; uint xa_gen; struct task_struct *xa_task; xfs_lsn_t xa_target; diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index 7f40628..50ce02b 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -113,13 +113,8 @@ struct xfs_mount; struct xfs_trans; struct xfs_dquot_acct; -typedef struct xfs_ail_entry { - struct xfs_log_item *ail_forw; /* AIL forw pointer */ - struct xfs_log_item *ail_back; /* AIL back pointer */ -} xfs_ail_entry_t; - typedef struct xfs_log_item { - xfs_ail_entry_t li_ail; /* AIL pointers */ + struct list_head li_ail; /* AIL pointers */ xfs_lsn_t li_lsn; /* last on-disk lsn */ struct xfs_log_item_desc *li_desc; /* ptr to current desc*/ struct xfs_mount *li_mountp; /* ptr to fs mount */ diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 4d6330e..0fe9d59 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -28,13 +28,13 @@ #include "xfs_trans_priv.h" #include "xfs_error.h" -STATIC void xfs_ail_insert(xfs_ail_entry_t *, xfs_log_item_t *); -STATIC xfs_log_item_t * xfs_ail_delete(xfs_ail_entry_t *, xfs_log_item_t *); -STATIC xfs_log_item_t * xfs_ail_min(xfs_ail_entry_t *); -STATIC xfs_log_item_t * xfs_ail_next(xfs_ail_entry_t *, xfs_log_item_t *); +STATIC void xfs_ail_insert(xfs_ail_t *, xfs_log_item_t *); +STATIC xfs_log_item_t * xfs_ail_delete(xfs_ail_t *, xfs_log_item_t *); +STATIC xfs_log_item_t * xfs_ail_min(xfs_ail_t *); +STATIC xfs_log_item_t * xfs_ail_next(xfs_ail_t *, xfs_log_item_t *); #ifdef DEBUG -STATIC void xfs_ail_check(xfs_ail_entry_t *, xfs_log_item_t *); +STATIC void xfs_ail_check(xfs_ail_t *, xfs_log_item_t *); #else #define xfs_ail_check(a,l) #endif /* DEBUG */ @@ -57,7 +57,7 @@ xfs_trans_tail_ail( xfs_log_item_t *lip; spin_lock(&mp->m_ail_lock); - lip = xfs_ail_min(&(mp->m_ail.xa_ail)); + lip = xfs_ail_min(&mp->m_ail); if (lip == NULL) { lsn = (xfs_lsn_t)0; } else { @@ -91,7 +91,7 @@ xfs_trans_push_ail( { xfs_log_item_t *lip; - lip = xfs_ail_min(&mp->m_ail.xa_ail); + lip = xfs_ail_min(&mp->m_ail); if (lip && !XFS_FORCED_SHUTDOWN(mp)) { if (XFS_LSN_CMP(threshold_lsn, mp->m_ail.xa_target) > 0) xfsaild_wakeup(mp, threshold_lsn); @@ -111,15 +111,17 @@ xfs_trans_first_push_ail( { xfs_log_item_t *lip; - lip = xfs_ail_min(&(mp->m_ail.xa_ail)); + lip = xfs_ail_min(&mp->m_ail); *gen = (int)mp->m_ail.xa_gen; if (lsn == 0) return lip; - while (lip && (XFS_LSN_CMP(lip->li_lsn, lsn) < 0)) - lip = lip->li_ail.ail_forw; + list_for_each_entry(lip, &mp->m_ail.xa_ail, li_ail) { + if (XFS_LSN_CMP(lip->li_lsn, lsn) >= 0) + return lip; + } - return lip; + return NULL; } /* @@ -326,7 +328,7 @@ xfs_trans_unlocked_item( * the call to xfs_log_move_tail() doesn't do anything if there's * not enough free space to wake people up so we're safe calling it. */ - min_lip = xfs_ail_min(&mp->m_ail.xa_ail); + min_lip = xfs_ail_min(&mp->m_ail); if (min_lip == lip) xfs_log_move_tail(mp, 1); @@ -354,15 +356,13 @@ xfs_trans_update_ail( xfs_log_item_t *lip, xfs_lsn_t lsn) __releases(mp->m_ail_lock) { - xfs_ail_entry_t *ailp; xfs_log_item_t *dlip=NULL; xfs_log_item_t *mlip; /* ptr to minimum lip */ - ailp = &(mp->m_ail.xa_ail); - mlip = xfs_ail_min(ailp); + mlip = xfs_ail_min(&mp->m_ail); if (lip->li_flags & XFS_LI_IN_AIL) { - dlip = xfs_ail_delete(ailp, lip); + dlip = xfs_ail_delete(&mp->m_ail, lip); ASSERT(dlip == lip); } else { lip->li_flags |= XFS_LI_IN_AIL; @@ -370,11 +370,11 @@ xfs_trans_update_ail( lip->li_lsn = lsn; - xfs_ail_insert(ailp, lip); + xfs_ail_insert(&mp->m_ail, lip); mp->m_ail.xa_gen++; if (mlip == dlip) { - mlip = xfs_ail_min(&(mp->m_ail.xa_ail)); + mlip = xfs_ail_min(&mp->m_ail); spin_unlock(&mp->m_ail_lock); xfs_log_move_tail(mp, mlip->li_lsn); } else { @@ -404,14 +404,12 @@ xfs_trans_delete_ail( xfs_mount_t *mp, xfs_log_item_t *lip) __releases(mp->m_ail_lock) { - xfs_ail_entry_t *ailp; xfs_log_item_t *dlip; xfs_log_item_t *mlip; if (lip->li_flags & XFS_LI_IN_AIL) { - ailp = &(mp->m_ail.xa_ail); - mlip = xfs_ail_min(ailp); - dlip = xfs_ail_delete(ailp, lip); + mlip = xfs_ail_min(&mp->m_ail); + dlip = xfs_ail_delete(&mp->m_ail, lip); ASSERT(dlip == lip); @@ -420,7 +418,7 @@ xfs_trans_delete_ail( mp->m_ail.xa_gen++; if (mlip == dlip) { - mlip = xfs_ail_min(&(mp->m_ail.xa_ail)); + mlip = xfs_ail_min(&mp->m_ail); spin_unlock(&mp->m_ail_lock); xfs_log_move_tail(mp, (mlip ? mlip->li_lsn : 0)); } else { @@ -458,7 +456,7 @@ xfs_trans_first_ail( { xfs_log_item_t *lip; - lip = xfs_ail_min(&(mp->m_ail.xa_ail)); + lip = xfs_ail_min(&mp->m_ail); *gen = (int)mp->m_ail.xa_gen; return lip; @@ -482,9 +480,9 @@ xfs_trans_next_ail( ASSERT(mp && lip && gen); if (mp->m_ail.xa_gen == *gen) { - nlip = xfs_ail_next(&(mp->m_ail.xa_ail), lip); + nlip = xfs_ail_next(&mp->m_ail, lip); } else { - nlip = xfs_ail_min(&(mp->m_ail).xa_ail); + nlip = xfs_ail_min(&mp->m_ail); *gen = (int)mp->m_ail.xa_gen; if (restarts != NULL) { XFS_STATS_INC(xs_push_ail_restarts); @@ -514,8 +512,7 @@ int xfs_trans_ail_init( xfs_mount_t *mp) { - mp->m_ail.xa_ail.ail_forw = (xfs_log_item_t*)&mp->m_ail.xa_ail; - mp->m_ail.xa_ail.ail_back = (xfs_log_item_t*)&mp->m_ail.xa_ail; + INIT_LIST_HEAD(&mp->m_ail.xa_ail); return xfsaild_start(mp); } @@ -534,7 +531,7 @@ xfs_trans_ail_destroy( */ STATIC void xfs_ail_insert( - xfs_ail_entry_t *base, + xfs_ail_t *ailp, xfs_log_item_t *lip) /* ARGSUSED */ { @@ -543,27 +540,22 @@ xfs_ail_insert( /* * If the list is empty, just insert the item. */ - if (base->ail_back == (xfs_log_item_t*)base) { - base->ail_forw = lip; - base->ail_back = lip; - lip->li_ail.ail_forw = (xfs_log_item_t*)base; - lip->li_ail.ail_back = (xfs_log_item_t*)base; + if (list_empty(&ailp->xa_ail)) { + list_add(&lip->li_ail, &ailp->xa_ail); return; } - next_lip = base->ail_back; - while ((next_lip != (xfs_log_item_t*)base) && - (XFS_LSN_CMP(next_lip->li_lsn, lip->li_lsn) > 0)) { - next_lip = next_lip->li_ail.ail_back; + list_for_each_entry_reverse(next_lip, &ailp->xa_ail, li_ail) { + if (XFS_LSN_CMP(next_lip->li_lsn, lip->li_lsn) <= 0) + break; } - ASSERT((next_lip == (xfs_log_item_t*)base) || + + ASSERT((&next_lip->li_ail == &ailp->xa_ail) || (XFS_LSN_CMP(next_lip->li_lsn, lip->li_lsn) <= 0)); - lip->li_ail.ail_forw = next_lip->li_ail.ail_forw; - lip->li_ail.ail_back = next_lip; - next_lip->li_ail.ail_forw = lip; - lip->li_ail.ail_forw->li_ail.ail_back = lip; - xfs_ail_check(base, lip); + list_add(&lip->li_ail, &next_lip->li_ail); + + xfs_ail_check(ailp, lip); return; } @@ -573,15 +565,13 @@ xfs_ail_insert( /*ARGSUSED*/ STATIC xfs_log_item_t * xfs_ail_delete( - xfs_ail_entry_t *base, + xfs_ail_t *ailp, xfs_log_item_t *lip) /* ARGSUSED */ { - xfs_ail_check(base, lip); - lip->li_ail.ail_forw->li_ail.ail_back = lip->li_ail.ail_back; - lip->li_ail.ail_back->li_ail.ail_forw = lip->li_ail.ail_forw; - lip->li_ail.ail_forw = NULL; - lip->li_ail.ail_back = NULL; + xfs_ail_check(ailp, lip); + + list_del(&lip->li_ail); return lip; } @@ -592,14 +582,13 @@ xfs_ail_delete( */ STATIC xfs_log_item_t * xfs_ail_min( - xfs_ail_entry_t *base) + xfs_ail_t *ailp) /* ARGSUSED */ { - register xfs_log_item_t *forw = base->ail_forw; - if (forw == (xfs_log_item_t*)base) { + if (list_empty(&ailp->xa_ail)) return NULL; - } - return forw; + + return list_first_entry(&ailp->xa_ail, xfs_log_item_t, li_ail); } /* @@ -609,15 +598,14 @@ xfs_ail_min( */ STATIC xfs_log_item_t * xfs_ail_next( - xfs_ail_entry_t *base, + xfs_ail_t *ailp, xfs_log_item_t *lip) /* ARGSUSED */ { - if (lip->li_ail.ail_forw == (xfs_log_item_t*)base) { + if (lip->li_ail.next == &ailp->xa_ail) return NULL; - } - return lip->li_ail.ail_forw; + return list_first_entry(&lip->li_ail, xfs_log_item_t, li_ail); } #ifdef DEBUG @@ -626,57 +614,40 @@ xfs_ail_next( */ STATIC void xfs_ail_check( - xfs_ail_entry_t *base, + xfs_ail_t *ailp, xfs_log_item_t *lip) { xfs_log_item_t *prev_lip; - prev_lip = base->ail_forw; - if (prev_lip == (xfs_log_item_t*)base) { - /* - * Make sure the pointers are correct when the list - * is empty. - */ - ASSERT(base->ail_back == (xfs_log_item_t*)base); + if (list_empty(&ailp->xa_ail)) return; - } /* * Check the next and previous entries are valid. */ ASSERT((lip->li_flags & XFS_LI_IN_AIL) != 0); - prev_lip = lip->li_ail.ail_back; - if (prev_lip != (xfs_log_item_t*)base) { - ASSERT(prev_lip->li_ail.ail_forw == lip); + prev_lip = list_entry(lip->li_ail.prev, xfs_log_item_t, li_ail); + if (&prev_lip->li_ail != &ailp->xa_ail) ASSERT(XFS_LSN_CMP(prev_lip->li_lsn, lip->li_lsn) <= 0); - } - prev_lip = lip->li_ail.ail_forw; - if (prev_lip != (xfs_log_item_t*)base) { - ASSERT(prev_lip->li_ail.ail_back == lip); + + prev_lip = list_entry(lip->li_ail.next, xfs_log_item_t, li_ail); + if (&prev_lip->li_ail != &ailp->xa_ail) ASSERT(XFS_LSN_CMP(prev_lip->li_lsn, lip->li_lsn) >= 0); - } #ifdef XFS_TRANS_DEBUG /* - * Walk the list checking forward and backward pointers, - * lsn ordering, and that every entry has the XFS_LI_IN_AIL - * flag set. This is really expensive, so only do it when - * specifically debugging the transaction subsystem. + * Walk the list checking lsn ordering, and that every entry has the + * XFS_LI_IN_AIL flag set. This is really expensive, so only do it + * when specifically debugging the transaction subsystem. */ - prev_lip = (xfs_log_item_t*)base; - while (lip != (xfs_log_item_t*)base) { - if (prev_lip != (xfs_log_item_t*)base) { - ASSERT(prev_lip->li_ail.ail_forw == lip); + prev_lip = list_entry(&ailp->xa_ail, xfs_log_item_t, li_ail); + list_for_each_entry(lip, &ailp->xa_ail, li_ail) { + if (&prev_lip->li_ail != &ailp->xa_ail) ASSERT(XFS_LSN_CMP(prev_lip->li_lsn, lip->li_lsn) <= 0); - } - ASSERT(lip->li_ail.ail_back == prev_lip); ASSERT((lip->li_flags & XFS_LI_IN_AIL) != 0); prev_lip = lip; - lip = lip->li_ail.ail_forw; } - ASSERT(lip == (xfs_log_item_t*)base); - ASSERT(base->ail_back == prev_lip); #endif /* XFS_TRANS_DEBUG */ } #endif /* DEBUG */ -- 1.5.4.rc2.85.g9de45-dirty From owner-xfs@oss.sgi.com Tue Feb 5 20:59:08 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 20:59:18 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m164x5RS023720 for ; Tue, 5 Feb 2008 20:59:08 -0800 X-ASG-Debug-ID: 1202273735-510900230000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4CF355A566F; Tue, 5 Feb 2008 20:55:35 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id 6G8K0iz6kUf8KrDh; Tue, 05 Feb 2008 20:55:35 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JMcJo-0004rB-2c; Wed, 06 Feb 2008 04:55:32 +0000 Date: Tue, 5 Feb 2008 23:55:32 -0500 From: Christoph Hellwig To: David Chinner Cc: Sven Geggus , xfs@oss.sgi.com, Tobias Ulmer , Andrea Perotti X-ASG-Orig-Subj: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Subject: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Message-ID: <20080206045532.GA16437@infradead.org> References: <20080205052418.GU155259@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080205052418.GU155259@sgi.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1202273738 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41481 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5704/Tue Feb 5 18:30:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14356 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Looks good to me. Well not really good but like fixing the corner case. We really need to get rid of this stupid loop. I'll put fixing nfsd up higher in my todo list. From owner-xfs@oss.sgi.com Tue Feb 5 21:05:21 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Feb 2008 21:05:27 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1655HIF024224 for ; Tue, 5 Feb 2008 21:05:21 -0800 X-ASG-Debug-ID: 1202274311-64c400fb0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1537DDA07C0 for ; Tue, 5 Feb 2008 21:05:11 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id WGqjiFxanKgJhG7A for ; Tue, 05 Feb 2008 21:05:11 -0800 (PST) Received: from liberator.sandeen.net (sandeen.net [209.173.210.139]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 2ED7B18DB3450; Tue, 5 Feb 2008 23:04:38 -0600 (CST) Message-ID: <47A93FE4.3000808@sandeen.net> Date: Tue, 05 Feb 2008 23:04:36 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Barry Naujok CC: "xfs@oss.sgi.com" X-ASG-Orig-Subj: Re: [PATCH] Fix mkfs.xfs default AG sizing Subject: Re: [PATCH] Fix mkfs.xfs default AG sizing References: In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202274312 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41482 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5704/Tue Feb 5 18:30:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14357 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Barry Naujok wrote: > Based on Eric's patches and my investigations, I have generated > the attached patch. > Based on some testing of random fs sizes :), seems to resolve the problem. Thanks, -Eric From owner-xfs@oss.sgi.com Wed Feb 6 01:46:05 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Feb 2008 01:46:12 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m169k1l2009273 for ; Wed, 6 Feb 2008 01:46:05 -0800 X-ASG-Debug-ID: 1202291183-17e401850000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.emlix.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id EAF8095609D; Wed, 6 Feb 2008 01:46:24 -0800 (PST) Received: from mx1.emlix.com (mx1.emlix.com [193.175.82.87]) by cuda.sgi.com with ESMTP id nLjj5YpFEQwawB9M; Wed, 06 Feb 2008 01:46:24 -0800 (PST) Received: from gate.emlix.com ([193.175.27.217]:41136 helo=mailer.emlix.com) by mx1.emlix.com with esmtp (Exim 4.63) (envelope-from ) id 1JMh4l-0003ph-Vh; Wed, 06 Feb 2008 11:00:19 +0100 Received: by mailer.emlix.com id 1JMgrB-0002nu-Lq; Wed, 06 Feb 2008 10:46:17 +0100 Received: by spinat.emlix.com (Postfix, from userid 2047) id 9263F7F452; Wed, 6 Feb 2008 10:46:17 +0100 (CET) Date: Wed, 6 Feb 2008 10:46:17 +0100 From: Tobias Ulmer To: David Chinner Cc: Sven Geggus , xfs@oss.sgi.com, Andrea Perotti X-ASG-Orig-Subj: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Subject: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash References: <20080205052418.GU155259@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080205052418.GU155259@sgi.com> User-Agent: Mutt/1.5.15+20070412 (2007-04-11) Message-Id: Organization: emlix gmbh, Goettingen, Germany X-Barracuda-Connect: mx1.emlix.com[193.175.82.87] X-Barracuda-Start-Time: 1202291184 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41500 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5710/Tue Feb 5 22:45:46 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14358 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tu@emlix.com Precedence: bulk X-list: xfs On Tue, Feb 05, 2008 at 04:24:18PM +1100, David Chinner wrote: > Sven, Tomas, Andrea: > > Can you try the patch attached below to see if it fixes the > xfs_file_readdir() oops you are seeing and let me know if it fixes > the problem? Works for me(TM) :) My testbox survived 24h with this patch, no problems. Tobias > > It looks like we're deferencing a pointer beyond the end of a buffer > if the buffer is filled exactly. This bug does not crash ia64 (even > with memory poisoning enabled), which is why the targeted corner > case testing I did a while back did not pick this up when fixing a > similar bug a month ago. > > Cheers, > > Dave. > -- > Dave Chinner > Principal Engineer > SGI Australian Software Group > > --- > Fix yet another corner case oops in xfs_file_readdir(). > > Signed-off-by: Dave Chinner > --- > fs/xfs/linux-2.6/xfs_file.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_file.c 2008-01-16 16:24:01.000000000 +1100 > +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c 2008-02-05 15:13:17.153110696 +1100 > @@ -351,8 +351,8 @@ xfs_file_readdir( > > size = buf.used; > de = (struct hack_dirent *)buf.dirent; > - curr_offset = de->offset /* & 0x7fffffff */; > while (size > 0) { > + curr_offset = de->offset /* & 0x7fffffff */; > if (filldir(dirent, de->name, de->namlen, > curr_offset & 0x7fffffff, > de->ino, de->d_type)) { > @@ -363,7 +363,6 @@ xfs_file_readdir( > sizeof(u64)); > size -= reclen; > de = (struct hack_dirent *)((char *)de + reclen); > - curr_offset = de->offset /* & 0x7fffffff */; > } > } > From owner-xfs@oss.sgi.com Wed Feb 6 01:59:39 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Feb 2008 01:59:42 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m169xbG2010081 for ; Wed, 6 Feb 2008 01:59:39 -0800 X-ASG-Debug-ID: 1202291999-5bd902550000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from magneto.unbit.it (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 31351DA0DAA for ; Wed, 6 Feb 2008 01:59:59 -0800 (PST) Received: from magneto.unbit.it (80.68.207.61 [80.68.207.61]) by cuda.sgi.com with ESMTP id eFMu77ziXc0ExEm6 for ; Wed, 06 Feb 2008 01:59:59 -0800 (PST) Received: by magneto.unbit.it (Postfix, from userid 33) id 365E4A05E189; Wed, 6 Feb 2008 10:59:57 +0100 (CET) Received: from 212.48.3.175 (SquirrelMail authenticated user andreamtp) by manage.unbit.it with HTTP; Wed, 6 Feb 2008 10:59:57 +0100 (CET) Message-ID: <39714.212.48.3.175.1202291997.squirrel@manage.unbit.it> In-Reply-To: <20080205052418.GU155259@sgi.com> References: <20080205052418.GU155259@sgi.com> Date: Wed, 6 Feb 2008 10:59:57 +0100 (CET) X-ASG-Orig-Subj: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Subject: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash From: ".:deadhead:." To: "David Chinner" Cc: "Sven Geggus" , xfs@oss.sgi.com, "Tobias Ulmer" , "Andrea Perotti" User-Agent: SquirrelMail/1.4.6 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Barracuda-Connect: 80.68.207.61[80.68.207.61] X-Barracuda-Start-Time: 1202292000 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41500 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5710/Tue Feb 5 22:45:46 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14359 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: deadhead@goodfellow.it Precedence: bulk X-list: xfs > Sven, Tomas, Andrea: > > Can you try the patch attached below to see if it fixes the > xfs_file_readdir() oops you are seeing and let me know if it fixes > the problem? Works like a charm here :D Thank you for your fast fix ! cheers Andrea From owner-xfs@oss.sgi.com Wed Feb 6 06:08:50 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Feb 2008 06:08:58 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m16E8mJE025113 for ; Wed, 6 Feb 2008 06:08:50 -0800 X-ASG-Debug-ID: 1202306947-3a30002a0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from il.marvell.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A1047DA3AE1 for ; Wed, 6 Feb 2008 06:09:08 -0800 (PST) Received: from il.marvell.com (galiil.marvell.com [199.203.130.254]) by cuda.sgi.com with ESMTP id CRHkj7jmpYpgsyy0 for ; Wed, 06 Feb 2008 06:09:08 -0800 (PST) Received: from msil-owa01.marvell.com ([10.4.5.100]) by il.marvell.com (8.13.1/8.13.1) with ESMTP id m16E8xAa025412; Wed, 6 Feb 2008 16:08:59 +0200 (IST) Received: from msilexch01.marvell.com ([10.4.5.104]) by msil-owa01.marvell.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 6 Feb 2008 16:08:59 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-ASG-Orig-Subj: RE: NFSD on XFS with RT subvolume Subject: RE: NFSD on XFS with RT subvolume Date: Wed, 6 Feb 2008 16:08:58 +0200 Message-ID: In-Reply-To: <1202076343.9463.465.camel@edge.scott.net.au> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: NFSD on XFS with RT subvolume Thread-Index: AchmsO+2WwyWxUtfR8uYigYxTwZXogCF43Gw References: <1202076343.9463.465.camel@edge.scott.net.au> From: "Rabeeh Khoury" To: Cc: , , "Lennert Buijtenhek" X-OriginalArrivalTime: 06 Feb 2008 14:08:59.0784 (UTC) FILETIME=[D1CFF080:01C868C9] X-Barracuda-Connect: galiil.marvell.com[199.203.130.254] X-Barracuda-Start-Time: 1202306949 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -0.47 X-Barracuda-Spam-Status: No, SCORE=-0.47 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=BSF_RULE7568M, BSF_RULE_7582B X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41517 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_RULE7568M BODY: Custom Rule 7568M 1.05 BSF_RULE_7582B BODY: Custom Rule 7582B X-Virus-Scanned: ClamAV 0.91.2/5711/Wed Feb 6 03:22:58 2008 on oss.sgi.com X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id m16E8pJE025116 X-archive-position: 14360 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: rabeeh@marvell.com Precedence: bulk X-list: xfs > > > > Exporting an XFS volume with kernel NFSD when real-time subvolume is > > enabled hangs the kernel. > > > > I'm using vanilla LK 2.6.22.7; first I create the XFS volume with two > > partitions of 20GB each with extent size of 1MB; then I create a > > subdirectory in the volume and mark it (using xfs_io util) as it belongs > > to the rt subvolume with inheritance flag. > > > > After mounting that volume through NFSv3 / UDP; and trying a 'dd > > if=/dev/zero of=/mnt/rt/test bs=1M count=1000' the machine running NFSD > > hangs infinitely. > > Did you manage to get a stack trace, OOC? No reason why it shouldn't > work AFAIK. I didn't mention that I'm using ARM EABI machine for that; but the same scenario happened on Ubuntu Gutsy 7.10. The serial console stops responding, but getting Sysrq with showPc function working I'v got some stack traces (Look for #stack-trace below). I'm running Fedora-8 on the ARM machine using xfsprogs-2.9.4-4.f8 RPM. The output of formatting /dev/sda5 and /dev/sda6 as the rt-subvolume is the following, but this time /dev/sda5 is 2GByte and /dev/sda6 is 20GByte (look for #mkfs.xfs). Another note is that sometimes I'm getting an error message that XFS is trying to access LBA beyond the volume. Maybe you can suggest few tests that I can perform to figure out what's the root cause? --------------- #mkfs.xfs ---------------------- meta-data=/dev/sda5 isize=256 agcount=8, agsize=61246 blks = sectsz=512 attr=0 data = bsize=4096 blocks=489968, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=2560, version=1 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =/dev/sda6 extsz=1048576 blocks=4885760, rtextents=19085 -------------- #stack-trace --------------------- -bash-3.2# SysRq : Show Regs Pid: 1208, comm: nfsd CPU: 0 Not tainted (2.6.22.7 #7) PC is at xfs_iext_get_ext+0x60/0x88 LR is at 0x1ff02 pc : [] lr : [<0001ff02>] psr: 80000013 sp : c5cb3a68 ip : 000000a8 fp : c5cb3a7c r10: c5cb3c60 r9 : c5cb3d0c r8 : c5984060 r7 : 0001ffab r6 : 0001ffac r5 : c5cb3c60 r4 : 000fffff r3 : c5cb3a6c r2 : 0001ffac r1 : 000000aa r0 : c49d1800 Flags: Nzcv IRQs on FIQs on Mode SVC_32 Segment user Control: a005317f Table: 07b38000 DAC: 00000015 [] (show_regs+0x0/0x50) from [] (sysrq_handle_showregs+0x20/0x28) r4:c03e5654 [] (sysrq_handle_showregs+0x0/0x28) from [] (__handle_sysrq+0xa4/0x148) [] (__handle_sysrq+0x0/0x148) from [] (handle_sysrq+0x34/0x40) [] (handle_sysrq+0x0/0x40) from [] (receive_chars+0x17c/0x2c4) [] (receive_chars+0x0/0x2c4) from [] (serial8250_interrupt+0x74/0x13c) [] (serial8250_interrupt+0x0/0x13c) from [] (handle_IRQ_event+0x44/0x84) [] (handle_IRQ_event+0x0/0x84) from [] (handle_level_irq+0xe0/0xfc) r7:0001ffab r6:00000000 r5:00000003 r4:c03d8998 [] (handle_level_irq+0x0/0xfc) from [] (asm_do_IRQ+0x48/0x64) r5:c03d8998 r4:00000003 [] (asm_do_IRQ+0x0/0x64) from [] (__irq_svc+0x30/0x100) Exception stack(0xc5cb3a20 to 0xc5cb3a68) 3a20: c49d1800 000000aa 0001ffac c5cb3a6c 000fffff c5cb3c60 0001ffac 0001ffab 3a40: c5984060 c5cb3d0c c5cb3c60 c5cb3a7c 000000a8 c5cb3a68 0001ff02 c02349b4 3a60: 80000013 ffffffff r6:00000002 r5:f1020000 r4:ffffffff [] (xfs_iext_get_ext+0x0/0x88) from [] (xfs_bmap_add_extent_hole_delay+0x54/0x608) [] (xfs_bmap_add_extent_hole_delay+0x0/0x608) from [] (xfs_bmap_add_extent+0x1bc/0x4f8) [] (xfs_bmap_add_extent+0x0/0x4f8) from [] (xfs_bunmapi+0x7fc/0xf38) [] (xfs_bunmapi+0x0/0xf38) from [] (xfs_itruncate_finish+0x1e4/0x368) [] (xfs_itruncate_finish+0x0/0x368) from [] (xfs_setattr+0x8b0/0xde4) [] (xfs_setattr+0x0/0xde4) from [] (xfs_vn_setattr+0x16c/0x18c) [] (xfs_vn_setattr+0x0/0x18c) from [] (notify_change+0x124/0x244) r7:c5b26758 r6:00000068 r5:c5985078 r4:c7d9b8f0 [] (notify_change+0x0/0x244) from [] (nfsd_setattr+0x368/0x50c) [] (nfsd_setattr+0x0/0x50c) from [] (nfsd3_proc_setattr+0xa0/0xc4) [] (nfsd3_proc_setattr+0x0/0xc4) from [] (nfsd_dispatch+0xd8/0x1e0) r7:c2b3a000 r6:c03db9a4 r5:00000018 r4:c62ad000 [] (nfsd_dispatch+0x0/0x1e0) from [] (svc_process+0x448/0x7e8) r8:c03db944 r7:00000014 r6:c62ad000 r5:c2b3a000 r4:c03db9a4 [] (svc_process+0x0/0x7e8) from [] (nfsd+0x17c/0x2d4) [] (nfsd+0x0/0x2d4) from [] (do_exit+0x0/0x7c4) SysRq : Show Regs Pid: 1208, comm: nfsd CPU: 0 Not tainted (2.6.22.7 #7) PC is at xfs_trans_dup+0xd0/0xf4 LR is at 0xc60e16b8 pc : [] lr : [] psr: 60000013 sp : c5cb3ca8 ip : 00000002 fp : c5cb3cc4 r10: c5cb3dd4 r9 : c5cb3d24 r8 : 00000000 r7 : c21cf098 r6 : c21cf310 r5 : 00000000 r4 : 00000000 r3 : 60000093 r2 : c6017000 r1 : 60000013 r0 : 00000004 Flags: nZCv IRQs on FIQs on Mode SVC_32 Segment user Control: a005317f Table: 07b38000 DAC: 00000015 [] (show_regs+0x0/0x50) from [] (sysrq_handle_showregs+0x20/0x28) r4:c03e5654 [] (sysrq_handle_showregs+0x0/0x28) from [] (__handle_sysrq+0xa4/0x148) [] (__handle_sysrq+0x0/0x148) from [] (handle_sysrq+0x34/0x40) [] (handle_sysrq+0x0/0x40) from [] (receive_chars+0x17c/0x2c4) [] (receive_chars+0x0/0x2c4) from [] (serial8250_interrupt+0x74/0x13c) [] (serial8250_interrupt+0x0/0x13c) from [] (handle_IRQ_event+0x44/0x84) [] (handle_IRQ_event+0x0/0x84) from [] (handle_level_irq+0xe0/0xfc) r7:c21cf098 r6:00000000 r5:00000003 r4:c03d8998 [] (handle_level_irq+0x0/0xfc) from [] (asm_do_IRQ+0x48/0x64) r5:c03d8998 r4:00000003 [] (asm_do_IRQ+0x0/0x64) from [] (__irq_svc+0x30/0x100) Exception stack(0xc5cb3c60 to 0xc5cb3ca8) 3c60: 00000004 60000013 c6017000 60000093 00000000 00000000 c21cf310 c21cf098 3c80: 00000000 c5cb3d24 c5cb3dd4 c5cb3cc4 00000002 c5cb3ca8 c60e16b8 c024b788 3ca0: 60000013 ffffffff r6:00000002 r5:f1020000 r4:ffffffff [] (xfs_trans_dup+0x0/0xf4) from [] (xfs_itruncate_finish+0x28c/0x368) r7:00000000 r6:00000000 r5:c5984060 r4:c21cf310 [] (xfs_itruncate_finish+0x0/0x368) from [] (xfs_setattr+0x8b0/0xde4) [] (xfs_setattr+0x0/0xde4) from [] (xfs_vn_setattr+0x16c/0x18c) [] (xfs_vn_setattr+0x0/0x18c) from [] (notify_change+0x124/0x244) r7:c5b26758 r6:00000068 r5:c5985078 r4:c7d9b8f0 [] (notify_change+0x0/0x244) from [] (nfsd_setattr+0x368/0x50c) [] (nfsd_setattr+0x0/0x50c) from [] (nfsd3_proc_setattr+0xa0/0xc4) [] (nfsd3_proc_setattr+0x0/0xc4) from [] (nfsd_dispatch+0xd8/0x1e0) r7:c2b3a000 r6:c03db9a4 r5:00000018 r4:c62ad000 [] (nfsd_dispatch+0x0/0x1e0) from [] (svc_process+0x448/0x7e8) r8:c03db944 r7:00000014 r6:c62ad000 r5:c2b3a000 r4:c03db9a4 [] (svc_process+0x0/0x7e8) from [] (nfsd+0x17c/0x2d4) [] (nfsd+0x0/0x2d4) from [] (do_exit+0x0/0x7c4) SysRq : Show Regs Pid: 1208, comm: nfsd CPU: 0 Not tainted (2.6.22.7 #7) PC is at xfs_bunmapi+0xbfc/0xf38 LR is at 0x2 pc : [] lr : [<00000002>] psr: 60000013 sp : c5cb3bc8 ip : c5cb3bc8 fp : c5cb3cc4 r10: 00000000 r9 : c59840b0 r8 : 00000000 r7 : 00000bee r6 : 00000000 r5 : 00000000 r4 : 00000002 r3 : 00000000 r2 : 00000000 r1 : 00053566 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 Segment user Control: a005317f Table: 07b38000 DAC: 00000015 [] (show_regs+0x0/0x50) from [] (sysrq_handle_showregs+0x20/0x28) r4:c03e5654 [] (sysrq_handle_showregs+0x0/0x28) from [] (__handle_sysrq+0xa4/0x148) [] (__handle_sysrq+0x0/0x148) from [] (handle_sysrq+0x34/0x40) [] (handle_sysrq+0x0/0x40) from [] (receive_chars+0x17c/0x2c4) [] (receive_chars+0x0/0x2c4) from [] (serial8250_interrupt+0x74/0x13c) [] (serial8250_interrupt+0x0/0x13c) from [] (handle_IRQ_event+0x44/0x84) [] (handle_IRQ_event+0x0/0x84) from [] (handle_level_irq+0xe0/0xfc) r7:00000bee r6:00000000 r5:00000003 r4:c03d8998 [] (handle_level_irq+0x0/0xfc) from [] (asm_do_IRQ+0x48/0x64) r5:c03d8998 r4:00000003 [] (asm_do_IRQ+0x0/0x64) from [] (__irq_svc+0x30/0x100) Exception stack(0xc5cb3b80 to 0xc5cb3bc8) 3b80: 00000000 00053566 00000000 00000000 00000002 00000000 00000000 00000bee 3ba0: 00000000 c59840b0 00000000 c5cb3cc4 c5cb3bc8 c5cb3bc8 00000002 c0215ec4 3bc0: 60000013 ffffffff r6:00000002 r5:f1020000 r4:ffffffff [] (xfs_bunmapi+0x0/0xf38) from [] (xfs_itruncate_finish+0x1e4/0x368) [] (xfs_itruncate_finish+0x0/0x368) from [] (xfs_setattr+0x8b0/0xde4) [] (xfs_setattr+0x0/0xde4) from [] (xfs_vn_setattr+0x16c/0x18c) [] (xfs_vn_setattr+0x0/0x18c) from [] (notify_change+0x124/0x244) r7:c5b26758 r6:00000068 r5:c5985078 r4:c7d9b8f0 [] (notify_change+0x0/0x244) from [] (nfsd_setattr+0x368/0x50c) [] (nfsd_setattr+0x0/0x50c) from [] (nfsd3_proc_setattr+0xa0/0xc4) [] (nfsd3_proc_setattr+0x0/0xc4) from [] (nfsd_dispatch+0xd8/0x1e0) r7:c2b3a000 r6:c03db9a4 r5:00000018 r4:c62ad000 [] (nfsd_dispatch+0x0/0x1e0) from [] (svc_process+0x448/0x7e8) r8:c03db944 r7:00000014 r6:c62ad000 r5:c2b3a000 r4:c03db9a4 [] (svc_process+0x0/0x7e8) from [] (nfsd+0x17c/0x2d4) [] (nfsd+0x0/0x2d4) from [] (do_exit+0x0/0x7c4) SysRq : Show Regs Pid: 1208, comm: nfsd CPU: 0 Not tainted (2.6.22.7 #7) PC is at xfs_bmap_add_extent+0x3c4/0x4f8 LR is at xfs_iext_insert+0x34/0x50 pc : [] lr : [] psr: 60000013 sp : c5cb3b40 ip : 000fffff fp : c5cb3bc4 r10: c5cb3c88 r9 : c5cb3d0c r8 : c5984060 r7 : 00064f35 r6 : 00000000 r5 : 00000000 r4 : 00000000 r3 : 00000001 r2 : c5cb3b90 r1 : c5984165 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 Segment user Control: a005317f Table: 07b38000 DAC: 00000015 [] (show_regs+0x0/0x50) from [] (sysrq_handle_showregs+0x20/0x28) r4:c03e5654 [] (sysrq_handle_showregs+0x0/0x28) from [] (__handle_sysrq+0xa4/0x148) [] (__handle_sysrq+0x0/0x148) from [] (handle_sysrq+0x34/0x40) [] (handle_sysrq+0x0/0x40) from [] (receive_chars+0x17c/0x2c4) [] (receive_chars+0x0/0x2c4) from [] (serial8250_interrupt+0x74/0x13c) [] (serial8250_interrupt+0x0/0x13c) from [] (handle_IRQ_event+0x44/0x84) [] (handle_IRQ_event+0x0/0x84) from [] (handle_level_irq+0xe0/0xfc) r7:00064f35 r6:00000000 r5:00000003 r4:c03d8998 [] (handle_level_irq+0x0/0xfc) from [] (asm_do_IRQ+0x48/0x64) r5:c03d8998 r4:00000003 [] (asm_do_IRQ+0x0/0x64) from [] (__irq_svc+0x30/0x100) Exception stack(0xc5cb3af8 to 0xc5cb3b40) 3ae0: 00000000 c5984165 3b00: c5cb3b90 00000001 00000000 00000000 00000000 00064f35 c5984060 c5cb3d0c 3b20: c5cb3c88 c5cb3bc4 000fffff c5cb3b40 c0235df4 c0215194 60000013 ffffffff r6:00000002 r5:f1020000 r4:ffffffff [] (xfs_bmap_add_extent+0x0/0x4f8) from [] (xfs_bunmapi+0x7fc/0xf38) [] (xfs_bunmapi+0x0/0xf38) from [] (xfs_itruncate_finish+0x1e4/0x368) [] (xfs_itruncate_finish+0x0/0x368) from [] (xfs_setattr+0x8b0/0xde4) [] (xfs_setattr+0x0/0xde4) from [] (xfs_vn_setattr+0x16c/0x18c) [] (xfs_vn_setattr+0x0/0x18c) from [] (notify_change+0x124/0x244) r7:c5b26758 r6:00000068 r5:c5985078 r4:c7d9b8f0 [] (notify_change+0x0/0x244) from [] (nfsd_setattr+0x368/0x50c) [] (nfsd_setattr+0x0/0x50c) from [] (nfsd3_proc_setattr+0xa0/0xc4) [] (nfsd3_proc_setattr+0x0/0xc4) from [] (nfsd_dispatch+0xd8/0x1e0) r7:c2b3a000 r6:c03db9a4 r5:00000018 r4:c62ad000 [] (nfsd_dispatch+0x0/0x1e0) from [] (svc_process+0x448/0x7e8) r8:c03db944 r7:00000014 r6:c62ad000 r5:c2b3a000 r4:c03db9a4 [] (svc_process+0x0/0x7e8) from [] (nfsd+0x17c/0x2d4) [] (nfsd+0x0/0x2d4) from [] (do_exit+0x0/0x7c4) SysRq : Show Regs Pid: 1208, comm: nfsd CPU: 0 Not tainted (2.6.22.7 #7) PC is at xfs_trans_unreserve_and_mod_sb+0x8/0x2a8 LR is at _xfs_trans_commit+0x6c/0x354 pc : [] lr : [] psr: 20000013 sp : c5cb3bb0 ip : c5cb3bc0 fp : c5cb3cc4 r10: c6017000 r9 : c5cb3d24 r8 : 00000000 r7 : 00000000 r6 : 00000010 r5 : c21cf098 r4 : 00000000 r3 : 00000000 r2 : 00000005 r1 : 00000004 r0 : c21cf098 Flags: nzCv IRQs on FIQs on Mode SVC_32 Segment user Control: a005317f Table: 07b38000 DAC: 00000015 [] (show_regs+0x0/0x50) from [] (sysrq_handle_showregs+0x20/0x28) r4:c03e5654 [] (sysrq_handle_showregs+0x0/0x28) from [] (__handle_sysrq+0xa4/0x148) [] (__handle_sysrq+0x0/0x148) from [] (handle_sysrq+0x34/0x40) [] (handle_sysrq+0x0/0x40) from [] (receive_chars+0x17c/0x2c4) [] (receive_chars+0x0/0x2c4) from [] (serial8250_interrupt+0x74/0x13c) [] (serial8250_interrupt+0x0/0x13c) from [] (handle_IRQ_event+0x44/0x84) [] (handle_IRQ_event+0x0/0x84) from [] (handle_level_irq+0xe0/0xfc) r7:00000000 r6:00000000 r5:00000003 r4:c03d8998 [] (handle_level_irq+0x0/0xfc) from [] (asm_do_IRQ+0x48/0x64) r5:c03d8998 r4:00000003 [] (asm_do_IRQ+0x0/0x64) from [] (__irq_svc+0x30/0x100) Exception stack(0xc5cb3b68 to 0xc5cb3bb0) 3b60: c21cf098 00000004 00000005 00000000 00000000 c21cf098 3b80: 00000010 00000000 00000000 c5cb3d24 c6017000 c5cb3cc4 c5cb3bc0 c5cb3bb0 3ba0: c024c5dc c024b1cc 20000013 ffffffff r6:00000002 r5:f1020000 r4:ffffffff [] (_xfs_trans_commit+0x0/0x354) from [] (xfs_itruncate_finish+0x2a4/0x368) [] (xfs_itruncate_finish+0x0/0x368) from [] (xfs_setattr+0x8b0/0xde4) [] (xfs_setattr+0x0/0xde4) from [] (xfs_vn_setattr+0x16c/0x18c) [] (xfs_vn_setattr+0x0/0x18c) from [] (notify_change+0x124/0x244) r7:c5b26758 r6:00000068 r5:c5985078 r4:c7d9b8f0 [] (notify_change+0x0/0x244) from [] (nfsd_setattr+0x368/0x50c) [] (nfsd_setattr+0x0/0x50c) from [] (nfsd3_proc_setattr+0xa0/0xc4) [] (nfsd3_proc_setattr+0x0/0xc4) from [] (nfsd_dispatch+0xd8/0x1e0) r7:c2b3a000 r6:c03db9a4 r5:00000018 r4:c62ad000 [] (nfsd_dispatch+0x0/0x1e0) from [] (svc_process+0x448/0x7e8) r8:c03db944 r7:00000014 r6:c62ad000 r5:c2b3a000 r4:c03db9a4 [] (svc_process+0x0/0x7e8) from [] (nfsd+0x17c/0x2d4) [] (nfsd+0x0/0x2d4) from [] (do_exit+0x0/0x7c4) SysRq : Show Regs Pid: 1208, comm: nfsd CPU: 0 Not tainted (2.6.22.7 #7) PC is at kmem_zone_zalloc+0x14/0x40 LR is at kmem_zone_alloc+0x6c/0xc0 pc : [] lr : [] psr: a0000013 sp : c5cb3c90 ip : c09b74a0 fp : c5cb3ca4 r10: c5cb3dd4 r9 : c5cb3d24 r8 : 00000000 r7 : 00000000 r6 : c21cf098 r5 : c5984060 r4 : c7c3bbc0 r3 : 00000004 r2 : 00000001 r1 : c09b74b0 r0 : c21cf310 Flags: NzCv IRQs on FIQs on Mode SVC_32 Segment user Control: a005317f Table: 07b38000 DAC: 00000015 [] (show_regs+0x0/0x50) from [] (sysrq_handle_showregs+0x20/0x28) r4:c03e5654 [] (sysrq_handle_showregs+0x0/0x28) from [] (__handle_sysrq+0xa4/0x148) [] (__handle_sysrq+0x0/0x148) from [] (handle_sysrq+0x34/0x40) [] (handle_sysrq+0x0/0x40) from [] (receive_chars+0x17c/0x2c4) [] (receive_chars+0x0/0x2c4) from [] (serial8250_interrupt+0x74/0x13c) [] (serial8250_interrupt+0x0/0x13c) from [] (handle_IRQ_event+0x44/0x84) [] (handle_IRQ_event+0x0/0x84) from [] (handle_level_irq+0xe0/0xfc) r7:00000000 r6:00000000 r5:00000003 r4:c03d8998 [] (handle_level_irq+0x0/0xfc) from [] (asm_do_IRQ+0x48/0x64) r5:c03d8998 r4:00000003 [] (asm_do_IRQ+0x0/0x64) from [] (__irq_svc+0x30/0x100) Exception stack(0xc5cb3c48 to 0xc5cb3c90) 3c40: c21cf310 c09b74b0 00000001 00000004 c7c3bbc0 c5984060 3c60: c21cf098 00000000 00000000 c5cb3d24 c5cb3dd4 c5cb3ca4 c09b74a0 c5cb3c90 3c80: c025809c c0258104 a0000013 ffffffff r6:00000002 r5:f1020000 r4:ffffffff [] (kmem_zone_zalloc+0x0/0x40) from [] (xfs_trans_dup+0x20/0xf4) r5:c5984060 r4:c21cf098 [] (xfs_trans_dup+0x0/0xf4) from [] (xfs_itruncate_finish+0x28c/0x368) r7:00000000 r6:00000000 r5:c5984060 r4:c21cf098 [] (xfs_itruncate_finish+0x0/0x368) from [] (xfs_setattr+0x8b0/0xde4) [] (xfs_setattr+0x0/0xde4) from [] (xfs_vn_setattr+0x16c/0x18c) [] (xfs_vn_setattr+0x0/0x18c) from [] (notify_change+0x124/0x244) r7:c5b26758 r6:00000068 r5:c5985078 r4:c7d9b8f0 [] (notify_change+0x0/0x244) from [] (nfsd_setattr+0x368/0x50c) [] (nfsd_setattr+0x0/0x50c) from [] (nfsd3_proc_setattr+0xa0/0xc4) [] (nfsd3_proc_setattr+0x0/0xc4) from [] (nfsd_dispatch+0xd8/0x1e0) r7:c2b3a000 r6:c03db9a4 r5:00000018 r4:c62ad000 [] (nfsd_dispatch+0x0/0x1e0) from [] (svc_process+0x448/0x7e8) r8:c03db944 r7:00000014 r6:c62ad000 r5:c2b3a000 r4:c03db9a4 [] (svc_process+0x0/0x7e8) from [] (nfsd+0x17c/0x2d4) [] (nfsd+0x0/0x2d4) from [] (do_exit+0x0/0x7c4) SysRq : Show Regs Pid: 1208, comm: nfsd CPU: 0 Not tainted (2.6.22.7 #7) PC is at xfs_bunmapi+0x330/0xf38 LR is at __init_begin+0x3fff8000/0x30 pc : [] lr : [<00000000>] psr: 80000013 sp : c5cb3bc8 ip : c5cb3c50 fp : c5cb3cc4 r10: 00000000 r9 : 00000000 r8 : 00000000 r7 : 00000bff r6 : 000fffff r5 : 00000000 r4 : ffffffff r3 : 000fffff r2 : fffe0c00 r1 : 00000000 r0 : 00000bef Flags: Nzcv IRQs on FIQs on Mode SVC_32 Segment user Control: a005317f Table: 07b38000 DAC: 00000015 [] (show_regs+0x0/0x50) from [] (sysrq_handle_showregs+0x20/0x28) r4:c03e5654 [] (sysrq_handle_showregs+0x0/0x28) from [] (__handle_sysrq+0xa4/0x148) [] (__handle_sysrq+0x0/0x148) from [] (handle_sysrq+0x34/0x40) [] (handle_sysrq+0x0/0x40) from [] (receive_chars+0x17c/0x2c4) [] (receive_chars+0x0/0x2c4) from [] (serial8250_interrupt+0x74/0x13c) [] (serial8250_interrupt+0x0/0x13c) from [] (handle_IRQ_event+0x44/0x84) [] (handle_IRQ_event+0x0/0x84) from [] (handle_level_irq+0xe0/0xfc) r7:00000bff r6:00000000 r5:00000003 r4:c03d8998 [] (handle_level_irq+0x0/0xfc) from [] (asm_do_IRQ+0x48/0x64) r5:c03d8998 r4:00000003 [] (asm_do_IRQ+0x0/0x64) from [] (__irq_svc+0x30/0x100) Exception stack(0xc5cb3b80 to 0xc5cb3bc8) 3b80: 00000bef 00000000 fffe0c00 000fffff ffffffff 00000000 000fffff 00000bff 3ba0: 00000000 00000000 00000000 c5cb3cc4 c5cb3c50 c5cb3bc8 00000000 c02155f8 3bc0: 80000013 ffffffff r6:00000002 r5:f1020000 r4:ffffffff [] (xfs_bunmapi+0x0/0xf38) from [] (xfs_itruncate_finish+0x1e4/0x368) [] (xfs_itruncate_finish+0x0/0x368) from [] (xfs_setattr+0x8b0/0xde4) [] (xfs_setattr+0x0/0xde4) from [] (xfs_vn_setattr+0x16c/0x18c) [] (xfs_vn_setattr+0x0/0x18c) from [] (notify_change+0x124/0x244) r7:c5b26758 r6:00000068 r5:c5985078 r4:c7d9b8f0 [] (notify_change+0x0/0x244) from [] (nfsd_setattr+0x368/0x50c) [] (nfsd_setattr+0x0/0x50c) from [] (nfsd3_proc_setattr+0xa0/0xc4) [] (nfsd3_proc_setattr+0x0/0xc4) from [] (nfsd_dispatch+0xd8/0x1e0) r7:c2b3a000 r6:c03db9a4 r5:00000018 r4:c62ad000 [] (nfsd_dispatch+0x0/0x1e0) from [] (svc_process+0x448/0x7e8) r8:c03db944 r7:00000014 r6:c62ad000 r5:c2b3a000 r4:c03db9a4 [] (svc_process+0x0/0x7e8) from [] (nfsd+0x17c/0x2d4) [] (nfsd+0x0/0x2d4) from [] (do_exit+0x0/0x7c4) SysRq : Resetting Reseting !! From owner-xfs@oss.sgi.com Wed Feb 6 19:35:50 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Feb 2008 19:35:53 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m173ZhEd015864 for ; Wed, 6 Feb 2008 19:35:48 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA06555; Thu, 7 Feb 2008 14:36:00 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 1161) id 75DAC58C4C11; Thu, 7 Feb 2008 14:36:00 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 976517 - Fix rounding issue with new mkfs.xfs defaults Message-Id: <20080207033600.75DAC58C4C11@chook.melbourne.sgi.com> Date: Thu, 7 Feb 2008 14:36:00 +1100 (EST) From: bnaujok@sgi.com (Barry Naujok) X-Virus-Scanned: ClamAV 0.91.2/5721/Wed Feb 6 18:18:06 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14361 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@sgi.com Precedence: bulk X-list: xfs Date: Thu Feb 7 14:35:39 AEDT 2008 Workarea: chook.melbourne.sgi.com:/home/bnaujok/isms/repair Inspected by: sandeen@sandeen.net The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-cmds/master-melb Modid: master-melb:xfs-cmds:30465a xfsprogs/VERSION - 1.178 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/VERSION.diff?r1=text&tr1=1.178&r2=text&tr2=1.177&f=h xfsprogs/doc/CHANGES - 1.250 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/doc/CHANGES.diff?r1=text&tr1=1.250&r2=text&tr2=1.249&f=h - Version 2.9.6 xfsprogs/mkfs/xfs_mkfs.c - 1.84 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/mkfs/xfs_mkfs.c.diff?r1=text&tr1=1.84&r2=text&tr2=1.83&f=h - Fix rounding issue with new mkfs.xfs defaults and turn on lazy superblock counters by default From owner-xfs@oss.sgi.com Wed Feb 6 20:15:20 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Feb 2008 20:15:23 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m174FEfU018005 for ; Wed, 6 Feb 2008 20:15:19 -0800 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA07424; Thu, 7 Feb 2008 15:15:32 +1100 Message-ID: <47AA86E5.5030804@sgi.com> Date: Thu, 07 Feb 2008 15:19:49 +1100 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.9 (X11/20071031) MIME-Version: 1.0 To: xfs-dev , xfs-oss Subject: [PATCH V2] make inode reclaim synchronise with xfs_iflush_done() Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/5721/Wed Feb 6 18:18:06 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14362 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs On a forced shutdown, xfs_finish_reclaim() will skip flushing the inode. If the inode flush lock is not already held and there is an outstanding xfs_iflush_done() then we might free the inode prematurely. By acquiring and releasing the flush lock we will synchronise with xfs_iflush_done(). Lachlan --- fs/xfs/xfs_vnodeops.c_1.727 2008-01-10 16:00:48.000000000 +1100 +++ fs/xfs/xfs_vnodeops.c 2008-02-07 15:15:26.000000000 +1100 @@ -3721,12 +3721,12 @@ xfs_finish_reclaim( * We get the flush lock regardless, though, just to make sure * we don't free it while it is being flushed. */ - if (!XFS_FORCED_SHUTDOWN(ip->i_mount)) { - if (!locked) { - xfs_ilock(ip, XFS_ILOCK_EXCL); - xfs_iflock(ip); - } + if (!locked) { + xfs_ilock(ip, XFS_ILOCK_EXCL); + xfs_iflock(ip); + } + if (!XFS_FORCED_SHUTDOWN(ip->i_mount)) { if (ip->i_update_core || ((ip->i_itemp != NULL) && (ip->i_itemp->ili_format.ilf_fields != 0))) { @@ -3746,17 +3746,11 @@ xfs_finish_reclaim( ASSERT(ip->i_update_core == 0); ASSERT(ip->i_itemp == NULL || ip->i_itemp->ili_format.ilf_fields == 0); - xfs_iunlock(ip, XFS_ILOCK_EXCL); - } else if (locked) { - /* - * We are not interested in doing an iflush if we're - * in the process of shutting down the filesystem forcibly. - * So, just reclaim the inode. - */ - xfs_ifunlock(ip); - xfs_iunlock(ip, XFS_ILOCK_EXCL); } + xfs_ifunlock(ip); + xfs_iunlock(ip, XFS_ILOCK_EXCL); + reclaim: xfs_ireclaim(ip); return 0; From owner-xfs@oss.sgi.com Wed Feb 6 22:39:37 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Feb 2008 22:39:45 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m176dYY7029619 for ; Wed, 6 Feb 2008 22:39:37 -0800 X-ASG-Debug-ID: 1202366372-6908009b0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id BC3E111DD3FA; Wed, 6 Feb 2008 22:39:32 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id 0LtSJBmits5rnFrK; Wed, 06 Feb 2008 22:39:32 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JN0Px-0008Cg-13; Thu, 07 Feb 2008 06:39:29 +0000 Date: Thu, 7 Feb 2008 01:39:29 -0500 From: Christoph Hellwig To: David Chinner Cc: Eric Sandeen , xfs-oss X-ASG-Orig-Subj: Re: unpushed 4-month-old mods? Subject: Re: unpushed 4-month-old mods? Message-ID: <20080207063929.GA31513@infradead.org> References: <47A694F3.9010307@sandeen.net> <20080204044611.GF155407@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080204044611.GF155407@sgi.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1202366394 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41582 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5721/Wed Feb 6 18:18:06 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14363 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Mon, Feb 04, 2008 at 03:46:11PM +1100, David Chinner wrote: > On Sun, Feb 03, 2008 at 10:30:43PM -0600, Eric Sandeen wrote: > > At least these three mods which I did back in September to get Fedora 8 > > / 2.6.23 into shape on 4k stacks, and a bugfix, are still not pushed to > > kernel.org, and are missing in 2.6.24... > > > > Is there any reason for the holdup? Makes me wonder what else isn't > > pushed... > > The holdup is that we drew a line in the sand for the 2.6.24 before > 2.6.23 was released. We did this because of the massive amount of invasive > change we already had queued up for 2.6.24. All those mods that got > held up will be pushed into 2.6.25 release. For which it's getting about time. The merge window started about with LCA, so by the tradition 2 weeks rule it'll close by the end of the week.. From owner-xfs@oss.sgi.com Wed Feb 6 23:20:04 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Feb 2008 23:20:06 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m177JxkS031584 for ; Wed, 6 Feb 2008 23:20:02 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA11727; Thu, 7 Feb 2008 18:20:18 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m177KHLF53983024; Thu, 7 Feb 2008 18:20:18 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m177KG7v54005035; Thu, 7 Feb 2008 18:20:16 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Thu, 7 Feb 2008 18:20:16 +1100 From: David Chinner To: Lachlan McIlroy Cc: xfs-dev , xfs-oss Subject: Re: [PATCH V2] make inode reclaim synchronise with xfs_iflush_done() Message-ID: <20080207072016.GR155407@sgi.com> References: <47AA86E5.5030804@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47AA86E5.5030804@sgi.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5722/Wed Feb 6 22:43:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14364 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Thu, Feb 07, 2008 at 03:19:49PM +1100, Lachlan McIlroy wrote: > On a forced shutdown, xfs_finish_reclaim() will skip flushing the inode. > If the inode flush lock is not already held and there is an outstanding > xfs_iflush_done() then we might free the inode prematurely. By acquiring > and releasing the flush lock we will synchronise with xfs_iflush_done(). Looks fine. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Feb 7 00:27:47 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 00:27:52 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m178RkF5008353 for ; Thu, 7 Feb 2008 00:27:47 -0800 X-ASG-Debug-ID: 1202372854-74c003be0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id D629811E08CB for ; Thu, 7 Feb 2008 00:27:34 -0800 (PST) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id iPrfVZFgn3nve9d3 for ; Thu, 07 Feb 2008 00:27:34 -0800 (PST) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m178RPF3014229 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Thu, 7 Feb 2008 09:27:25 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m178ROsa014227 for xfs@oss.sgi.com; Thu, 7 Feb 2008 09:27:24 +0100 Date: Thu, 7 Feb 2008 09:27:24 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH] kill t_sema member of struct xfs_trans Subject: [PATCH] kill t_sema member of struct xfs_trans Message-ID: <20080207082724.GA14119@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1202372869 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41588 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5722/Wed Feb 6 22:43:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14365 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs It's completely unused so we might aswell kill it. Note that there is another t_sema in struct xlog_ticket, which is used and actually an sv_t despite the name. That one is left untouched by this patch. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_trans.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_trans.h 2008-02-05 08:48:40.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_trans.h 2008-02-05 08:48:45.000000000 +0100 @@ -341,7 +341,6 @@ typedef struct xfs_trans { unsigned int t_rtx_res; /* # of rt extents resvd */ unsigned int t_rtx_res_used; /* # of resvd rt extents used */ xfs_log_ticket_t t_ticket; /* log mgr ticket */ - sema_t t_sema; /* sema for commit completion */ xfs_lsn_t t_lsn; /* log seq num of start of * transaction. */ xfs_lsn_t t_commit_lsn; /* log seq num of end of From owner-xfs@oss.sgi.com Thu Feb 7 00:36:34 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 00:36:38 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m178aXbg009026 for ; Thu, 7 Feb 2008 00:36:34 -0800 X-ASG-Debug-ID: 1202373407-473801900000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from pentafluge.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 591C95ACB8D for ; Thu, 7 Feb 2008 00:36:47 -0800 (PST) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by cuda.sgi.com with ESMTP id NS2zC7ahdC2YDfpr for ; Thu, 07 Feb 2008 00:36:47 -0800 (PST) Received: from hch by pentafluge.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JN2Ec-0002u5-AU; Thu, 07 Feb 2008 08:35:54 +0000 Date: Thu, 7 Feb 2008 08:35:54 +0000 From: Christoph Hellwig To: Donald Douwsma Cc: xfs-oss X-ASG-Orig-Subj: Re: [review] Remove the xfs refcache Subject: Re: [review] Remove the xfs refcache Message-ID: <20080207083554.GA11119@infradead.org> References: <4765EC66.5020202@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4765EC66.5020202@sgi.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: pentafluge.infradead.org[213.146.154.40] X-Barracuda-Start-Time: 1202373417 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41588 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5722/Wed Feb 6 22:43:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14366 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Mon, Dec 17, 2007 at 02:26:30PM +1100, Donald Douwsma wrote: > Remove the xfs_refcache, it was only needed while we were still building for > 2.4 kernels. Given that we finally agreed that this form of refcache shouldn't come back can you commit the patch? From owner-xfs@oss.sgi.com Thu Feb 7 00:40:01 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 00:40:06 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_64, J_CHICKENPOX_66 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m178e0Ba009477 for ; Thu, 7 Feb 2008 00:40:01 -0800 X-ASG-Debug-ID: 1202373605-78ed000f0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 89BC311E0409 for ; Thu, 7 Feb 2008 00:40:05 -0800 (PST) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id N2TwHGmFk2A53fZY for ; Thu, 07 Feb 2008 00:40:05 -0800 (PST) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m178WMF3014458 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 7 Feb 2008 09:32:22 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m178WMZl014456; Thu, 7 Feb 2008 09:32:22 +0100 Date: Thu, 7 Feb 2008 09:32:22 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: a.gruenbacher@computer.org X-ASG-Orig-Subj: [PATCH, RFC] use generic ACL code Subject: [PATCH, RFC] use generic ACL code Message-ID: <20080207083222.GA14317@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1202373621 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41590 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/5722/Wed Feb 6 22:43:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14367 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs This patch rips out the XFS ACL handling code and uses the generic fs/posix_acl.c code instead. The ondisk format is of course left unchanged. This also introduces the same ACL caching all other Linux filesystems do by adding pointers to the acl and default acl in struct xfs_inode. It'll probably need some benchmarking to find out whether bloating the inode is worth it. It should be possible to use the generic code without this caching by revamping the code a little, although no other filesystem currently does that. This patch is only an RFC because it still introduces a regression in XFSQA test 053, but I really want to get it out now to get more comments or even someone having a look at it because I'm running a little out of time currently. Note that this patch applies ontop of the various vnode cleanups I've posted to the XFS list a few weeks ago that haven't been applied yet. Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_acl.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_acl.c 2008-02-07 09:15:35.000000000 +0100 @@ -0,0 +1,453 @@ +/* + * Copyright (C) 2007 Christoph Hellwig. + * Released under GPL v2. + */ +#include "xfs.h" +#include "xfs_acl.h" +#include "xfs_attr.h" +#include "xfs_bmap_btree.h" /* required by xfs_inode.h */ +#include "xfs_inode.h" +#include "xfs_vnodeops.h" + +#include + + +#define XFS_ACL_NOT_CACHED ((void *)-1) + +/* + * Convert from extended attribute to in-memory representation. + */ +static struct posix_acl *xfs_acl_from_disk(struct xfs_acl *aclp) +{ + struct posix_acl_entry *acl_e; + struct posix_acl *acl; + struct xfs_acl_entry *ace; + int count, i; + + count = be32_to_cpu(aclp->acl_cnt); + + acl = posix_acl_alloc(count, GFP_KERNEL); + if (!acl) + return ERR_PTR(-ENOMEM); + + for (i = 0; i < count; i++) { + acl_e = &acl->a_entries[i]; + ace = &aclp->acl_entry[i]; + + /* + * XXX(hch): the tag is 32 bits on disk and 16 bits in core. + * Any special handling required?? + */ + acl_e->e_tag = be32_to_cpu(ace->ae_tag); + acl_e->e_perm = be16_to_cpu(ace->ae_perm); + + switch(acl_e->e_tag) { + case ACL_USER: + case ACL_GROUP: + acl_e->e_id = be32_to_cpu(ace->ae_id); + break; + case ACL_USER_OBJ: + case ACL_GROUP_OBJ: + case ACL_MASK: + case ACL_OTHER: + acl_e->e_id = ACL_UNDEFINED_ID; + break; + default: + goto fail; + } + } + return acl; + +fail: + posix_acl_release(acl); + return ERR_PTR(-EINVAL); +} + +/* + * Convert from in-memory to extended attribute representation. + */ +static void xfs_acl_to_disk(struct xfs_acl *aclp, const struct posix_acl *acl) +{ + const struct posix_acl_entry *acl_e; + struct xfs_acl_entry *ace; + int i; + + for (i = 0; i < acl->a_count; i++) { + ace = &aclp->acl_entry[i]; + acl_e = &acl->a_entries[i]; + + ace->ae_tag = cpu_to_be32(acl_e->e_tag); + ace->ae_id = cpu_to_be32(acl_e->e_id); + ace->ae_perm = cpu_to_be16(acl_e->e_perm); + } +} + +struct posix_acl *xfs_get_acl(struct inode *inode, int type) +{ + struct xfs_inode *ip = XFS_I(inode); + struct posix_acl *acl = NULL, **p_acl; + struct xfs_acl *xfs_acl; + int len = sizeof(struct xfs_acl); + char *ea_name; + int error; + + switch (type) { + case ACL_TYPE_ACCESS: + ea_name = SGI_ACL_FILE; + p_acl = &ip->i_acl; + break; + case ACL_TYPE_DEFAULT: + ea_name = SGI_ACL_DEFAULT; + p_acl = &ip->i_default_acl; + break; + default: + return ERR_PTR(-EINVAL); + } + + if (*p_acl != XFS_ACL_NOT_CACHED) + return posix_acl_dup(*p_acl); + + xfs_acl = kzalloc(sizeof(struct xfs_acl), GFP_KERNEL); + if (!xfs_acl) + return ERR_PTR(-ENOMEM); + + error = -xfs_attr_get(ip, ea_name, (char *)xfs_acl, + &len, ATTR_ROOT, sys_cred); + if (!error) { + acl = xfs_acl_from_disk(xfs_acl); + if (!IS_ERR(acl)) + *p_acl = posix_acl_dup(acl); + } else { + *p_acl = NULL; + } + + kfree(xfs_acl); + return acl; +} + +static int xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl) +{ + struct xfs_inode *ip = XFS_I(inode); + struct posix_acl **p_acl; + char *ea_name; + int error; + + if (S_ISLNK(inode->i_mode)) + return -EOPNOTSUPP; + + switch (type) { + case ACL_TYPE_ACCESS: + ea_name = SGI_ACL_FILE; + p_acl = &ip->i_acl; + break; + case ACL_TYPE_DEFAULT: + ea_name = SGI_ACL_DEFAULT; + p_acl = &ip->i_default_acl; + if (!S_ISDIR(inode->i_mode)) + return acl ? -EACCES : 0; + break; + default: + return -EINVAL; + } + + if (acl) { + struct xfs_acl *xfs_acl; + int len; + + xfs_acl = kzalloc(sizeof(struct xfs_acl), GFP_KERNEL); + if (!xfs_acl) + return -ENOMEM; + + xfs_acl_to_disk(xfs_acl, acl); + len = sizeof(struct xfs_acl) - + (sizeof(struct xfs_acl_entry) * + (XFS_ACL_MAX_ENTRIES - acl->a_count)); + + error = -xfs_attr_set(ip, ea_name, (char *)xfs_acl, + len, ATTR_ROOT); + + kfree(xfs_acl); + } else { + error = -xfs_attr_remove(ip, ea_name, ATTR_ROOT); + /* + * If the attribute didn't exist to start with that's fine. + */ + if (error == -ENOATTR) + error = 0; + } + + if (!error) { + if (*p_acl && *p_acl != XFS_ACL_NOT_CACHED) + posix_acl_release(*p_acl); + *p_acl = posix_acl_dup(acl); + } + return error; +} + +static int xfs_check_acl(struct inode *inode, int mask) +{ + struct xfs_inode *ip = XFS_I(inode); + + xfs_itrace_entry(ip); + + if (!XFS_IFORK_Q(ip)) + return -EAGAIN; + + if (ip->i_acl == XFS_ACL_NOT_CACHED) { + struct posix_acl *acl = xfs_get_acl(inode, ACL_TYPE_ACCESS); + if (IS_ERR(acl)) + return PTR_ERR(acl); + posix_acl_release(acl); + } + + if (ip->i_acl) + return posix_acl_permission(inode, ip->i_acl, mask); + return -EAGAIN; +} + +int xfs_vn_permission(struct inode *inode, int mask, struct nameidata *nd) +{ + return generic_permission(inode, mask, xfs_check_acl); +} + +/* + * Extended attribute handlers + */ +static int xfs_xattr_get_acl(struct inode *inode, int type, + void *buffer, size_t size) +{ + struct posix_acl *acl; + int error; + + acl = xfs_get_acl(inode, type); + if (IS_ERR(acl)) + return PTR_ERR(acl); + if (acl == NULL) + return -ENODATA; + error = posix_acl_to_xattr(acl, buffer, size); + posix_acl_release(acl); + + return error; +} + +/* + * Helper to propagate i_mode the xfs_inode. + */ +static int xfs_set_mode(struct inode *inode, mode_t mode) +{ + int error = 0; + + if (mode != inode->i_mode) { + struct bhv_vattr va = { + .va_mask = XFS_AT_MODE, + .va_mode = mode, + }; + + va.va_mask = XFS_AT_MODE; + va.va_mode = mode; + + error = -xfs_setattr(XFS_I(inode), &va, 0, sys_cred); + inode->i_mode = mode; + } + + return error; +} + +static int xfs_xattr_set_acl(struct inode *inode, int type, + const void *value, size_t size) +{ + struct posix_acl *acl; + int error; + + if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) + return -EPERM; + + if (value) { + acl = posix_acl_from_xattr(value, size); + if (IS_ERR(acl)) + return PTR_ERR(acl); + else if (acl) { + error = posix_acl_valid(acl); + if (error) + goto release_and_out; + if (acl->a_count > XFS_ACL_MAX_ENTRIES) { + error = -EINVAL; + goto release_and_out; + } + + if (type == ACL_TYPE_ACCESS) { + mode_t mode = inode->i_mode; + error = posix_acl_equiv_mode(acl, &mode); + if (error < 0) + return error; + if (error == 0) { + posix_acl_release(acl); + acl = NULL; + } + error = xfs_set_mode(inode, mode); + if (error) + goto release_and_out; + } + } + } else + acl = NULL; + + error = xfs_set_acl(inode, type, acl); +release_and_out: + posix_acl_release(acl); + return error; +} + +static int xfs_acl_exists(struct inode *inode, char *name) +{ + int len = sizeof(struct xfs_acl); + + return xfs_attr_get(XFS_I(inode), name, NULL, &len, + ATTR_ROOT|ATTR_KERNOVAL, sys_cred); +} + +static int posix_acl_access_get(struct inode *inode, char *name, void *data, + size_t size, int xflags) +{ + return xfs_xattr_get_acl(inode, ACL_TYPE_ACCESS, data, size); +} + +static int posix_acl_access_set(struct inode *inode, char *name, void *data, + size_t size, int xflags) +{ + return xfs_xattr_set_acl(inode, ACL_TYPE_ACCESS, data, size); +} + +static int posix_acl_access_remove(struct inode *inode, char *name, int xflags) +{ + return xfs_xattr_set_acl(inode, ACL_TYPE_ACCESS, NULL, 0); +} + +static int posix_acl_access_exists(struct inode *inode) +{ + return xfs_acl_exists(inode, SGI_ACL_FILE); +} + +static int posix_acl_default_get(struct inode *inode, char *name, void *data, + size_t size, int xflags) +{ + return xfs_xattr_get_acl(inode, ACL_TYPE_DEFAULT, data, size); +} + +static int posix_acl_default_set(struct inode *inode, char *name, void *data, + size_t size, int xflags) +{ + if (!S_ISDIR(inode->i_mode)) + return data ? -EACCES : 0; + return xfs_xattr_set_acl(inode, ACL_TYPE_DEFAULT, data, size); +} + +static int posix_acl_default_remove(struct inode *inode, char *name, int xflags) +{ + return xfs_xattr_set_acl(inode, ACL_TYPE_DEFAULT, NULL, 0); +} + +int posix_acl_default_exists(struct inode *inode) +{ + if (!S_ISDIR(inode->i_mode)) + return 0; + return xfs_acl_exists(inode, SGI_ACL_DEFAULT); +} + +struct attrnames posix_acl_access = { + .attr_name = "posix_acl_access", + .attr_namelen = sizeof("posix_acl_access") - 1, + .attr_get = posix_acl_access_get, + .attr_set = posix_acl_access_set, + .attr_remove = posix_acl_access_remove, + .attr_exists = posix_acl_access_exists, +}; + +struct attrnames posix_acl_default = { + .attr_name = "posix_acl_default", + .attr_namelen = sizeof("posix_acl_default") - 1, + .attr_get = posix_acl_default_get, + .attr_set = posix_acl_default_set, + .attr_remove = posix_acl_default_remove, + .attr_exists = posix_acl_default_exists, +}; + +/* + * Unlike the other functions in this file this returns positive errors. + */ +int xfs_inherit_acl(struct inode *inode, struct posix_acl *default_acl) +{ + struct xfs_inode *ip = XFS_I(inode); + struct posix_acl *clone; + mode_t mode; + int error = 0; + + if (S_ISDIR(inode->i_mode)) { + error = xfs_set_acl(inode, ACL_TYPE_DEFAULT, default_acl); + if (error) + return -error; + } + + clone = posix_acl_clone(default_acl, GFP_KERNEL); + if (!clone) + return ENOMEM; + + mode = inode->i_mode; + error = posix_acl_create_masq(clone, &mode); + if (error < 0) + goto out_release_clone; + + error = xfs_set_mode(inode, mode); + if (error > 0) + error = xfs_set_acl(inode, ACL_TYPE_ACCESS, clone); + xfs_iflags_set(ip, XFS_IMODIFIED); + + out_release_clone: + posix_acl_release(clone); + return -error; +} + +int xfs_acl_chmod(struct inode *inode) +{ + struct posix_acl *acl, *clone; + int error; + + if (S_ISLNK(inode->i_mode)) + return -EOPNOTSUPP; + + acl = xfs_get_acl(inode, ACL_TYPE_ACCESS); + if (IS_ERR(acl) || !acl) + return PTR_ERR(acl); + + clone = posix_acl_clone(acl, GFP_KERNEL); + posix_acl_release(acl); + if (!clone) + return -ENOMEM; + + error = posix_acl_chmod_masq(clone, inode->i_mode); + if (!error) + error = xfs_set_acl(inode, ACL_TYPE_ACCESS, clone); + + posix_acl_release(clone); + return error; +} + +void xfs_inode_init_acls(struct xfs_inode *ip) +{ + ip->i_acl = XFS_ACL_NOT_CACHED; + ip->i_default_acl = XFS_ACL_NOT_CACHED; +} + +static void xfs_clear_acl(struct posix_acl **aclp) +{ + if (*aclp != XFS_ACL_NOT_CACHED) { + posix_acl_release(*aclp); + *aclp = XFS_ACL_NOT_CACHED; + } +} + +void xfs_inode_clear_acls(struct xfs_inode *ip) +{ + xfs_clear_acl(&ip->i_acl); + xfs_clear_acl(&ip->i_default_acl); +} Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_iops.c 2008-02-05 08:43:31.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.c 2008-02-07 09:17:11.000000000 +0100 @@ -51,6 +51,7 @@ #include #include #include +#include #include #include @@ -272,8 +273,7 @@ xfs_vn_mknod( { struct inode *inode; struct xfs_inode *ip = NULL; - xfs_acl_t *default_acl = NULL; - attrexists_t test_default_acl = _ACL_DEFAULT_EXISTS; + struct posix_acl *default_acl = NULL; int error; /* @@ -283,18 +283,14 @@ xfs_vn_mknod( if (unlikely(!sysv_valid_dev(rdev) || MAJOR(rdev) & ~0x1ff)) return -EINVAL; - if (test_default_acl && test_default_acl(dir)) { - if (!_ACL_ALLOC(default_acl)) { - return -ENOMEM; - } - if (!_ACL_GET_DEFAULT(dir, default_acl)) { - _ACL_FREE(default_acl); - default_acl = NULL; - } - } + if (IS_POSIXACL(dir)) { + default_acl = xfs_get_acl(dir, ACL_TYPE_DEFAULT); + if (IS_ERR(default_acl)) + return -PTR_ERR(default_acl); - if (IS_POSIXACL(dir) && !default_acl) - mode &= ~current->fs->umask; + if (!default_acl) + mode &= ~current->fs->umask; + } switch (mode & S_IFMT) { case S_IFCHR: @@ -323,11 +319,11 @@ xfs_vn_mknod( goto out_cleanup_inode; if (default_acl) { - error = _ACL_INHERIT(inode, mode, default_acl); + error = xfs_inherit_acl(inode, default_acl); if (unlikely(error)) goto out_cleanup_inode; xfs_iflags_set(ip, XFS_IMODIFIED); - _ACL_FREE(default_acl); + posix_acl_release(default_acl); } @@ -340,8 +336,7 @@ xfs_vn_mknod( out_cleanup_inode: xfs_cleanup_inode(dir, inode, dentry, mode); out_free_acl: - if (default_acl) - _ACL_FREE(default_acl); + posix_acl_release(default_acl); return -error; } @@ -545,38 +540,6 @@ xfs_vn_put_link( kfree(s); } -#ifdef CONFIG_XFS_POSIX_ACL -STATIC int -xfs_check_acl( - struct inode *inode, - int mask) -{ - struct xfs_inode *ip = XFS_I(inode); - int error; - - xfs_itrace_entry(ip); - - if (XFS_IFORK_Q(ip)) { - error = xfs_acl_iaccess(ip, mask, NULL); - if (error != -1) - return -error; - } - - return -EAGAIN; -} - -STATIC int -xfs_vn_permission( - struct inode *inode, - int mask, - struct nameidata *nd) -{ - return generic_permission(inode, mask, xfs_check_acl); -} -#else -#define xfs_vn_permission NULL -#endif - STATIC int xfs_vn_getattr( struct vfsmount *mnt, @@ -689,6 +652,9 @@ xfs_vn_setattr( error = xfs_setattr(XFS_I(inode), &vattr, flags, NULL); if (likely(!error)) vn_revalidate(vn_from_inode(inode)); + + if (!error && (attr->ia_valid & ATTR_MODE)) + error = -xfs_acl_chmod(inode); return -error; } Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_iops.h 2008-02-05 08:43:31.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.h 2008-02-07 09:15:35.000000000 +0100 @@ -26,6 +26,12 @@ extern const struct file_operations xfs_ extern const struct file_operations xfs_dir_file_operations; extern const struct file_operations xfs_invis_file_operations; +#ifdef CONFIG_XFS_POSIX_ACL +int xfs_vn_permission(struct inode *inode, int mask, struct nameidata *nd); +#else +#define xfs_vn_permission NULL +#endif + struct xfs_inode; extern void xfs_ichgtime(struct xfs_inode *, int); Index: linux-2.6-xfs/fs/xfs/xfs_acl.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_acl.h 2008-02-05 08:43:31.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_acl.h 2008-02-07 09:15:35.000000000 +0100 @@ -18,27 +18,25 @@ #ifndef __XFS_ACL_H__ #define __XFS_ACL_H__ +struct inode; +struct posix_acl; +struct xfs_inode; + + /* * Access Control Lists */ -typedef __uint16_t xfs_acl_perm_t; -typedef __int32_t xfs_acl_type_t; -typedef __int32_t xfs_acl_tag_t; -typedef __int32_t xfs_acl_id_t; - #define XFS_ACL_MAX_ENTRIES 25 #define XFS_ACL_NOT_PRESENT (-1) -typedef struct xfs_acl_entry { - xfs_acl_tag_t ae_tag; - xfs_acl_id_t ae_id; - xfs_acl_perm_t ae_perm; -} xfs_acl_entry_t; - -typedef struct xfs_acl { - __int32_t acl_cnt; - xfs_acl_entry_t acl_entry[XFS_ACL_MAX_ENTRIES]; -} xfs_acl_t; +struct xfs_acl { + __be32 acl_cnt; + struct xfs_acl_entry { + __be32 ae_tag; + __be32 ae_id; + __be16 ae_perm; + } acl_entry[XFS_ACL_MAX_ENTRIES]; +}; /* On-disk XFS extended attribute names */ #define SGI_ACL_FILE "SGI_ACL_FILE" @@ -49,51 +47,31 @@ typedef struct xfs_acl { #ifdef CONFIG_XFS_POSIX_ACL -struct vattr; -struct xfs_inode; - -extern struct kmem_zone *xfs_acl_zone; -#define xfs_acl_zone_init(zone, name) \ - (zone) = kmem_zone_init(sizeof(xfs_acl_t), (name)) -#define xfs_acl_zone_destroy(zone) kmem_zone_destroy(zone) - -extern int xfs_acl_inherit(bhv_vnode_t *, mode_t mode, xfs_acl_t *); -extern int xfs_acl_iaccess(struct xfs_inode *, mode_t, cred_t *); -extern int xfs_acl_vtoacl(bhv_vnode_t *, xfs_acl_t *, xfs_acl_t *); -extern int xfs_acl_vhasacl_access(bhv_vnode_t *); -extern int xfs_acl_vhasacl_default(bhv_vnode_t *); -extern int xfs_acl_vset(bhv_vnode_t *, void *, size_t, int); -extern int xfs_acl_vget(bhv_vnode_t *, void *, size_t, int); -extern int xfs_acl_vremove(bhv_vnode_t *, int); - -#define _ACL_TYPE_ACCESS 1 -#define _ACL_TYPE_DEFAULT 2 -#define _ACL_PERM_INVALID(perm) ((perm) & ~(ACL_READ|ACL_WRITE|ACL_EXECUTE)) - -#define _ACL_INHERIT(c,m,d) (xfs_acl_inherit(c,m,d)) -#define _ACL_GET_ACCESS(pv,pa) (xfs_acl_vtoacl(pv,pa,NULL) == 0) -#define _ACL_GET_DEFAULT(pv,pd) (xfs_acl_vtoacl(pv,NULL,pd) == 0) -#define _ACL_ACCESS_EXISTS xfs_acl_vhasacl_access -#define _ACL_DEFAULT_EXISTS xfs_acl_vhasacl_default - -#define _ACL_ALLOC(a) ((a) = kmem_zone_alloc(xfs_acl_zone, KM_SLEEP)) -#define _ACL_FREE(a) ((a)? kmem_zone_free(xfs_acl_zone, (a)):(void)0) +struct posix_acl *xfs_get_acl(struct inode *inode, int type); +int xfs_inherit_acl(struct inode *inode, struct posix_acl *default_acl); +int xfs_acl_chmod(struct inode *inode); +void xfs_inode_init_acls(struct xfs_inode *ip); +void xfs_inode_clear_acls(struct xfs_inode *ip); #else -#define xfs_acl_zone_init(zone,name) -#define xfs_acl_zone_destroy(zone) -#define xfs_acl_vset(v,p,sz,t) (-EOPNOTSUPP) -#define xfs_acl_vget(v,p,sz,t) (-EOPNOTSUPP) -#define xfs_acl_vremove(v,t) (-EOPNOTSUPP) -#define xfs_acl_vhasacl_access(v) (0) -#define xfs_acl_vhasacl_default(v) (0) -#define _ACL_ALLOC(a) (1) /* successfully allocate nothing */ -#define _ACL_FREE(a) ((void)0) -#define _ACL_INHERIT(c,m,d) (0) -#define _ACL_GET_ACCESS(pv,pa) (0) -#define _ACL_GET_DEFAULT(pv,pd) (0) -#define _ACL_ACCESS_EXISTS (NULL) -#define _ACL_DEFAULT_EXISTS (NULL) -#endif +static inline struct posix_acl *xfs_get_acl(struct inode *inode, int type) +{ + BUG(); +} +static inline int xfs_inherit_acl(struct inode *inode, + struct posix_acl *default_acl) +{ + BUG(); +} + +static inline void xfs_inode_init_acls(struct xfs_inode *ip) +{ +} + +static inline void xfs_inode_clear_acls(struct xfs_inode *ip) +{ +} + +#endif /* CONFIG_XFS_POSIX_ACL */ #endif /* __XFS_ACL_H__ */ Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_ksyms.c 2008-02-05 08:43:31.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c 2008-02-07 09:15:35.000000000 +0100 @@ -52,7 +52,6 @@ #include "xfs_dir2_block.h" #include "xfs_dir2_node.h" #include "xfs_dir2_trace.h" -#include "xfs_acl.h" #include "xfs_attr.h" #include "xfs_attr_leaf.h" #include "xfs_inode_item.h" @@ -183,10 +182,6 @@ EXPORT_SYMBOL(uuid_table_remove); EXPORT_SYMBOL(vn_hold); EXPORT_SYMBOL(vn_revalidate); -#if defined(CONFIG_XFS_POSIX_ACL) -EXPORT_SYMBOL(xfs_acl_vtoacl); -EXPORT_SYMBOL(xfs_acl_inherit); -#endif EXPORT_SYMBOL(xfs_alloc_buftarg); EXPORT_SYMBOL(xfs_flush_buftarg); EXPORT_SYMBOL(xfs_free_buftarg); Index: linux-2.6-xfs/fs/xfs/xfs_attr.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_attr.c 2008-02-05 08:43:31.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_attr.c 2008-02-07 09:15:35.000000000 +0100 @@ -58,8 +58,6 @@ */ #define ATTR_SYSCOUNT 2 -static struct attrnames posix_acl_access; -static struct attrnames posix_acl_default; static struct attrnames *attr_system_names[ATTR_SYSCOUNT]; /*======================================================================== @@ -2427,80 +2425,6 @@ xfs_attr_trace_enter(int type, char *whe * System (pseudo) namespace attribute interface routines. *========================================================================*/ -STATIC int -posix_acl_access_set( - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) -{ - return xfs_acl_vset(vp, data, size, _ACL_TYPE_ACCESS); -} - -STATIC int -posix_acl_access_remove( - bhv_vnode_t *vp, char *name, int xflags) -{ - return xfs_acl_vremove(vp, _ACL_TYPE_ACCESS); -} - -STATIC int -posix_acl_access_get( - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) -{ - return xfs_acl_vget(vp, data, size, _ACL_TYPE_ACCESS); -} - -STATIC int -posix_acl_access_exists( - bhv_vnode_t *vp) -{ - return xfs_acl_vhasacl_access(vp); -} - -STATIC int -posix_acl_default_set( - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) -{ - return xfs_acl_vset(vp, data, size, _ACL_TYPE_DEFAULT); -} - -STATIC int -posix_acl_default_get( - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) -{ - return xfs_acl_vget(vp, data, size, _ACL_TYPE_DEFAULT); -} - -STATIC int -posix_acl_default_remove( - bhv_vnode_t *vp, char *name, int xflags) -{ - return xfs_acl_vremove(vp, _ACL_TYPE_DEFAULT); -} - -STATIC int -posix_acl_default_exists( - bhv_vnode_t *vp) -{ - return xfs_acl_vhasacl_default(vp); -} - -static struct attrnames posix_acl_access = { - .attr_name = "posix_acl_access", - .attr_namelen = sizeof("posix_acl_access") - 1, - .attr_get = posix_acl_access_get, - .attr_set = posix_acl_access_set, - .attr_remove = posix_acl_access_remove, - .attr_exists = posix_acl_access_exists, -}; - -static struct attrnames posix_acl_default = { - .attr_name = "posix_acl_default", - .attr_namelen = sizeof("posix_acl_default") - 1, - .attr_get = posix_acl_default_get, - .attr_set = posix_acl_default_set, - .attr_remove = posix_acl_default_remove, - .attr_exists = posix_acl_default_exists, -}; - static struct attrnames *attr_system_names[] = { &posix_acl_access, &posix_acl_default }; Index: linux-2.6-xfs/fs/xfs/xfs_attr.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_attr.h 2008-02-05 08:43:31.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_attr.h 2008-02-07 09:15:35.000000000 +0100 @@ -61,6 +61,8 @@ extern struct attrnames attr_secure; extern struct attrnames attr_system; extern struct attrnames attr_trusted; extern struct attrnames *attr_namespaces[ATTR_NAMECOUNT]; +extern struct attrnames posix_acl_access; +extern struct attrnames posix_acl_default; extern attrnames_t *attr_lookup_namespace(char *, attrnames_t **, int); extern int attr_generic_list(bhv_vnode_t *, void *, size_t, int, ssize_t *); Index: linux-2.6-xfs/fs/xfs/xfs_vfsops.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_vfsops.c 2008-02-05 08:43:31.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_vfsops.c 2008-02-07 09:15:35.000000000 +0100 @@ -78,7 +78,6 @@ xfs_init(void) kmem_zone_init(sizeof(xfs_da_state_t), "xfs_da_state"); xfs_dabuf_zone = kmem_zone_init(sizeof(xfs_dabuf_t), "xfs_dabuf"); xfs_ifork_zone = kmem_zone_init(sizeof(xfs_ifork_t), "xfs_ifork"); - xfs_acl_zone_init(xfs_acl_zone, "xfs_acl"); xfs_mru_cache_init(); xfs_filestream_init(); @@ -160,7 +159,6 @@ xfs_cleanup(void) xfs_refcache_destroy(); xfs_filestream_uninit(); xfs_mru_cache_uninit(); - xfs_acl_zone_destroy(xfs_acl_zone); #ifdef XFS_DIR2_TRACE ktrace_free(xfs_dir2_trace_buf); Index: linux-2.6-xfs/fs/xfs/xfs_acl.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_acl.c 2008-02-05 08:43:31.000000000 +0100 +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 @@ -1,903 +0,0 @@ -/* - * Copyright (c) 2001-2002,2005 Silicon Graphics, Inc. - * All Rights Reserved. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License as - * published by the Free Software Foundation. - * - * This program is distributed in the hope that it would be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write the Free Software Foundation, - * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA - */ -#include "xfs.h" -#include "xfs_fs.h" -#include "xfs_types.h" -#include "xfs_bit.h" -#include "xfs_inum.h" -#include "xfs_ag.h" -#include "xfs_dir2.h" -#include "xfs_bmap_btree.h" -#include "xfs_alloc_btree.h" -#include "xfs_ialloc_btree.h" -#include "xfs_dir2_sf.h" -#include "xfs_attr_sf.h" -#include "xfs_dinode.h" -#include "xfs_inode.h" -#include "xfs_btree.h" -#include "xfs_acl.h" -#include "xfs_attr.h" -#include "xfs_vnodeops.h" - -#include -#include - -STATIC int xfs_acl_setmode(bhv_vnode_t *, xfs_acl_t *, int *); -STATIC void xfs_acl_filter_mode(mode_t, xfs_acl_t *); -STATIC void xfs_acl_get_endian(xfs_acl_t *); -STATIC int xfs_acl_access(uid_t, gid_t, xfs_acl_t *, mode_t, cred_t *); -STATIC int xfs_acl_invalid(xfs_acl_t *); -STATIC void xfs_acl_sync_mode(mode_t, xfs_acl_t *); -STATIC void xfs_acl_get_attr(bhv_vnode_t *, xfs_acl_t *, int, int, int *); -STATIC void xfs_acl_set_attr(bhv_vnode_t *, xfs_acl_t *, int, int *); -STATIC int xfs_acl_allow_set(bhv_vnode_t *, int); - -kmem_zone_t *xfs_acl_zone; - - -/* - * Test for existence of access ACL attribute as efficiently as possible. - */ -int -xfs_acl_vhasacl_access( - bhv_vnode_t *vp) -{ - int error; - - xfs_acl_get_attr(vp, NULL, _ACL_TYPE_ACCESS, ATTR_KERNOVAL, &error); - return (error == 0); -} - -/* - * Test for existence of default ACL attribute as efficiently as possible. - */ -int -xfs_acl_vhasacl_default( - bhv_vnode_t *vp) -{ - int error; - - if (!VN_ISDIR(vp)) - return 0; - xfs_acl_get_attr(vp, NULL, _ACL_TYPE_DEFAULT, ATTR_KERNOVAL, &error); - return (error == 0); -} - -/* - * Convert from extended attribute representation to in-memory for XFS. - */ -STATIC int -posix_acl_xattr_to_xfs( - posix_acl_xattr_header *src, - size_t size, - xfs_acl_t *dest) -{ - posix_acl_xattr_entry *src_entry; - xfs_acl_entry_t *dest_entry; - int n; - - if (!src || !dest) - return EINVAL; - - if (size < sizeof(posix_acl_xattr_header)) - return EINVAL; - - if (src->a_version != cpu_to_le32(POSIX_ACL_XATTR_VERSION)) - return EOPNOTSUPP; - - memset(dest, 0, sizeof(xfs_acl_t)); - dest->acl_cnt = posix_acl_xattr_count(size); - if (dest->acl_cnt < 0 || dest->acl_cnt > XFS_ACL_MAX_ENTRIES) - return EINVAL; - - /* - * acl_set_file(3) may request that we set default ACLs with - * zero length -- defend (gracefully) against that here. - */ - if (!dest->acl_cnt) - return 0; - - src_entry = (posix_acl_xattr_entry *)((char *)src + sizeof(*src)); - dest_entry = &dest->acl_entry[0]; - - for (n = 0; n < dest->acl_cnt; n++, src_entry++, dest_entry++) { - dest_entry->ae_perm = le16_to_cpu(src_entry->e_perm); - if (_ACL_PERM_INVALID(dest_entry->ae_perm)) - return EINVAL; - dest_entry->ae_tag = le16_to_cpu(src_entry->e_tag); - switch(dest_entry->ae_tag) { - case ACL_USER: - case ACL_GROUP: - dest_entry->ae_id = le32_to_cpu(src_entry->e_id); - break; - case ACL_USER_OBJ: - case ACL_GROUP_OBJ: - case ACL_MASK: - case ACL_OTHER: - dest_entry->ae_id = ACL_UNDEFINED_ID; - break; - default: - return EINVAL; - } - } - if (xfs_acl_invalid(dest)) - return EINVAL; - - return 0; -} - -/* - * Comparison function called from xfs_sort(). - * Primary key is ae_tag, secondary key is ae_id. - */ -STATIC int -xfs_acl_entry_compare( - const void *va, - const void *vb) -{ - xfs_acl_entry_t *a = (xfs_acl_entry_t *)va, - *b = (xfs_acl_entry_t *)vb; - - if (a->ae_tag == b->ae_tag) - return (a->ae_id - b->ae_id); - return (a->ae_tag - b->ae_tag); -} - -/* - * Convert from in-memory XFS to extended attribute representation. - */ -STATIC int -posix_acl_xfs_to_xattr( - xfs_acl_t *src, - posix_acl_xattr_header *dest, - size_t size) -{ - int n; - size_t new_size = posix_acl_xattr_size(src->acl_cnt); - posix_acl_xattr_entry *dest_entry; - xfs_acl_entry_t *src_entry; - - if (size < new_size) - return -ERANGE; - - /* Need to sort src XFS ACL by */ - xfs_sort(src->acl_entry, src->acl_cnt, sizeof(src->acl_entry[0]), - xfs_acl_entry_compare); - - dest->a_version = cpu_to_le32(POSIX_ACL_XATTR_VERSION); - dest_entry = &dest->a_entries[0]; - src_entry = &src->acl_entry[0]; - for (n = 0; n < src->acl_cnt; n++, dest_entry++, src_entry++) { - dest_entry->e_perm = cpu_to_le16(src_entry->ae_perm); - if (_ACL_PERM_INVALID(src_entry->ae_perm)) - return -EINVAL; - dest_entry->e_tag = cpu_to_le16(src_entry->ae_tag); - switch (src_entry->ae_tag) { - case ACL_USER: - case ACL_GROUP: - dest_entry->e_id = cpu_to_le32(src_entry->ae_id); - break; - case ACL_USER_OBJ: - case ACL_GROUP_OBJ: - case ACL_MASK: - case ACL_OTHER: - dest_entry->e_id = cpu_to_le32(ACL_UNDEFINED_ID); - break; - default: - return -EINVAL; - } - } - return new_size; -} - -int -xfs_acl_vget( - bhv_vnode_t *vp, - void *acl, - size_t size, - int kind) -{ - int error; - xfs_acl_t *xfs_acl = NULL; - posix_acl_xattr_header *ext_acl = acl; - int flags = 0; - - VN_HOLD(vp); - if(size) { - if (!(_ACL_ALLOC(xfs_acl))) { - error = ENOMEM; - goto out; - } - memset(xfs_acl, 0, sizeof(xfs_acl_t)); - } else - flags = ATTR_KERNOVAL; - - xfs_acl_get_attr(vp, xfs_acl, kind, flags, &error); - if (error) - goto out; - - if (!size) { - error = -posix_acl_xattr_size(XFS_ACL_MAX_ENTRIES); - } else { - if (xfs_acl_invalid(xfs_acl)) { - error = EINVAL; - goto out; - } - if (kind == _ACL_TYPE_ACCESS) { - bhv_vattr_t va; - - va.va_mask = XFS_AT_MODE; - error = xfs_getattr(xfs_vtoi(vp), &va, 0); - if (error) - goto out; - xfs_acl_sync_mode(va.va_mode, xfs_acl); - } - error = -posix_acl_xfs_to_xattr(xfs_acl, ext_acl, size); - } -out: - VN_RELE(vp); - if(xfs_acl) - _ACL_FREE(xfs_acl); - return -error; -} - -int -xfs_acl_vremove( - bhv_vnode_t *vp, - int kind) -{ - int error; - - VN_HOLD(vp); - error = xfs_acl_allow_set(vp, kind); - if (!error) { - error = xfs_attr_remove(xfs_vtoi(vp), - kind == _ACL_TYPE_DEFAULT? - SGI_ACL_DEFAULT: SGI_ACL_FILE, - ATTR_ROOT); - if (error == ENOATTR) - error = 0; /* 'scool */ - } - VN_RELE(vp); - return -error; -} - -int -xfs_acl_vset( - bhv_vnode_t *vp, - void *acl, - size_t size, - int kind) -{ - posix_acl_xattr_header *ext_acl = acl; - xfs_acl_t *xfs_acl; - int error; - int basicperms = 0; /* more than std unix perms? */ - - if (!acl) - return -EINVAL; - - if (!(_ACL_ALLOC(xfs_acl))) - return -ENOMEM; - - error = posix_acl_xattr_to_xfs(ext_acl, size, xfs_acl); - if (error) { - _ACL_FREE(xfs_acl); - return -error; - } - if (!xfs_acl->acl_cnt) { - _ACL_FREE(xfs_acl); - return 0; - } - - VN_HOLD(vp); - error = xfs_acl_allow_set(vp, kind); - if (error) - goto out; - - /* Incoming ACL exists, set file mode based on its value */ - if (kind == _ACL_TYPE_ACCESS) - xfs_acl_setmode(vp, xfs_acl, &basicperms); - - /* - * If we have more than std unix permissions, set up the actual attr. - * Otherwise, delete any existing attr. This prevents us from - * having actual attrs for permissions that can be stored in the - * standard permission bits. - */ - if (!basicperms) { - xfs_acl_set_attr(vp, xfs_acl, kind, &error); - } else { - xfs_acl_vremove(vp, _ACL_TYPE_ACCESS); - } - -out: - VN_RELE(vp); - _ACL_FREE(xfs_acl); - return -error; -} - -int -xfs_acl_iaccess( - xfs_inode_t *ip, - mode_t mode, - cred_t *cr) -{ - xfs_acl_t *acl; - int rval; - - if (!(_ACL_ALLOC(acl))) - return -1; - - /* If the file has no ACL return -1. */ - rval = sizeof(xfs_acl_t); - if (xfs_attr_fetch(ip, SGI_ACL_FILE, SGI_ACL_FILE_SIZE, - (char *)acl, &rval, ATTR_ROOT | ATTR_KERNACCESS, cr)) { - _ACL_FREE(acl); - return -1; - } - xfs_acl_get_endian(acl); - - /* If the file has an empty ACL return -1. */ - if (acl->acl_cnt == XFS_ACL_NOT_PRESENT) { - _ACL_FREE(acl); - return -1; - } - - /* Synchronize ACL with mode bits */ - xfs_acl_sync_mode(ip->i_d.di_mode, acl); - - rval = xfs_acl_access(ip->i_d.di_uid, ip->i_d.di_gid, acl, mode, cr); - _ACL_FREE(acl); - return rval; -} - -STATIC int -xfs_acl_allow_set( - bhv_vnode_t *vp, - int kind) -{ - xfs_inode_t *ip = xfs_vtoi(vp); - bhv_vattr_t va; - int error; - - if (vp->i_flags & (S_IMMUTABLE|S_APPEND)) - return EPERM; - if (kind == _ACL_TYPE_DEFAULT && !VN_ISDIR(vp)) - return ENOTDIR; - if (vp->i_sb->s_flags & MS_RDONLY) - return EROFS; - va.va_mask = XFS_AT_UID; - error = xfs_getattr(ip, &va, 0); - if (error) - return error; - if (va.va_uid != current->fsuid && !capable(CAP_FOWNER)) - return EPERM; - return error; -} - -/* - * Note: cr is only used here for the capability check if the ACL test fails. - * It is not used to find out the credentials uid or groups etc, as was - * done in IRIX. It is assumed that the uid and groups for the current - * thread are taken from "current" instead of the cr parameter. - */ -STATIC int -xfs_acl_access( - uid_t fuid, - gid_t fgid, - xfs_acl_t *fap, - mode_t md, - cred_t *cr) -{ - xfs_acl_entry_t matched; - int i, allows; - int maskallows = -1; /* true, but not 1, either */ - int seen_userobj = 0; - - matched.ae_tag = 0; /* Invalid type */ - matched.ae_perm = 0; - - for (i = 0; i < fap->acl_cnt; i++) { - /* - * Break out if we've got a user_obj entry or - * a user entry and the mask (and have processed USER_OBJ) - */ - if (matched.ae_tag == ACL_USER_OBJ) - break; - if (matched.ae_tag == ACL_USER) { - if (maskallows != -1 && seen_userobj) - break; - if (fap->acl_entry[i].ae_tag != ACL_MASK && - fap->acl_entry[i].ae_tag != ACL_USER_OBJ) - continue; - } - /* True if this entry allows the requested access */ - allows = ((fap->acl_entry[i].ae_perm & md) == md); - - switch (fap->acl_entry[i].ae_tag) { - case ACL_USER_OBJ: - seen_userobj = 1; - if (fuid != current->fsuid) - continue; - matched.ae_tag = ACL_USER_OBJ; - matched.ae_perm = allows; - break; - case ACL_USER: - if (fap->acl_entry[i].ae_id != current->fsuid) - continue; - matched.ae_tag = ACL_USER; - matched.ae_perm = allows; - break; - case ACL_GROUP_OBJ: - if ((matched.ae_tag == ACL_GROUP_OBJ || - matched.ae_tag == ACL_GROUP) && !allows) - continue; - if (!in_group_p(fgid)) - continue; - matched.ae_tag = ACL_GROUP_OBJ; - matched.ae_perm = allows; - break; - case ACL_GROUP: - if ((matched.ae_tag == ACL_GROUP_OBJ || - matched.ae_tag == ACL_GROUP) && !allows) - continue; - if (!in_group_p(fap->acl_entry[i].ae_id)) - continue; - matched.ae_tag = ACL_GROUP; - matched.ae_perm = allows; - break; - case ACL_MASK: - maskallows = allows; - break; - case ACL_OTHER: - if (matched.ae_tag != 0) - continue; - matched.ae_tag = ACL_OTHER; - matched.ae_perm = allows; - break; - } - } - /* - * First possibility is that no matched entry allows access. - * The capability to override DAC may exist, so check for it. - */ - switch (matched.ae_tag) { - case ACL_OTHER: - case ACL_USER_OBJ: - if (matched.ae_perm) - return 0; - break; - case ACL_USER: - case ACL_GROUP_OBJ: - case ACL_GROUP: - if (maskallows && matched.ae_perm) - return 0; - break; - case 0: - break; - } - - /* EACCES tells generic_permission to check for capability overrides */ - return EACCES; -} -EXPORT_SYMBOL(xfs_acl_access); - -/* - * ACL validity checker. - * This acl validation routine checks each ACL entry read in makes sense. - */ -STATIC int -xfs_acl_invalid( - xfs_acl_t *aclp) -{ - xfs_acl_entry_t *entry, *e; - int user = 0, group = 0, other = 0, mask = 0; - int mask_required = 0; - int i, j; - - if (!aclp) - goto acl_invalid; - - if (aclp->acl_cnt > XFS_ACL_MAX_ENTRIES) - goto acl_invalid; - - for (i = 0; i < aclp->acl_cnt; i++) { - entry = &aclp->acl_entry[i]; - switch (entry->ae_tag) { - case ACL_USER_OBJ: - if (user++) - goto acl_invalid; - break; - case ACL_GROUP_OBJ: - if (group++) - goto acl_invalid; - break; - case ACL_OTHER: - if (other++) - goto acl_invalid; - break; - case ACL_USER: - case ACL_GROUP: - for (j = i + 1; j < aclp->acl_cnt; j++) { - e = &aclp->acl_entry[j]; - if (e->ae_id == entry->ae_id && - e->ae_tag == entry->ae_tag) - goto acl_invalid; - } - mask_required++; - break; - case ACL_MASK: - if (mask++) - goto acl_invalid; - break; - default: - goto acl_invalid; - } - } - if (!user || !group || !other || (mask_required && !mask)) - goto acl_invalid; - else - return 0; -acl_invalid: - return EINVAL; -} - -/* - * Do ACL endian conversion. - */ -STATIC void -xfs_acl_get_endian( - xfs_acl_t *aclp) -{ - xfs_acl_entry_t *ace, *end; - - INT_SET(aclp->acl_cnt, ARCH_CONVERT, aclp->acl_cnt); - end = &aclp->acl_entry[0]+aclp->acl_cnt; - for (ace = &aclp->acl_entry[0]; ace < end; ace++) { - INT_SET(ace->ae_tag, ARCH_CONVERT, ace->ae_tag); - INT_SET(ace->ae_id, ARCH_CONVERT, ace->ae_id); - INT_SET(ace->ae_perm, ARCH_CONVERT, ace->ae_perm); - } -} - -/* - * Get the ACL from the EA and do endian conversion. - */ -STATIC void -xfs_acl_get_attr( - bhv_vnode_t *vp, - xfs_acl_t *aclp, - int kind, - int flags, - int *error) -{ - int len = sizeof(xfs_acl_t); - - ASSERT((flags & ATTR_KERNOVAL) ? (aclp == NULL) : 1); - flags |= ATTR_ROOT; - *error = xfs_attr_get(xfs_vtoi(vp), - kind == _ACL_TYPE_ACCESS ? - SGI_ACL_FILE : SGI_ACL_DEFAULT, - (char *)aclp, &len, flags, sys_cred); - if (*error || (flags & ATTR_KERNOVAL)) - return; - xfs_acl_get_endian(aclp); -} - -/* - * Set the EA with the ACL and do endian conversion. - */ -STATIC void -xfs_acl_set_attr( - bhv_vnode_t *vp, - xfs_acl_t *aclp, - int kind, - int *error) -{ - xfs_acl_entry_t *ace, *newace, *end; - xfs_acl_t *newacl; - int len; - - if (!(_ACL_ALLOC(newacl))) { - *error = ENOMEM; - return; - } - - len = sizeof(xfs_acl_t) - - (sizeof(xfs_acl_entry_t) * (XFS_ACL_MAX_ENTRIES - aclp->acl_cnt)); - end = &aclp->acl_entry[0]+aclp->acl_cnt; - for (ace = &aclp->acl_entry[0], newace = &newacl->acl_entry[0]; - ace < end; - ace++, newace++) { - INT_SET(newace->ae_tag, ARCH_CONVERT, ace->ae_tag); - INT_SET(newace->ae_id, ARCH_CONVERT, ace->ae_id); - INT_SET(newace->ae_perm, ARCH_CONVERT, ace->ae_perm); - } - INT_SET(newacl->acl_cnt, ARCH_CONVERT, aclp->acl_cnt); - *error = xfs_attr_set(xfs_vtoi(vp), - kind == _ACL_TYPE_ACCESS ? - SGI_ACL_FILE: SGI_ACL_DEFAULT, - (char *)newacl, len, ATTR_ROOT); - _ACL_FREE(newacl); -} - -int -xfs_acl_vtoacl( - bhv_vnode_t *vp, - xfs_acl_t *access_acl, - xfs_acl_t *default_acl) -{ - bhv_vattr_t va; - int error = 0; - - if (access_acl) { - /* - * Get the Access ACL and the mode. If either cannot - * be obtained for some reason, invalidate the access ACL. - */ - xfs_acl_get_attr(vp, access_acl, _ACL_TYPE_ACCESS, 0, &error); - if (!error) { - /* Got the ACL, need the mode... */ - va.va_mask = XFS_AT_MODE; - error = xfs_getattr(xfs_vtoi(vp), &va, 0); - } - - if (error) - access_acl->acl_cnt = XFS_ACL_NOT_PRESENT; - else /* We have a good ACL and the file mode, synchronize. */ - xfs_acl_sync_mode(va.va_mode, access_acl); - } - - if (default_acl) { - xfs_acl_get_attr(vp, default_acl, _ACL_TYPE_DEFAULT, 0, &error); - if (error) - default_acl->acl_cnt = XFS_ACL_NOT_PRESENT; - } - return error; -} - -/* - * This function retrieves the parent directory's acl, processes it - * and lets the child inherit the acl(s) that it should. - */ -int -xfs_acl_inherit( - bhv_vnode_t *vp, - mode_t mode, - xfs_acl_t *pdaclp) -{ - xfs_acl_t *cacl; - int error = 0; - int basicperms = 0; - - /* - * If the parent does not have a default ACL, or it's an - * invalid ACL, we're done. - */ - if (!vp) - return 0; - if (!pdaclp || xfs_acl_invalid(pdaclp)) - return 0; - - /* - * Copy the default ACL of the containing directory to - * the access ACL of the new file and use the mode that - * was passed in to set up the correct initial values for - * the u::,g::[m::], and o:: entries. This is what makes - * umask() "work" with ACL's. - */ - - if (!(_ACL_ALLOC(cacl))) - return ENOMEM; - - memcpy(cacl, pdaclp, sizeof(xfs_acl_t)); - xfs_acl_filter_mode(mode, cacl); - xfs_acl_setmode(vp, cacl, &basicperms); - - /* - * Set the Default and Access ACL on the file. The mode is already - * set on the file, so we don't need to worry about that. - * - * If the new file is a directory, its default ACL is a copy of - * the containing directory's default ACL. - */ - if (VN_ISDIR(vp)) - xfs_acl_set_attr(vp, pdaclp, _ACL_TYPE_DEFAULT, &error); - if (!error && !basicperms) - xfs_acl_set_attr(vp, cacl, _ACL_TYPE_ACCESS, &error); - _ACL_FREE(cacl); - return error; -} - -/* - * Set up the correct mode on the file based on the supplied ACL. This - * makes sure that the mode on the file reflects the state of the - * u::,g::[m::], and o:: entries in the ACL. Since the mode is where - * the ACL is going to get the permissions for these entries, we must - * synchronize the mode whenever we set the ACL on a file. - */ -STATIC int -xfs_acl_setmode( - bhv_vnode_t *vp, - xfs_acl_t *acl, - int *basicperms) -{ - bhv_vattr_t va; - xfs_acl_entry_t *ap; - xfs_acl_entry_t *gap = NULL; - int i, error, nomask = 1; - - *basicperms = 1; - - if (acl->acl_cnt == XFS_ACL_NOT_PRESENT) - return 0; - - /* - * Copy the u::, g::, o::, and m:: bits from the ACL into the - * mode. The m:: bits take precedence over the g:: bits. - */ - va.va_mask = XFS_AT_MODE; - error = xfs_getattr(xfs_vtoi(vp), &va, 0); - if (error) - return error; - - va.va_mask = XFS_AT_MODE; - va.va_mode &= ~(S_IRWXU|S_IRWXG|S_IRWXO); - ap = acl->acl_entry; - for (i = 0; i < acl->acl_cnt; ++i) { - switch (ap->ae_tag) { - case ACL_USER_OBJ: - va.va_mode |= ap->ae_perm << 6; - break; - case ACL_GROUP_OBJ: - gap = ap; - break; - case ACL_MASK: /* more than just standard modes */ - nomask = 0; - va.va_mode |= ap->ae_perm << 3; - *basicperms = 0; - break; - case ACL_OTHER: - va.va_mode |= ap->ae_perm; - break; - default: /* more than just standard modes */ - *basicperms = 0; - break; - } - ap++; - } - - /* Set the group bits from ACL_GROUP_OBJ if there's no ACL_MASK */ - if (gap && nomask) - va.va_mode |= gap->ae_perm << 3; - - return xfs_setattr(xfs_vtoi(vp), &va, 0, sys_cred); -} - -/* - * The permissions for the special ACL entries (u::, g::[m::], o::) are - * actually stored in the file mode (if there is both a group and a mask, - * the group is stored in the ACL entry and the mask is stored on the file). - * This allows the mode to remain automatically in sync with the ACL without - * the need for a call-back to the ACL system at every point where the mode - * could change. This function takes the permissions from the specified mode - * and places it in the supplied ACL. - * - * This implementation draws its validity from the fact that, when the ACL - * was assigned, the mode was copied from the ACL. - * If the mode did not change, therefore, the mode remains exactly what was - * taken from the special ACL entries at assignment. - * If a subsequent chmod() was done, the POSIX spec says that the change in - * mode must cause an update to the ACL seen at user level and used for - * access checks. Before and after a mode change, therefore, the file mode - * most accurately reflects what the special ACL entries should permit/deny. - * - * CAVEAT: If someone sets the SGI_ACL_FILE attribute directly, - * the existing mode bits will override whatever is in the - * ACL. Similarly, if there is a pre-existing ACL that was - * never in sync with its mode (owing to a bug in 6.5 and - * before), it will now magically (or mystically) be - * synchronized. This could cause slight astonishment, but - * it is better than inconsistent permissions. - * - * The supplied ACL is a template that may contain any combination - * of special entries. These are treated as place holders when we fill - * out the ACL. This routine does not add or remove special entries, it - * simply unites each special entry with its associated set of permissions. - */ -STATIC void -xfs_acl_sync_mode( - mode_t mode, - xfs_acl_t *acl) -{ - int i, nomask = 1; - xfs_acl_entry_t *ap; - xfs_acl_entry_t *gap = NULL; - - /* - * Set ACL entries. POSIX1003.1eD16 requires that the MASK - * be set instead of the GROUP entry, if there is a MASK. - */ - for (ap = acl->acl_entry, i = 0; i < acl->acl_cnt; ap++, i++) { - switch (ap->ae_tag) { - case ACL_USER_OBJ: - ap->ae_perm = (mode >> 6) & 0x7; - break; - case ACL_GROUP_OBJ: - gap = ap; - break; - case ACL_MASK: - nomask = 0; - ap->ae_perm = (mode >> 3) & 0x7; - break; - case ACL_OTHER: - ap->ae_perm = mode & 0x7; - break; - default: - break; - } - } - /* Set the ACL_GROUP_OBJ if there's no ACL_MASK */ - if (gap && nomask) - gap->ae_perm = (mode >> 3) & 0x7; -} - -/* - * When inheriting an Access ACL from a directory Default ACL, - * the ACL bits are set to the intersection of the ACL default - * permission bits and the file permission bits in mode. If there - * are no permission bits on the file then we must not give them - * the ACL. This is what what makes umask() work with ACLs. - */ -STATIC void -xfs_acl_filter_mode( - mode_t mode, - xfs_acl_t *acl) -{ - int i, nomask = 1; - xfs_acl_entry_t *ap; - xfs_acl_entry_t *gap = NULL; - - /* - * Set ACL entries. POSIX1003.1eD16 requires that the MASK - * be merged with GROUP entry, if there is a MASK. - */ - for (ap = acl->acl_entry, i = 0; i < acl->acl_cnt; ap++, i++) { - switch (ap->ae_tag) { - case ACL_USER_OBJ: - ap->ae_perm &= (mode >> 6) & 0x7; - break; - case ACL_GROUP_OBJ: - gap = ap; - break; - case ACL_MASK: - nomask = 0; - ap->ae_perm &= (mode >> 3) & 0x7; - break; - case ACL_OTHER: - ap->ae_perm &= mode & 0x7; - break; - default: - break; - } - } - /* Set the ACL_GROUP_OBJ if there's no ACL_MASK */ - if (gap && nomask) - gap->ae_perm &= (mode >> 3) & 0x7; -} Index: linux-2.6-xfs/fs/xfs/Makefile =================================================================== --- linux-2.6-xfs.orig/fs/xfs/Makefile 2008-02-05 08:43:31.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/Makefile 2008-02-07 09:15:35.000000000 +0100 @@ -29,7 +29,7 @@ obj-$(CONFIG_XFS_QUOTA) += quota/ obj-$(CONFIG_XFS_DMAPI) += dmapi/ xfs-$(CONFIG_XFS_RT) += xfs_rtalloc.o -xfs-$(CONFIG_XFS_POSIX_ACL) += xfs_acl.o +xfs-$(CONFIG_XFS_POSIX_ACL) += $(XFS_LINUX)/xfs_acl.o xfs-$(CONFIG_PROC_FS) += $(XFS_LINUX)/xfs_stats.o xfs-$(CONFIG_SYSCTL) += $(XFS_LINUX)/xfs_sysctl.o xfs-$(CONFIG_COMPAT) += $(XFS_LINUX)/xfs_ioctl32.o Index: linux-2.6-xfs/fs/xfs/xfs_inode.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_inode.c 2008-02-05 08:43:31.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_inode.c 2008-02-07 09:15:35.000000000 +0100 @@ -52,6 +52,7 @@ #include "xfs_acl.h" #include "xfs_filestream.h" #include "xfs_vnodeops.h" +#include "xfs_acl.h" kmem_zone_t *xfs_ifork_zone; kmem_zone_t *xfs_inode_zone; @@ -870,6 +871,7 @@ xfs_iread( ip->i_mount = mp; atomic_set(&ip->i_iocount, 0); spin_lock_init(&ip->i_flags_lock); + xfs_inode_init_acls(ip); /* * Get pointer's to the on-disk inode and the buffer containing it. @@ -2793,6 +2795,8 @@ xfs_idestroy( } xfs_inode_item_destroy(ip); } + + xfs_inode_clear_acls(ip); kmem_zone_free(xfs_inode_zone, ip); } Index: linux-2.6-xfs/fs/xfs/xfs_inode.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_inode.h 2008-02-05 08:43:31.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_inode.h 2008-02-07 09:15:35.000000000 +0100 @@ -18,6 +18,7 @@ #ifndef __XFS_INODE_H__ #define __XFS_INODE_H__ +struct posix_acl; struct xfs_dinode; struct xfs_dinode_core; @@ -258,6 +259,11 @@ typedef struct xfs_inode { xfs_fsize_t i_size; /* in-memory size */ xfs_fsize_t i_new_size; /* size when write completes */ atomic_t i_iocount; /* outstanding I/O count */ + +#ifdef CONFIG_XFS_POSIX_ACL + struct posix_acl *i_acl; + struct posix_acl *i_default_acl; +#endif /* Trace buffers per inode. */ #ifdef XFS_INODE_TRACE struct ktrace *i_trace; /* general inode trace */ Index: linux-2.6-xfs/fs/xfs/xfs_vnodeops.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_vnodeops.c 2008-02-07 09:15:55.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.c 2008-02-07 09:16:07.000000000 +0100 @@ -77,132 +77,6 @@ xfs_open( } /* - * xfs_getattr - */ -int -xfs_getattr( - xfs_inode_t *ip, - bhv_vattr_t *vap, - int flags) -{ - bhv_vnode_t *vp = XFS_ITOV(ip); - xfs_mount_t *mp = ip->i_mount; - - xfs_itrace_entry(ip); - - if (XFS_FORCED_SHUTDOWN(mp)) - return XFS_ERROR(EIO); - - if (!(flags & ATTR_LAZY)) - xfs_ilock(ip, XFS_ILOCK_SHARED); - - vap->va_size = XFS_ISIZE(ip); - if (vap->va_mask == XFS_AT_SIZE) - goto all_done; - - vap->va_nblocks = - XFS_FSB_TO_BB(mp, ip->i_d.di_nblocks + ip->i_delayed_blks); - vap->va_nodeid = ip->i_ino; -#if XFS_BIG_INUMS - vap->va_nodeid += mp->m_inoadd; -#endif - vap->va_nlink = ip->i_d.di_nlink; - - /* - * Quick exit for non-stat callers - */ - if ((vap->va_mask & - ~(XFS_AT_SIZE|XFS_AT_FSID|XFS_AT_NODEID| - XFS_AT_NLINK|XFS_AT_BLKSIZE)) == 0) - goto all_done; - - /* - * Copy from in-core inode. - */ - vap->va_mode = ip->i_d.di_mode; - vap->va_uid = ip->i_d.di_uid; - vap->va_gid = ip->i_d.di_gid; - vap->va_projid = ip->i_d.di_projid; - - /* - * Check vnode type block/char vs. everything else. - */ - switch (ip->i_d.di_mode & S_IFMT) { - case S_IFBLK: - case S_IFCHR: - vap->va_rdev = ip->i_df.if_u2.if_rdev; - vap->va_blocksize = BLKDEV_IOSIZE; - break; - default: - vap->va_rdev = 0; - - if (!(XFS_IS_REALTIME_INODE(ip))) { - vap->va_blocksize = xfs_preferred_iosize(mp); - } else { - - /* - * If the file blocks are being allocated from a - * realtime partition, then return the inode's - * realtime extent size or the realtime volume's - * extent size. - */ - vap->va_blocksize = - xfs_get_extsz_hint(ip) << mp->m_sb.sb_blocklog; - } - break; - } - - vn_atime_to_timespec(vp, &vap->va_atime); - vap->va_mtime.tv_sec = ip->i_d.di_mtime.t_sec; - vap->va_mtime.tv_nsec = ip->i_d.di_mtime.t_nsec; - vap->va_ctime.tv_sec = ip->i_d.di_ctime.t_sec; - vap->va_ctime.tv_nsec = ip->i_d.di_ctime.t_nsec; - - /* - * Exit for stat callers. See if any of the rest of the fields - * to be filled in are needed. - */ - if ((vap->va_mask & - (XFS_AT_XFLAGS|XFS_AT_EXTSIZE|XFS_AT_NEXTENTS|XFS_AT_ANEXTENTS| - XFS_AT_GENCOUNT|XFS_AT_VCODE)) == 0) - goto all_done; - - /* - * Convert di_flags to xflags. - */ - vap->va_xflags = xfs_ip2xflags(ip); - - /* - * Exit for inode revalidate. See if any of the rest of - * the fields to be filled in are needed. - */ - if ((vap->va_mask & - (XFS_AT_EXTSIZE|XFS_AT_NEXTENTS|XFS_AT_ANEXTENTS| - XFS_AT_GENCOUNT|XFS_AT_VCODE)) == 0) - goto all_done; - - vap->va_extsize = ip->i_d.di_extsize << mp->m_sb.sb_blocklog; - vap->va_nextents = - (ip->i_df.if_flags & XFS_IFEXTENTS) ? - ip->i_df.if_bytes / sizeof(xfs_bmbt_rec_t) : - ip->i_d.di_nextents; - if (ip->i_afp) - vap->va_anextents = - (ip->i_afp->if_flags & XFS_IFEXTENTS) ? - ip->i_afp->if_bytes / sizeof(xfs_bmbt_rec_t) : - ip->i_d.di_anextents; - else - vap->va_anextents = 0; - vap->va_gen = ip->i_d.di_gen; - - all_done: - if (!(flags & ATTR_LAZY)) - xfs_iunlock(ip, XFS_ILOCK_SHARED); - return 0; -} - - -/* * xfs_setattr */ int Index: linux-2.6-xfs/fs/xfs/xfs_vnodeops.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_vnodeops.h 2008-02-07 09:15:48.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.h 2008-02-07 09:15:53.000000000 +0100 @@ -15,7 +15,6 @@ struct xfs_iomap; int xfs_open(struct xfs_inode *ip); -int xfs_getattr(struct xfs_inode *ip, struct bhv_vattr *vap, int flags); int xfs_setattr(struct xfs_inode *ip, struct bhv_vattr *vap, int flags, struct cred *credp); int xfs_readlink(struct xfs_inode *ip, char *link); From owner-xfs@oss.sgi.com Thu Feb 7 18:27:00 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 18:27:06 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_00,WEIRD_PORT autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m182Qox8022396 for ; Thu, 7 Feb 2008 18:26:54 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA17067; Fri, 8 Feb 2008 13:27:05 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 0DB1058C4C11; Fri, 8 Feb 2008 13:27:04 +1100 (EST) Date: Fri, 08 Feb 2008 13:27:04 +1100 To: torvalds@linux-foundation.org Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com, akpm@linux-foundation.org Subject: [GIT PULL] XFS update for 2.6.25 User-Agent: nail 11.25 7/29/05 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20080208022705.0DB1058C4C11@chook.melbourne.sgi.com> From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/5732/Thu Feb 7 14:45:29 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14368 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Please pull from the for-linus branch: git pull git://oss.sgi.com:8090/xfs/xfs-2.6.git for-linus This will update the following files: fs/xfs/Makefile-linux-2.6 | 1 - fs/xfs/linux-2.6/spin.h | 45 --- fs/xfs/linux-2.6/xfs_aops.c | 43 ++- fs/xfs/linux-2.6/xfs_buf.c | 57 +--- fs/xfs/linux-2.6/xfs_buf.h | 1 - fs/xfs/linux-2.6/xfs_export.c | 25 +- fs/xfs/linux-2.6/xfs_file.c | 3 +- fs/xfs/linux-2.6/xfs_globals.c | 3 +- fs/xfs/linux-2.6/xfs_ioctl.c | 86 ++--- fs/xfs/linux-2.6/xfs_ioctl32.c | 9 +- fs/xfs/linux-2.6/xfs_iops.c | 170 +++++++-- fs/xfs/linux-2.6/xfs_linux.h | 34 +-- fs/xfs/linux-2.6/xfs_lrw.c | 122 +++---- fs/xfs/linux-2.6/xfs_lrw.h | 16 +- fs/xfs/linux-2.6/xfs_super.c | 572 +++++++++++++++++++++++++++-- fs/xfs/linux-2.6/xfs_vnode.c | 117 +++---- fs/xfs/linux-2.6/xfs_vnode.h | 62 ++-- fs/xfs/quota/xfs_dquot.c | 12 +- fs/xfs/quota/xfs_dquot.h | 5 - fs/xfs/quota/xfs_dquot_item.c | 27 +- fs/xfs/quota/xfs_qm.c | 14 +- fs/xfs/quota/xfs_qm.h | 6 +- fs/xfs/quota/xfs_qm_syscalls.c | 19 +- fs/xfs/support/debug.c | 7 +- fs/xfs/support/ktrace.c | 8 +- fs/xfs/support/ktrace.h | 3 - fs/xfs/support/uuid.c | 2 +- fs/xfs/xfs.h | 2 +- fs/xfs/xfs_acl.c | 30 +-- fs/xfs/xfs_acl.h | 2 - fs/xfs/xfs_ag.h | 2 +- fs/xfs/xfs_alloc.c | 19 +- fs/xfs/xfs_attr.c | 2 +- fs/xfs/xfs_attr_leaf.c | 8 +- fs/xfs/xfs_bit.c | 103 ------ fs/xfs/xfs_bit.h | 27 ++- fs/xfs/xfs_bmap.c | 22 +- fs/xfs/xfs_bmap.h | 2 + fs/xfs/xfs_bmap_btree.c | 3 +- fs/xfs/xfs_btree.h | 2 + fs/xfs/xfs_buf_item.c | 10 +- fs/xfs/xfs_buf_item.h | 2 + fs/xfs/xfs_da_btree.c | 13 +- fs/xfs/xfs_da_btree.h | 1 + fs/xfs/xfs_dfrag.c | 84 ++--- fs/xfs/xfs_dinode.h | 82 ++--- fs/xfs/xfs_dir2.c | 3 +- fs/xfs/xfs_error.c | 31 -- fs/xfs/xfs_error.h | 2 + fs/xfs/xfs_extfree_item.c | 21 +- fs/xfs/xfs_filestream.c | 2 +- fs/xfs/xfs_fs.h | 10 +- fs/xfs/xfs_fsops.c | 13 +- fs/xfs/xfs_ialloc_btree.h | 2 - fs/xfs/xfs_iget.c | 185 ++++------ fs/xfs/xfs_inode.c | 225 +++--------- fs/xfs/xfs_inode.h | 98 ++--- fs/xfs/xfs_inode_item.c | 26 +- fs/xfs/xfs_iocore.c | 119 ------ fs/xfs/xfs_iomap.c | 212 ++++++------ fs/xfs/xfs_iomap.h | 5 +- fs/xfs/xfs_itable.c | 12 +- fs/xfs/xfs_log.c | 416 ++++++++++----------- fs/xfs/xfs_log.h | 3 +- fs/xfs/xfs_log_priv.h | 96 ++--- fs/xfs/xfs_log_recover.c | 197 +++++------ fs/xfs/xfs_mount.c | 344 ++++++++++-------- fs/xfs/xfs_mount.h | 127 +------ fs/xfs/xfs_mru_cache.c | 54 ++-- fs/xfs/xfs_qmops.c | 7 +- fs/xfs/xfs_rename.c | 9 +- fs/xfs/xfs_rtalloc.c | 19 +- fs/xfs/xfs_rtalloc.h | 2 - fs/xfs/xfs_rw.h | 12 +- fs/xfs/xfs_trans.c | 7 +- fs/xfs/xfs_trans.h | 7 +- fs/xfs/xfs_trans_ail.c | 340 +++++++++++------ fs/xfs/xfs_trans_item.c | 1 + fs/xfs/xfs_trans_priv.h | 13 +- fs/xfs/xfs_utils.c | 11 +- fs/xfs/xfs_utils.h | 2 - fs/xfs/xfs_vfsops.c | 793 ++-------------------------------------- fs/xfs/xfs_vfsops.h | 9 +- fs/xfs/xfs_vnodeops.c | 165 +++------ fs/xfs/xfs_vnodeops.h | 2 - 85 files changed, 2303 insertions(+), 3184 deletions(-) delete mode 100644 fs/xfs/linux-2.6/spin.h delete mode 100644 fs/xfs/xfs_iocore.c through these commits: commit de2eeea609b55e8c3994133a565b39edeaaaaf69 Author: Lachlan McIlroy Date: Wed Feb 6 13:37:56 2008 +1100 [XFS] add __init/__exit mark to specific init/cleanup functions SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30459a Signed-off-by: Lachlan McIlroy Signed-off-by: Denis Cheng commit 450790a2c51e6d9d47ed30dbdcf486656b8e186f Author: David Chinner Date: Wed Feb 6 13:37:40 2008 +1100 [XFS] Fix oops in xfs_file_readdir() When xfs_file_readdir() exactly fills a buffer, it can move it's index past the end of the buffer and dereference it even though the result of the dereference is never used. On some platforms this causes an oops. SGI-PV: 976923 SGI-Modid: xfs-linux-melb:xfs-kern:30458a Signed-off-by: David Chinner Signed-off-by: Lachlan McIlroy commit cbc89dcfd24fd161f7a8e262266177db160a58fb Author: Christoph Hellwig Date: Tue Feb 5 12:14:01 2008 +1100 [XFS] kill xfs_root The only caller (xfs_fs_fill_super) can simplify call igrab on the root inode. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30393a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit 4188c78d951d8a44630f4c33bc0f5b63374572a4 Author: Christoph Hellwig Date: Tue Feb 5 12:13:53 2008 +1100 [XFS] keep i_nlink updated and use proper accessors To get the read-only bind mounts in -mm to work correctly with XFS we need to call the drop_nlink and inc_nlink helpers to monitor the link count. Add calls to these to xfs_bumplink and xfs_droplink and stop copying over di_nlink to i_nlink in xfs_validate_fields and vn_revalidate. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30392a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit 222096ae7f7616caa9e4150948096160cc8a8141 Author: Christoph Hellwig Date: Tue Feb 5 12:13:46 2008 +1100 [XFS] stop updating inode->i_blocks The VFS doesn't use i_blocks, it's only used by generic_fillattr and the generic quota code which XFS doesn't use. In XFS there is one use to check whether we have an inline or out of line sumlink, but we can replace that with a check of the XFS_IFINLINE inode flag. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30391a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit de08dbc1977419efa47eb71f10d96a98eb5bb111 Author: David Chinner Date: Tue Feb 5 12:13:38 2008 +1100 [XFS] Make xfs_ail_check check less by default Checking the entire AIL on every insert and remove is prohibitively expensive - the sustained sequntial create rate on a single disk drops from about 1800/s to 60/s because of this checking resulting in the xfslogd becoming cpu bound. By default on debug builds, only check the next and previous entries in the list to ensure they are ordered correctly. If you really want, define XFS_TRANS_DEBUG to use the old behaviour. SGI-PV: 972759 SGI-Modid: xfs-linux-melb:xfs-kern:30372a Signed-off-by: David Chinner Signed-off-by: Lachlan McIlroy commit 249a8c1124653fa90f3a3afff869095a31bc229f Author: David Chinner Date: Tue Feb 5 12:13:32 2008 +1100 [XFS] Move AIL pushing into it's own thread When many hundreds to thousands of threads all try to do simultaneous transactions and the log is in a tail-pushing situation (i.e. full), we can get multiple threads walking the AIL list and contending on the AIL lock. The AIL push is, in effect, a simple I/O dispatch algorithm complicated by the ordering constraints placed on it by the transaction subsystem. It really does not need multiple threads to push on it - even when only a single CPU is pushing the AIL, it can push the I/O out far faster that pretty much any disk subsystem can handle. So, to avoid contention problems stemming from multiple list walkers, move the list walk off into another thread and simply provide a "target" to push to. When a thread requires a push, it sets the target and wakes the push thread, then goes to sleep waiting for the required amount of space to become available in the log. This mechanism should also be a lot fairer under heavy load as the waiters will queue in arrival order, rather than queuing in "who completed a push first" order. Also, by moving the pushing to a separate thread we can do more effectively overload detection and prevention as we can keep context from loop iteration to loop iteration. That is, we can push only part of the list each loop and not have to loop back to the start of the list every time we run. This should also help by reducing the number of items we try to lock and/or push items that we cannot move. Note that this patch is not intended to solve the inefficiencies in the AIL structure and the associated issues with extremely large list contents. That needs to be addresses separately; parallel access would cause problems to any new structure as well, so I'm only aiming to isolate the structure from unbounded parallelism here. SGI-PV: 972759 SGI-Modid: xfs-linux-melb:xfs-kern:30371a Signed-off-by: David Chinner Signed-off-by: Lachlan McIlroy commit 4576758db5817a91b8974c696247d459dc653db2 Author: Christoph Hellwig Date: Tue Feb 5 12:13:24 2008 +1100 [XFS] use generic_permission Now that all direct caller of xfs_iaccess are gone we can kill xfs_iaccess and xfs_access and just use generic_permission with a check_acl callback. This is required for the per-mount read-only patchset in -mm to work properly with XFS. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30370a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit f6aa7f2184330262e1cb5f7802536e5346bd46a3 Author: Christoph Hellwig Date: Tue Feb 5 12:13:15 2008 +1100 [XFS] stop re-checking permissions in xfs_swapext xfs_swapext should simplify check if we have a writeable file descriptor instead of re-checking the permissions using xfs_iaccess. Add an additional check to refuse O_APPEND file descriptors because swapext is not an append-only write operation. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30369a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit 35fec8df65217546f6d9d508b203c1d135a67fbc Author: Christoph Hellwig Date: Tue Feb 5 12:13:07 2008 +1100 [XFS] clean up xfs_swapext - stop using vnodes - use proper multiple label goto unwinding - give the struct file * variables saner names SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30366a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit 199037c598daf5f3602dace68c331665a4f4f0c1 Author: Christoph Hellwig Date: Tue Feb 5 12:12:58 2008 +1100 [XFS] remove permission check from xfs_change_file_space Both callers of xfs_change_file_space alreaedy do the file->f_mode & FMODE_WRITE check to ensure we have a file descriptor that has been opened for write mode, so there is no need to re-check that with xfs_iaccess. Especially as the later might wrongly deny it for corner cases like file descriptor passing through unix domain sockets. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30365a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit 9742bb93da27737fe490eab2af9fba1efa243dcb Author: Lachlan McIlroy Date: Thu Jan 10 16:43:36 2008 +1100 [XFS] prevent panic during log recovery due to bogus op_hdr length A problem was reported where a system panicked in log recovery due to a corrupt log record. The cause of the corruption is not known but this change will at least prevent a crash for this specific scenario. Log recovery definitely needs some more work in this area. SGI-PV: 974151 SGI-Modid: xfs-linux-melb:xfs-kern:30318a Signed-off-by: Lachlan McIlroy Signed-off-by: David Chinner Signed-off-by: Christoph Hellwig commit f71354bc3a96c657a70e36dcf980cbad6c9fc63f Author: Christoph Hellwig Date: Tue Dec 18 16:26:55 2007 +1100 [XFS] Cleanup various fid related bits: - merge xfs_fid2 into it's only caller xfs_dm_inode_to_fh. - remove xfs_vget and opencode it in the two callers, simplifying both of them by avoiding the awkward calling convetion. - assign directly to the dm_fid_t members in various places in the dmapi code instead of casting them to xfs_fid_t first (which is identical to dm_fid_t) SGI-PV: 974747 SGI-Modid: xfs-linux-melb:xfs-kern:30258a Signed-off-by: Christoph Hellwig Signed-off-by: Vlad Apostolov Signed-off-by: Lachlan McIlroy commit edd319dc527733e61eec5bdc9ce20c94634b6482 Author: David Chinner Date: Fri Dec 7 14:08:48 2007 +1100 [XFS] Fix xfs_lowbit64 xfs_lowbit64 was broken on 32 bit platforms in a recent cleanup of the xfs bitops. Fix it back up again. SGI-PV: 974005 SGI-Modid: xfs-linux-melb:xfs-kern:30202a Signed-off-by: David Chinner Signed-off-by: Lachlan McIlroy commit 45ba598e56fa9f77801e06432b50580d97994fa4 Author: Christoph Hellwig Date: Fri Dec 7 14:07:20 2007 +1100 [XFS] Remove CFORK macros and use code directly in IFORK and DFORK macros. Currently XFS_IFORK_* and XFS_DFORK* are implemented by means of XFS_CFORK* macros. But given that XFS_IFORK_* operates on an xfs_inode that embedds and xfs_icdinode_core and XFS_DFORK_* operates on an xfs_dinode that embedds a xfs_dinode_core one will have to do endian swapping while the other doesn't. Instead of having the current mess with the CFORK macros that have byteswapping and non-byteswapping version (which are inconsistantly named while we're at it) just define each family of the macros to stand by itself and simplify the whole matter. A few direct references to the CFORK variants were cleaned up to use IFORK or DFORK to make this possible. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30163a Signed-off-by: Christoph Hellwig Signed-off-by: Tim Shimmin Signed-off-by: Lachlan McIlroy commit a9759f2de38a3443d5107bddde03b4f3f550060e Author: Christoph Hellwig Date: Fri Dec 7 14:07:08 2007 +1100 [XFS] kill superflous buffer locking (2nd attempt) There is no need to lock any page in xfs_buf.c because we operate on our own address_space and all locking is covered by the buffer semaphore. If we ever switch back to main blockdeive address_space as suggested e.g. for fsblock with a similar scheme the locking will have to be totally revised anyway because the current scheme is neither correct nor coherent with itself. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30156a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit 40ebd81d1a7635cf92a59c387a599fce4863206b Author: Robert P. J. Day Date: Fri Nov 23 16:30:51 2007 +1100 [XFS] Use kernel-supplied "roundup_pow_of_two" for simplicity SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30098a Signed-off-by: Robert P. J. Day Signed-off-by: David Chinner Signed-off-by: Lachlan McIlroy commit e6a4b37f38dca6e86b2648d172946700ee921e12 Author: Tim Shimmin Date: Fri Nov 23 16:30:42 2007 +1100 [XFS] Remove the BPCSHIFT and NB* based macros from XFS. The BPCSHIFT based macros, btoc*, ctob*, offtoc* and ctooff are either not used or don't need to be used. The NDPP, NDPP, NBBY macros don't need to be used but instead are replaced directly by PAGE_SIZE and PAGE_CACHE_SIZE where appropriate. Initial patch and motivation from Nicolas Kaiser. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30096a Signed-off-by: Tim Shimmin Signed-off-by: Lachlan McIlroy commit f7b7c3673e6e225de337abe00e14dc048e44782b Author: Niv Sardi Date: Tue Nov 27 17:01:13 2007 +1100 [XFS] Remove bogus assert This assert is bogus. We can have a forced shutdown occur between the check for the XLOG_FORCED_SHUTDOWN and the ASSERT. Also, the logging system shouldn't care about the state of XFS_FORCED_SHUTDOWN, it should only check XLOG_FORCED_SHUTDOWN. The logging system has it's own forced shutdown flag so, for the case of a forced shutdown that's not due to a logging error, we can flush the log. SGI-PV: 972985 SGI-Modid: xfs-linux-melb:xfs-kern:30029a Signed-off-by: Niv Sardi Signed-off-by: David Chinner Signed-off-by: Lachlan McIlroy commit 71ddabb94a623d1e16e7e66898bf439ff78ecc41 Author: Eric Sandeen Date: Fri Nov 23 16:29:42 2007 +1100 [XFS] optimize XFS_IS_REALTIME_INODE w/o realtime config Use XFS_IS_REALTIME_INODE in more places, and #define it to 0 if CONFIG_XFS_RT is off. This should be safe because mount checks in xfs_rtmount_init: so if we get mounted w/o CONFIG_XFS_RT, no realtime inodes should be encountered after that. Defining XFS_IS_REALTIME_INODE to 0 saves a bit of stack space, presumeably gcc can optimize around the various "if (0)" type checks: xfs_alloc_file_space -8 xfs_bmap_adjacent -16 xfs_bmapi -8 xfs_bmap_rtalloc -16 xfs_bunmapi -28 xfs_free_file_space -64 xfs_imap +8 <-- ? hmm. xfs_iomap_write_direct -12 xfs_qm_dqusage_adjust -4 xfs_qm_vop_chown_reserve -4 SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30014a Signed-off-by: Eric Sandeen Signed-off-by: David Chinner Signed-off-by: Lachlan McIlroy commit a67d7c5f5d25d0b13a4dfb182697135b014fa478 Author: David Chinner Date: Fri Nov 23 16:29:32 2007 +1100 [XFS] Move platform specific mount option parse out of core XFS code Mount option parsing is platform specific. Move it out of core code into the platform specific superblock operation file. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30012a Signed-off-by: David Chinner Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit 3ed6526441053d79b85d206b14d75125e6f51cc2 Author: David Chinner Date: Fri Nov 23 16:29:25 2007 +1100 [XFS] Implement fallocate. Implement the new generic callout for file preallocation. Atomically change the file size if requested. SGI-PV: 972756 SGI-Modid: xfs-linux-melb:xfs-kern:30009a Signed-off-by: David Chinner Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit 5d51eff4538bdfeb9b7a2ec030ee3b0980b067d2 Author: David Chinner Date: Fri Nov 23 16:29:18 2007 +1100 [XFS] Fix inode allocation latency The log force added in xfs_iget_core() has been a performance issue since it was introduced for tight loops that allocate then unlink a single file. under heavy writeback, this can introduce unnecessary latency due tothe log I/o getting stuck behind bulk data writes. Fix this latency problem by avoinding the need for the log force by moving the place we mark linux inode dirty to the transaction commit rather than on transaction completion. This also closes a potential hole in the sync code where a linux inode is not dirty between the time it is modified and the time the log buffer has been written to disk. SGI-PV: 972753 SGI-Modid: xfs-linux-melb:xfs-kern:30007a Signed-off-by: David Chinner Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit e4143a1cf5973e3443c0650fc4c35292d3b7baa8 Author: David Chinner Date: Fri Nov 23 16:29:11 2007 +1100 [XFS] Fix transaction overrun during writeback. Prevent transaction overrun in xfs_iomap_write_allocate() if we race with a truncate that overlaps the delalloc range we were planning to allocate. If we race, we may allocate into a hole and that requires block allocation. At this point in time we don't have a reservation for block allocation (apart from metadata blocks) and so allocating into a hole rather than a delalloc region results in overflowing the transaction block reservation. Fix it by only allowing a single extent to be allocated at a time. SGI-PV: 972757 SGI-Modid: xfs-linux-melb:xfs-kern:30005a Signed-off-by: David Chinner Signed-off-by: Lachlan McIlroy commit 786f486f8154b94b36182d2b53df3bf2b40d85e7 Author: David Chinner Date: Fri Nov 23 16:28:24 2007 +1100 [XFS] Show all mount args in /proc/mounts There are several mount options that don't show up in /proc/mounts. Add them in and clean up the showargs code at the same time. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30004a Signed-off-by: David Chinner Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit 8ae2c0f64a81a93d2c394eacee29d6ced53b54f9 Author: David Chinner Date: Fri Nov 23 16:28:17 2007 +1100 [XFS] Fix sparse warning in xlog_recover_do_efd_trans. Sparse trips over the locking order in xlog_recover_do_efd_trans() when xfs_trans_delete_ail() drops the ail lock. Because the unlock is conditional, we need to either annotate with a "fake unlock" or change the structure of the code so sparse thinks the function always unlocks. Reordering the code makes it simpler, so do that. SGI-PV: 972755 SGI-Modid: xfs-linux-melb:xfs-kern:30003a Signed-off-by: David Chinner Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit a8272ce0c1d49aa3bec57682678f0bdfe28ed4ca Author: David Chinner Date: Fri Nov 23 16:28:09 2007 +1100 [XFS] Fix up sparse warnings. These are mostly locking annotations, marking things static, casts where needed and declaring stuff in header files. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30002a Signed-off-by: David Chinner Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit a69b176df246d59626e6a9c640b44c0921fa4566 Author: David Chinner Date: Fri Nov 23 16:27:59 2007 +1100 [XFS] Use the generic bitops rather than implementing them ourselves. Patch inspired by Andi Kleen. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:30000a Signed-off-by: David Chinner Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy commit c319b58b13bb22f9a2478825b06c641c825f51ec Author: Vlad Apostolov Date: Fri Nov 23 16:27:51 2007 +1100 [XFS] Make xfs_bulkstat() to report unlinked but referenced inodes We need xfs_bulkstat() to report inode stat for inodes with link count zero but reference count non zero. The fix here: http://oss.sgi.com/archives/xfs/2007-09/msg00266.html changed this behavior and made xfs_bulkstat() to filter all unlinked inodes including those that are not destroyed yet but held by reference. The attached patch returns back to the original behavior by marking the on-disk inode buffer "dirty" when di_mode is cleared (at that time both inode link and reference counter are zero). SGI-PV: 972004 SGI-Modid: xfs-linux-melb:xfs-kern:29914a Signed-off-by: Vlad Apostolov Signed-off-by: David Chinner Signed-off-by: Lachlan McIlroy commit 98ce2b5b1bd6db9f8d510b4333757fa6b1efe131 Author: Lachlan McIlroy Date: Fri Nov 23 16:27:32 2007 +1100 [XFS] 971186 Undo mod xfs-linux-melb:xfs-kern:29845a due to a regression SGI-PV: 971596 SGI-Modid: xfs-linux-melb:xfs-kern:29902a Signed-off-by: Lachlan McIlroy commit bc58f9bb6be02a80b5f1f757b656c9affc07154f Author: Eric Sandeen Date: Fri Oct 12 11:13:22 2007 +1000 [XFS] fix 32-bit compat ioctls for GETXFLAGS, SETXFLAGS, GETVERSION XFS_IOC_GETVERSION, XFS_IOC_GETXFLAGS and XFS_IOC_SETXFLAGS all take a "long" which changes size between 32 and 64 bit platforms. So, the ioctl cmds that come in from a 32-bit app aren't as expected, for example on GETXFLAGS, unknown cmd fd(3) cmd(80046601){t:'f';sz:4} due to the size mismatch. So, use instead the 32-bit version of the commands for compat ioctls, and other than that it doesn't take any more manipulation. Also, for both native and compat versions, just define them to the values as defined in fs.h SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29849a Signed-off-by: Eric Sandeen Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit d4f3cc016fd6b392d483adc586b6dfaabad081af Author: Eric Sandeen Date: Fri Oct 12 11:13:08 2007 +1000 [XFS] lose xfs_hex_dump in favor of print_hex_dump No need for xfs to have its own hex dumping routine now that the kernel has one. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29847a Signed-off-by: Eric Sandeen Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 91906a882a4c9541317bc4f4c7fa5d8b784ba198 Author: Christoph Hellwig Date: Fri Oct 12 11:12:54 2007 +1000 [XFS] kill XFS_INOBT_IS_FREE_DISK This macro is unused an all other acros in this family operate on native types, so we most likely won't grow a user either. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29846a Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit c40ea74101ab75a8f320d057e7cf4b772b090110 Author: Christoph Hellwig Date: Fri Oct 12 11:12:39 2007 +1000 [XFS] kill superflous buffer locking There is no need to lock any page in xfs_buf.c because we operate on our own address_space and all locking is covered by the buffer semaphore. If we ever switch back to main blockdeive address_space as suggested e.g. for fsblock with a similar scheme the locking will have to be totally revised anyway because the current scheme is neither correct nor coherent with itself. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29845a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 0771fb4515229821b7d74865b87a430de9fc1113 Author: Eric Sandeen Date: Fri Oct 12 11:03:40 2007 +1000 [XFS] Refactor xfs_mountfs Refactoring xfs_mountfs() to call sub-functions for logical chunks can help save a bit of stack, and can make it easier to read this long function. The mount path is one of the longest common callchains, easily getting to within a few bytes of the end of a 4k stack when over lvm, quotas are enabled, and quotacheck must be done. With this change on top of the other stack-related changes I've sent, I can get xfs to survive a normal xfsqa run on 4k stacks over lvm. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29834a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit b53e675dc868c4844ecbcce9149cf68e4299231d Author: Christoph Hellwig Date: Fri Oct 12 10:59:34 2007 +1000 [XFS] xlog_rec_header/xlog_rec_ext_header endianess annotations Mostly trivial conversion with one exceptions: h_num_logops was kept in native endian previously and only converted to big endian in xlog_sync, but we always keep it big endian now. With todays cpus fast byteswap instructions that's not an issue but the new variant keeps the code clean and maintainable. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29821a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 67fcb7bfb69eb1072c7e2dd6b46fa34db11dd587 Author: Christoph Hellwig Date: Fri Oct 12 10:58:59 2007 +1000 [XFS] clean up some xfs_log_priv.h macros - the various assign lsn macros are replaced by a single inline, xlog_assign_lsn, which is equivalent to ASSIGN_ANY_LSN_HOST except for a more sane calling convention. ASSIGN_LSN_DISK is replaced by xlog_assign_lsn and a manual bytespap, and ASSIGN_LSN by the same, except we pass the cycle and block arguments explicitly instead of a log paramter. The latter two variants only had 2, respectively one user anyway. - the GET_CYCLE is replaced by a xlog_get_cycle inline with exactly the same calling conventions. - GET_CLIENT_ID is replaced by xlog_get_client_id which leaves away the unused arch argument. Instead of conditional defintions depending on host endianess we now do an unconditional swap and shift then, which generates equal code. - the unused XLOG_SET macro is removed. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29820a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 03bea6fe6c38c502c815432999eacfa2eccb0a12 Author: Christoph Hellwig Date: Fri Oct 12 10:58:05 2007 +1000 [XFS] clean up some xfs_log_priv.h macros - the various assign lsn macros are replaced by a single inline, xlog_assign_lsn, which is equivalent to ASSIGN_ANY_LSN_HOST except for a more sane calling convention. ASSIGN_LSN_DISK is replaced by xlog_assign_lsn and a manual bytespap, and ASSIGN_LSN by the same, except we pass the cycle and block arguments explicitly instead of a log paramter. The latter two variants only had 2, respectively one user anyway. - the GET_CYCLE is replaced by a xlog_get_cycle inline with exactly the same calling conventions. - GET_CLIENT_ID is replaced by xlog_get_client_id which leaves away the unused arch argument. Instead of conditional defintions depending on host endianess we now do an unconditional swap and shift then, which generates equal code. - the unused XLOG_SET macro is removed. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29819a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 9909c4aa1a3e5b1f23cbc1bc2f0db025a7f75f85 Author: Christoph Hellwig Date: Thu Oct 11 18:11:14 2007 +1000 [XFS] kill xfs_freeze. No need to have a wrapper just two call two more functions. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29816a Signed-off-by: Christoph Hellwig Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit 10090be25c159c02208b7abf89ae90f8105a2423 Author: Christoph Hellwig Date: Thu Oct 11 18:11:03 2007 +1000 [XFS] cleanup vnode useage in xfs_iget.c Get rid of vnode useage in xfs_iget.c and pass Linux inode / xfs_inode where apropinquate. And kill some useless helpers while we're at it. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29808a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 6e7f75eafbc9b0eb575097f52ba6ed27154cea1b Author: Christoph Hellwig Date: Thu Oct 11 18:09:50 2007 +1000 [XFS] cleanup vnode useage in xfs_ioctl.c xfs_ioctl.c passes around vnode pointers quite a lot, but all places already have the Linux inode which is identical to the vnode these days. Clean the code up to always use the Linux inode. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29807a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 4ca488eb45692520f745f96abc00ea4e268a87d4 Author: Christoph Hellwig Date: Thu Oct 11 18:09:40 2007 +1000 [XFS] Kill off xfs_statvfs. We were already filling the Linux struct statfs anyway, and doing this trivial task directly in xfs_fs_statfs makes the code quite a bit cleaner. While I was at it I also moved copying attributes that don't change over the lifetime of the filesystem outside the superblock lock. xfs_fs_fill_super used to get the magic number and blocksize through xfs_statvfs, but assigning them directly is a lot cleaner and will save some stack space during mount. SGI-PV: 971186 SGI-Modid: xfs-linux-melb:xfs-kern:29802a Signed-off-by: Christoph Hellwig Signed-off-by: Tim Shimmin commit c43f408795c3210c9f5c925e4a49dbb93d41bb57 Author: Christoph Hellwig Date: Thu Oct 11 17:46:39 2007 +1000 [XFS] simplify xfs_vn_getattr Just fill in struct kstat directly from the xfs_inode instead of doing a detour through a bhv_vattr_t and xfs_getattr. SGI-PV: 970980 SGI-Modid: xfs-linux-melb:xfs-kern:29770a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 613d70436c1aeda6843ca8b70c7fab6d0484a591 Author: Christoph Hellwig Date: Thu Oct 11 17:44:08 2007 +1000 [XFS] kill xfs_iocore_t xfs_iocore_t is a structure embedded in xfs_inode. Except for one field it just duplicates fields already in xfs_inode, and there is nothing this abstraction buys us on XFS/Linux. This patch removes it and shrinks source and binary size of xfs aswell as shrinking the size of xfs_inode by 60/44 bytes in debug/non-debug builds. SGI-PV: 970852 SGI-Modid: xfs-linux-melb:xfs-kern:29754a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 007c61c68640ea17c036785b698d05da67b4365e Author: Eric Sandeen Date: Thu Oct 11 17:43:56 2007 +1000 [XFS] Remove spin.h remove spinlock init abstraction macro in spin.h, remove the callers, and remove the file. Move no-op spinlock_destroy to xfs_linux.h Cleanup spinlock locals in xfs_mount.c SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29751a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 36e41eebdafc8b5fabdf66f59d0d43b0b60f0fdb Author: Eric Sandeen Date: Thu Oct 11 17:43:43 2007 +1000 [XFS] Cleanup lock goop. Switch last couple lock_t's to spinlock_t's. Remove now-unused spinlock-related macros & types. SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29748a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit 3a0e487034107c0859b8a0d71d14b5c8988d356b Author: Eric Sandeen Date: Thu Oct 11 17:43:32 2007 +1000 [XFS] ktrace kt_lock is unused, remove it. SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29747a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit 3685c2a1d773781608c9e281a6ff6b4c8ea8f6f9 Author: Eric Sandeen Date: Thu Oct 11 17:42:32 2007 +1000 [XFS] Unwrap XFS_SB_LOCK. Un-obfuscate XFS_SB_LOCK, remove XFS_SB_LOCK->mutex_lock->spin_lock macros, call spin_lock directly, remove extraneous cookie holdover from old xfs code, and change lock type to spinlock_t. SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29746a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit ba74d0cba51dcaa99e4dc2e4fb62e6e13abbf703 Author: Eric Sandeen Date: Thu Oct 11 17:42:10 2007 +1000 [XFS] Unwrap mru_lock. Un-obfuscate mru_lock, remove mutex_lock->spin_lock macros, call spin_lock directly, remove extraneous cookie holdover from old xfs code. SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29745a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit 703e1f0fd2edc2978bde3b4536e78b577318c090 Author: Eric Sandeen Date: Thu Oct 11 17:41:21 2007 +1000 [XFS] Unwrap xfs_dabuf_global_lock Un-obfuscate dabuf_global_lock, remove mutex_lock->spin_lock macros, call spin_lock directly, remove extraneous cookie holdover from old xfs code, and change lock type to spinlock_t. SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29744a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit 64137e56d76a5c05aa4411e2f5d7121593dd9478 Author: Eric Sandeen Date: Thu Oct 11 17:38:28 2007 +1000 [XFS] Unwrap pagb_lock. Un-obfuscate pagb_lock, remove mutex_lock->spin_lock macros, call spin_lock directly, remove extraneous cookie holdover from old xfs code, and change lock type to spinlock_t. SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29743a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit 869b906078720b68711569b68de0acca6b73b675 Author: Eric Sandeen Date: Thu Oct 11 17:38:18 2007 +1000 [XFS] Unwrap XFS_DQ_PINUNLOCK. Un-obfuscate DQ_PINLOCK, remove DQ_PINLOCK->mutex_lock->spin_lock macros, call spin_lock directly, remove extraneous cookie holdover from old xfs code, and change lock type to spinlock_t. SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29742a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit c8b5ea289fed15a7d7a4d6e911987ff16499aed7 Author: Eric Sandeen Date: Thu Oct 11 17:37:31 2007 +1000 [XFS] Unwrap GRANT_LOCK. Un-obfuscate GRANT_LOCK, remove GRANT_LOCK->mutex_lock->spin_lock macros, call spin_lock directly, remove extraneous cookie holdover from old xfs code, and change lock type to spinlock_t. SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29741a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit b22cd72c95df0414e0502a0999624d460ba66126 Author: Eric Sandeen Date: Thu Oct 11 17:37:10 2007 +1000 [XFS] Unwrap LOG_LOCK. Un-obfuscate LOG_LOCK, remove LOG_LOCK->mutex_lock->spin_lock macros, call spin_lock directly, remove extraneous cookie holdover from old xfs code, and change lock type to spinlock_t. SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29740a Signed-off-by: Eric Sandeen Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit 287f3dad14828275d2517c8696ad118c82b9243f Author: Donald Douwsma Date: Thu Oct 11 17:36:05 2007 +1000 [XFS] Unwrap AIL_LOCK SGI-PV: 970382 SGI-Modid: xfs-linux-melb:xfs-kern:29739a Signed-off-by: Donald Douwsma Signed-off-by: Eric Sandeen Signed-off-by: Tim Shimmin commit 541d7d3c4b31e2b0ac846fe6d2eb5cdbe1353095 Author: Lachlan McIlroy Date: Thu Oct 11 17:34:33 2007 +1000 [XFS] kill unnessecary ioops indirection Currently there is an indirection called ioops in the XFS data I/O path. Various functions are called by functions pointers, but there is no coherence in what this is for, and of course for XFS itself it's entirely unused. This patch removes it instead and significantly reduces source and binary size of XFS while making maintaince easier. SGI-PV: 970841 SGI-Modid: xfs-linux-melb:xfs-kern:29737a Signed-off-by: Lachlan McIlroy Signed-off-by: Christoph Hellwig Signed-off-by: Tim Shimmin commit 21a62542b6d7f726d6c1d2cfbfa084f721ba4a26 Author: Christoph Hellwig Date: Wed Sep 19 15:27:49 2007 +1000 [XFS] simplify vn_revalidate No need to allocate a bhv_vattr_t on stack and call xfs_getattr to update a few fields in the Linux inode from the XFS inode, just do it directly. And yes, this function is in dire need of a better name and prototype, I'll do in a separate patch, though. SGI-PV: 970705 SGI-Modid: xfs-linux-melb:xfs-kern:29713a Signed-off-by: Christoph Hellwig Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 15947f2d4f747897f31cfaa36e98a93f80ca3d3f Author: Lachlan McIlroy Date: Mon Sep 17 13:11:58 2007 +1000 [XFS] more vnode/inode tracing fixes SGI-PV: 970335 SGI-Modid: xfs-linux-melb:xfs-kern:29697a Signed-off-by: Lachlan McIlroy Signed-off-by: Eric Sandeen Signed-off-by: Tim Shimmin commit 7642861b7eeaddfc82d762b3342044c809c3f77e Author: Christoph Hellwig Date: Fri Sep 14 15:23:31 2007 +1000 [XFS] kill BMAPI_UNWRITTEN There is no reason to go through xfs_iomap for the BMAPI_UNWRITTEN because it has nothing in common with the other cases. Instead check for the shutdown filesystem in xfs_end_bio_unwritten and perform a direct call to xfs_iomap_write_unwritten (which should be renamed to something more sensible one day) SGI-PV: 970241 SGI-Modid: xfs-linux-melb:xfs-kern:29681a Signed-off-by: Christoph Hellwig Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit 6214ed4461f1ad8aeec41857c73d58afb31be335 Author: Christoph Hellwig Date: Fri Sep 14 15:23:17 2007 +1000 [XFS] kill BMAPI_DEVICE There is no reason to go into the iomap machinery just to get the right block device for an inode. Instead look at the realtime flag in the inode and grab the right device from the mount structure. I created a new helper, xfs_find_bdev_for_inode instead of opencoding it because I plan to use it in other places in the future. SGI-PV: 970240 SGI-Modid: xfs-linux-melb:xfs-kern:29680a Signed-off-by: Christoph Hellwig Signed-off-by: Donald Douwsma Signed-off-by: Tim Shimmin commit cf441eeb79c32471379f0a4d97feaef691432a03 Author: Lachlan McIlroy Date: Thu Feb 7 16:42:19 2008 +1100 [XFS] clean up vnode/inode tracing Simplify vnode tracing calls by embedding function name & return addr in the calling macro. Also do a lot of vnode->inode renaming for consistency, while we're at it. SGI-PV: 970335 SGI-Modid: xfs-linux-melb:xfs-kern:29650a Signed-off-by: Eric Sandeen Signed-off-by: Lachlan McIlroy Signed-off-by: Tim Shimmin commit 44866d39282d0782b15fa4cb62aad937bf0a0897 Author: Lachlan McIlroy Date: Fri Sep 14 15:21:08 2007 +1000 [XFS] remove dead SYNC_BDFLUSH case in xfs_sync_inodes A large part of xfs_sync_inodes is conditional on the SYNC_BDFLUSH which is never passed to it. This patch removes it and adds an assert that triggers in case some new code tries to pass SYNC_BDFLUSH to it. SGI-PV: 970242 SGI-Modid: xfs-linux-melb:xfs-kern:29630a Signed-off-by: Lachlan McIlroy Signed-off-by: Christoph Hellwig Signed-off-by: Tim Shimmin From owner-xfs@oss.sgi.com Thu Feb 7 19:12:50 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 19:12:53 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from relay.sgi.com (netops-testserver-3.corp.sgi.com [192.26.57.72]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m183CnJd024473 for ; Thu, 7 Feb 2008 19:12:50 -0800 Received: from attica.americas.sgi.com (attica.americas.sgi.com [128.162.236.44]) by netops-testserver-3.corp.sgi.com (Postfix) with ESMTP id D145A9089B; Thu, 7 Feb 2008 19:13:10 -0800 (PST) Received: by attica.americas.sgi.com (Postfix, from userid 2022) id 3773140CB66; Thu, 7 Feb 2008 21:13:10 -0600 (CST) To: sgi.bugs.xfs@sgi.com, xfs@sgi.com Subject: TAKE 938188 - xfsinvutil should prune entries with 0 media files when using -m Message-Id: <20080208031310.3773140CB66@attica.americas.sgi.com> Date: Thu, 7 Feb 2008 21:13:10 -0600 (CST) From: wkendall@sgi.com (Bill Kendall) X-Virus-Scanned: ClamAV 0.91.2/5732/Thu Feb 7 14:45:29 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14369 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: wkendall@sgi.com Precedence: bulk X-list: xfs Change xfsinvutil to prune entries with 0 media files even when -m is specified. Dumps with 0 media files contain no data and are safe to remove from the inventory. Date: Thu Feb 7 19:12:26 PST 2008 Workarea: attica.americas.sgi.com:/data/lwork/attica1/dmfgrp/5.0_chroot/bldroot/work/wkendall/xfs-cmds Inspected by: kfr The following file(s) were checked into: bonnie.engr.sgi.com:/isms/xfs-cmds/master Modid: master:xfs-cmds:244164a xfsdump/VERSION - 1.89 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsdump/VERSION.diff?r1=text&tr1=1.89&r2=text&tr2=1.88&f=h xfsdump/doc/CHANGES - 1.103 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsdump/doc/CHANGES.diff?r1=text&tr1=1.103&r2=text&tr2=1.102&f=h xfsdump/man/man8/xfsinvutil.8 - 1.5 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsdump/man/man8/xfsinvutil.8.diff?r1=text&tr1=1.5&r2=text&tr2=1.4&f=h xfsdump/invutil/invutil.c - 1.18 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsdump/invutil/invutil.c.diff?r1=text&tr1=1.18&r2=text&tr2=1.17&f=h From owner-xfs@oss.sgi.com Thu Feb 7 19:25:36 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 19:25:41 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_64, J_CHICKENPOX_66 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m183PSE9025587 for ; Thu, 7 Feb 2008 19:25:32 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA18544; Fri, 8 Feb 2008 14:25:39 +1100 Message-ID: <47ABCBB3.3010506@sgi.com> Date: Fri, 08 Feb 2008 14:25:39 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs@oss.sgi.com, a.gruenbacher@computer.org Subject: Re: [PATCH, RFC] use generic ACL code References: <20080207083222.GA14317@lst.de> In-Reply-To: <20080207083222.GA14317@lst.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/5733/Thu Feb 7 17:26:32 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14370 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Hi Christoph, Yeah, I've recently taken your old patch and applied it to TOT dev yesterday (as my old patch of it stopped applying after more recent changes of yours went in). I'm running thru qa (051 passes as you said) and was just going to do 053 and see where the problems are. I did have a few questions from before but I need to relook at them to see if they make sense now. One of the comments I noticed recently was about extending the limit of ACEs with this change instead of having the fixed array. More soon. --Tim Christoph Hellwig wrote: > This patch rips out the XFS ACL handling code and uses the generic > fs/posix_acl.c code instead. The ondisk format is of course left > unchanged. > > This also introduces the same ACL caching all other Linux filesystems do > by adding pointers to the acl and default acl in struct xfs_inode. > It'll probably need some benchmarking to find out whether bloating the > inode is worth it. It should be possible to use the generic code > without this caching by revamping the code a little, although no other > filesystem currently does that. > > This patch is only an RFC because it still introduces a regression in > XFSQA test 053, but I really want to get it out now to get more comments > or even someone having a look at it because I'm running a little out of > time currently. > > Note that this patch applies ontop of the various vnode cleanups I've > posted to the XFS list a few weeks ago that haven't been applied yet. > > > Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_acl.c > =================================================================== > --- /dev/null 1970-01-01 00:00:00.000000000 +0000 > +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_acl.c 2008-02-07 09:15:35.000000000 +0100 > @@ -0,0 +1,453 @@ > +/* > + * Copyright (C) 2007 Christoph Hellwig. > + * Released under GPL v2. > + */ > +#include "xfs.h" > +#include "xfs_acl.h" > +#include "xfs_attr.h" > +#include "xfs_bmap_btree.h" /* required by xfs_inode.h */ > +#include "xfs_inode.h" > +#include "xfs_vnodeops.h" > + > +#include > + > + > +#define XFS_ACL_NOT_CACHED ((void *)-1) > + > +/* > + * Convert from extended attribute to in-memory representation. > + */ > +static struct posix_acl *xfs_acl_from_disk(struct xfs_acl *aclp) > +{ > + struct posix_acl_entry *acl_e; > + struct posix_acl *acl; > + struct xfs_acl_entry *ace; > + int count, i; > + > + count = be32_to_cpu(aclp->acl_cnt); > + > + acl = posix_acl_alloc(count, GFP_KERNEL); > + if (!acl) > + return ERR_PTR(-ENOMEM); > + > + for (i = 0; i < count; i++) { > + acl_e = &acl->a_entries[i]; > + ace = &aclp->acl_entry[i]; > + > + /* > + * XXX(hch): the tag is 32 bits on disk and 16 bits in core. > + * Any special handling required?? > + */ > + acl_e->e_tag = be32_to_cpu(ace->ae_tag); > + acl_e->e_perm = be16_to_cpu(ace->ae_perm); > + > + switch(acl_e->e_tag) { > + case ACL_USER: > + case ACL_GROUP: > + acl_e->e_id = be32_to_cpu(ace->ae_id); > + break; > + case ACL_USER_OBJ: > + case ACL_GROUP_OBJ: > + case ACL_MASK: > + case ACL_OTHER: > + acl_e->e_id = ACL_UNDEFINED_ID; > + break; > + default: > + goto fail; > + } > + } > + return acl; > + > +fail: > + posix_acl_release(acl); > + return ERR_PTR(-EINVAL); > +} > + > +/* > + * Convert from in-memory to extended attribute representation. > + */ > +static void xfs_acl_to_disk(struct xfs_acl *aclp, const struct posix_acl *acl) > +{ > + const struct posix_acl_entry *acl_e; > + struct xfs_acl_entry *ace; > + int i; > + > + for (i = 0; i < acl->a_count; i++) { > + ace = &aclp->acl_entry[i]; > + acl_e = &acl->a_entries[i]; > + > + ace->ae_tag = cpu_to_be32(acl_e->e_tag); > + ace->ae_id = cpu_to_be32(acl_e->e_id); > + ace->ae_perm = cpu_to_be16(acl_e->e_perm); > + } > +} > + > +struct posix_acl *xfs_get_acl(struct inode *inode, int type) > +{ > + struct xfs_inode *ip = XFS_I(inode); > + struct posix_acl *acl = NULL, **p_acl; > + struct xfs_acl *xfs_acl; > + int len = sizeof(struct xfs_acl); > + char *ea_name; > + int error; > + > + switch (type) { > + case ACL_TYPE_ACCESS: > + ea_name = SGI_ACL_FILE; > + p_acl = &ip->i_acl; > + break; > + case ACL_TYPE_DEFAULT: > + ea_name = SGI_ACL_DEFAULT; > + p_acl = &ip->i_default_acl; > + break; > + default: > + return ERR_PTR(-EINVAL); > + } > + > + if (*p_acl != XFS_ACL_NOT_CACHED) > + return posix_acl_dup(*p_acl); > + > + xfs_acl = kzalloc(sizeof(struct xfs_acl), GFP_KERNEL); > + if (!xfs_acl) > + return ERR_PTR(-ENOMEM); > + > + error = -xfs_attr_get(ip, ea_name, (char *)xfs_acl, > + &len, ATTR_ROOT, sys_cred); > + if (!error) { > + acl = xfs_acl_from_disk(xfs_acl); > + if (!IS_ERR(acl)) > + *p_acl = posix_acl_dup(acl); > + } else { > + *p_acl = NULL; > + } > + > + kfree(xfs_acl); > + return acl; > +} > + > +static int xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl) > +{ > + struct xfs_inode *ip = XFS_I(inode); > + struct posix_acl **p_acl; > + char *ea_name; > + int error; > + > + if (S_ISLNK(inode->i_mode)) > + return -EOPNOTSUPP; > + > + switch (type) { > + case ACL_TYPE_ACCESS: > + ea_name = SGI_ACL_FILE; > + p_acl = &ip->i_acl; > + break; > + case ACL_TYPE_DEFAULT: > + ea_name = SGI_ACL_DEFAULT; > + p_acl = &ip->i_default_acl; > + if (!S_ISDIR(inode->i_mode)) > + return acl ? -EACCES : 0; > + break; > + default: > + return -EINVAL; > + } > + > + if (acl) { > + struct xfs_acl *xfs_acl; > + int len; > + > + xfs_acl = kzalloc(sizeof(struct xfs_acl), GFP_KERNEL); > + if (!xfs_acl) > + return -ENOMEM; > + > + xfs_acl_to_disk(xfs_acl, acl); > + len = sizeof(struct xfs_acl) - > + (sizeof(struct xfs_acl_entry) * > + (XFS_ACL_MAX_ENTRIES - acl->a_count)); > + > + error = -xfs_attr_set(ip, ea_name, (char *)xfs_acl, > + len, ATTR_ROOT); > + > + kfree(xfs_acl); > + } else { > + error = -xfs_attr_remove(ip, ea_name, ATTR_ROOT); > + /* > + * If the attribute didn't exist to start with that's fine. > + */ > + if (error == -ENOATTR) > + error = 0; > + } > + > + if (!error) { > + if (*p_acl && *p_acl != XFS_ACL_NOT_CACHED) > + posix_acl_release(*p_acl); > + *p_acl = posix_acl_dup(acl); > + } > + return error; > +} > + > +static int xfs_check_acl(struct inode *inode, int mask) > +{ > + struct xfs_inode *ip = XFS_I(inode); > + > + xfs_itrace_entry(ip); > + > + if (!XFS_IFORK_Q(ip)) > + return -EAGAIN; > + > + if (ip->i_acl == XFS_ACL_NOT_CACHED) { > + struct posix_acl *acl = xfs_get_acl(inode, ACL_TYPE_ACCESS); > + if (IS_ERR(acl)) > + return PTR_ERR(acl); > + posix_acl_release(acl); > + } > + > + if (ip->i_acl) > + return posix_acl_permission(inode, ip->i_acl, mask); > + return -EAGAIN; > +} > + > +int xfs_vn_permission(struct inode *inode, int mask, struct nameidata *nd) > +{ > + return generic_permission(inode, mask, xfs_check_acl); > +} > + > +/* > + * Extended attribute handlers > + */ > +static int xfs_xattr_get_acl(struct inode *inode, int type, > + void *buffer, size_t size) > +{ > + struct posix_acl *acl; > + int error; > + > + acl = xfs_get_acl(inode, type); > + if (IS_ERR(acl)) > + return PTR_ERR(acl); > + if (acl == NULL) > + return -ENODATA; > + error = posix_acl_to_xattr(acl, buffer, size); > + posix_acl_release(acl); > + > + return error; > +} > + > +/* > + * Helper to propagate i_mode the xfs_inode. > + */ > +static int xfs_set_mode(struct inode *inode, mode_t mode) > +{ > + int error = 0; > + > + if (mode != inode->i_mode) { > + struct bhv_vattr va = { > + .va_mask = XFS_AT_MODE, > + .va_mode = mode, > + }; > + > + va.va_mask = XFS_AT_MODE; > + va.va_mode = mode; > + > + error = -xfs_setattr(XFS_I(inode), &va, 0, sys_cred); > + inode->i_mode = mode; > + } > + > + return error; > +} > + > +static int xfs_xattr_set_acl(struct inode *inode, int type, > + const void *value, size_t size) > +{ > + struct posix_acl *acl; > + int error; > + > + if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) > + return -EPERM; > + > + if (value) { > + acl = posix_acl_from_xattr(value, size); > + if (IS_ERR(acl)) > + return PTR_ERR(acl); > + else if (acl) { > + error = posix_acl_valid(acl); > + if (error) > + goto release_and_out; > + if (acl->a_count > XFS_ACL_MAX_ENTRIES) { > + error = -EINVAL; > + goto release_and_out; > + } > + > + if (type == ACL_TYPE_ACCESS) { > + mode_t mode = inode->i_mode; > + error = posix_acl_equiv_mode(acl, &mode); > + if (error < 0) > + return error; > + if (error == 0) { > + posix_acl_release(acl); > + acl = NULL; > + } > + error = xfs_set_mode(inode, mode); > + if (error) > + goto release_and_out; > + } > + } > + } else > + acl = NULL; > + > + error = xfs_set_acl(inode, type, acl); > +release_and_out: > + posix_acl_release(acl); > + return error; > +} > + > +static int xfs_acl_exists(struct inode *inode, char *name) > +{ > + int len = sizeof(struct xfs_acl); > + > + return xfs_attr_get(XFS_I(inode), name, NULL, &len, > + ATTR_ROOT|ATTR_KERNOVAL, sys_cred); > +} > + > +static int posix_acl_access_get(struct inode *inode, char *name, void *data, > + size_t size, int xflags) > +{ > + return xfs_xattr_get_acl(inode, ACL_TYPE_ACCESS, data, size); > +} > + > +static int posix_acl_access_set(struct inode *inode, char *name, void *data, > + size_t size, int xflags) > +{ > + return xfs_xattr_set_acl(inode, ACL_TYPE_ACCESS, data, size); > +} > + > +static int posix_acl_access_remove(struct inode *inode, char *name, int xflags) > +{ > + return xfs_xattr_set_acl(inode, ACL_TYPE_ACCESS, NULL, 0); > +} > + > +static int posix_acl_access_exists(struct inode *inode) > +{ > + return xfs_acl_exists(inode, SGI_ACL_FILE); > +} > + > +static int posix_acl_default_get(struct inode *inode, char *name, void *data, > + size_t size, int xflags) > +{ > + return xfs_xattr_get_acl(inode, ACL_TYPE_DEFAULT, data, size); > +} > + > +static int posix_acl_default_set(struct inode *inode, char *name, void *data, > + size_t size, int xflags) > +{ > + if (!S_ISDIR(inode->i_mode)) > + return data ? -EACCES : 0; > + return xfs_xattr_set_acl(inode, ACL_TYPE_DEFAULT, data, size); > +} > + > +static int posix_acl_default_remove(struct inode *inode, char *name, int xflags) > +{ > + return xfs_xattr_set_acl(inode, ACL_TYPE_DEFAULT, NULL, 0); > +} > + > +int posix_acl_default_exists(struct inode *inode) > +{ > + if (!S_ISDIR(inode->i_mode)) > + return 0; > + return xfs_acl_exists(inode, SGI_ACL_DEFAULT); > +} > + > +struct attrnames posix_acl_access = { > + .attr_name = "posix_acl_access", > + .attr_namelen = sizeof("posix_acl_access") - 1, > + .attr_get = posix_acl_access_get, > + .attr_set = posix_acl_access_set, > + .attr_remove = posix_acl_access_remove, > + .attr_exists = posix_acl_access_exists, > +}; > + > +struct attrnames posix_acl_default = { > + .attr_name = "posix_acl_default", > + .attr_namelen = sizeof("posix_acl_default") - 1, > + .attr_get = posix_acl_default_get, > + .attr_set = posix_acl_default_set, > + .attr_remove = posix_acl_default_remove, > + .attr_exists = posix_acl_default_exists, > +}; > + > +/* > + * Unlike the other functions in this file this returns positive errors. > + */ > +int xfs_inherit_acl(struct inode *inode, struct posix_acl *default_acl) > +{ > + struct xfs_inode *ip = XFS_I(inode); > + struct posix_acl *clone; > + mode_t mode; > + int error = 0; > + > + if (S_ISDIR(inode->i_mode)) { > + error = xfs_set_acl(inode, ACL_TYPE_DEFAULT, default_acl); > + if (error) > + return -error; > + } > + > + clone = posix_acl_clone(default_acl, GFP_KERNEL); > + if (!clone) > + return ENOMEM; > + > + mode = inode->i_mode; > + error = posix_acl_create_masq(clone, &mode); > + if (error < 0) > + goto out_release_clone; > + > + error = xfs_set_mode(inode, mode); > + if (error > 0) > + error = xfs_set_acl(inode, ACL_TYPE_ACCESS, clone); > + xfs_iflags_set(ip, XFS_IMODIFIED); > + > + out_release_clone: > + posix_acl_release(clone); > + return -error; > +} > + > +int xfs_acl_chmod(struct inode *inode) > +{ > + struct posix_acl *acl, *clone; > + int error; > + > + if (S_ISLNK(inode->i_mode)) > + return -EOPNOTSUPP; > + > + acl = xfs_get_acl(inode, ACL_TYPE_ACCESS); > + if (IS_ERR(acl) || !acl) > + return PTR_ERR(acl); > + > + clone = posix_acl_clone(acl, GFP_KERNEL); > + posix_acl_release(acl); > + if (!clone) > + return -ENOMEM; > + > + error = posix_acl_chmod_masq(clone, inode->i_mode); > + if (!error) > + error = xfs_set_acl(inode, ACL_TYPE_ACCESS, clone); > + > + posix_acl_release(clone); > + return error; > +} > + > +void xfs_inode_init_acls(struct xfs_inode *ip) > +{ > + ip->i_acl = XFS_ACL_NOT_CACHED; > + ip->i_default_acl = XFS_ACL_NOT_CACHED; > +} > + > +static void xfs_clear_acl(struct posix_acl **aclp) > +{ > + if (*aclp != XFS_ACL_NOT_CACHED) { > + posix_acl_release(*aclp); > + *aclp = XFS_ACL_NOT_CACHED; > + } > +} > + > +void xfs_inode_clear_acls(struct xfs_inode *ip) > +{ > + xfs_clear_acl(&ip->i_acl); > + xfs_clear_acl(&ip->i_default_acl); > +} > Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_iops.c 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.c 2008-02-07 09:17:11.000000000 +0100 > @@ -51,6 +51,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -272,8 +273,7 @@ xfs_vn_mknod( > { > struct inode *inode; > struct xfs_inode *ip = NULL; > - xfs_acl_t *default_acl = NULL; > - attrexists_t test_default_acl = _ACL_DEFAULT_EXISTS; > + struct posix_acl *default_acl = NULL; > int error; > > /* > @@ -283,18 +283,14 @@ xfs_vn_mknod( > if (unlikely(!sysv_valid_dev(rdev) || MAJOR(rdev) & ~0x1ff)) > return -EINVAL; > > - if (test_default_acl && test_default_acl(dir)) { > - if (!_ACL_ALLOC(default_acl)) { > - return -ENOMEM; > - } > - if (!_ACL_GET_DEFAULT(dir, default_acl)) { > - _ACL_FREE(default_acl); > - default_acl = NULL; > - } > - } > + if (IS_POSIXACL(dir)) { > + default_acl = xfs_get_acl(dir, ACL_TYPE_DEFAULT); > + if (IS_ERR(default_acl)) > + return -PTR_ERR(default_acl); > > - if (IS_POSIXACL(dir) && !default_acl) > - mode &= ~current->fs->umask; > + if (!default_acl) > + mode &= ~current->fs->umask; > + } > > switch (mode & S_IFMT) { > case S_IFCHR: > @@ -323,11 +319,11 @@ xfs_vn_mknod( > goto out_cleanup_inode; > > if (default_acl) { > - error = _ACL_INHERIT(inode, mode, default_acl); > + error = xfs_inherit_acl(inode, default_acl); > if (unlikely(error)) > goto out_cleanup_inode; > xfs_iflags_set(ip, XFS_IMODIFIED); > - _ACL_FREE(default_acl); > + posix_acl_release(default_acl); > } > > > @@ -340,8 +336,7 @@ xfs_vn_mknod( > out_cleanup_inode: > xfs_cleanup_inode(dir, inode, dentry, mode); > out_free_acl: > - if (default_acl) > - _ACL_FREE(default_acl); > + posix_acl_release(default_acl); > return -error; > } > > @@ -545,38 +540,6 @@ xfs_vn_put_link( > kfree(s); > } > > -#ifdef CONFIG_XFS_POSIX_ACL > -STATIC int > -xfs_check_acl( > - struct inode *inode, > - int mask) > -{ > - struct xfs_inode *ip = XFS_I(inode); > - int error; > - > - xfs_itrace_entry(ip); > - > - if (XFS_IFORK_Q(ip)) { > - error = xfs_acl_iaccess(ip, mask, NULL); > - if (error != -1) > - return -error; > - } > - > - return -EAGAIN; > -} > - > -STATIC int > -xfs_vn_permission( > - struct inode *inode, > - int mask, > - struct nameidata *nd) > -{ > - return generic_permission(inode, mask, xfs_check_acl); > -} > -#else > -#define xfs_vn_permission NULL > -#endif > - > STATIC int > xfs_vn_getattr( > struct vfsmount *mnt, > @@ -689,6 +652,9 @@ xfs_vn_setattr( > error = xfs_setattr(XFS_I(inode), &vattr, flags, NULL); > if (likely(!error)) > vn_revalidate(vn_from_inode(inode)); > + > + if (!error && (attr->ia_valid & ATTR_MODE)) > + error = -xfs_acl_chmod(inode); > return -error; > } > > Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.h > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_iops.h 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.h 2008-02-07 09:15:35.000000000 +0100 > @@ -26,6 +26,12 @@ extern const struct file_operations xfs_ > extern const struct file_operations xfs_dir_file_operations; > extern const struct file_operations xfs_invis_file_operations; > > +#ifdef CONFIG_XFS_POSIX_ACL > +int xfs_vn_permission(struct inode *inode, int mask, struct nameidata *nd); > +#else > +#define xfs_vn_permission NULL > +#endif > + > > struct xfs_inode; > extern void xfs_ichgtime(struct xfs_inode *, int); > Index: linux-2.6-xfs/fs/xfs/xfs_acl.h > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_acl.h 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_acl.h 2008-02-07 09:15:35.000000000 +0100 > @@ -18,27 +18,25 @@ > #ifndef __XFS_ACL_H__ > #define __XFS_ACL_H__ > > +struct inode; > +struct posix_acl; > +struct xfs_inode; > + > + > /* > * Access Control Lists > */ > -typedef __uint16_t xfs_acl_perm_t; > -typedef __int32_t xfs_acl_type_t; > -typedef __int32_t xfs_acl_tag_t; > -typedef __int32_t xfs_acl_id_t; > - > #define XFS_ACL_MAX_ENTRIES 25 > #define XFS_ACL_NOT_PRESENT (-1) > > -typedef struct xfs_acl_entry { > - xfs_acl_tag_t ae_tag; > - xfs_acl_id_t ae_id; > - xfs_acl_perm_t ae_perm; > -} xfs_acl_entry_t; > - > -typedef struct xfs_acl { > - __int32_t acl_cnt; > - xfs_acl_entry_t acl_entry[XFS_ACL_MAX_ENTRIES]; > -} xfs_acl_t; > +struct xfs_acl { > + __be32 acl_cnt; > + struct xfs_acl_entry { > + __be32 ae_tag; > + __be32 ae_id; > + __be16 ae_perm; > + } acl_entry[XFS_ACL_MAX_ENTRIES]; > +}; > > /* On-disk XFS extended attribute names */ > #define SGI_ACL_FILE "SGI_ACL_FILE" > @@ -49,51 +47,31 @@ typedef struct xfs_acl { > > #ifdef CONFIG_XFS_POSIX_ACL > > -struct vattr; > -struct xfs_inode; > - > -extern struct kmem_zone *xfs_acl_zone; > -#define xfs_acl_zone_init(zone, name) \ > - (zone) = kmem_zone_init(sizeof(xfs_acl_t), (name)) > -#define xfs_acl_zone_destroy(zone) kmem_zone_destroy(zone) > - > -extern int xfs_acl_inherit(bhv_vnode_t *, mode_t mode, xfs_acl_t *); > -extern int xfs_acl_iaccess(struct xfs_inode *, mode_t, cred_t *); > -extern int xfs_acl_vtoacl(bhv_vnode_t *, xfs_acl_t *, xfs_acl_t *); > -extern int xfs_acl_vhasacl_access(bhv_vnode_t *); > -extern int xfs_acl_vhasacl_default(bhv_vnode_t *); > -extern int xfs_acl_vset(bhv_vnode_t *, void *, size_t, int); > -extern int xfs_acl_vget(bhv_vnode_t *, void *, size_t, int); > -extern int xfs_acl_vremove(bhv_vnode_t *, int); > - > -#define _ACL_TYPE_ACCESS 1 > -#define _ACL_TYPE_DEFAULT 2 > -#define _ACL_PERM_INVALID(perm) ((perm) & ~(ACL_READ|ACL_WRITE|ACL_EXECUTE)) > - > -#define _ACL_INHERIT(c,m,d) (xfs_acl_inherit(c,m,d)) > -#define _ACL_GET_ACCESS(pv,pa) (xfs_acl_vtoacl(pv,pa,NULL) == 0) > -#define _ACL_GET_DEFAULT(pv,pd) (xfs_acl_vtoacl(pv,NULL,pd) == 0) > -#define _ACL_ACCESS_EXISTS xfs_acl_vhasacl_access > -#define _ACL_DEFAULT_EXISTS xfs_acl_vhasacl_default > - > -#define _ACL_ALLOC(a) ((a) = kmem_zone_alloc(xfs_acl_zone, KM_SLEEP)) > -#define _ACL_FREE(a) ((a)? kmem_zone_free(xfs_acl_zone, (a)):(void)0) > +struct posix_acl *xfs_get_acl(struct inode *inode, int type); > +int xfs_inherit_acl(struct inode *inode, struct posix_acl *default_acl); > +int xfs_acl_chmod(struct inode *inode); > +void xfs_inode_init_acls(struct xfs_inode *ip); > +void xfs_inode_clear_acls(struct xfs_inode *ip); > > #else > -#define xfs_acl_zone_init(zone,name) > -#define xfs_acl_zone_destroy(zone) > -#define xfs_acl_vset(v,p,sz,t) (-EOPNOTSUPP) > -#define xfs_acl_vget(v,p,sz,t) (-EOPNOTSUPP) > -#define xfs_acl_vremove(v,t) (-EOPNOTSUPP) > -#define xfs_acl_vhasacl_access(v) (0) > -#define xfs_acl_vhasacl_default(v) (0) > -#define _ACL_ALLOC(a) (1) /* successfully allocate nothing */ > -#define _ACL_FREE(a) ((void)0) > -#define _ACL_INHERIT(c,m,d) (0) > -#define _ACL_GET_ACCESS(pv,pa) (0) > -#define _ACL_GET_DEFAULT(pv,pd) (0) > -#define _ACL_ACCESS_EXISTS (NULL) > -#define _ACL_DEFAULT_EXISTS (NULL) > -#endif > > +static inline struct posix_acl *xfs_get_acl(struct inode *inode, int type) > +{ > + BUG(); > +} > +static inline int xfs_inherit_acl(struct inode *inode, > + struct posix_acl *default_acl) > +{ > + BUG(); > +} > + > +static inline void xfs_inode_init_acls(struct xfs_inode *ip) > +{ > +} > + > +static inline void xfs_inode_clear_acls(struct xfs_inode *ip) > +{ > +} > + > +#endif /* CONFIG_XFS_POSIX_ACL */ > #endif /* __XFS_ACL_H__ */ > Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_ksyms.c 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c 2008-02-07 09:15:35.000000000 +0100 > @@ -52,7 +52,6 @@ > #include "xfs_dir2_block.h" > #include "xfs_dir2_node.h" > #include "xfs_dir2_trace.h" > -#include "xfs_acl.h" > #include "xfs_attr.h" > #include "xfs_attr_leaf.h" > #include "xfs_inode_item.h" > @@ -183,10 +182,6 @@ EXPORT_SYMBOL(uuid_table_remove); > EXPORT_SYMBOL(vn_hold); > EXPORT_SYMBOL(vn_revalidate); > > -#if defined(CONFIG_XFS_POSIX_ACL) > -EXPORT_SYMBOL(xfs_acl_vtoacl); > -EXPORT_SYMBOL(xfs_acl_inherit); > -#endif > EXPORT_SYMBOL(xfs_alloc_buftarg); > EXPORT_SYMBOL(xfs_flush_buftarg); > EXPORT_SYMBOL(xfs_free_buftarg); > Index: linux-2.6-xfs/fs/xfs/xfs_attr.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_attr.c 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_attr.c 2008-02-07 09:15:35.000000000 +0100 > @@ -58,8 +58,6 @@ > */ > > #define ATTR_SYSCOUNT 2 > -static struct attrnames posix_acl_access; > -static struct attrnames posix_acl_default; > static struct attrnames *attr_system_names[ATTR_SYSCOUNT]; > > /*======================================================================== > @@ -2427,80 +2425,6 @@ xfs_attr_trace_enter(int type, char *whe > * System (pseudo) namespace attribute interface routines. > *========================================================================*/ > > -STATIC int > -posix_acl_access_set( > - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) > -{ > - return xfs_acl_vset(vp, data, size, _ACL_TYPE_ACCESS); > -} > - > -STATIC int > -posix_acl_access_remove( > - bhv_vnode_t *vp, char *name, int xflags) > -{ > - return xfs_acl_vremove(vp, _ACL_TYPE_ACCESS); > -} > - > -STATIC int > -posix_acl_access_get( > - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) > -{ > - return xfs_acl_vget(vp, data, size, _ACL_TYPE_ACCESS); > -} > - > -STATIC int > -posix_acl_access_exists( > - bhv_vnode_t *vp) > -{ > - return xfs_acl_vhasacl_access(vp); > -} > - > -STATIC int > -posix_acl_default_set( > - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) > -{ > - return xfs_acl_vset(vp, data, size, _ACL_TYPE_DEFAULT); > -} > - > -STATIC int > -posix_acl_default_get( > - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) > -{ > - return xfs_acl_vget(vp, data, size, _ACL_TYPE_DEFAULT); > -} > - > -STATIC int > -posix_acl_default_remove( > - bhv_vnode_t *vp, char *name, int xflags) > -{ > - return xfs_acl_vremove(vp, _ACL_TYPE_DEFAULT); > -} > - > -STATIC int > -posix_acl_default_exists( > - bhv_vnode_t *vp) > -{ > - return xfs_acl_vhasacl_default(vp); > -} > - > -static struct attrnames posix_acl_access = { > - .attr_name = "posix_acl_access", > - .attr_namelen = sizeof("posix_acl_access") - 1, > - .attr_get = posix_acl_access_get, > - .attr_set = posix_acl_access_set, > - .attr_remove = posix_acl_access_remove, > - .attr_exists = posix_acl_access_exists, > -}; > - > -static struct attrnames posix_acl_default = { > - .attr_name = "posix_acl_default", > - .attr_namelen = sizeof("posix_acl_default") - 1, > - .attr_get = posix_acl_default_get, > - .attr_set = posix_acl_default_set, > - .attr_remove = posix_acl_default_remove, > - .attr_exists = posix_acl_default_exists, > -}; > - > static struct attrnames *attr_system_names[] = > { &posix_acl_access, &posix_acl_default }; > > Index: linux-2.6-xfs/fs/xfs/xfs_attr.h > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_attr.h 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_attr.h 2008-02-07 09:15:35.000000000 +0100 > @@ -61,6 +61,8 @@ extern struct attrnames attr_secure; > extern struct attrnames attr_system; > extern struct attrnames attr_trusted; > extern struct attrnames *attr_namespaces[ATTR_NAMECOUNT]; > +extern struct attrnames posix_acl_access; > +extern struct attrnames posix_acl_default; > > extern attrnames_t *attr_lookup_namespace(char *, attrnames_t **, int); > extern int attr_generic_list(bhv_vnode_t *, void *, size_t, int, ssize_t *); > Index: linux-2.6-xfs/fs/xfs/xfs_vfsops.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_vfsops.c 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_vfsops.c 2008-02-07 09:15:35.000000000 +0100 > @@ -78,7 +78,6 @@ xfs_init(void) > kmem_zone_init(sizeof(xfs_da_state_t), "xfs_da_state"); > xfs_dabuf_zone = kmem_zone_init(sizeof(xfs_dabuf_t), "xfs_dabuf"); > xfs_ifork_zone = kmem_zone_init(sizeof(xfs_ifork_t), "xfs_ifork"); > - xfs_acl_zone_init(xfs_acl_zone, "xfs_acl"); > xfs_mru_cache_init(); > xfs_filestream_init(); > > @@ -160,7 +159,6 @@ xfs_cleanup(void) > xfs_refcache_destroy(); > xfs_filestream_uninit(); > xfs_mru_cache_uninit(); > - xfs_acl_zone_destroy(xfs_acl_zone); > > #ifdef XFS_DIR2_TRACE > ktrace_free(xfs_dir2_trace_buf); > Index: linux-2.6-xfs/fs/xfs/xfs_acl.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_acl.c 2008-02-05 08:43:31.000000000 +0100 > +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 > @@ -1,903 +0,0 @@ > -/* > - * Copyright (c) 2001-2002,2005 Silicon Graphics, Inc. > - * All Rights Reserved. > - * > - * This program is free software; you can redistribute it and/or > - * modify it under the terms of the GNU General Public License as > - * published by the Free Software Foundation. > - * > - * This program is distributed in the hope that it would be useful, > - * but WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > - * GNU General Public License for more details. > - * > - * You should have received a copy of the GNU General Public License > - * along with this program; if not, write the Free Software Foundation, > - * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > - */ > -#include "xfs.h" > -#include "xfs_fs.h" > -#include "xfs_types.h" > -#include "xfs_bit.h" > -#include "xfs_inum.h" > -#include "xfs_ag.h" > -#include "xfs_dir2.h" > -#include "xfs_bmap_btree.h" > -#include "xfs_alloc_btree.h" > -#include "xfs_ialloc_btree.h" > -#include "xfs_dir2_sf.h" > -#include "xfs_attr_sf.h" > -#include "xfs_dinode.h" > -#include "xfs_inode.h" > -#include "xfs_btree.h" > -#include "xfs_acl.h" > -#include "xfs_attr.h" > -#include "xfs_vnodeops.h" > - > -#include > -#include > - > -STATIC int xfs_acl_setmode(bhv_vnode_t *, xfs_acl_t *, int *); > -STATIC void xfs_acl_filter_mode(mode_t, xfs_acl_t *); > -STATIC void xfs_acl_get_endian(xfs_acl_t *); > -STATIC int xfs_acl_access(uid_t, gid_t, xfs_acl_t *, mode_t, cred_t *); > -STATIC int xfs_acl_invalid(xfs_acl_t *); > -STATIC void xfs_acl_sync_mode(mode_t, xfs_acl_t *); > -STATIC void xfs_acl_get_attr(bhv_vnode_t *, xfs_acl_t *, int, int, int *); > -STATIC void xfs_acl_set_attr(bhv_vnode_t *, xfs_acl_t *, int, int *); > -STATIC int xfs_acl_allow_set(bhv_vnode_t *, int); > - > -kmem_zone_t *xfs_acl_zone; > - > - > -/* > - * Test for existence of access ACL attribute as efficiently as possible. > - */ > -int > -xfs_acl_vhasacl_access( > - bhv_vnode_t *vp) > -{ > - int error; > - > - xfs_acl_get_attr(vp, NULL, _ACL_TYPE_ACCESS, ATTR_KERNOVAL, &error); > - return (error == 0); > -} > - > -/* > - * Test for existence of default ACL attribute as efficiently as possible. > - */ > -int > -xfs_acl_vhasacl_default( > - bhv_vnode_t *vp) > -{ > - int error; > - > - if (!VN_ISDIR(vp)) > - return 0; > - xfs_acl_get_attr(vp, NULL, _ACL_TYPE_DEFAULT, ATTR_KERNOVAL, &error); > - return (error == 0); > -} > - > -/* > - * Convert from extended attribute representation to in-memory for XFS. > - */ > -STATIC int > -posix_acl_xattr_to_xfs( > - posix_acl_xattr_header *src, > - size_t size, > - xfs_acl_t *dest) > -{ > - posix_acl_xattr_entry *src_entry; > - xfs_acl_entry_t *dest_entry; > - int n; > - > - if (!src || !dest) > - return EINVAL; > - > - if (size < sizeof(posix_acl_xattr_header)) > - return EINVAL; > - > - if (src->a_version != cpu_to_le32(POSIX_ACL_XATTR_VERSION)) > - return EOPNOTSUPP; > - > - memset(dest, 0, sizeof(xfs_acl_t)); > - dest->acl_cnt = posix_acl_xattr_count(size); > - if (dest->acl_cnt < 0 || dest->acl_cnt > XFS_ACL_MAX_ENTRIES) > - return EINVAL; > - > - /* > - * acl_set_file(3) may request that we set default ACLs with > - * zero length -- defend (gracefully) against that here. > - */ > - if (!dest->acl_cnt) > - return 0; > - > - src_entry = (posix_acl_xattr_entry *)((char *)src + sizeof(*src)); > - dest_entry = &dest->acl_entry[0]; > - > - for (n = 0; n < dest->acl_cnt; n++, src_entry++, dest_entry++) { > - dest_entry->ae_perm = le16_to_cpu(src_entry->e_perm); > - if (_ACL_PERM_INVALID(dest_entry->ae_perm)) > - return EINVAL; > - dest_entry->ae_tag = le16_to_cpu(src_entry->e_tag); > - switch(dest_entry->ae_tag) { > - case ACL_USER: > - case ACL_GROUP: > - dest_entry->ae_id = le32_to_cpu(src_entry->e_id); > - break; > - case ACL_USER_OBJ: > - case ACL_GROUP_OBJ: > - case ACL_MASK: > - case ACL_OTHER: > - dest_entry->ae_id = ACL_UNDEFINED_ID; > - break; > - default: > - return EINVAL; > - } > - } > - if (xfs_acl_invalid(dest)) > - return EINVAL; > - > - return 0; > -} > - > -/* > - * Comparison function called from xfs_sort(). > - * Primary key is ae_tag, secondary key is ae_id. > - */ > -STATIC int > -xfs_acl_entry_compare( > - const void *va, > - const void *vb) > -{ > - xfs_acl_entry_t *a = (xfs_acl_entry_t *)va, > - *b = (xfs_acl_entry_t *)vb; > - > - if (a->ae_tag == b->ae_tag) > - return (a->ae_id - b->ae_id); > - return (a->ae_tag - b->ae_tag); > -} > - > -/* > - * Convert from in-memory XFS to extended attribute representation. > - */ > -STATIC int > -posix_acl_xfs_to_xattr( > - xfs_acl_t *src, > - posix_acl_xattr_header *dest, > - size_t size) > -{ > - int n; > - size_t new_size = posix_acl_xattr_size(src->acl_cnt); > - posix_acl_xattr_entry *dest_entry; > - xfs_acl_entry_t *src_entry; > - > - if (size < new_size) > - return -ERANGE; > - > - /* Need to sort src XFS ACL by */ > - xfs_sort(src->acl_entry, src->acl_cnt, sizeof(src->acl_entry[0]), > - xfs_acl_entry_compare); > - > - dest->a_version = cpu_to_le32(POSIX_ACL_XATTR_VERSION); > - dest_entry = &dest->a_entries[0]; > - src_entry = &src->acl_entry[0]; > - for (n = 0; n < src->acl_cnt; n++, dest_entry++, src_entry++) { > - dest_entry->e_perm = cpu_to_le16(src_entry->ae_perm); > - if (_ACL_PERM_INVALID(src_entry->ae_perm)) > - return -EINVAL; > - dest_entry->e_tag = cpu_to_le16(src_entry->ae_tag); > - switch (src_entry->ae_tag) { > - case ACL_USER: > - case ACL_GROUP: > - dest_entry->e_id = cpu_to_le32(src_entry->ae_id); > - break; > - case ACL_USER_OBJ: > - case ACL_GROUP_OBJ: > - case ACL_MASK: > - case ACL_OTHER: > - dest_entry->e_id = cpu_to_le32(ACL_UNDEFINED_ID); > - break; > - default: > - return -EINVAL; > - } > - } > - return new_size; > -} > - > -int > -xfs_acl_vget( > - bhv_vnode_t *vp, > - void *acl, > - size_t size, > - int kind) > -{ > - int error; > - xfs_acl_t *xfs_acl = NULL; > - posix_acl_xattr_header *ext_acl = acl; > - int flags = 0; > - > - VN_HOLD(vp); > - if(size) { > - if (!(_ACL_ALLOC(xfs_acl))) { > - error = ENOMEM; > - goto out; > - } > - memset(xfs_acl, 0, sizeof(xfs_acl_t)); > - } else > - flags = ATTR_KERNOVAL; > - > - xfs_acl_get_attr(vp, xfs_acl, kind, flags, &error); > - if (error) > - goto out; > - > - if (!size) { > - error = -posix_acl_xattr_size(XFS_ACL_MAX_ENTRIES); > - } else { > - if (xfs_acl_invalid(xfs_acl)) { > - error = EINVAL; > - goto out; > - } > - if (kind == _ACL_TYPE_ACCESS) { > - bhv_vattr_t va; > - > - va.va_mask = XFS_AT_MODE; > - error = xfs_getattr(xfs_vtoi(vp), &va, 0); > - if (error) > - goto out; > - xfs_acl_sync_mode(va.va_mode, xfs_acl); > - } > - error = -posix_acl_xfs_to_xattr(xfs_acl, ext_acl, size); > - } > -out: > - VN_RELE(vp); > - if(xfs_acl) > - _ACL_FREE(xfs_acl); > - return -error; > -} > - > -int > -xfs_acl_vremove( > - bhv_vnode_t *vp, > - int kind) > -{ > - int error; > - > - VN_HOLD(vp); > - error = xfs_acl_allow_set(vp, kind); > - if (!error) { > - error = xfs_attr_remove(xfs_vtoi(vp), > - kind == _ACL_TYPE_DEFAULT? > - SGI_ACL_DEFAULT: SGI_ACL_FILE, > - ATTR_ROOT); > - if (error == ENOATTR) > - error = 0; /* 'scool */ > - } > - VN_RELE(vp); > - return -error; > -} > - > -int > -xfs_acl_vset( > - bhv_vnode_t *vp, > - void *acl, > - size_t size, > - int kind) > -{ > - posix_acl_xattr_header *ext_acl = acl; > - xfs_acl_t *xfs_acl; > - int error; > - int basicperms = 0; /* more than std unix perms? */ > - > - if (!acl) > - return -EINVAL; > - > - if (!(_ACL_ALLOC(xfs_acl))) > - return -ENOMEM; > - > - error = posix_acl_xattr_to_xfs(ext_acl, size, xfs_acl); > - if (error) { > - _ACL_FREE(xfs_acl); > - return -error; > - } > - if (!xfs_acl->acl_cnt) { > - _ACL_FREE(xfs_acl); > - return 0; > - } > - > - VN_HOLD(vp); > - error = xfs_acl_allow_set(vp, kind); > - if (error) > - goto out; > - > - /* Incoming ACL exists, set file mode based on its value */ > - if (kind == _ACL_TYPE_ACCESS) > - xfs_acl_setmode(vp, xfs_acl, &basicperms); > - > - /* > - * If we have more than std unix permissions, set up the actual attr. > - * Otherwise, delete any existing attr. This prevents us from > - * having actual attrs for permissions that can be stored in the > - * standard permission bits. > - */ > - if (!basicperms) { > - xfs_acl_set_attr(vp, xfs_acl, kind, &error); > - } else { > - xfs_acl_vremove(vp, _ACL_TYPE_ACCESS); > - } > - > -out: > - VN_RELE(vp); > - _ACL_FREE(xfs_acl); > - return -error; > -} > - > -int > -xfs_acl_iaccess( > - xfs_inode_t *ip, > - mode_t mode, > - cred_t *cr) > -{ > - xfs_acl_t *acl; > - int rval; > - > - if (!(_ACL_ALLOC(acl))) > - return -1; > - > - /* If the file has no ACL return -1. */ > - rval = sizeof(xfs_acl_t); > - if (xfs_attr_fetch(ip, SGI_ACL_FILE, SGI_ACL_FILE_SIZE, > - (char *)acl, &rval, ATTR_ROOT | ATTR_KERNACCESS, cr)) { > - _ACL_FREE(acl); > - return -1; > - } > - xfs_acl_get_endian(acl); > - > - /* If the file has an empty ACL return -1. */ > - if (acl->acl_cnt == XFS_ACL_NOT_PRESENT) { > - _ACL_FREE(acl); > - return -1; > - } > - > - /* Synchronize ACL with mode bits */ > - xfs_acl_sync_mode(ip->i_d.di_mode, acl); > - > - rval = xfs_acl_access(ip->i_d.di_uid, ip->i_d.di_gid, acl, mode, cr); > - _ACL_FREE(acl); > - return rval; > -} > - > -STATIC int > -xfs_acl_allow_set( > - bhv_vnode_t *vp, > - int kind) > -{ > - xfs_inode_t *ip = xfs_vtoi(vp); > - bhv_vattr_t va; > - int error; > - > - if (vp->i_flags & (S_IMMUTABLE|S_APPEND)) > - return EPERM; > - if (kind == _ACL_TYPE_DEFAULT && !VN_ISDIR(vp)) > - return ENOTDIR; > - if (vp->i_sb->s_flags & MS_RDONLY) > - return EROFS; > - va.va_mask = XFS_AT_UID; > - error = xfs_getattr(ip, &va, 0); > - if (error) > - return error; > - if (va.va_uid != current->fsuid && !capable(CAP_FOWNER)) > - return EPERM; > - return error; > -} > - > -/* > - * Note: cr is only used here for the capability check if the ACL test fails. > - * It is not used to find out the credentials uid or groups etc, as was > - * done in IRIX. It is assumed that the uid and groups for the current > - * thread are taken from "current" instead of the cr parameter. > - */ > -STATIC int > -xfs_acl_access( > - uid_t fuid, > - gid_t fgid, > - xfs_acl_t *fap, > - mode_t md, > - cred_t *cr) > -{ > - xfs_acl_entry_t matched; > - int i, allows; > - int maskallows = -1; /* true, but not 1, either */ > - int seen_userobj = 0; > - > - matched.ae_tag = 0; /* Invalid type */ > - matched.ae_perm = 0; > - > - for (i = 0; i < fap->acl_cnt; i++) { > - /* > - * Break out if we've got a user_obj entry or > - * a user entry and the mask (and have processed USER_OBJ) > - */ > - if (matched.ae_tag == ACL_USER_OBJ) > - break; > - if (matched.ae_tag == ACL_USER) { > - if (maskallows != -1 && seen_userobj) > - break; > - if (fap->acl_entry[i].ae_tag != ACL_MASK && > - fap->acl_entry[i].ae_tag != ACL_USER_OBJ) > - continue; > - } > - /* True if this entry allows the requested access */ > - allows = ((fap->acl_entry[i].ae_perm & md) == md); > - > - switch (fap->acl_entry[i].ae_tag) { > - case ACL_USER_OBJ: > - seen_userobj = 1; > - if (fuid != current->fsuid) > - continue; > - matched.ae_tag = ACL_USER_OBJ; > - matched.ae_perm = allows; > - break; > - case ACL_USER: > - if (fap->acl_entry[i].ae_id != current->fsuid) > - continue; > - matched.ae_tag = ACL_USER; > - matched.ae_perm = allows; > - break; > - case ACL_GROUP_OBJ: > - if ((matched.ae_tag == ACL_GROUP_OBJ || > - matched.ae_tag == ACL_GROUP) && !allows) > - continue; > - if (!in_group_p(fgid)) > - continue; > - matched.ae_tag = ACL_GROUP_OBJ; > - matched.ae_perm = allows; > - break; > - case ACL_GROUP: > - if ((matched.ae_tag == ACL_GROUP_OBJ || > - matched.ae_tag == ACL_GROUP) && !allows) > - continue; > - if (!in_group_p(fap->acl_entry[i].ae_id)) > - continue; > - matched.ae_tag = ACL_GROUP; > - matched.ae_perm = allows; > - break; > - case ACL_MASK: > - maskallows = allows; > - break; > - case ACL_OTHER: > - if (matched.ae_tag != 0) > - continue; > - matched.ae_tag = ACL_OTHER; > - matched.ae_perm = allows; > - break; > - } > - } > - /* > - * First possibility is that no matched entry allows access. > - * The capability to override DAC may exist, so check for it. > - */ > - switch (matched.ae_tag) { > - case ACL_OTHER: > - case ACL_USER_OBJ: > - if (matched.ae_perm) > - return 0; > - break; > - case ACL_USER: > - case ACL_GROUP_OBJ: > - case ACL_GROUP: > - if (maskallows && matched.ae_perm) > - return 0; > - break; > - case 0: > - break; > - } > - > - /* EACCES tells generic_permission to check for capability overrides */ > - return EACCES; > -} > -EXPORT_SYMBOL(xfs_acl_access); > - > -/* > - * ACL validity checker. > - * This acl validation routine checks each ACL entry read in makes sense. > - */ > -STATIC int > -xfs_acl_invalid( > - xfs_acl_t *aclp) > -{ > - xfs_acl_entry_t *entry, *e; > - int user = 0, group = 0, other = 0, mask = 0; > - int mask_required = 0; > - int i, j; > - > - if (!aclp) > - goto acl_invalid; > - > - if (aclp->acl_cnt > XFS_ACL_MAX_ENTRIES) > - goto acl_invalid; > - > - for (i = 0; i < aclp->acl_cnt; i++) { > - entry = &aclp->acl_entry[i]; > - switch (entry->ae_tag) { > - case ACL_USER_OBJ: > - if (user++) > - goto acl_invalid; > - break; > - case ACL_GROUP_OBJ: > - if (group++) > - goto acl_invalid; > - break; > - case ACL_OTHER: > - if (other++) > - goto acl_invalid; > - break; > - case ACL_USER: > - case ACL_GROUP: > - for (j = i + 1; j < aclp->acl_cnt; j++) { > - e = &aclp->acl_entry[j]; > - if (e->ae_id == entry->ae_id && > - e->ae_tag == entry->ae_tag) > - goto acl_invalid; > - } > - mask_required++; > - break; > - case ACL_MASK: > - if (mask++) > - goto acl_invalid; > - break; > - default: > - goto acl_invalid; > - } > - } > - if (!user || !group || !other || (mask_required && !mask)) > - goto acl_invalid; > - else > - return 0; > -acl_invalid: > - return EINVAL; > -} > - > -/* > - * Do ACL endian conversion. > - */ > -STATIC void > -xfs_acl_get_endian( > - xfs_acl_t *aclp) > -{ > - xfs_acl_entry_t *ace, *end; > - > - INT_SET(aclp->acl_cnt, ARCH_CONVERT, aclp->acl_cnt); > - end = &aclp->acl_entry[0]+aclp->acl_cnt; > - for (ace = &aclp->acl_entry[0]; ace < end; ace++) { > - INT_SET(ace->ae_tag, ARCH_CONVERT, ace->ae_tag); > - INT_SET(ace->ae_id, ARCH_CONVERT, ace->ae_id); > - INT_SET(ace->ae_perm, ARCH_CONVERT, ace->ae_perm); > - } > -} > - > -/* > - * Get the ACL from the EA and do endian conversion. > - */ > -STATIC void > -xfs_acl_get_attr( > - bhv_vnode_t *vp, > - xfs_acl_t *aclp, > - int kind, > - int flags, > - int *error) > -{ > - int len = sizeof(xfs_acl_t); > - > - ASSERT((flags & ATTR_KERNOVAL) ? (aclp == NULL) : 1); > - flags |= ATTR_ROOT; > - *error = xfs_attr_get(xfs_vtoi(vp), > - kind == _ACL_TYPE_ACCESS ? > - SGI_ACL_FILE : SGI_ACL_DEFAULT, > - (char *)aclp, &len, flags, sys_cred); > - if (*error || (flags & ATTR_KERNOVAL)) > - return; > - xfs_acl_get_endian(aclp); > -} > - > -/* > - * Set the EA with the ACL and do endian conversion. > - */ > -STATIC void > -xfs_acl_set_attr( > - bhv_vnode_t *vp, > - xfs_acl_t *aclp, > - int kind, > - int *error) > -{ > - xfs_acl_entry_t *ace, *newace, *end; > - xfs_acl_t *newacl; > - int len; > - > - if (!(_ACL_ALLOC(newacl))) { > - *error = ENOMEM; > - return; > - } > - > - len = sizeof(xfs_acl_t) - > - (sizeof(xfs_acl_entry_t) * (XFS_ACL_MAX_ENTRIES - aclp->acl_cnt)); > - end = &aclp->acl_entry[0]+aclp->acl_cnt; > - for (ace = &aclp->acl_entry[0], newace = &newacl->acl_entry[0]; > - ace < end; > - ace++, newace++) { > - INT_SET(newace->ae_tag, ARCH_CONVERT, ace->ae_tag); > - INT_SET(newace->ae_id, ARCH_CONVERT, ace->ae_id); > - INT_SET(newace->ae_perm, ARCH_CONVERT, ace->ae_perm); > - } > - INT_SET(newacl->acl_cnt, ARCH_CONVERT, aclp->acl_cnt); > - *error = xfs_attr_set(xfs_vtoi(vp), > - kind == _ACL_TYPE_ACCESS ? > - SGI_ACL_FILE: SGI_ACL_DEFAULT, > - (char *)newacl, len, ATTR_ROOT); > - _ACL_FREE(newacl); > -} > - > -int > -xfs_acl_vtoacl( > - bhv_vnode_t *vp, > - xfs_acl_t *access_acl, > - xfs_acl_t *default_acl) > -{ > - bhv_vattr_t va; > - int error = 0; > - > - if (access_acl) { > - /* > - * Get the Access ACL and the mode. If either cannot > - * be obtained for some reason, invalidate the access ACL. > - */ > - xfs_acl_get_attr(vp, access_acl, _ACL_TYPE_ACCESS, 0, &error); > - if (!error) { > - /* Got the ACL, need the mode... */ > - va.va_mask = XFS_AT_MODE; > - error = xfs_getattr(xfs_vtoi(vp), &va, 0); > - } > - > - if (error) > - access_acl->acl_cnt = XFS_ACL_NOT_PRESENT; > - else /* We have a good ACL and the file mode, synchronize. */ > - xfs_acl_sync_mode(va.va_mode, access_acl); > - } > - > - if (default_acl) { > - xfs_acl_get_attr(vp, default_acl, _ACL_TYPE_DEFAULT, 0, &error); > - if (error) > - default_acl->acl_cnt = XFS_ACL_NOT_PRESENT; > - } > - return error; > -} > - > -/* > - * This function retrieves the parent directory's acl, processes it > - * and lets the child inherit the acl(s) that it should. > - */ > -int > -xfs_acl_inherit( > - bhv_vnode_t *vp, > - mode_t mode, > - xfs_acl_t *pdaclp) > -{ > - xfs_acl_t *cacl; > - int error = 0; > - int basicperms = 0; > - > - /* > - * If the parent does not have a default ACL, or it's an > - * invalid ACL, we're done. > - */ > - if (!vp) > - return 0; > - if (!pdaclp || xfs_acl_invalid(pdaclp)) > - return 0; > - > - /* > - * Copy the default ACL of the containing directory to > - * the access ACL of the new file and use the mode that > - * was passed in to set up the correct initial values for > - * the u::,g::[m::], and o:: entries. This is what makes > - * umask() "work" with ACL's. > - */ > - > - if (!(_ACL_ALLOC(cacl))) > - return ENOMEM; > - > - memcpy(cacl, pdaclp, sizeof(xfs_acl_t)); > - xfs_acl_filter_mode(mode, cacl); > - xfs_acl_setmode(vp, cacl, &basicperms); > - > - /* > - * Set the Default and Access ACL on the file. The mode is already > - * set on the file, so we don't need to worry about that. > - * > - * If the new file is a directory, its default ACL is a copy of > - * the containing directory's default ACL. > - */ > - if (VN_ISDIR(vp)) > - xfs_acl_set_attr(vp, pdaclp, _ACL_TYPE_DEFAULT, &error); > - if (!error && !basicperms) > - xfs_acl_set_attr(vp, cacl, _ACL_TYPE_ACCESS, &error); > - _ACL_FREE(cacl); > - return error; > -} > - > -/* > - * Set up the correct mode on the file based on the supplied ACL. This > - * makes sure that the mode on the file reflects the state of the > - * u::,g::[m::], and o:: entries in the ACL. Since the mode is where > - * the ACL is going to get the permissions for these entries, we must > - * synchronize the mode whenever we set the ACL on a file. > - */ > -STATIC int > -xfs_acl_setmode( > - bhv_vnode_t *vp, > - xfs_acl_t *acl, > - int *basicperms) > -{ > - bhv_vattr_t va; > - xfs_acl_entry_t *ap; > - xfs_acl_entry_t *gap = NULL; > - int i, error, nomask = 1; > - > - *basicperms = 1; > - > - if (acl->acl_cnt == XFS_ACL_NOT_PRESENT) > - return 0; > - > - /* > - * Copy the u::, g::, o::, and m:: bits from the ACL into the > - * mode. The m:: bits take precedence over the g:: bits. > - */ > - va.va_mask = XFS_AT_MODE; > - error = xfs_getattr(xfs_vtoi(vp), &va, 0); > - if (error) > - return error; > - > - va.va_mask = XFS_AT_MODE; > - va.va_mode &= ~(S_IRWXU|S_IRWXG|S_IRWXO); > - ap = acl->acl_entry; > - for (i = 0; i < acl->acl_cnt; ++i) { > - switch (ap->ae_tag) { > - case ACL_USER_OBJ: > - va.va_mode |= ap->ae_perm << 6; > - break; > - case ACL_GROUP_OBJ: > - gap = ap; > - break; > - case ACL_MASK: /* more than just standard modes */ > - nomask = 0; > - va.va_mode |= ap->ae_perm << 3; > - *basicperms = 0; > - break; > - case ACL_OTHER: > - va.va_mode |= ap->ae_perm; > - break; > - default: /* more than just standard modes */ > - *basicperms = 0; > - break; > - } > - ap++; > - } > - > - /* Set the group bits from ACL_GROUP_OBJ if there's no ACL_MASK */ > - if (gap && nomask) > - va.va_mode |= gap->ae_perm << 3; > - > - return xfs_setattr(xfs_vtoi(vp), &va, 0, sys_cred); > -} > - > -/* > - * The permissions for the special ACL entries (u::, g::[m::], o::) are > - * actually stored in the file mode (if there is both a group and a mask, > - * the group is stored in the ACL entry and the mask is stored on the file). > - * This allows the mode to remain automatically in sync with the ACL without > - * the need for a call-back to the ACL system at every point where the mode > - * could change. This function takes the permissions from the specified mode > - * and places it in the supplied ACL. > - * > - * This implementation draws its validity from the fact that, when the ACL > - * was assigned, the mode was copied from the ACL. > - * If the mode did not change, therefore, the mode remains exactly what was > - * taken from the special ACL entries at assignment. > - * If a subsequent chmod() was done, the POSIX spec says that the change in > - * mode must cause an update to the ACL seen at user level and used for > - * access checks. Before and after a mode change, therefore, the file mode > - * most accurately reflects what the special ACL entries should permit/deny. > - * > - * CAVEAT: If someone sets the SGI_ACL_FILE attribute directly, > - * the existing mode bits will override whatever is in the > - * ACL. Similarly, if there is a pre-existing ACL that was > - * never in sync with its mode (owing to a bug in 6.5 and > - * before), it will now magically (or mystically) be > - * synchronized. This could cause slight astonishment, but > - * it is better than inconsistent permissions. > - * > - * The supplied ACL is a template that may contain any combination > - * of special entries. These are treated as place holders when we fill > - * out the ACL. This routine does not add or remove special entries, it > - * simply unites each special entry with its associated set of permissions. > - */ > -STATIC void > -xfs_acl_sync_mode( > - mode_t mode, > - xfs_acl_t *acl) > -{ > - int i, nomask = 1; > - xfs_acl_entry_t *ap; > - xfs_acl_entry_t *gap = NULL; > - > - /* > - * Set ACL entries. POSIX1003.1eD16 requires that the MASK > - * be set instead of the GROUP entry, if there is a MASK. > - */ > - for (ap = acl->acl_entry, i = 0; i < acl->acl_cnt; ap++, i++) { > - switch (ap->ae_tag) { > - case ACL_USER_OBJ: > - ap->ae_perm = (mode >> 6) & 0x7; > - break; > - case ACL_GROUP_OBJ: > - gap = ap; > - break; > - case ACL_MASK: > - nomask = 0; > - ap->ae_perm = (mode >> 3) & 0x7; > - break; > - case ACL_OTHER: > - ap->ae_perm = mode & 0x7; > - break; > - default: > - break; > - } > - } > - /* Set the ACL_GROUP_OBJ if there's no ACL_MASK */ > - if (gap && nomask) > - gap->ae_perm = (mode >> 3) & 0x7; > -} > - > -/* > - * When inheriting an Access ACL from a directory Default ACL, > - * the ACL bits are set to the intersection of the ACL default > - * permission bits and the file permission bits in mode. If there > - * are no permission bits on the file then we must not give them > - * the ACL. This is what what makes umask() work with ACLs. > - */ > -STATIC void > -xfs_acl_filter_mode( > - mode_t mode, > - xfs_acl_t *acl) > -{ > - int i, nomask = 1; > - xfs_acl_entry_t *ap; > - xfs_acl_entry_t *gap = NULL; > - > - /* > - * Set ACL entries. POSIX1003.1eD16 requires that the MASK > - * be merged with GROUP entry, if there is a MASK. > - */ > - for (ap = acl->acl_entry, i = 0; i < acl->acl_cnt; ap++, i++) { > - switch (ap->ae_tag) { > - case ACL_USER_OBJ: > - ap->ae_perm &= (mode >> 6) & 0x7; > - break; > - case ACL_GROUP_OBJ: > - gap = ap; > - break; > - case ACL_MASK: > - nomask = 0; > - ap->ae_perm &= (mode >> 3) & 0x7; > - break; > - case ACL_OTHER: > - ap->ae_perm &= mode & 0x7; > - break; > - default: > - break; > - } > - } > - /* Set the ACL_GROUP_OBJ if there's no ACL_MASK */ > - if (gap && nomask) > - gap->ae_perm &= (mode >> 3) & 0x7; > -} > Index: linux-2.6-xfs/fs/xfs/Makefile > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/Makefile 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/Makefile 2008-02-07 09:15:35.000000000 +0100 > @@ -29,7 +29,7 @@ obj-$(CONFIG_XFS_QUOTA) += quota/ > obj-$(CONFIG_XFS_DMAPI) += dmapi/ > > xfs-$(CONFIG_XFS_RT) += xfs_rtalloc.o > -xfs-$(CONFIG_XFS_POSIX_ACL) += xfs_acl.o > +xfs-$(CONFIG_XFS_POSIX_ACL) += $(XFS_LINUX)/xfs_acl.o > xfs-$(CONFIG_PROC_FS) += $(XFS_LINUX)/xfs_stats.o > xfs-$(CONFIG_SYSCTL) += $(XFS_LINUX)/xfs_sysctl.o > xfs-$(CONFIG_COMPAT) += $(XFS_LINUX)/xfs_ioctl32.o > Index: linux-2.6-xfs/fs/xfs/xfs_inode.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_inode.c 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_inode.c 2008-02-07 09:15:35.000000000 +0100 > @@ -52,6 +52,7 @@ > #include "xfs_acl.h" > #include "xfs_filestream.h" > #include "xfs_vnodeops.h" > +#include "xfs_acl.h" > > kmem_zone_t *xfs_ifork_zone; > kmem_zone_t *xfs_inode_zone; > @@ -870,6 +871,7 @@ xfs_iread( > ip->i_mount = mp; > atomic_set(&ip->i_iocount, 0); > spin_lock_init(&ip->i_flags_lock); > + xfs_inode_init_acls(ip); > > /* > * Get pointer's to the on-disk inode and the buffer containing it. > @@ -2793,6 +2795,8 @@ xfs_idestroy( > } > xfs_inode_item_destroy(ip); > } > + > + xfs_inode_clear_acls(ip); > kmem_zone_free(xfs_inode_zone, ip); > } > > Index: linux-2.6-xfs/fs/xfs/xfs_inode.h > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_inode.h 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_inode.h 2008-02-07 09:15:35.000000000 +0100 > @@ -18,6 +18,7 @@ > #ifndef __XFS_INODE_H__ > #define __XFS_INODE_H__ > > +struct posix_acl; > struct xfs_dinode; > struct xfs_dinode_core; > > @@ -258,6 +259,11 @@ typedef struct xfs_inode { > xfs_fsize_t i_size; /* in-memory size */ > xfs_fsize_t i_new_size; /* size when write completes */ > atomic_t i_iocount; /* outstanding I/O count */ > + > +#ifdef CONFIG_XFS_POSIX_ACL > + struct posix_acl *i_acl; > + struct posix_acl *i_default_acl; > +#endif > /* Trace buffers per inode. */ > #ifdef XFS_INODE_TRACE > struct ktrace *i_trace; /* general inode trace */ > Index: linux-2.6-xfs/fs/xfs/xfs_vnodeops.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_vnodeops.c 2008-02-07 09:15:55.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.c 2008-02-07 09:16:07.000000000 +0100 > @@ -77,132 +77,6 @@ xfs_open( > } > > /* > - * xfs_getattr > - */ > -int > -xfs_getattr( > - xfs_inode_t *ip, > - bhv_vattr_t *vap, > - int flags) > -{ > - bhv_vnode_t *vp = XFS_ITOV(ip); > - xfs_mount_t *mp = ip->i_mount; > - > - xfs_itrace_entry(ip); > - > - if (XFS_FORCED_SHUTDOWN(mp)) > - return XFS_ERROR(EIO); > - > - if (!(flags & ATTR_LAZY)) > - xfs_ilock(ip, XFS_ILOCK_SHARED); > - > - vap->va_size = XFS_ISIZE(ip); > - if (vap->va_mask == XFS_AT_SIZE) > - goto all_done; > - > - vap->va_nblocks = > - XFS_FSB_TO_BB(mp, ip->i_d.di_nblocks + ip->i_delayed_blks); > - vap->va_nodeid = ip->i_ino; > -#if XFS_BIG_INUMS > - vap->va_nodeid += mp->m_inoadd; > -#endif > - vap->va_nlink = ip->i_d.di_nlink; > - > - /* > - * Quick exit for non-stat callers > - */ > - if ((vap->va_mask & > - ~(XFS_AT_SIZE|XFS_AT_FSID|XFS_AT_NODEID| > - XFS_AT_NLINK|XFS_AT_BLKSIZE)) == 0) > - goto all_done; > - > - /* > - * Copy from in-core inode. > - */ > - vap->va_mode = ip->i_d.di_mode; > - vap->va_uid = ip->i_d.di_uid; > - vap->va_gid = ip->i_d.di_gid; > - vap->va_projid = ip->i_d.di_projid; > - > - /* > - * Check vnode type block/char vs. everything else. > - */ > - switch (ip->i_d.di_mode & S_IFMT) { > - case S_IFBLK: > - case S_IFCHR: > - vap->va_rdev = ip->i_df.if_u2.if_rdev; > - vap->va_blocksize = BLKDEV_IOSIZE; > - break; > - default: > - vap->va_rdev = 0; > - > - if (!(XFS_IS_REALTIME_INODE(ip))) { > - vap->va_blocksize = xfs_preferred_iosize(mp); > - } else { > - > - /* > - * If the file blocks are being allocated from a > - * realtime partition, then return the inode's > - * realtime extent size or the realtime volume's > - * extent size. > - */ > - vap->va_blocksize = > - xfs_get_extsz_hint(ip) << mp->m_sb.sb_blocklog; > - } > - break; > - } > - > - vn_atime_to_timespec(vp, &vap->va_atime); > - vap->va_mtime.tv_sec = ip->i_d.di_mtime.t_sec; > - vap->va_mtime.tv_nsec = ip->i_d.di_mtime.t_nsec; > - vap->va_ctime.tv_sec = ip->i_d.di_ctime.t_sec; > - vap->va_ctime.tv_nsec = ip->i_d.di_ctime.t_nsec; > - > - /* > - * Exit for stat callers. See if any of the rest of the fields > - * to be filled in are needed. > - */ > - if ((vap->va_mask & > - (XFS_AT_XFLAGS|XFS_AT_EXTSIZE|XFS_AT_NEXTENTS|XFS_AT_ANEXTENTS| > - XFS_AT_GENCOUNT|XFS_AT_VCODE)) == 0) > - goto all_done; > - > - /* > - * Convert di_flags to xflags. > - */ > - vap->va_xflags = xfs_ip2xflags(ip); > - > - /* > - * Exit for inode revalidate. See if any of the rest of > - * the fields to be filled in are needed. > - */ > - if ((vap->va_mask & > - (XFS_AT_EXTSIZE|XFS_AT_NEXTENTS|XFS_AT_ANEXTENTS| > - XFS_AT_GENCOUNT|XFS_AT_VCODE)) == 0) > - goto all_done; > - > - vap->va_extsize = ip->i_d.di_extsize << mp->m_sb.sb_blocklog; > - vap->va_nextents = > - (ip->i_df.if_flags & XFS_IFEXTENTS) ? > - ip->i_df.if_bytes / sizeof(xfs_bmbt_rec_t) : > - ip->i_d.di_nextents; > - if (ip->i_afp) > - vap->va_anextents = > - (ip->i_afp->if_flags & XFS_IFEXTENTS) ? > - ip->i_afp->if_bytes / sizeof(xfs_bmbt_rec_t) : > - ip->i_d.di_anextents; > - else > - vap->va_anextents = 0; > - vap->va_gen = ip->i_d.di_gen; > - > - all_done: > - if (!(flags & ATTR_LAZY)) > - xfs_iunlock(ip, XFS_ILOCK_SHARED); > - return 0; > -} > - > - > -/* > * xfs_setattr > */ > int > Index: linux-2.6-xfs/fs/xfs/xfs_vnodeops.h > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_vnodeops.h 2008-02-07 09:15:48.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.h 2008-02-07 09:15:53.000000000 +0100 > @@ -15,7 +15,6 @@ struct xfs_iomap; > > > int xfs_open(struct xfs_inode *ip); > -int xfs_getattr(struct xfs_inode *ip, struct bhv_vattr *vap, int flags); > int xfs_setattr(struct xfs_inode *ip, struct bhv_vattr *vap, int flags, > struct cred *credp); > int xfs_readlink(struct xfs_inode *ip, char *link); > From owner-xfs@oss.sgi.com Thu Feb 7 19:27:24 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 19:27:28 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_47 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m183RLDZ025865 for ; Thu, 7 Feb 2008 19:27:22 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA18610; Fri, 8 Feb 2008 14:27:36 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m183RXLF55156325; Fri, 8 Feb 2008 14:27:34 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m183RU3B55070590; Fri, 8 Feb 2008 14:27:30 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 8 Feb 2008 14:27:30 +1100 From: David Chinner To: Rabeeh Khoury Cc: nscott@aconex.com, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com, Lennert Buijtenhek Subject: Re: NFSD on XFS with RT subvolume Message-ID: <20080208032730.GL155407@sgi.com> References: <1202076343.9463.465.camel@edge.scott.net.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5733/Thu Feb 7 17:26:32 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14371 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Wed, Feb 06, 2008 at 04:08:58PM +0200, Rabeeh Khoury wrote: > > > > > > Exporting an XFS volume with kernel NFSD when real-time subvolume is > > > enabled hangs the kernel. > > > > > > I'm using vanilla LK 2.6.22.7; first I create the XFS volume with > two > > > partitions of 20GB each with extent size of 1MB; then I create a > > > subdirectory in the volume and mark it (using xfs_io util) as it > belongs > > > to the rt subvolume with inheritance flag. > > > > > > After mounting that volume through NFSv3 / UDP; and trying a 'dd > > > if=/dev/zero of=/mnt/rt/test bs=1M count=1000' the machine running > NFSD > > > hangs infinitely. > > > > Did you manage to get a stack trace, OOC? No reason why it shouldn't > > work AFAIK. > > I didn't mention that I'm using ARM EABI machine for that; but the same > scenario happened on Ubuntu Gutsy 7.10. > The serial console stops responding, but getting Sysrq with showPc > function working I'v got some stack traces (Look for #stack-trace > below). Nothing indicating a hang in the stack traces, just lots of truncates in progress. If you run the same test on the local machine, does the system hang? Or does it only hang through NFS. BTW, having multiple truncates in flight doesn't match up with you supposed test case above. If all you are doing is a dd, then there should only be one truncate occurring (on open). Try running with conv=notrunc and see if that hangs in a similar manner... > I'm running Fedora-8 on the ARM machine using xfsprogs-2.9.4-4.f8 RPM. > The output of formatting /dev/sda5 and /dev/sda6 as the rt-subvolume is > the following, but this time /dev/sda5 is 2GByte and /dev/sda6 is > 20GByte (look for #mkfs.xfs). > > Another note is that sometimes I'm getting an error message that XFS is > trying to access LBA beyond the volume. Does xfs_check or xfs_repair -n indicate and corruption on disk? > Maybe you can suggest few tests that I can perform to figure out what's > the root cause? If you don't use a rt device, does the same test hang? FWIW, if you run the same test on x86 or x86_64, does it hang? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Feb 7 20:43:58 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 20:44:03 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m184htfn001863 for ; Thu, 7 Feb 2008 20:43:58 -0800 X-ASG-Debug-ID: 1202445853-628e01d50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A8A4DDBB6E0 for ; Thu, 7 Feb 2008 20:44:13 -0800 (PST) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id FmWFfNGwsvMCno0G for ; Thu, 07 Feb 2008 20:44:13 -0800 (PST) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m184i6F3015114 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Fri, 8 Feb 2008 05:44:06 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m184i6L9015112 for xfs@oss.sgi.com; Fri, 8 Feb 2008 05:44:06 +0100 Date: Fri, 8 Feb 2008 05:44:06 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH, mainline-only] remove dmapi cruft in xfs_file.c Subject: [PATCH, mainline-only] remove dmapi cruft in xfs_file.c Message-ID: <20080208044405.GB15013@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1202445858 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41669 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5733/Thu Feb 7 17:26:32 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14373 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs The dmapi cruft in xfs_file.c is totally out of date in mainline vs CVS, and at this point just removing this code which can't be used on mainline at all seems to be the best option to keep it maintainable. Signed-off-by: Christoph Hellwig Index: linux-2.6/fs/xfs/linux-2.6/xfs_file.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_file.c 2008-02-08 05:30:57.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_file.c 2008-02-08 05:31:26.000000000 +0100 @@ -43,9 +43,6 @@ #include static struct vm_operations_struct xfs_file_vm_ops; -#ifdef CONFIG_XFS_DMAPI -static struct vm_operations_struct xfs_dmapi_file_vm_ops; -#endif STATIC_INLINE ssize_t __xfs_file_read( @@ -202,22 +199,6 @@ xfs_file_fsync( (xfs_off_t)0, (xfs_off_t)-1); } -#ifdef CONFIG_XFS_DMAPI -STATIC int -xfs_vm_fault( - struct vm_area_struct *vma, - struct vm_fault *vmf) -{ - struct inode *inode = vma->vm_file->f_path.dentry->d_inode; - bhv_vnode_t *vp = vn_from_inode(inode); - - ASSERT_ALWAYS(vp->v_vfsp->vfs_flag & VFS_DMI); - if (XFS_SEND_MMAP(XFS_VFSTOM(vp->v_vfsp), vma, 0)) - return VM_FAULT_SIGBUS; - return filemap_fault(vma, vmf); -} -#endif /* CONFIG_XFS_DMAPI */ - /* * Unfortunately we can't just use the clean and simple readdir implementation * below, because nfs might call back into ->lookup from the filldir callback @@ -386,11 +367,6 @@ xfs_file_mmap( vma->vm_ops = &xfs_file_vm_ops; vma->vm_flags |= VM_CAN_NONLINEAR; -#ifdef CONFIG_XFS_DMAPI - if (XFS_M(filp->f_path.dentry->d_inode->i_sb)->m_flags & XFS_MOUNT_DMAPI) - vma->vm_ops = &xfs_dmapi_file_vm_ops; -#endif /* CONFIG_XFS_DMAPI */ - file_accessed(filp); return 0; } @@ -437,52 +413,6 @@ xfs_file_ioctl_invis( return error; } -#ifdef CONFIG_XFS_DMAPI -#ifdef HAVE_VMOP_MPROTECT -STATIC int -xfs_vm_mprotect( - struct vm_area_struct *vma, - unsigned int newflags) -{ - struct inode *inode = vma->vm_file->f_path.dentry->d_inode; - struct xfs_mount *mp = XFS_M(inode->i_sb); - int error = 0; - - if (mp->m_flags & XFS_MOUNT_DMAPI) { - if ((vma->vm_flags & VM_MAYSHARE) && - (newflags & VM_WRITE) && !(vma->vm_flags & VM_WRITE)) - error = XFS_SEND_MMAP(mp, vma, VM_WRITE); - } - return error; -} -#endif /* HAVE_VMOP_MPROTECT */ -#endif /* CONFIG_XFS_DMAPI */ - -#ifdef HAVE_FOP_OPEN_EXEC -/* If the user is attempting to execute a file that is offline then - * we have to trigger a DMAPI READ event before the file is marked as busy - * otherwise the invisible I/O will not be able to write to the file to bring - * it back online. - */ -STATIC int -xfs_file_open_exec( - struct inode *inode) -{ - struct xfs_mount *mp = XFS_M(inode->i_sb); - - if (unlikely(mp->m_flags & XFS_MOUNT_DMAPI)) { - if (DM_EVENT_ENABLED(XFS_I(inode), DM_EVENT_READ)) { - bhv_vnode_t *vp = vn_from_inode(inode); - - return -XFS_SEND_DATA(mp, DM_EVENT_READ, - vp, 0, 0, 0, NULL); - } - } - - return 0; -} -#endif /* HAVE_FOP_OPEN_EXEC */ - /* * mmap()d file has taken write protection fault and is being made * writable. We can set the page state up correctly for a writable @@ -551,13 +481,3 @@ static struct vm_operations_struct xfs_f .fault = filemap_fault, .page_mkwrite = xfs_vm_page_mkwrite, }; - -#ifdef CONFIG_XFS_DMAPI -static struct vm_operations_struct xfs_dmapi_file_vm_ops = { - .fault = xfs_vm_fault, - .page_mkwrite = xfs_vm_page_mkwrite, -#ifdef HAVE_VMOP_MPROTECT - .mprotect = xfs_vm_mprotect, -#endif -}; -#endif /* CONFIG_XFS_DMAPI */ From owner-xfs@oss.sgi.com Thu Feb 7 20:42:52 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 20:42:58 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m184gppD001749 for ; Thu, 7 Feb 2008 20:42:52 -0800 X-ASG-Debug-ID: 1202445782-348701e50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id F0B4E5B3128 for ; Thu, 7 Feb 2008 20:43:03 -0800 (PST) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id KXJfgdLPqBowpzC6 for ; Thu, 07 Feb 2008 20:43:03 -0800 (PST) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m184gvF3015052 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Fri, 8 Feb 2008 05:42:57 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m184gvZ5015050 for xfs@oss.sgi.com; Fri, 8 Feb 2008 05:42:57 +0100 Date: Fri, 8 Feb 2008 05:42:57 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH, mainline-only] remove sendfile leftovers Subject: [PATCH, mainline-only] remove sendfile leftovers Message-ID: <20080208044256.GA15013@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1202445791 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41668 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5733/Thu Feb 7 17:26:32 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14372 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Remove the last sendfile leftovers in mainline. This code is already gone in CVS. Signed-off-by: Christoph Hellwig diff -uNr -Xdontdiff -p linux-2.6/fs/xfs/linux-2.6/xfs_lrw.h linux-2.6-xfs/fs/xfs/linux-2.6/xfs_lrw.h --- linux-2.6/fs/xfs/linux-2.6/xfs_lrw.h 2008-02-08 05:19:59.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_lrw.h 2007-10-31 15:08:27.000000000 +0100 @@ -50,7 +50,6 @@ struct xfs_iomap; #define XFS_INVAL_CACHED 18 #define XFS_DIORD_ENTER 19 #define XFS_DIOWR_ENTER 20 -#define XFS_SENDFILE_ENTER 21 #define XFS_WRITEPAGE_ENTER 22 #define XFS_RELEASEPAGE_ENTER 23 #define XFS_INVALIDPAGE_ENTER 24 diff -uNr -Xdontdiff -p linux-2.6/fs/xfs/xfs_vnodeops.h linux-2.6-xfs/fs/xfs/xfs_vnodeops.h --- linux-2.6/fs/xfs/xfs_vnodeops.h 2008-02-08 05:19:59.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.h 2008-02-08 05:20:52.000000000 +0100 @@ -60,9 +60,6 @@ int xfs_ioctl(struct xfs_inode *ip, stru ssize_t xfs_read(struct xfs_inode *ip, struct kiocb *iocb, const struct iovec *iovp, unsigned int segs, loff_t *offset, int ioflags); -ssize_t xfs_sendfile(struct xfs_inode *ip, struct file *filp, - loff_t *offset, int ioflags, size_t count, - read_actor_t actor, void *target); ssize_t xfs_splice_read(struct xfs_inode *ip, struct file *infilp, loff_t *ppos, struct pipe_inode_info *pipe, size_t count, int flags, int ioflags); From owner-xfs@oss.sgi.com Thu Feb 7 22:13:53 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 22:13:58 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m186DpKY006351 for ; Thu, 7 Feb 2008 22:13:53 -0800 X-ASG-Debug-ID: 1202451247-2e03006c0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id E582E5B343F for ; Thu, 7 Feb 2008 22:14:07 -0800 (PST) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id 9CfZryBiAClFfMMw for ; Thu, 07 Feb 2008 22:14:07 -0800 (PST) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m186E2F3018968 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Fri, 8 Feb 2008 07:14:02 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m186E2LK018966 for xfs@oss.sgi.com; Fri, 8 Feb 2008 07:14:02 +0100 Date: Fri, 8 Feb 2008 07:14:02 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH, mainline-only] allow enabling CONFIG_XFS_DEBUG Subject: [PATCH, mainline-only] allow enabling CONFIG_XFS_DEBUG Message-ID: <20080208061402.GA18924@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1202451251 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41674 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5733/Thu Feb 7 17:26:32 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14374 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Back when I first submitted XFS for mainline inclusion we made the decision that the debug code is far to extensive to be accidentally enabled by users in mainline. But then again it's often quite useful to track problems down and hacking the makefile all the time is rather annoying. Given all the debug options with even more overhead like lockdep or DEBUG_PAGE_ALLOC users (or rather developers) should know by now what they're doing. Signed-off-by: Christoph Hellwig Index: linux-2.6/fs/xfs/Kconfig =================================================================== --- linux-2.6.orig/fs/xfs/Kconfig 2008-02-08 07:08:09.000000000 +0100 +++ linux-2.6/fs/xfs/Kconfig 2008-02-08 07:08:38.000000000 +0100 @@ -76,3 +76,16 @@ config XFS_RT See the xfs man page in section 5 for additional information. If unsure, say N. + +config XFS_DEBUG + bool "XFS Debugging support (EXPERIMENTAL)" + depends on XFS_FS && EXPERIMENTAL + help + Say Y here to get an XFS build with many debugging features, + including ASSERT checks, function wrappers around macros, + and extra sanity-checking functions in various code paths. + + Note that the resulting code will be HUGE and SLOW, and probably + not useful unless you are debugging a particular problem. + + Say N unless you are an XFS developer, or you play one on TV. From owner-xfs@oss.sgi.com Thu Feb 7 22:42:49 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Feb 2008 22:42:54 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m186ghiH007821 for ; Thu, 7 Feb 2008 22:42:48 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA22975; Fri, 8 Feb 2008 17:43:00 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 5F7C358C4C11; Fri, 8 Feb 2008 17:43:00 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 909874 - make inode reclaim synchronise with xfs_iflush_done() Message-Id: <20080208064300.5F7C358C4C11@chook.melbourne.sgi.com> Date: Fri, 8 Feb 2008 17:43:00 +1100 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/5733/Thu Feb 7 17:26:32 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14375 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs make inode reclaim synchronise with xfs_iflush_done() On a forced shutdown, xfs_finish_reclaim() will skip flushing the inode. If the inode flush lock is not already held and there is an outstanding xfs_iflush_done() then we might free the inode prematurely. By acquiring and releasing the flush lock we will synchronise with xfs_iflush_done(). Date: Fri Feb 8 17:40:49 AEDT 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-free Inspected by: dgc Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30468a fs/xfs/xfs_vnodeops.c - 1.731 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.731&r2=text&tr2=1.730&f=h - make inode reclaim synchronise with xfs_iflush_done() From owner-xfs@oss.sgi.com Fri Feb 8 20:12:44 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 20:12:46 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.4 required=5.0 tests=AWL,BAYES_00,WEIRD_PORT autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m194Cgfu025679 for ; Fri, 8 Feb 2008 20:12:44 -0800 X-ASG-Debug-ID: 1202530385-5d0200bd0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 240C55B9B2A for ; Fri, 8 Feb 2008 20:13:05 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id 9IGZrKOlTZOhW8kh for ; Fri, 08 Feb 2008 20:13:05 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 0819318DE2F53; Fri, 8 Feb 2008 22:13:04 -0600 (CST) Message-ID: <47AD284F.7080603@sandeen.net> Date: Fri, 08 Feb 2008 22:13:03 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Lachlan McIlroy CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [GIT PULL] XFS update for 2.6.25 Subject: Re: [GIT PULL] XFS update for 2.6.25 References: <20080208022705.0DB1058C4C11@chook.melbourne.sgi.com> In-Reply-To: <20080208022705.0DB1058C4C11@chook.melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202530386 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41762 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5752/Fri Feb 8 18:57:23 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14376 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Lachlan McIlroy wrote: > Please pull from the for-linus branch: > git pull git://oss.sgi.com:8090/xfs/xfs-2.6.git for-linus > > This will update the following files: > > fs/xfs/Makefile-linux-2.6 | 1 - Is there a reason the other various makefile updates still haven't been pushed? They're a lot tidier now, and they facilitate out-of-tree building... Thanks, -Eric Remove Makefile wrappers in XFS Makefile (and Kbuild) would include Makefile-linux-26 I doubt XFS will really still compile on 2.4; so drop that. This moves Makefile-linux-26 into Makefile and drops Kbuild. Also having wrappers as both Kbuild and Makefile seemed redundant anyways. The patch is relatively large because it renames a file, but no functional changes. Signed-off-by: Andi Kleen Merge of xfs-linux-melb:xfs-kern:29781a by kenmcd. Remove Makefile wrappers in XFS. Fix up xfs out-of-tree builds. (a.k.a. external modules) Change -I include directives to find headers in the out-of-tree spot. This allows a directory containing only xfs files to be built as: # make -C /path/to/kernel M=`pwd` Signed-off-by: Eric Sandeen Merge of xfs-linux-melb:xfs-kern:29878a by kenmcd. fix up out-of-tree xfs builds. From owner-xfs@oss.sgi.com Fri Feb 8 20:56:26 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 20:56:29 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_00,WEIRD_PORT autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m194uO4x032134 for ; Fri, 8 Feb 2008 20:56:26 -0800 X-ASG-Debug-ID: 1202533007-5a1201bf0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 152EA5B9BA7; Fri, 8 Feb 2008 20:56:47 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id lb5JBB56wqGdC9ND; Fri, 08 Feb 2008 20:56:47 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JNhld-0001Th-Us; Sat, 09 Feb 2008 04:56:45 +0000 Date: Fri, 8 Feb 2008 23:56:45 -0500 From: Christoph Hellwig To: Eric Sandeen Cc: Lachlan McIlroy , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [GIT PULL] XFS update for 2.6.25 Subject: Re: [GIT PULL] XFS update for 2.6.25 Message-ID: <20080209045645.GB1428@infradead.org> References: <20080208022705.0DB1058C4C11@chook.melbourne.sgi.com> <47AD284F.7080603@sandeen.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47AD284F.7080603@sandeen.net> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1202533008 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41766 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5752/Fri Feb 8 18:57:23 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14377 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Fri, Feb 08, 2008 at 10:13:03PM -0600, Eric Sandeen wrote: > Lachlan McIlroy wrote: > > Please pull from the for-linus branch: > > git pull git://oss.sgi.com:8090/xfs/xfs-2.6.git for-linus > > > > This will update the following files: > > > > fs/xfs/Makefile-linux-2.6 | 1 - > > Is there a reason the other various makefile updates still haven't been > pushed? They're a lot tidier now, and they facilitate out-of-tree > building... Well, the makefiles are pretty different for CVS vs mainline to modular quota and dmapi. I'm thinking about doing a proof of concept modular quota patch for mainline and if it doesn't get too ugly that would mean the makefiles are a lot more in sync. From owner-xfs@oss.sgi.com Fri Feb 8 20:58:34 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 20:58:37 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.4 required=5.0 tests=AWL,BAYES_00,WEIRD_PORT autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m194wWv3032440 for ; Fri, 8 Feb 2008 20:58:34 -0800 X-ASG-Debug-ID: 1202533134-5ae101940000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B5CB75B9BB1 for ; Fri, 8 Feb 2008 20:58:55 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id HU75IyuXg2VTT2Z2 for ; Fri, 08 Feb 2008 20:58:55 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 3427618DE2F51; Fri, 8 Feb 2008 22:58:54 -0600 (CST) Message-ID: <47AD330D.3010603@sandeen.net> Date: Fri, 08 Feb 2008 22:58:53 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Christoph Hellwig CC: Lachlan McIlroy , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [GIT PULL] XFS update for 2.6.25 Subject: Re: [GIT PULL] XFS update for 2.6.25 References: <20080208022705.0DB1058C4C11@chook.melbourne.sgi.com> <47AD284F.7080603@sandeen.net> <20080209045645.GB1428@infradead.org> In-Reply-To: <20080209045645.GB1428@infradead.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202533135 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41766 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5752/Fri Feb 8 18:57:23 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14378 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Christoph Hellwig wrote: > On Fri, Feb 08, 2008 at 10:13:03PM -0600, Eric Sandeen wrote: >> Lachlan McIlroy wrote: >>> Please pull from the for-linus branch: >>> git pull git://oss.sgi.com:8090/xfs/xfs-2.6.git for-linus >>> >>> This will update the following files: >>> >>> fs/xfs/Makefile-linux-2.6 | 1 - >> Is there a reason the other various makefile updates still haven't been >> pushed? They're a lot tidier now, and they facilitate out-of-tree >> building... > > Well, the makefiles are pretty different for CVS vs mainline to modular > quota and dmapi. I'm thinking about doing a proof of concept modular > quota patch for mainline and if it doesn't get too ugly that would > mean the makefiles are a lot more in sync. Even if they differ, they can still get the same basic treatment. I'll make a patch if desired. The current kernel.org makefiles are a mess, IMHO :) -Eric From owner-xfs@oss.sgi.com Fri Feb 8 21:02:52 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 21:03:00 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_00,WEIRD_PORT autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1952o9s000548 for ; Fri, 8 Feb 2008 21:02:52 -0800 X-ASG-Debug-ID: 1202533393-634800ad0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C9096DCF73D; Fri, 8 Feb 2008 21:03:13 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id fjmc7ttZBNDmUBB1; Fri, 08 Feb 2008 21:03:13 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JNhrs-0001bs-EG; Sat, 09 Feb 2008 05:03:12 +0000 Date: Sat, 9 Feb 2008 00:03:12 -0500 From: Christoph Hellwig To: Eric Sandeen Cc: Christoph Hellwig , Lachlan McIlroy , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [GIT PULL] XFS update for 2.6.25 Subject: Re: [GIT PULL] XFS update for 2.6.25 Message-ID: <20080209050312.GC1428@infradead.org> References: <20080208022705.0DB1058C4C11@chook.melbourne.sgi.com> <47AD284F.7080603@sandeen.net> <20080209045645.GB1428@infradead.org> <47AD330D.3010603@sandeen.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47AD330D.3010603@sandeen.net> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1202533393 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41765 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5752/Fri Feb 8 18:57:23 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14379 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Fri, Feb 08, 2008 at 10:58:53PM -0600, Eric Sandeen wrote: > Christoph Hellwig wrote: > > On Fri, Feb 08, 2008 at 10:13:03PM -0600, Eric Sandeen wrote: > >> Lachlan McIlroy wrote: > >>> Please pull from the for-linus branch: > >>> git pull git://oss.sgi.com:8090/xfs/xfs-2.6.git for-linus > >>> > >>> This will update the following files: > >>> > >>> fs/xfs/Makefile-linux-2.6 | 1 - > >> Is there a reason the other various makefile updates still haven't been > >> pushed? They're a lot tidier now, and they facilitate out-of-tree > >> building... > > > > Well, the makefiles are pretty different for CVS vs mainline to modular > > quota and dmapi. I'm thinking about doing a proof of concept modular > > quota patch for mainline and if it doesn't get too ugly that would > > mean the makefiles are a lot more in sync. > > Even if they differ, they can still get the same basic treatment. I'll > make a patch if desired. The current kernel.org makefiles are a mess, > IMHO :) They could, but I understand the SGI people fully if they try to touch the things that differ as little as possible. But yeah, please send a patch, I've done quite a few mainline-only patches yesterday aswell :) From owner-xfs@oss.sgi.com Fri Feb 8 21:09:18 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 21:09:36 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1959HGw001073 for ; Fri, 8 Feb 2008 21:09:18 -0800 X-ASG-Debug-ID: 1202533780-7e7302a00000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 14550DCEC19 for ; Fri, 8 Feb 2008 21:09:40 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id 5ihJV7a6SVxoL8n0 for ; Fri, 08 Feb 2008 21:09:40 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id A36B918DE2F53; Fri, 8 Feb 2008 23:09:39 -0600 (CST) Message-ID: <47AD3593.1060700@sandeen.net> Date: Fri, 08 Feb 2008 23:09:39 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Lachlan McIlroy CC: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 1/2] - unwrap makefiles (mainline only) Subject: [PATCH 1/2] - unwrap makefiles (mainline only) References: <20080208022705.0DB1058C4C11@chook.melbourne.sgi.com> <47AD284F.7080603@sandeen.net> In-Reply-To: <47AD284F.7080603@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202533781 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41767 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5752/Fri Feb 8 18:57:23 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14380 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs kernel.org git tree "port" of: Remove Makefile wrappers in XFS Makefile (and Kbuild) would include Makefile-linux-26 I doubt XFS will really still compile on 2.4; so drop that. This moves Makefile-linux-26 into Makefile and drops Kbuild. Also having wrappers as both Kbuild and Makefile seemed redundant anyways. The patch is relatively large because it renames a file, but no functional changes. Signed-off-by: Andi Kleen > Merge of xfs-linux-melb:xfs-kern:29781a by kenmcd. Remove Makefile wrappers in XFS. Index: linux-2.6/fs/xfs/Kbuild =================================================================== --- linux-2.6.orig/fs/xfs/Kbuild +++ /dev/null @@ -1,6 +0,0 @@ -# -# The xfs people like to share Makefile with 2.6 and 2.4. -# Utilise file named Kbuild file which has precedence over Makefile. -# - -include $(srctree)/$(obj)/Makefile-linux-2.6 Index: linux-2.6/fs/xfs/Makefile =================================================================== --- linux-2.6.orig/fs/xfs/Makefile +++ linux-2.6/fs/xfs/Makefile @@ -1 +1,117 @@ -include $(TOPDIR)/fs/xfs/Makefile-linux-$(VERSION).$(PATCHLEVEL) +# +# Copyright (c) 2000-2005 Silicon Graphics, Inc. +# All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# + +EXTRA_CFLAGS += -Ifs/xfs -Ifs/xfs/linux-2.6 -funsigned-char + +XFS_LINUX := linux-2.6 + +ifeq ($(CONFIG_XFS_DEBUG),y) + EXTRA_CFLAGS += -g +endif + +obj-$(CONFIG_XFS_FS) += xfs.o + +xfs-$(CONFIG_XFS_QUOTA) += $(addprefix quota/, \ + xfs_dquot.o \ + xfs_dquot_item.o \ + xfs_trans_dquot.o \ + xfs_qm_syscalls.o \ + xfs_qm_bhv.o \ + xfs_qm.o) + +ifeq ($(CONFIG_XFS_QUOTA),y) +xfs-$(CONFIG_PROC_FS) += quota/xfs_qm_stats.o +endif + +xfs-$(CONFIG_XFS_RT) += xfs_rtalloc.o +xfs-$(CONFIG_XFS_POSIX_ACL) += xfs_acl.o +xfs-$(CONFIG_PROC_FS) += $(XFS_LINUX)/xfs_stats.o +xfs-$(CONFIG_SYSCTL) += $(XFS_LINUX)/xfs_sysctl.o +xfs-$(CONFIG_COMPAT) += $(XFS_LINUX)/xfs_ioctl32.o + + +xfs-y += xfs_alloc.o \ + xfs_alloc_btree.o \ + xfs_attr.o \ + xfs_attr_leaf.o \ + xfs_bit.o \ + xfs_bmap.o \ + xfs_bmap_btree.o \ + xfs_btree.o \ + xfs_buf_item.o \ + xfs_da_btree.o \ + xfs_dir2.o \ + xfs_dir2_block.o \ + xfs_dir2_data.o \ + xfs_dir2_leaf.o \ + xfs_dir2_node.o \ + xfs_dir2_sf.o \ + xfs_error.o \ + xfs_extfree_item.o \ + xfs_filestream.o \ + xfs_fsops.o \ + xfs_ialloc.o \ + xfs_ialloc_btree.o \ + xfs_iget.o \ + xfs_inode.o \ + xfs_inode_item.o \ + xfs_iomap.o \ + xfs_itable.o \ + xfs_dfrag.o \ + xfs_log.o \ + xfs_log_recover.o \ + xfs_mount.o \ + xfs_mru_cache.o \ + xfs_rename.o \ + xfs_trans.o \ + xfs_trans_ail.o \ + xfs_trans_buf.o \ + xfs_trans_extfree.o \ + xfs_trans_inode.o \ + xfs_trans_item.o \ + xfs_utils.o \ + xfs_vfsops.o \ + xfs_vnodeops.o \ + xfs_rw.o \ + xfs_dmops.o \ + xfs_qmops.o + +xfs-$(CONFIG_XFS_TRACE) += xfs_dir2_trace.o + +# Objects in linux/ +xfs-y += $(addprefix $(XFS_LINUX)/, \ + kmem.o \ + xfs_aops.o \ + xfs_buf.o \ + xfs_export.o \ + xfs_file.o \ + xfs_fs_subr.o \ + xfs_globals.o \ + xfs_ioctl.o \ + xfs_iops.o \ + xfs_lrw.o \ + xfs_super.o \ + xfs_vnode.o) + +# Objects in support/ +xfs-y += $(addprefix support/, \ + debug.o \ + uuid.o) + +xfs-$(CONFIG_XFS_TRACE) += support/ktrace.o + Index: linux-2.6/fs/xfs/Makefile-linux-2.6 =================================================================== --- linux-2.6.orig/fs/xfs/Makefile-linux-2.6 +++ /dev/null @@ -1,117 +0,0 @@ -# -# Copyright (c) 2000-2005 Silicon Graphics, Inc. -# All Rights Reserved. -# -# This program is free software; you can redistribute it and/or -# modify it under the terms of the GNU General Public License as -# published by the Free Software Foundation. -# -# This program is distributed in the hope that it would be useful, -# but WITHOUT ANY WARRANTY; without even the implied warranty of -# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -# GNU General Public License for more details. -# -# You should have received a copy of the GNU General Public License -# along with this program; if not, write the Free Software Foundation, -# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA -# - -EXTRA_CFLAGS += -Ifs/xfs -Ifs/xfs/linux-2.6 -funsigned-char - -XFS_LINUX := linux-2.6 - -ifeq ($(CONFIG_XFS_DEBUG),y) - EXTRA_CFLAGS += -g -endif - -obj-$(CONFIG_XFS_FS) += xfs.o - -xfs-$(CONFIG_XFS_QUOTA) += $(addprefix quota/, \ - xfs_dquot.o \ - xfs_dquot_item.o \ - xfs_trans_dquot.o \ - xfs_qm_syscalls.o \ - xfs_qm_bhv.o \ - xfs_qm.o) - -ifeq ($(CONFIG_XFS_QUOTA),y) -xfs-$(CONFIG_PROC_FS) += quota/xfs_qm_stats.o -endif - -xfs-$(CONFIG_XFS_RT) += xfs_rtalloc.o -xfs-$(CONFIG_XFS_POSIX_ACL) += xfs_acl.o -xfs-$(CONFIG_PROC_FS) += $(XFS_LINUX)/xfs_stats.o -xfs-$(CONFIG_SYSCTL) += $(XFS_LINUX)/xfs_sysctl.o -xfs-$(CONFIG_COMPAT) += $(XFS_LINUX)/xfs_ioctl32.o - - -xfs-y += xfs_alloc.o \ - xfs_alloc_btree.o \ - xfs_attr.o \ - xfs_attr_leaf.o \ - xfs_bit.o \ - xfs_bmap.o \ - xfs_bmap_btree.o \ - xfs_btree.o \ - xfs_buf_item.o \ - xfs_da_btree.o \ - xfs_dir2.o \ - xfs_dir2_block.o \ - xfs_dir2_data.o \ - xfs_dir2_leaf.o \ - xfs_dir2_node.o \ - xfs_dir2_sf.o \ - xfs_error.o \ - xfs_extfree_item.o \ - xfs_filestream.o \ - xfs_fsops.o \ - xfs_ialloc.o \ - xfs_ialloc_btree.o \ - xfs_iget.o \ - xfs_inode.o \ - xfs_inode_item.o \ - xfs_iomap.o \ - xfs_itable.o \ - xfs_dfrag.o \ - xfs_log.o \ - xfs_log_recover.o \ - xfs_mount.o \ - xfs_mru_cache.o \ - xfs_rename.o \ - xfs_trans.o \ - xfs_trans_ail.o \ - xfs_trans_buf.o \ - xfs_trans_extfree.o \ - xfs_trans_inode.o \ - xfs_trans_item.o \ - xfs_utils.o \ - xfs_vfsops.o \ - xfs_vnodeops.o \ - xfs_rw.o \ - xfs_dmops.o \ - xfs_qmops.o - -xfs-$(CONFIG_XFS_TRACE) += xfs_dir2_trace.o - -# Objects in linux/ -xfs-y += $(addprefix $(XFS_LINUX)/, \ - kmem.o \ - xfs_aops.o \ - xfs_buf.o \ - xfs_export.o \ - xfs_file.o \ - xfs_fs_subr.o \ - xfs_globals.o \ - xfs_ioctl.o \ - xfs_iops.o \ - xfs_lrw.o \ - xfs_super.o \ - xfs_vnode.o) - -# Objects in support/ -xfs-y += $(addprefix support/, \ - debug.o \ - uuid.o) - -xfs-$(CONFIG_XFS_TRACE) += support/ktrace.o - From owner-xfs@oss.sgi.com Fri Feb 8 21:13:42 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 21:14:29 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m195DdfO001507 for ; Fri, 8 Feb 2008 21:13:42 -0800 X-ASG-Debug-ID: 1202534042-3d6e035a0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3FDE65B9DE5 for ; Fri, 8 Feb 2008 21:14:02 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id l7RVu0CwNpDMavqs for ; Fri, 08 Feb 2008 21:14:02 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id CD0A418DE2F53; Fri, 8 Feb 2008 23:13:31 -0600 (CST) Message-ID: <47AD367A.8080804@sandeen.net> Date: Fri, 08 Feb 2008 23:13:30 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Lachlan McIlroy CC: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 2/2] fix up out of tree builds (mainline only) Subject: [PATCH 2/2] fix up out of tree builds (mainline only) References: <20080208022705.0DB1058C4C11@chook.melbourne.sgi.com> <47AD284F.7080603@sandeen.net> In-Reply-To: <47AD284F.7080603@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202534043 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41766 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5752/Fri Feb 8 18:57:23 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14381 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs kernel.org git tree "port" of: Fix up xfs out-of-tree builds. (a.k.a. external modules) Change -I include directives to find headers in the out-of-tree spot. This allows a directory containing only xfs files to be built as: # make -C /path/to/kernel M=`pwd` Signed-off-by: Eric Sandeen > Merge of xfs-linux-melb:xfs-kern:29878a by kenmcd. fix up out-of-tree xfs builds. Index: linux-2.6/fs/xfs/Makefile =================================================================== --- linux-2.6.orig/fs/xfs/Makefile +++ linux-2.6/fs/xfs/Makefile @@ -16,7 +16,7 @@ # Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA # -EXTRA_CFLAGS += -Ifs/xfs -Ifs/xfs/linux-2.6 -funsigned-char +EXTRA_CFLAGS += -I$(src) -I$(src)/linux-2.6 -funsigned-char XFS_LINUX := linux-2.6 From owner-xfs@oss.sgi.com Fri Feb 8 21:29:18 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 21:29:23 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m195THB1002869 for ; Fri, 8 Feb 2008 21:29:18 -0800 X-ASG-Debug-ID: 1202534979-633a011a0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 18CABDCF72B; Fri, 8 Feb 2008 21:29:40 -0800 (PST) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id DBXPZe2CSK9Bd0Hp; Fri, 08 Feb 2008 21:29:40 -0800 (PST) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m195TVF3026634 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Sat, 9 Feb 2008 06:29:31 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m195TUvH026632; Sat, 9 Feb 2008 06:29:30 +0100 Date: Sat, 9 Feb 2008 06:29:30 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com, xaiki@sgi.com X-ASG-Orig-Subj: mod xfs-linux-melb:xfs-kern:30462a Subject: mod xfs-linux-melb:xfs-kern:30462a Message-ID: <20080209052930.GA26550@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1202534981 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41767 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5753/Fri Feb 8 19:34:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14382 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Looks like the take message for this one didn't make it out to the list. Please fix up the indentation for the 'return error' added there, it needs one more level of indentation. Note to all the new xfs hackers: please make sure your take messages get out to xfs@oss.sgi.com. It would also be very nice of review requests would continue to go to this list aswell. From owner-xfs@oss.sgi.com Fri Feb 8 22:09:12 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 22:09:15 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-3.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1969BC7006141 for ; Fri, 8 Feb 2008 22:09:12 -0800 X-ASG-Debug-ID: 1202537373-5a0e02b60000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C446F5B9E4B for ; Fri, 8 Feb 2008 22:09:34 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com with ESMTP id qAkjqTLZJdcy2oyS for ; Fri, 08 Feb 2008 22:09:34 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id m195jsUY028495 for ; Sat, 9 Feb 2008 00:45:54 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m195jstU007078 for ; Sat, 9 Feb 2008 00:45:54 -0500 Received: from Liberator.local (sebastian-int.corp.redhat.com [172.16.52.221]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m195jrh1027192 for ; Sat, 9 Feb 2008 00:45:53 -0500 Message-ID: <47AD3E11.7020608@redhat.com> Date: Fri, 08 Feb 2008 23:45:53 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: xfs-oss X-ASG-Orig-Subj: [PATCH] recover from iclog allocation failures Subject: [PATCH] recover from iclog allocation failures Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 X-Barracuda-Connect: mx1.redhat.com[66.187.233.31] X-Barracuda-Start-Time: 1202537374 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41770 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5753/Fri Feb 8 19:34:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14383 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@redhat.com Precedence: bulk X-list: xfs A user in #xfs had some strange thing hogging up vmalloc space, and after mounting several xfs filesystems with aggressive log memory usage, started hitting vmalloc failures which led to an oops. I inserted a fake failure at i=3 in the iclog alloc loop, and this patch let me exit with a graceful "ENOMEM" instead of an oops. Also, somehow the use of "uuid_mounted" has gone stale; after the graceful mount failure, I got dup uuid errors on the next mount. This patch fixes that problem as well. Signed-off-by: Eric Sandeen --- Index: linux-2.6.24.noarch/fs/xfs/xfs_log.c =================================================================== --- linux-2.6.24.noarch.orig/fs/xfs/xfs_log.c +++ linux-2.6.24.noarch/fs/xfs/xfs_log.c @@ -513,6 +513,8 @@ xfs_log_mount(xfs_mount_t *mp, } mp->m_log = xlog_alloc_log(mp, log_target, blk_offset, num_bblks); + if (!mp->m_log) + return ENOMEM; /* * skip log recovery on a norecovery mount. pretend it all @@ -1219,6 +1221,13 @@ xlog_alloc_log(xfs_mount_t *mp, prev_iclog = iclog; bp = xfs_buf_get_noaddr(log->l_iclog_size, mp->m_logdev_targp); + if (!iclog || !bp) { + if (iclog) + kmem_free(iclog, sizeof(xlog_in_core_t)); + log->l_iclog_bufs = i; + xlog_dealloc_log(log); + return NULL; + } if (!XFS_BUF_CPSEMA(bp)) ASSERT(0); XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); Index: linux-2.6.24.noarch/fs/xfs/xfs_mount.c =================================================================== --- linux-2.6.24.noarch.orig/fs/xfs/xfs_mount.c +++ linux-2.6.24.noarch/fs/xfs/xfs_mount.c @@ -1007,6 +1007,7 @@ xfs_mountfs( error = XFS_ERROR(EINVAL); goto error1; } + uuid_mounted = 1; } /* From owner-xfs@oss.sgi.com Fri Feb 8 22:14:56 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 22:15:00 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_45, J_CHICKENPOX_63,J_CHICKENPOX_64 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m196ErX8006954 for ; Fri, 8 Feb 2008 22:14:56 -0800 X-ASG-Debug-ID: 1202537714-4e11004c0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7AFED5B9E63 for ; Fri, 8 Feb 2008 22:15:14 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id 7adBOhtTebN1FR31 for ; Fri, 08 Feb 2008 22:15:14 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 584DE18004487 for ; Sat, 9 Feb 2008 00:14:43 -0600 (CST) Message-ID: <47AD44D3.4060503@sandeen.net> Date: Sat, 09 Feb 2008 00:14:43 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: xfs-oss X-ASG-Orig-Subj: [PATCH] remove shouting-indirection macros from xfs_sb.h Subject: [PATCH] remove shouting-indirection macros from xfs_sb.h Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202537715 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41770 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5753/Fri Feb 8 19:34:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14384 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Some day I'll get them all... Signed-off-by: Eric Sandeen --- Index: linux-2.6/fs/xfs/linux-2.6/xfs_ioctl.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_ioctl.c +++ linux-2.6/fs/xfs/linux-2.6/xfs_ioctl.c @@ -732,7 +732,7 @@ xfs_ioctl( * Only allow the sys admin to reserve space unless * unwritten extents are enabled. */ - if (!XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb) && + if (!xfs_sb_version_hasextflgbit(&mp->m_sb) && !capable(CAP_SYS_ADMIN)) return -EPERM; Index: linux-2.6/fs/xfs/quota/xfs_qm.c =================================================================== --- linux-2.6.orig/fs/xfs/quota/xfs_qm.c +++ linux-2.6/fs/xfs/quota/xfs_qm.c @@ -1405,13 +1405,13 @@ xfs_qm_qino_alloc( #if defined(DEBUG) && defined(XFS_LOUD_RECOVERY) unsigned oldv = mp->m_sb.sb_versionnum; #endif - ASSERT(!XFS_SB_VERSION_HASQUOTA(&mp->m_sb)); + ASSERT(!xfs_sb_version_hasquota(&mp->m_sb)); ASSERT((sbfields & (XFS_SB_VERSIONNUM | XFS_SB_UQUOTINO | XFS_SB_GQUOTINO | XFS_SB_QFLAGS)) == (XFS_SB_VERSIONNUM | XFS_SB_UQUOTINO | XFS_SB_GQUOTINO | XFS_SB_QFLAGS)); - XFS_SB_VERSION_ADDQUOTA(&mp->m_sb); + xfs_sb_version_addquota(&mp->m_sb); mp->m_sb.sb_uquotino = NULLFSINO; mp->m_sb.sb_gquotino = NULLFSINO; @@ -1954,7 +1954,7 @@ xfs_qm_init_quotainos( /* * Get the uquota and gquota inodes */ - if (XFS_SB_VERSION_HASQUOTA(&mp->m_sb)) { + if (xfs_sb_version_hasquota(&mp->m_sb)) { if (XFS_IS_UQUOTA_ON(mp) && mp->m_sb.sb_uquotino != NULLFSINO) { ASSERT(mp->m_sb.sb_uquotino > 0); Index: linux-2.6/fs/xfs/quota/xfs_qm_bhv.c =================================================================== --- linux-2.6.orig/fs/xfs/quota/xfs_qm_bhv.c +++ linux-2.6/fs/xfs/quota/xfs_qm_bhv.c @@ -118,7 +118,7 @@ xfs_qm_newmount( *quotaflags = 0; *needquotamount = B_FALSE; - quotaondisk = XFS_SB_VERSION_HASQUOTA(&mp->m_sb) && + quotaondisk = xfs_sb_version_hasquota(&mp->m_sb) && (mp->m_sb.sb_qflags & XFS_ALL_QUOTA_ACCT); if (quotaondisk) { Index: linux-2.6/fs/xfs/quota/xfs_qm_syscalls.c =================================================================== --- linux-2.6.orig/fs/xfs/quota/xfs_qm_syscalls.c +++ linux-2.6/fs/xfs/quota/xfs_qm_syscalls.c @@ -377,7 +377,7 @@ xfs_qm_scall_trunc_qfiles( if (!capable(CAP_SYS_ADMIN)) return XFS_ERROR(EPERM); error = 0; - if (!XFS_SB_VERSION_HASQUOTA(&mp->m_sb) || flags == 0) { + if (!xfs_sb_version_hasquota(&mp->m_sb) || flags == 0) { qdprintk("qtrunc flags=%x m_qflags=%x\n", flags, mp->m_qflags); return XFS_ERROR(EINVAL); } @@ -522,7 +522,7 @@ xfs_qm_scall_getqstat( memset(out, 0, sizeof(fs_quota_stat_t)); out->qs_version = FS_QSTAT_VERSION; - if (! XFS_SB_VERSION_HASQUOTA(&mp->m_sb)) { + if (! xfs_sb_version_hasquota(&mp->m_sb)) { out->qs_uquota.qfs_ino = NULLFSINO; out->qs_gquota.qfs_ino = NULLFSINO; return (0); Index: linux-2.6/fs/xfs/xfs_attr_leaf.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_attr_leaf.c +++ linux-2.6/fs/xfs/xfs_attr_leaf.c @@ -227,10 +227,10 @@ STATIC void xfs_sbversion_add_attr2(xfs_mount_t *mp, xfs_trans_t *tp) { if ((mp->m_flags & XFS_MOUNT_ATTR2) && - !(XFS_SB_VERSION_HASATTR2(&mp->m_sb))) { + !(xfs_sb_version_hasattr2(&mp->m_sb))) { spin_lock(&mp->m_sb_lock); - if (!XFS_SB_VERSION_HASATTR2(&mp->m_sb)) { - XFS_SB_VERSION_ADDATTR2(&mp->m_sb); + if (!xfs_sb_version_hasattr2(&mp->m_sb)) { + xfs_sb_version_addattr2(&mp->m_sb); spin_unlock(&mp->m_sb_lock); xfs_mod_sb(tp, XFS_SB_VERSIONNUM | XFS_SB_FEATURES2); } else Index: linux-2.6/fs/xfs/xfs_bmap.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_bmap.c +++ linux-2.6/fs/xfs/xfs_bmap.c @@ -4047,17 +4047,17 @@ xfs_bmap_add_attrfork( xfs_trans_log_inode(tp, ip, logflags); if (error) goto error2; - if (!XFS_SB_VERSION_HASATTR(&mp->m_sb) || - (!XFS_SB_VERSION_HASATTR2(&mp->m_sb) && version == 2)) { + if (!xfs_sb_version_hasattr(&mp->m_sb) || + (!xfs_sb_version_hasattr2(&mp->m_sb) && version == 2)) { __int64_t sbfields = 0; spin_lock(&mp->m_sb_lock); - if (!XFS_SB_VERSION_HASATTR(&mp->m_sb)) { - XFS_SB_VERSION_ADDATTR(&mp->m_sb); + if (!xfs_sb_version_hasattr(&mp->m_sb)) { + xfs_sb_version_addattr(&mp->m_sb); sbfields |= XFS_SB_VERSIONNUM; } - if (!XFS_SB_VERSION_HASATTR2(&mp->m_sb) && version == 2) { - XFS_SB_VERSION_ADDATTR2(&mp->m_sb); + if (!xfs_sb_version_hasattr2(&mp->m_sb) && version == 2) { + xfs_sb_version_addattr2(&mp->m_sb); sbfields |= (XFS_SB_VERSIONNUM | XFS_SB_FEATURES2); } if (sbfields) { @@ -5043,7 +5043,7 @@ xfs_bmapi( * A wasdelay extent has been initialized, so * shouldn't be flagged as unwritten. */ - if (wr && XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb)) { + if (wr && xfs_sb_version_hasextflgbit(&mp->m_sb)) { if (!wasdelay && (flags & XFS_BMAPI_PREALLOC)) got.br_state = XFS_EXT_UNWRITTEN; } @@ -5483,7 +5483,7 @@ xfs_bunmapi( * get rid of part of a realtime extent. */ if (del.br_state == XFS_EXT_UNWRITTEN || - !XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb)) { + !xfs_sb_version_hasextflgbit(&mp->m_sb)) { /* * This piece is unwritten, or we're not * using unwritten extents. Skip over it. @@ -5535,7 +5535,7 @@ xfs_bunmapi( } else if ((del.br_startoff == start && (del.br_state == XFS_EXT_UNWRITTEN || xfs_trans_get_block_res(tp) == 0)) || - !XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb)) { + !xfs_sb_version_hasextflgbit(&mp->m_sb)) { /* * Can't make it unwritten. There isn't * a full extent here so just skip it. Index: linux-2.6/fs/xfs/xfs_bmap_btree.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_bmap_btree.h +++ linux-2.6/fs/xfs/xfs_bmap_btree.h @@ -120,7 +120,7 @@ typedef enum { * Extent state and extent format macros. */ #define XFS_EXTFMT_INODE(x) \ - (XFS_SB_VERSION_HASEXTFLGBIT(&((x)->i_mount->m_sb)) ? \ + (xfs_sb_version_hasextflgbit(&((x)->i_mount->m_sb)) ? \ XFS_EXTFMT_HASSTATE : XFS_EXTFMT_NOSTATE) #define ISUNWRITTEN(x) ((x)->br_state == XFS_EXT_UNWRITTEN) Index: linux-2.6/fs/xfs/xfs_dir2.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_dir2.c +++ linux-2.6/fs/xfs/xfs_dir2.c @@ -49,7 +49,7 @@ void xfs_dir_mount( xfs_mount_t *mp) { - ASSERT(XFS_SB_VERSION_HASDIRV2(&mp->m_sb)); + ASSERT(xfs_sb_version_hasdirv2(&mp->m_sb)); ASSERT((1 << (mp->m_sb.sb_blocklog + mp->m_sb.sb_dirblklog)) <= XFS_MAX_BLOCKSIZE); mp->m_dirblksize = 1 << (mp->m_sb.sb_blocklog + mp->m_sb.sb_dirblklog); Index: linux-2.6/fs/xfs/xfs_fsops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_fsops.c +++ linux-2.6/fs/xfs/xfs_fsops.c @@ -77,36 +77,36 @@ xfs_fs_geometry( if (new_version >= 3) { geo->version = XFS_FSOP_GEOM_VERSION; geo->flags = - (XFS_SB_VERSION_HASATTR(&mp->m_sb) ? + (xfs_sb_version_hasattr(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_ATTR : 0) | - (XFS_SB_VERSION_HASNLINK(&mp->m_sb) ? + (xfs_sb_version_hasnlink(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_NLINK : 0) | - (XFS_SB_VERSION_HASQUOTA(&mp->m_sb) ? + (xfs_sb_version_hasquota(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_QUOTA : 0) | - (XFS_SB_VERSION_HASALIGN(&mp->m_sb) ? + (xfs_sb_version_hasalign(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_IALIGN : 0) | - (XFS_SB_VERSION_HASDALIGN(&mp->m_sb) ? + (xfs_sb_version_hasdalign(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_DALIGN : 0) | - (XFS_SB_VERSION_HASSHARED(&mp->m_sb) ? + (xfs_sb_version_hasshared(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_SHARED : 0) | - (XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb) ? + (xfs_sb_version_hasextflgbit(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_EXTFLG : 0) | - (XFS_SB_VERSION_HASDIRV2(&mp->m_sb) ? + (xfs_sb_version_hasdirv2(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_DIRV2 : 0) | - (XFS_SB_VERSION_HASSECTOR(&mp->m_sb) ? + (xfs_sb_version_hassector(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_SECTOR : 0) | (xfs_sb_version_haslazysbcount(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_LAZYSB : 0) | - (XFS_SB_VERSION_HASATTR2(&mp->m_sb) ? + (xfs_sb_version_hasattr2(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_ATTR2 : 0); - geo->logsectsize = XFS_SB_VERSION_HASSECTOR(&mp->m_sb) ? + geo->logsectsize = xfs_sb_version_hassector(&mp->m_sb) ? mp->m_sb.sb_logsectsize : BBSIZE; geo->rtsectsize = mp->m_sb.sb_blocksize; geo->dirblocksize = mp->m_dirblksize; } if (new_version >= 4) { geo->flags |= - (XFS_SB_VERSION_HASLOGV2(&mp->m_sb) ? + (xfs_sb_version_haslogv2(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_LOGV2 : 0); geo->logsunit = mp->m_sb.sb_logsunit; } Index: linux-2.6/fs/xfs/xfs_ialloc.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_ialloc.c +++ linux-2.6/fs/xfs/xfs_ialloc.c @@ -191,7 +191,7 @@ xfs_ialloc_ag_alloc( ASSERT(!(args.mp->m_flags & XFS_MOUNT_NOALIGN)); args.alignment = args.mp->m_dalign; isaligned = 1; - } else if (XFS_SB_VERSION_HASALIGN(&args.mp->m_sb) && + } else if (xfs_sb_version_hasalign(&args.mp->m_sb) && args.mp->m_sb.sb_inoalignmt >= XFS_B_TO_FSBT(args.mp, XFS_INODE_CLUSTER_SIZE(args.mp))) @@ -230,7 +230,7 @@ xfs_ialloc_ag_alloc( args.agbno = be32_to_cpu(agi->agi_root); args.fsbno = XFS_AGB_TO_FSB(args.mp, be32_to_cpu(agi->agi_seqno), args.agbno); - if (XFS_SB_VERSION_HASALIGN(&args.mp->m_sb) && + if (xfs_sb_version_hasalign(&args.mp->m_sb) && args.mp->m_sb.sb_inoalignmt >= XFS_B_TO_FSBT(args.mp, XFS_INODE_CLUSTER_SIZE(args.mp))) args.alignment = args.mp->m_sb.sb_inoalignmt; @@ -271,7 +271,7 @@ xfs_ialloc_ag_alloc( * use the old version so that old kernels will continue to be * able to use the file system. */ - if (XFS_SB_VERSION_HASNLINK(&args.mp->m_sb)) + if (xfs_sb_version_hasnlink(&args.mp->m_sb)) version = XFS_DINODE_VERSION_2; else version = XFS_DINODE_VERSION_1; Index: linux-2.6/fs/xfs/xfs_inode.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_inode.c +++ linux-2.6/fs/xfs/xfs_inode.c @@ -1147,7 +1147,7 @@ xfs_ialloc( * the inode version number now. This way we only do the conversion * here rather than here and in the flush/logging code. */ - if (XFS_SB_VERSION_HASNLINK(&tp->t_mountp->m_sb) && + if (xfs_sb_version_hasnlink(&tp->t_mountp->m_sb) && ip->i_d.di_version == XFS_DINODE_VERSION_1) { ip->i_d.di_version = XFS_DINODE_VERSION_2; /* @@ -3434,9 +3434,9 @@ xfs_iflush_int( * has been updated, then make the conversion permanent. */ ASSERT(ip->i_d.di_version == XFS_DINODE_VERSION_1 || - XFS_SB_VERSION_HASNLINK(&mp->m_sb)); + xfs_sb_version_hasnlink(&mp->m_sb)); if (ip->i_d.di_version == XFS_DINODE_VERSION_1) { - if (!XFS_SB_VERSION_HASNLINK(&mp->m_sb)) { + if (!xfs_sb_version_hasnlink(&mp->m_sb)) { /* * Convert it back. */ Index: linux-2.6/fs/xfs/xfs_inode_item.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_inode_item.c +++ linux-2.6/fs/xfs/xfs_inode_item.c @@ -296,9 +296,9 @@ xfs_inode_item_format( */ mp = ip->i_mount; ASSERT(ip->i_d.di_version == XFS_DINODE_VERSION_1 || - XFS_SB_VERSION_HASNLINK(&mp->m_sb)); + xfs_sb_version_hasnlink(&mp->m_sb)); if (ip->i_d.di_version == XFS_DINODE_VERSION_1) { - if (!XFS_SB_VERSION_HASNLINK(&mp->m_sb)) { + if (!xfs_sb_version_hasnlink(&mp->m_sb)) { /* * Convert it back. */ Index: linux-2.6/fs/xfs/xfs_itable.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_itable.c +++ linux-2.6/fs/xfs/xfs_itable.c @@ -45,7 +45,7 @@ xfs_internal_inum( xfs_ino_t ino) { return (ino == mp->m_sb.sb_rbmino || ino == mp->m_sb.sb_rsumino || - (XFS_SB_VERSION_HASQUOTA(&mp->m_sb) && + (xfs_sb_version_hasquota(&mp->m_sb) && (ino == mp->m_sb.sb_uquotino || ino == mp->m_sb.sb_gquotino))); } Index: linux-2.6/fs/xfs/xfs_log.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_log.c +++ linux-2.6/fs/xfs/xfs_log.c @@ -1090,7 +1090,7 @@ xlog_get_iclog_buffer_size(xfs_mount_t * size >>= 1; } - if (XFS_SB_VERSION_HASLOGV2(&mp->m_sb)) { + if (xfs_sb_version_haslogv2(&mp->m_sb)) { /* # headers = size / 32K * one header holds cycles from 32K of data */ @@ -1186,13 +1186,13 @@ xlog_alloc_log(xfs_mount_t *mp, log->l_grant_reserve_cycle = 1; log->l_grant_write_cycle = 1; - if (XFS_SB_VERSION_HASSECTOR(&mp->m_sb)) { + if (xfs_sb_version_hassector(&mp->m_sb)) { log->l_sectbb_log = mp->m_sb.sb_logsectlog - BBSHIFT; ASSERT(log->l_sectbb_log <= mp->m_sectbb_log); /* for larger sector sizes, must have v2 or external log */ ASSERT(log->l_sectbb_log == 0 || log->l_logBBstart == 0 || - XFS_SB_VERSION_HASLOGV2(&mp->m_sb)); + xfs_sb_version_haslogv2(&mp->m_sb)); ASSERT(mp->m_sb.sb_logsectlog >= BBSHIFT); } log->l_sectbb_mask = (1 << log->l_sectbb_log) - 1; @@ -1247,7 +1247,7 @@ xlog_alloc_log(xfs_mount_t *mp, memset(head, 0, sizeof(xlog_rec_header_t)); head->h_magicno = cpu_to_be32(XLOG_HEADER_MAGIC_NUM); head->h_version = cpu_to_be32( - XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) ? 2 : 1); + xfs_sb_version_haslogv2(&log->l_mp->m_sb) ? 2 : 1); head->h_size = cpu_to_be32(log->l_iclog_size); /* new fields */ head->h_fmt = cpu_to_be32(XLOG_FMT); @@ -1402,7 +1402,7 @@ xlog_sync(xlog_t *log, int roundoff; /* roundoff to BB or stripe */ int split = 0; /* split write into two regions */ int error; - int v2 = XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb); + int v2 = xfs_sb_version_haslogv2(&log->l_mp->m_sb); XFS_STATS_INC(xs_log_writes); ASSERT(iclog->ic_refcnt == 0); @@ -2881,7 +2881,7 @@ xlog_state_switch_iclogs(xlog_t *log, log->l_curr_block += BTOBB(eventual_size)+BTOBB(log->l_iclog_hsize); /* Round up to next log-sunit */ - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) && + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb) && log->l_mp->m_sb.sb_logsunit > 1) { __uint32_t sunit_bb = BTOBB(log->l_mp->m_sb.sb_logsunit); log->l_curr_block = roundup(log->l_curr_block, sunit_bb); @@ -3334,7 +3334,7 @@ xlog_ticket_get(xlog_t *log, unit_bytes += sizeof(xlog_op_header_t) * num_headers; /* for roundoff padding for transaction data and one for commit record */ - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) && + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb) && log->l_mp->m_sb.sb_logsunit > 1) { /* log su roundoff */ unit_bytes += 2*log->l_mp->m_sb.sb_logsunit; Index: linux-2.6/fs/xfs/xfs_log_priv.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_log_priv.h +++ linux-2.6/fs/xfs/xfs_log_priv.h @@ -49,10 +49,10 @@ struct xfs_mount; #define XLOG_HEADER_SIZE 512 #define XLOG_REC_SHIFT(log) \ - BTOBB(1 << (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) ? \ + BTOBB(1 << (xfs_sb_version_haslogv2(&log->l_mp->m_sb) ? \ XLOG_MAX_RECORD_BSHIFT : XLOG_BIG_RECORD_BSHIFT)) #define XLOG_TOTAL_REC_SHIFT(log) \ - BTOBB(XLOG_MAX_ICLOGS << (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) ? \ + BTOBB(XLOG_MAX_ICLOGS << (xfs_sb_version_haslogv2(&log->l_mp->m_sb) ? \ XLOG_MAX_RECORD_BSHIFT : XLOG_BIG_RECORD_BSHIFT)) Index: linux-2.6/fs/xfs/xfs_log_recover.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_log_recover.c +++ linux-2.6/fs/xfs/xfs_log_recover.c @@ -478,7 +478,7 @@ xlog_find_verify_log_record( * reset last_blk. Only when last_blk points in the middle of a log * record do we update last_blk. */ - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { uint h_size = be32_to_cpu(head->h_size); xhdrs = h_size / XLOG_HEADER_CYCLE_SIZE; @@ -888,7 +888,7 @@ xlog_find_tail( * unmount record if there is one, so we pass the lsn of the * unmount record rather than the block after it. */ - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { int h_size = be32_to_cpu(rhead->h_size); int h_version = be32_to_cpu(rhead->h_version); @@ -1101,7 +1101,7 @@ xlog_add_record( recp->h_magicno = cpu_to_be32(XLOG_HEADER_MAGIC_NUM); recp->h_cycle = cpu_to_be32(cycle); recp->h_version = cpu_to_be32( - XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) ? 2 : 1); + xfs_sb_version_haslogv2(&log->l_mp->m_sb) ? 2 : 1); recp->h_lsn = cpu_to_be64(xlog_assign_lsn(cycle, block)); recp->h_tail_lsn = cpu_to_be64(xlog_assign_lsn(tail_cycle, tail_block)); recp->h_fmt = cpu_to_be32(XLOG_FMT); @@ -3348,7 +3348,7 @@ xlog_pack_data( dp += BBSIZE; } - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { xhdr = (xlog_in_core_2_t *)&iclog->ic_header; for ( ; i < BTOBB(size); i++) { j = i / (XLOG_HEADER_CYCLE_SIZE / BBSIZE); @@ -3388,7 +3388,7 @@ xlog_unpack_data_checksum( be32_to_cpu(rhead->h_chksum), chksum); cmn_err(CE_DEBUG, "XFS: Disregard message if filesystem was created with non-DEBUG kernel"); - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { cmn_err(CE_DEBUG, "XFS: LogR this is a LogV2 filesystem\n"); } @@ -3415,7 +3415,7 @@ xlog_unpack_data( dp += BBSIZE; } - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { xhdr = (xlog_in_core_2_t *)rhead; for ( ; i < BTOBB(be32_to_cpu(rhead->h_len)); i++) { j = i / (XLOG_HEADER_CYCLE_SIZE / BBSIZE); @@ -3494,7 +3494,7 @@ xlog_do_recovery_pass( * Read the header of the tail block and get the iclog buffer size from * h_size. Use this to tell how many sectors make up the log header. */ - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { /* * When using variable length iclogs, read first sector of * iclog header and extract the header size from it. Get a @@ -3838,7 +3838,7 @@ xlog_do_recover( sbp = &log->l_mp->m_sb; xfs_sb_from_disk(sbp, XFS_BUF_TO_SBP(bp)); ASSERT(sbp->sb_magicnum == XFS_SB_MAGIC); - ASSERT(XFS_SB_GOOD_VERSION(sbp)); + ASSERT(xfs_sb_good_version(sbp)); xfs_buf_relse(bp); /* We've re-read the superblock so re-initialize per-cpu counters */ Index: linux-2.6/fs/xfs/xfs_mount.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_mount.c +++ linux-2.6/fs/xfs/xfs_mount.c @@ -225,7 +225,7 @@ xfs_mount_validate_sb( return XFS_ERROR(EWRONGFS); } - if (!XFS_SB_GOOD_VERSION(sbp)) { + if (!xfs_sb_good_version(sbp)) { xfs_fs_mount_cmn_err(flags, "bad version"); return XFS_ERROR(EWRONGFS); } @@ -300,7 +300,7 @@ xfs_mount_validate_sb( /* * Version 1 directory format has never worked on Linux. */ - if (unlikely(!XFS_SB_VERSION_HASDIRV2(sbp))) { + if (unlikely(!xfs_sb_version_hasdirv2(sbp))) { xfs_fs_mount_cmn_err(flags, "file system using version 1 directory format"); return XFS_ERROR(ENOSYS); @@ -781,7 +781,7 @@ xfs_update_alignment(xfs_mount_t *mp, in * Update superblock with new values * and log changes */ - if (XFS_SB_VERSION_HASDALIGN(sbp)) { + if (xfs_sb_version_hasdalign(sbp)) { if (sbp->sb_unit != mp->m_dalign) { sbp->sb_unit = mp->m_dalign; *update_flags |= XFS_SB_UNIT; @@ -792,7 +792,7 @@ xfs_update_alignment(xfs_mount_t *mp, in } } } else if ((mp->m_flags & XFS_MOUNT_NOALIGN) != XFS_MOUNT_NOALIGN && - XFS_SB_VERSION_HASDALIGN(&mp->m_sb)) { + xfs_sb_version_hasdalign(&mp->m_sb)) { mp->m_dalign = sbp->sb_unit; mp->m_swidth = sbp->sb_width; } @@ -869,7 +869,7 @@ xfs_set_rw_sizes(xfs_mount_t *mp) STATIC void xfs_set_inoalignment(xfs_mount_t *mp) { - if (XFS_SB_VERSION_HASALIGN(&mp->m_sb) && + if (xfs_sb_version_hasalign(&mp->m_sb) && mp->m_sb.sb_inoalignmt >= XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) mp->m_inoalign_mask = mp->m_sb.sb_inoalignmt - 1; Index: linux-2.6/fs/xfs/xfs_sb.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_sb.h +++ linux-2.6/fs/xfs/xfs_sb.h @@ -271,7 +271,6 @@ typedef enum { #define XFS_SB_VERSION_NUM(sbp) ((sbp)->sb_versionnum & XFS_SB_VERSION_NUMBITS) -#define XFS_SB_GOOD_VERSION(sbp) xfs_sb_good_version(sbp) #ifdef __KERNEL__ static inline int xfs_sb_good_version(xfs_sb_t *sbp) { @@ -297,7 +296,6 @@ static inline int xfs_sb_good_version(xf } #endif /* __KERNEL__ */ -#define XFS_SB_VERSION_TONEW(v) xfs_sb_version_tonew(v) static inline unsigned xfs_sb_version_tonew(unsigned v) { return ((((v) == XFS_SB_VERSION_1) ? \ @@ -308,7 +306,6 @@ static inline unsigned xfs_sb_version_to XFS_SB_VERSION_4); } -#define XFS_SB_VERSION_TOOLD(v) xfs_sb_version_toold(v) static inline unsigned xfs_sb_version_toold(unsigned v) { return (((v) & (XFS_SB_VERSION_QUOTABIT | XFS_SB_VERSION_ALIGNBIT)) ? \ @@ -320,7 +317,6 @@ static inline unsigned xfs_sb_version_to XFS_SB_VERSION_1))); } -#define XFS_SB_VERSION_HASATTR(sbp) xfs_sb_version_hasattr(sbp) static inline int xfs_sb_version_hasattr(xfs_sb_t *sbp) { return ((sbp)->sb_versionnum == XFS_SB_VERSION_2) || \ @@ -329,7 +325,6 @@ static inline int xfs_sb_version_hasattr ((sbp)->sb_versionnum & XFS_SB_VERSION_ATTRBIT)); } -#define XFS_SB_VERSION_ADDATTR(sbp) xfs_sb_version_addattr(sbp) static inline void xfs_sb_version_addattr(xfs_sb_t *sbp) { (sbp)->sb_versionnum = (((sbp)->sb_versionnum == XFS_SB_VERSION_1) ? \ @@ -339,7 +334,6 @@ static inline void xfs_sb_version_addatt (XFS_SB_VERSION_4 | XFS_SB_VERSION_ATTRBIT))); } -#define XFS_SB_VERSION_HASNLINK(sbp) xfs_sb_version_hasnlink(sbp) static inline int xfs_sb_version_hasnlink(xfs_sb_t *sbp) { return ((sbp)->sb_versionnum == XFS_SB_VERSION_3) || \ @@ -347,7 +341,6 @@ static inline int xfs_sb_version_hasnlin ((sbp)->sb_versionnum & XFS_SB_VERSION_NLINKBIT)); } -#define XFS_SB_VERSION_ADDNLINK(sbp) xfs_sb_version_addnlink(sbp) static inline void xfs_sb_version_addnlink(xfs_sb_t *sbp) { (sbp)->sb_versionnum = ((sbp)->sb_versionnum <= XFS_SB_VERSION_2 ? \ @@ -355,115 +348,63 @@ static inline void xfs_sb_version_addnli ((sbp)->sb_versionnum | XFS_SB_VERSION_NLINKBIT)); } -#define XFS_SB_VERSION_HASQUOTA(sbp) xfs_sb_version_hasquota(sbp) static inline int xfs_sb_version_hasquota(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_QUOTABIT); } -#define XFS_SB_VERSION_ADDQUOTA(sbp) xfs_sb_version_addquota(sbp) static inline void xfs_sb_version_addquota(xfs_sb_t *sbp) { (sbp)->sb_versionnum = \ (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 ? \ ((sbp)->sb_versionnum | XFS_SB_VERSION_QUOTABIT) : \ - (XFS_SB_VERSION_TONEW((sbp)->sb_versionnum) | \ + (xfs_sb_version_tonew((sbp)->sb_versionnum) | \ XFS_SB_VERSION_QUOTABIT)); } -#define XFS_SB_VERSION_HASALIGN(sbp) xfs_sb_version_hasalign(sbp) static inline int xfs_sb_version_hasalign(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_ALIGNBIT); } -#define XFS_SB_VERSION_SUBALIGN(sbp) xfs_sb_version_subalign(sbp) -static inline void xfs_sb_version_subalign(xfs_sb_t *sbp) -{ - (sbp)->sb_versionnum = \ - XFS_SB_VERSION_TOOLD((sbp)->sb_versionnum & ~XFS_SB_VERSION_ALIGNBIT); -} - -#define XFS_SB_VERSION_HASDALIGN(sbp) xfs_sb_version_hasdalign(sbp) static inline int xfs_sb_version_hasdalign(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_DALIGNBIT); } -#define XFS_SB_VERSION_ADDDALIGN(sbp) xfs_sb_version_adddalign(sbp) -static inline int xfs_sb_version_adddalign(xfs_sb_t *sbp) -{ - return (sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum | XFS_SB_VERSION_DALIGNBIT); -} - -#define XFS_SB_VERSION_HASSHARED(sbp) xfs_sb_version_hasshared(sbp) static inline int xfs_sb_version_hasshared(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_SHAREDBIT); } -#define XFS_SB_VERSION_ADDSHARED(sbp) xfs_sb_version_addshared(sbp) -static inline int xfs_sb_version_addshared(xfs_sb_t *sbp) -{ - return (sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum | XFS_SB_VERSION_SHAREDBIT); -} - -#define XFS_SB_VERSION_SUBSHARED(sbp) xfs_sb_version_subshared(sbp) -static inline int xfs_sb_version_subshared(xfs_sb_t *sbp) -{ - return (sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum & ~XFS_SB_VERSION_SHAREDBIT); -} - -#define XFS_SB_VERSION_HASDIRV2(sbp) xfs_sb_version_hasdirv2(sbp) static inline int xfs_sb_version_hasdirv2(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_DIRV2BIT); } -#define XFS_SB_VERSION_HASLOGV2(sbp) xfs_sb_version_haslogv2(sbp) static inline int xfs_sb_version_haslogv2(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_LOGV2BIT); } -#define XFS_SB_VERSION_HASEXTFLGBIT(sbp) xfs_sb_version_hasextflgbit(sbp) static inline int xfs_sb_version_hasextflgbit(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_EXTFLGBIT); } -#define XFS_SB_VERSION_ADDEXTFLGBIT(sbp) xfs_sb_version_addextflgbit(sbp) -static inline int xfs_sb_version_addextflgbit(xfs_sb_t *sbp) -{ - return (sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum | XFS_SB_VERSION_EXTFLGBIT); -} - -#define XFS_SB_VERSION_SUBEXTFLGBIT(sbp) xfs_sb_version_subextflgbit(sbp) -static inline int xfs_sb_version_subextflgbit(xfs_sb_t *sbp) -{ - return (sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum & ~XFS_SB_VERSION_EXTFLGBIT); -} - -#define XFS_SB_VERSION_HASSECTOR(sbp) xfs_sb_version_hassector(sbp) static inline int xfs_sb_version_hassector(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_SECTORBIT); } -#define XFS_SB_VERSION_HASMOREBITS(sbp) xfs_sb_version_hasmorebits(sbp) static inline int xfs_sb_version_hasmorebits(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ @@ -476,24 +417,22 @@ static inline int xfs_sb_version_hasmore * For example, for a bit defined as XFS_SB_VERSION2_FUNBIT, has a macro: * * SB_VERSION_HASFUNBIT(xfs_sb_t *sbp) - * ((XFS_SB_VERSION_HASMOREBITS(sbp) && + * ((xfs_sb_version_hasmorebits(sbp) && * ((sbp)->sb_features2 & XFS_SB_VERSION2_FUNBIT) */ static inline int xfs_sb_version_haslazysbcount(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_HASMOREBITS(sbp) && \ + return (xfs_sb_version_hasmorebits(sbp) && \ ((sbp)->sb_features2 & XFS_SB_VERSION2_LAZYSBCOUNTBIT)); } -#define XFS_SB_VERSION_HASATTR2(sbp) xfs_sb_version_hasattr2(sbp) static inline int xfs_sb_version_hasattr2(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_HASMOREBITS(sbp)) && \ + return (xfs_sb_version_hasmorebits(sbp)) && \ ((sbp)->sb_features2 & XFS_SB_VERSION2_ATTR2BIT); } -#define XFS_SB_VERSION_ADDATTR2(sbp) xfs_sb_version_addattr2(sbp) static inline void xfs_sb_version_addattr2(xfs_sb_t *sbp) { ((sbp)->sb_versionnum = \ Index: linux-2.6/fs/xfs/xfs_utils.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_utils.c +++ linux-2.6/fs/xfs/xfs_utils.c @@ -339,10 +339,10 @@ xfs_bump_ino_vers2( ip->i_d.di_onlink = 0; memset(&(ip->i_d.di_pad[0]), 0, sizeof(ip->i_d.di_pad)); mp = tp->t_mountp; - if (!XFS_SB_VERSION_HASNLINK(&mp->m_sb)) { + if (!xfs_sb_version_hasnlink(&mp->m_sb)) { spin_lock(&mp->m_sb_lock); - if (!XFS_SB_VERSION_HASNLINK(&mp->m_sb)) { - XFS_SB_VERSION_ADDNLINK(&mp->m_sb); + if (!xfs_sb_version_hasnlink(&mp->m_sb)) { + xfs_sb_version_addnlink(&mp->m_sb); spin_unlock(&mp->m_sb_lock); xfs_mod_sb(tp, XFS_SB_VERSIONNUM); } else { Index: linux-2.6/fs/xfs/xfs_vfsops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_vfsops.c +++ linux-2.6/fs/xfs/xfs_vfsops.c @@ -349,7 +349,7 @@ xfs_finish_flags( } } - if (XFS_SB_VERSION_HASATTR2(&mp->m_sb)) { + if (xfs_sb_version_hasattr2(&mp->m_sb)) { mp->m_flags |= XFS_MOUNT_ATTR2; } @@ -366,7 +366,7 @@ xfs_finish_flags( * check for shared mount. */ if (ap->flags & XFSMNT_SHARED) { - if (!XFS_SB_VERSION_HASSHARED(&mp->m_sb)) + if (!xfs_sb_version_hasshared(&mp->m_sb)) return XFS_ERROR(EINVAL); /* @@ -512,7 +512,7 @@ xfs_mount( if (!error && logdev && logdev != ddev) { unsigned int log_sector_size = BBSIZE; - if (XFS_SB_VERSION_HASSECTOR(&mp->m_sb)) + if (xfs_sb_version_hassector(&mp->m_sb)) log_sector_size = mp->m_sb.sb_logsectsize; error = xfs_setsize_buftarg(mp->m_logdev_targp, mp->m_sb.sb_blocksize, Index: linux-2.6/fs/xfs/xfs_vnodeops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_vnodeops.c +++ linux-2.6/fs/xfs/xfs_vnodeops.c @@ -4132,7 +4132,7 @@ xfs_free_file_space( * actually need to zero the extent edges. Otherwise xfs_bunmapi * will take care of it for us. */ - if (rt && !XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb)) { + if (rt && !xfs_sb_version_hasextflgbit(&mp->m_sb)) { nimap = 1; error = xfs_bmapi(NULL, ip, startoffset_fsb, 1, 0, NULL, 0, &imap, &nimap, NULL, NULL); From owner-xfs@oss.sgi.com Fri Feb 8 22:33:39 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 22:33:42 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m196XbHk008618 for ; Fri, 8 Feb 2008 22:33:39 -0800 X-ASG-Debug-ID: 1202538840-2ff7036d0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5A4CA5B9EF4 for ; Fri, 8 Feb 2008 22:34:00 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id hm5lgLJ6l3DjaYWJ for ; Fri, 08 Feb 2008 22:34:00 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JNjHF-0001zB-Et; Sat, 09 Feb 2008 06:33:29 +0000 Date: Sat, 9 Feb 2008 01:33:29 -0500 From: Christoph Hellwig To: Eric Sandeen Cc: xfs-oss X-ASG-Orig-Subj: Re: [PATCH] recover from iclog allocation failures Subject: Re: [PATCH] recover from iclog allocation failures Message-ID: <20080209063329.GA6840@infradead.org> References: <47AD3E11.7020608@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47AD3E11.7020608@redhat.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1202538841 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41772 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5753/Fri Feb 8 19:34:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14386 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Fri, Feb 08, 2008 at 11:45:53PM -0600, Eric Sandeen wrote: > mp->m_log = xlog_alloc_log(mp, log_target, blk_offset, num_bblks); > + if (!mp->m_log) > + return ENOMEM; Currently there's no allocations in there that should be able to fail. But actually marking these KM_MAYFAIL would be a good idea. > @@ -1219,6 +1221,13 @@ xlog_alloc_log(xfs_mount_t *mp, > prev_iclog = iclog; > > bp = xfs_buf_get_noaddr(log->l_iclog_size, mp->m_logdev_targp); > + if (!iclog || !bp) { > + if (iclog) > + kmem_free(iclog, sizeof(xlog_in_core_t)); > + log->l_iclog_bufs = i; > + xlog_dealloc_log(log); > + return NULL; > + } Please check for iclog beeing NULL before trying to allocate the buffer, and switch it to KM_MAYFAIL. Given that there are two failing cases now it would make sense to have goto-unwinding here. Also once you touch the memory allocation feel free to remove the useless casts of their return values. > Index: linux-2.6.24.noarch/fs/xfs/xfs_mount.c > =================================================================== > --- linux-2.6.24.noarch.orig/fs/xfs/xfs_mount.c > +++ linux-2.6.24.noarch/fs/xfs/xfs_mount.c > @@ -1007,6 +1007,7 @@ xfs_mountfs( > error = XFS_ERROR(EINVAL); > goto error1; > } > + uuid_mounted = 1; How is this related to the rest of the patch? From owner-xfs@oss.sgi.com Fri Feb 8 22:33:25 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 22:33:28 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m196XK96008570 for ; Fri, 8 Feb 2008 22:33:25 -0800 X-ASG-Debug-ID: 1202538824-4e1400980000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8FB635B9E9F for ; Fri, 8 Feb 2008 22:33:44 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id cGPFIfJ7wRXLzVRv for ; Fri, 08 Feb 2008 22:33:44 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JNjHT-0001zV-Ss; Sat, 09 Feb 2008 06:33:43 +0000 Date: Sat, 9 Feb 2008 01:33:43 -0500 From: Christoph Hellwig To: Eric Sandeen Cc: xfs-oss X-ASG-Orig-Subj: Re: [PATCH] remove shouting-indirection macros from xfs_sb.h Subject: Re: [PATCH] remove shouting-indirection macros from xfs_sb.h Message-ID: <20080209063343.GB6840@infradead.org> References: <47AD44D3.4060503@sandeen.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47AD44D3.4060503@sandeen.net> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1202538824 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41772 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5753/Fri Feb 8 19:34:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14385 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Sat, Feb 09, 2008 at 12:14:43AM -0600, Eric Sandeen wrote: > Some day I'll get them all... Nice. From owner-xfs@oss.sgi.com Fri Feb 8 22:44:23 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 08 Feb 2008 22:44:30 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m196iLAf009588 for ; Fri, 8 Feb 2008 22:44:23 -0800 X-ASG-Debug-ID: 1202539484-4dfb00e90000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 260D25B9FA4 for ; Fri, 8 Feb 2008 22:44:44 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id NlCvUFAGFoOkqpGh for ; Fri, 08 Feb 2008 22:44:44 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 8ACFE18004487; Sat, 9 Feb 2008 00:44:42 -0600 (CST) Message-ID: <47AD4BD9.5030605@sandeen.net> Date: Sat, 09 Feb 2008 00:44:41 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs-oss X-ASG-Orig-Subj: Re: [PATCH] recover from iclog allocation failures Subject: Re: [PATCH] recover from iclog allocation failures References: <47AD3E11.7020608@redhat.com> <20080209063329.GA6840@infradead.org> In-Reply-To: <20080209063329.GA6840@infradead.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202539485 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41772 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5753/Fri Feb 8 19:34:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14387 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Christoph Hellwig wrote: > On Fri, Feb 08, 2008 at 11:45:53PM -0600, Eric Sandeen wrote: >> mp->m_log = xlog_alloc_log(mp, log_target, blk_offset, num_bblks); >> + if (!mp->m_log) >> + return ENOMEM; > > Currently there's no allocations in there that should be able to fail. Well, actually... > But actually marking these KM_MAYFAIL would be a good idea. > >> @@ -1219,6 +1221,13 @@ xlog_alloc_log(xfs_mount_t *mp, >> prev_iclog = iclog; >> >> bp = xfs_buf_get_noaddr(log->l_iclog_size, mp->m_logdev_targp); This was failing for him I think. >> + if (!iclog || !bp) { >> + if (iclog) >> + kmem_free(iclog, sizeof(xlog_in_core_t)); >> + log->l_iclog_bufs = i; >> + xlog_dealloc_log(log); >> + return NULL; >> + } > > Please check for iclog beeing NULL before trying to allocate the buffer, > and switch it to KM_MAYFAIL. Given that there are two failing cases now > it would make sense to have goto-unwinding here. hmm yeah probably so. > Also once you touch the memory allocation feel free to remove the > useless casts of their return values. > >> Index: linux-2.6.24.noarch/fs/xfs/xfs_mount.c >> =================================================================== >> --- linux-2.6.24.noarch.orig/fs/xfs/xfs_mount.c >> +++ linux-2.6.24.noarch/fs/xfs/xfs_mount.c >> @@ -1007,6 +1007,7 @@ xfs_mountfs( >> error = XFS_ERROR(EINVAL); >> goto error1; >> } >> + uuid_mounted = 1; > > How is this related to the rest of the patch? if we errored out, we returned from mount w/o taking the uuid back out of the table. -Eric From owner-xfs@oss.sgi.com Sat Feb 9 04:04:16 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 09 Feb 2008 04:04:25 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: **** X-Spam-Status: No, score=4.3 required=5.0 tests=BAYES_05,FROM_DOMAIN_NOVOWEL, RCVD_BAD_ID autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m19C4CEg002938 for ; Sat, 9 Feb 2008 04:04:16 -0800 X-ASG-Debug-ID: 1202558673-3dbc00210000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from dolly.gnuher.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id EE79BDD01C3; Sat, 9 Feb 2008 04:04:33 -0800 (PST) Received: from dolly.gnuher.de (dolly.gnuher.de [212.227.64.154]) by cuda.sgi.com with ESMTP id CUWsdfMZvAP2NCgj; Sat, 09 Feb 2008 04:04:33 -0800 (PST) Received: from ultimate100.geggus.net ([2001:8d8:81:672::1]) by dolly.gnuher.de with esmtpsa TLS-1.0:RSA_AES_256_CBC_SHA:32 (Exim 4.66 id 1JNoRb-0001Vv-UM); Sat, 09 Feb 2008 13:04:32 +0100 Received: from diesel.geggus.net ([192.168.3.2] ident=mail) by ultimate100.geggus.net with esmtp (Exim 4.66 id 1JNoRW-0006JW-Ug); Sat, 09 Feb 2008 13:04:27 +0100 Received: from sven by diesel.geggus.net with local (Exim 3.36 id 1JNoRW-0001mT-00); Sat, 09 Feb 2008 13:04:26 +0100 Date: Sat, 9 Feb 2008 13:04:24 +0100 From: Sven Geggus To: David Chinner Cc: xfs@oss.sgi.com, Tobias Ulmer , Andrea Perotti X-ASG-Orig-Subj: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Subject: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Message-ID: <20080209120423.GA6699@diesel.geggus.net> References: <20080205052418.GU155259@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080205052418.GU155259@sgi.com> X-MimeOLE: Produced By Exchange Microsoft V6.6.6 X-Message-Flag: CAUTION: Usage of another Email-Software is highly recommended see http://mozilla.org/thunderbird/ for details. User-Agent: Mutt/1.5.13 (2006-08-11) X-Barracuda-Connect: dolly.gnuher.de[212.227.64.154] X-Barracuda-Start-Time: 1202558674 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -0.43 X-Barracuda-Spam-Status: No, SCORE=-0.43 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=FROM_DOMAIN_NOVOWEL X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41793 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 1.59 FROM_DOMAIN_NOVOWEL From: domain has series of non-vowel letters X-Virus-Scanned: ClamAV 0.91.2/5754/Sat Feb 9 00:47:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14388 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lists@fuchsschwanzdomain.de Precedence: bulk X-list: xfs David Chinner schrieb am Dienstag, den 05. Februar um 06:24 Uhr: > Can you try the patch attached below Am I correct in the assumption, that this did not make it into 2.6.24.1? Can we reckon that this patch will get included in one of the next minor releases? Sven -- The main thing to note is that when you choose open source you don't get a Windows operating system. (from http://www.dell.com/ubuntu) /me is giggls@ircnet, http://sven.gegg.us/ on the Web From owner-xfs@oss.sgi.com Sat Feb 9 11:42:25 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 09 Feb 2008 11:42:30 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m19JgNne003960 for ; Sat, 9 Feb 2008 11:42:25 -0800 X-ASG-Debug-ID: 1202586165-0384017c0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id CC0C55BBAEC for ; Sat, 9 Feb 2008 11:42:45 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id DAd4bEdNc6zvEk1f for ; Sat, 09 Feb 2008 11:42:45 -0800 (PST) Received: from Liberator.local (sandeen.net [209.173.210.139]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 80D6E18004487 for ; Sat, 9 Feb 2008 13:42:43 -0600 (CST) Message-ID: <47AE0232.9000002@sandeen.net> Date: Sat, 09 Feb 2008 13:42:42 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: xfs-oss X-ASG-Orig-Subj: [PATCH] remove forward declarations for ioctl helpers; let "noinline" do the work Subject: [PATCH] remove forward declarations for ioctl helpers; let "noinline" do the work Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202586166 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41822 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5761/Sat Feb 9 10:02:33 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14389 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs (if this one is too purely cosmetic I won't be offended) The forward declarations for the xfs_ioctl() helpers and the associated comment about gcc behavior really aren't needed; all of these functions are marked STATIC which includes noinline, and the stack usage won't be a problem. This effectively just removes the forward declarations and moves xfs_ioctl() back to the end of the file. Signed-off-by: Eric Sandeen --- xfs_ioctl.c | 563 ++++++++++++++++---------------------- 1 files changed, 255 insertions(+), 308 deletions(-) Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ioctl.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_ioctl.c +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ioctl.c @@ -651,314 +651,6 @@ xfs_attrmulti_by_handle( return -error; } -/* prototypes for a few of the stack-hungry cases that have - * their own functions. Functions are defined after their use - * so gcc doesn't get fancy and inline them with -03 */ - -STATIC int -xfs_ioc_space( - struct xfs_inode *ip, - struct inode *inode, - struct file *filp, - int flags, - unsigned int cmd, - void __user *arg); - -STATIC int -xfs_ioc_bulkstat( - xfs_mount_t *mp, - unsigned int cmd, - void __user *arg); - -STATIC int -xfs_ioc_fsgeometry_v1( - xfs_mount_t *mp, - void __user *arg); - -STATIC int -xfs_ioc_fsgeometry( - xfs_mount_t *mp, - void __user *arg); - -STATIC int -xfs_ioc_xattr( - xfs_inode_t *ip, - struct file *filp, - unsigned int cmd, - void __user *arg); - -STATIC int -xfs_ioc_fsgetxattr( - xfs_inode_t *ip, - int attr, - void __user *arg); - -STATIC int -xfs_ioc_getbmap( - struct xfs_inode *ip, - int flags, - unsigned int cmd, - void __user *arg); - -STATIC int -xfs_ioc_getbmapx( - struct xfs_inode *ip, - void __user *arg); - -int -xfs_ioctl( - xfs_inode_t *ip, - struct file *filp, - int ioflags, - unsigned int cmd, - void __user *arg) -{ - struct inode *inode = filp->f_path.dentry->d_inode; - xfs_mount_t *mp = ip->i_mount; - int error; - - xfs_itrace_entry(XFS_I(inode)); - switch (cmd) { - - case XFS_IOC_ALLOCSP: - case XFS_IOC_FREESP: - case XFS_IOC_RESVSP: - case XFS_IOC_UNRESVSP: - case XFS_IOC_ALLOCSP64: - case XFS_IOC_FREESP64: - case XFS_IOC_RESVSP64: - case XFS_IOC_UNRESVSP64: - /* - * Only allow the sys admin to reserve space unless - * unwritten extents are enabled. - */ - if (!XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb) && - !capable(CAP_SYS_ADMIN)) - return -EPERM; - - return xfs_ioc_space(ip, inode, filp, ioflags, cmd, arg); - - case XFS_IOC_DIOINFO: { - struct dioattr da; - xfs_buftarg_t *target = - XFS_IS_REALTIME_INODE(ip) ? - mp->m_rtdev_targp : mp->m_ddev_targp; - - da.d_mem = da.d_miniosz = 1 << target->bt_sshift; - da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1); - - if (copy_to_user(arg, &da, sizeof(da))) - return -XFS_ERROR(EFAULT); - return 0; - } - - case XFS_IOC_FSBULKSTAT_SINGLE: - case XFS_IOC_FSBULKSTAT: - case XFS_IOC_FSINUMBERS: - return xfs_ioc_bulkstat(mp, cmd, arg); - - case XFS_IOC_FSGEOMETRY_V1: - return xfs_ioc_fsgeometry_v1(mp, arg); - - case XFS_IOC_FSGEOMETRY: - return xfs_ioc_fsgeometry(mp, arg); - - case XFS_IOC_GETVERSION: - return put_user(inode->i_generation, (int __user *)arg); - - case XFS_IOC_FSGETXATTR: - return xfs_ioc_fsgetxattr(ip, 0, arg); - case XFS_IOC_FSGETXATTRA: - return xfs_ioc_fsgetxattr(ip, 1, arg); - case XFS_IOC_GETXFLAGS: - case XFS_IOC_SETXFLAGS: - case XFS_IOC_FSSETXATTR: - return xfs_ioc_xattr(ip, filp, cmd, arg); - - case XFS_IOC_FSSETDM: { - struct fsdmidata dmi; - - if (copy_from_user(&dmi, arg, sizeof(dmi))) - return -XFS_ERROR(EFAULT); - - error = xfs_set_dmattrs(ip, dmi.fsd_dmevmask, - dmi.fsd_dmstate); - return -error; - } - - case XFS_IOC_GETBMAP: - case XFS_IOC_GETBMAPA: - return xfs_ioc_getbmap(ip, ioflags, cmd, arg); - - case XFS_IOC_GETBMAPX: - return xfs_ioc_getbmapx(ip, arg); - - case XFS_IOC_FD_TO_HANDLE: - case XFS_IOC_PATH_TO_HANDLE: - case XFS_IOC_PATH_TO_FSHANDLE: - return xfs_find_handle(cmd, arg); - - case XFS_IOC_OPEN_BY_HANDLE: - return xfs_open_by_handle(mp, arg, filp, inode); - - case XFS_IOC_FSSETDM_BY_HANDLE: - return xfs_fssetdm_by_handle(mp, arg, inode); - - case XFS_IOC_READLINK_BY_HANDLE: - return xfs_readlink_by_handle(mp, arg, inode); - - case XFS_IOC_ATTRLIST_BY_HANDLE: - return xfs_attrlist_by_handle(mp, arg, inode); - - case XFS_IOC_ATTRMULTI_BY_HANDLE: - return xfs_attrmulti_by_handle(mp, arg, inode); - - case XFS_IOC_SWAPEXT: { - error = xfs_swapext((struct xfs_swapext __user *)arg); - return -error; - } - - case XFS_IOC_FSCOUNTS: { - xfs_fsop_counts_t out; - - error = xfs_fs_counts(mp, &out); - if (error) - return -error; - - if (copy_to_user(arg, &out, sizeof(out))) - return -XFS_ERROR(EFAULT); - return 0; - } - - case XFS_IOC_SET_RESBLKS: { - xfs_fsop_resblks_t inout; - __uint64_t in; - - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - if (copy_from_user(&inout, arg, sizeof(inout))) - return -XFS_ERROR(EFAULT); - - /* input parameter is passed in resblks field of structure */ - in = inout.resblks; - error = xfs_reserve_blocks(mp, &in, &inout); - if (error) - return -error; - - if (copy_to_user(arg, &inout, sizeof(inout))) - return -XFS_ERROR(EFAULT); - return 0; - } - - case XFS_IOC_GET_RESBLKS: { - xfs_fsop_resblks_t out; - - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - error = xfs_reserve_blocks(mp, NULL, &out); - if (error) - return -error; - - if (copy_to_user(arg, &out, sizeof(out))) - return -XFS_ERROR(EFAULT); - - return 0; - } - - case XFS_IOC_FSGROWFSDATA: { - xfs_growfs_data_t in; - - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - if (copy_from_user(&in, arg, sizeof(in))) - return -XFS_ERROR(EFAULT); - - error = xfs_growfs_data(mp, &in); - return -error; - } - - case XFS_IOC_FSGROWFSLOG: { - xfs_growfs_log_t in; - - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - if (copy_from_user(&in, arg, sizeof(in))) - return -XFS_ERROR(EFAULT); - - error = xfs_growfs_log(mp, &in); - return -error; - } - - case XFS_IOC_FSGROWFSRT: { - xfs_growfs_rt_t in; - - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - if (copy_from_user(&in, arg, sizeof(in))) - return -XFS_ERROR(EFAULT); - - error = xfs_growfs_rt(mp, &in); - return -error; - } - - case XFS_IOC_FREEZE: - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - if (inode->i_sb->s_frozen == SB_UNFROZEN) - freeze_bdev(inode->i_sb->s_bdev); - return 0; - - case XFS_IOC_THAW: - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - if (inode->i_sb->s_frozen != SB_UNFROZEN) - thaw_bdev(inode->i_sb->s_bdev, inode->i_sb); - return 0; - - case XFS_IOC_GOINGDOWN: { - __uint32_t in; - - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - if (get_user(in, (__uint32_t __user *)arg)) - return -XFS_ERROR(EFAULT); - - error = xfs_fs_goingdown(mp, in); - return -error; - } - - case XFS_IOC_ERROR_INJECTION: { - xfs_error_injection_t in; - - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - if (copy_from_user(&in, arg, sizeof(in))) - return -XFS_ERROR(EFAULT); - - error = xfs_errortag_add(in.errtag, mp); - return -error; - } - - case XFS_IOC_ERROR_CLEARALL: - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - error = xfs_errortag_clearall(mp, 1); - return -error; - - default: - return -ENOTTY; - } -} - STATIC int xfs_ioc_space( struct xfs_inode *ip, @@ -1332,3 +1024,258 @@ xfs_ioc_getbmapx( return 0; } + +int +xfs_ioctl( + xfs_inode_t *ip, + struct file *filp, + int ioflags, + unsigned int cmd, + void __user *arg) +{ + struct inode *inode = filp->f_path.dentry->d_inode; + xfs_mount_t *mp = ip->i_mount; + int error; + + xfs_itrace_entry(XFS_I(inode)); + switch (cmd) { + + case XFS_IOC_ALLOCSP: + case XFS_IOC_FREESP: + case XFS_IOC_RESVSP: + case XFS_IOC_UNRESVSP: + case XFS_IOC_ALLOCSP64: + case XFS_IOC_FREESP64: + case XFS_IOC_RESVSP64: + case XFS_IOC_UNRESVSP64: + /* + * Only allow the sys admin to reserve space unless + * unwritten extents are enabled. + */ + if (!XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb) && + !capable(CAP_SYS_ADMIN)) + return -EPERM; + + return xfs_ioc_space(ip, inode, filp, ioflags, cmd, arg); + + case XFS_IOC_DIOINFO: { + struct dioattr da; + xfs_buftarg_t *target = + XFS_IS_REALTIME_INODE(ip) ? + mp->m_rtdev_targp : mp->m_ddev_targp; + + da.d_mem = da.d_miniosz = 1 << target->bt_sshift; + da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1); + + if (copy_to_user(arg, &da, sizeof(da))) + return -XFS_ERROR(EFAULT); + return 0; + } + + case XFS_IOC_FSBULKSTAT_SINGLE: + case XFS_IOC_FSBULKSTAT: + case XFS_IOC_FSINUMBERS: + return xfs_ioc_bulkstat(mp, cmd, arg); + + case XFS_IOC_FSGEOMETRY_V1: + return xfs_ioc_fsgeometry_v1(mp, arg); + + case XFS_IOC_FSGEOMETRY: + return xfs_ioc_fsgeometry(mp, arg); + + case XFS_IOC_GETVERSION: + return put_user(inode->i_generation, (int __user *)arg); + + case XFS_IOC_FSGETXATTR: + return xfs_ioc_fsgetxattr(ip, 0, arg); + case XFS_IOC_FSGETXATTRA: + return xfs_ioc_fsgetxattr(ip, 1, arg); + case XFS_IOC_GETXFLAGS: + case XFS_IOC_SETXFLAGS: + case XFS_IOC_FSSETXATTR: + return xfs_ioc_xattr(ip, filp, cmd, arg); + + case XFS_IOC_FSSETDM: { + struct fsdmidata dmi; + + if (copy_from_user(&dmi, arg, sizeof(dmi))) + return -XFS_ERROR(EFAULT); + + error = xfs_set_dmattrs(ip, dmi.fsd_dmevmask, + dmi.fsd_dmstate); + return -error; + } + + case XFS_IOC_GETBMAP: + case XFS_IOC_GETBMAPA: + return xfs_ioc_getbmap(ip, ioflags, cmd, arg); + + case XFS_IOC_GETBMAPX: + return xfs_ioc_getbmapx(ip, arg); + + case XFS_IOC_FD_TO_HANDLE: + case XFS_IOC_PATH_TO_HANDLE: + case XFS_IOC_PATH_TO_FSHANDLE: + return xfs_find_handle(cmd, arg); + + case XFS_IOC_OPEN_BY_HANDLE: + return xfs_open_by_handle(mp, arg, filp, inode); + + case XFS_IOC_FSSETDM_BY_HANDLE: + return xfs_fssetdm_by_handle(mp, arg, inode); + + case XFS_IOC_READLINK_BY_HANDLE: + return xfs_readlink_by_handle(mp, arg, inode); + + case XFS_IOC_ATTRLIST_BY_HANDLE: + return xfs_attrlist_by_handle(mp, arg, inode); + + case XFS_IOC_ATTRMULTI_BY_HANDLE: + return xfs_attrmulti_by_handle(mp, arg, inode); + + case XFS_IOC_SWAPEXT: { + error = xfs_swapext((struct xfs_swapext __user *)arg); + return -error; + } + + case XFS_IOC_FSCOUNTS: { + xfs_fsop_counts_t out; + + error = xfs_fs_counts(mp, &out); + if (error) + return -error; + + if (copy_to_user(arg, &out, sizeof(out))) + return -XFS_ERROR(EFAULT); + return 0; + } + + case XFS_IOC_SET_RESBLKS: { + xfs_fsop_resblks_t inout; + __uint64_t in; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (copy_from_user(&inout, arg, sizeof(inout))) + return -XFS_ERROR(EFAULT); + + /* input parameter is passed in resblks field of structure */ + in = inout.resblks; + error = xfs_reserve_blocks(mp, &in, &inout); + if (error) + return -error; + + if (copy_to_user(arg, &inout, sizeof(inout))) + return -XFS_ERROR(EFAULT); + return 0; + } + + case XFS_IOC_GET_RESBLKS: { + xfs_fsop_resblks_t out; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + error = xfs_reserve_blocks(mp, NULL, &out); + if (error) + return -error; + + if (copy_to_user(arg, &out, sizeof(out))) + return -XFS_ERROR(EFAULT); + + return 0; + } + + case XFS_IOC_FSGROWFSDATA: { + xfs_growfs_data_t in; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (copy_from_user(&in, arg, sizeof(in))) + return -XFS_ERROR(EFAULT); + + error = xfs_growfs_data(mp, &in); + return -error; + } + + case XFS_IOC_FSGROWFSLOG: { + xfs_growfs_log_t in; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (copy_from_user(&in, arg, sizeof(in))) + return -XFS_ERROR(EFAULT); + + error = xfs_growfs_log(mp, &in); + return -error; + } + + case XFS_IOC_FSGROWFSRT: { + xfs_growfs_rt_t in; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (copy_from_user(&in, arg, sizeof(in))) + return -XFS_ERROR(EFAULT); + + error = xfs_growfs_rt(mp, &in); + return -error; + } + + case XFS_IOC_FREEZE: + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (inode->i_sb->s_frozen == SB_UNFROZEN) + freeze_bdev(inode->i_sb->s_bdev); + return 0; + + case XFS_IOC_THAW: + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + if (inode->i_sb->s_frozen != SB_UNFROZEN) + thaw_bdev(inode->i_sb->s_bdev, inode->i_sb); + return 0; + + case XFS_IOC_GOINGDOWN: { + __uint32_t in; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (get_user(in, (__uint32_t __user *)arg)) + return -XFS_ERROR(EFAULT); + + error = xfs_fs_goingdown(mp, in); + return -error; + } + + case XFS_IOC_ERROR_INJECTION: { + xfs_error_injection_t in; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + if (copy_from_user(&in, arg, sizeof(in))) + return -XFS_ERROR(EFAULT); + + error = xfs_errortag_add(in.errtag, mp); + return -error; + } + + case XFS_IOC_ERROR_CLEARALL: + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + error = xfs_errortag_clearall(mp, 1); + return -error; + + default: + return -ENOTTY; + } +} + From owner-xfs@oss.sgi.com Sat Feb 9 11:50:32 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 09 Feb 2008 11:50:35 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_45, J_CHICKENPOX_63,J_CHICKENPOX_64 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m19JoUeS004602 for ; Sat, 9 Feb 2008 11:50:32 -0800 X-ASG-Debug-ID: 1202586651-3e0b02d60000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0F87F5BB789 for ; Sat, 9 Feb 2008 11:50:51 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id mIgOffhTMHf4ZH8R for ; Sat, 09 Feb 2008 11:50:51 -0800 (PST) Received: from Liberator.local (sandeen.net [209.173.210.139]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 9E54218008605 for ; Sat, 9 Feb 2008 13:50:50 -0600 (CST) Message-ID: <47AE0419.6050000@sandeen.net> Date: Sat, 09 Feb 2008 13:50:49 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: xfs-oss X-ASG-Orig-Subj: [PATCH, Updated] remove shouting-indirection macros from xfs_sb.h Subject: [PATCH, Updated] remove shouting-indirection macros from xfs_sb.h References: <47AD44D3.4060503@sandeen.net> In-Reply-To: <47AD44D3.4060503@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202586653 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41826 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5761/Sat Feb 9 10:02:33 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14390 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs (missed one instance of XFS_SB_VERSION_HASLOGV2) Remove macro-to-small-function indirection from xfs_sb.h, and remove some which are completely unused. Some day I'll get them all... Signed-off-by: Eric Sandeen --- Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ioctl.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_ioctl.c +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ioctl.c @@ -1052,7 +1052,7 @@ xfs_ioctl( * Only allow the sys admin to reserve space unless * unwritten extents are enabled. */ - if (!XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb) && + if (!xfs_sb_version_hasextflgbit(&mp->m_sb) && !capable(CAP_SYS_ADMIN)) return -EPERM; Index: linux-2.6-xfs/fs/xfs/quota/xfs_qm.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/quota/xfs_qm.c +++ linux-2.6-xfs/fs/xfs/quota/xfs_qm.c @@ -1405,13 +1405,13 @@ xfs_qm_qino_alloc( #if defined(DEBUG) && defined(XFS_LOUD_RECOVERY) unsigned oldv = mp->m_sb.sb_versionnum; #endif - ASSERT(!XFS_SB_VERSION_HASQUOTA(&mp->m_sb)); + ASSERT(!xfs_sb_version_hasquota(&mp->m_sb)); ASSERT((sbfields & (XFS_SB_VERSIONNUM | XFS_SB_UQUOTINO | XFS_SB_GQUOTINO | XFS_SB_QFLAGS)) == (XFS_SB_VERSIONNUM | XFS_SB_UQUOTINO | XFS_SB_GQUOTINO | XFS_SB_QFLAGS)); - XFS_SB_VERSION_ADDQUOTA(&mp->m_sb); + xfs_sb_version_addquota(&mp->m_sb); mp->m_sb.sb_uquotino = NULLFSINO; mp->m_sb.sb_gquotino = NULLFSINO; @@ -1954,7 +1954,7 @@ xfs_qm_init_quotainos( /* * Get the uquota and gquota inodes */ - if (XFS_SB_VERSION_HASQUOTA(&mp->m_sb)) { + if (xfs_sb_version_hasquota(&mp->m_sb)) { if (XFS_IS_UQUOTA_ON(mp) && mp->m_sb.sb_uquotino != NULLFSINO) { ASSERT(mp->m_sb.sb_uquotino > 0); Index: linux-2.6-xfs/fs/xfs/quota/xfs_qm_bhv.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/quota/xfs_qm_bhv.c +++ linux-2.6-xfs/fs/xfs/quota/xfs_qm_bhv.c @@ -118,7 +118,7 @@ xfs_qm_newmount( *quotaflags = 0; *needquotamount = B_FALSE; - quotaondisk = XFS_SB_VERSION_HASQUOTA(&mp->m_sb) && + quotaondisk = xfs_sb_version_hasquota(&mp->m_sb) && (mp->m_sb.sb_qflags & XFS_ALL_QUOTA_ACCT); if (quotaondisk) { Index: linux-2.6-xfs/fs/xfs/quota/xfs_qm_syscalls.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/quota/xfs_qm_syscalls.c +++ linux-2.6-xfs/fs/xfs/quota/xfs_qm_syscalls.c @@ -377,7 +377,7 @@ xfs_qm_scall_trunc_qfiles( if (!capable(CAP_SYS_ADMIN)) return XFS_ERROR(EPERM); error = 0; - if (!XFS_SB_VERSION_HASQUOTA(&mp->m_sb) || flags == 0) { + if (!xfs_sb_version_hasquota(&mp->m_sb) || flags == 0) { qdprintk("qtrunc flags=%x m_qflags=%x\n", flags, mp->m_qflags); return XFS_ERROR(EINVAL); } @@ -522,7 +522,7 @@ xfs_qm_scall_getqstat( memset(out, 0, sizeof(fs_quota_stat_t)); out->qs_version = FS_QSTAT_VERSION; - if (! XFS_SB_VERSION_HASQUOTA(&mp->m_sb)) { + if (! xfs_sb_version_hasquota(&mp->m_sb)) { out->qs_uquota.qfs_ino = NULLFSINO; out->qs_gquota.qfs_ino = NULLFSINO; return (0); Index: linux-2.6-xfs/fs/xfs/xfs_attr_leaf.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_attr_leaf.c +++ linux-2.6-xfs/fs/xfs/xfs_attr_leaf.c @@ -227,10 +227,10 @@ STATIC void xfs_sbversion_add_attr2(xfs_mount_t *mp, xfs_trans_t *tp) { if ((mp->m_flags & XFS_MOUNT_ATTR2) && - !(XFS_SB_VERSION_HASATTR2(&mp->m_sb))) { + !(xfs_sb_version_hasattr2(&mp->m_sb))) { spin_lock(&mp->m_sb_lock); - if (!XFS_SB_VERSION_HASATTR2(&mp->m_sb)) { - XFS_SB_VERSION_ADDATTR2(&mp->m_sb); + if (!xfs_sb_version_hasattr2(&mp->m_sb)) { + xfs_sb_version_addattr2(&mp->m_sb); spin_unlock(&mp->m_sb_lock); xfs_mod_sb(tp, XFS_SB_VERSIONNUM | XFS_SB_FEATURES2); } else Index: linux-2.6-xfs/fs/xfs/xfs_bmap.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap.c +++ linux-2.6-xfs/fs/xfs/xfs_bmap.c @@ -4047,17 +4047,17 @@ xfs_bmap_add_attrfork( xfs_trans_log_inode(tp, ip, logflags); if (error) goto error2; - if (!XFS_SB_VERSION_HASATTR(&mp->m_sb) || - (!XFS_SB_VERSION_HASATTR2(&mp->m_sb) && version == 2)) { + if (!xfs_sb_version_hasattr(&mp->m_sb) || + (!xfs_sb_version_hasattr2(&mp->m_sb) && version == 2)) { __int64_t sbfields = 0; spin_lock(&mp->m_sb_lock); - if (!XFS_SB_VERSION_HASATTR(&mp->m_sb)) { - XFS_SB_VERSION_ADDATTR(&mp->m_sb); + if (!xfs_sb_version_hasattr(&mp->m_sb)) { + xfs_sb_version_addattr(&mp->m_sb); sbfields |= XFS_SB_VERSIONNUM; } - if (!XFS_SB_VERSION_HASATTR2(&mp->m_sb) && version == 2) { - XFS_SB_VERSION_ADDATTR2(&mp->m_sb); + if (!xfs_sb_version_hasattr2(&mp->m_sb) && version == 2) { + xfs_sb_version_addattr2(&mp->m_sb); sbfields |= (XFS_SB_VERSIONNUM | XFS_SB_FEATURES2); } if (sbfields) { @@ -5043,7 +5043,7 @@ xfs_bmapi( * A wasdelay extent has been initialized, so * shouldn't be flagged as unwritten. */ - if (wr && XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb)) { + if (wr && xfs_sb_version_hasextflgbit(&mp->m_sb)) { if (!wasdelay && (flags & XFS_BMAPI_PREALLOC)) got.br_state = XFS_EXT_UNWRITTEN; } @@ -5483,7 +5483,7 @@ xfs_bunmapi( * get rid of part of a realtime extent. */ if (del.br_state == XFS_EXT_UNWRITTEN || - !XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb)) { + !xfs_sb_version_hasextflgbit(&mp->m_sb)) { /* * This piece is unwritten, or we're not * using unwritten extents. Skip over it. @@ -5535,7 +5535,7 @@ xfs_bunmapi( } else if ((del.br_startoff == start && (del.br_state == XFS_EXT_UNWRITTEN || xfs_trans_get_block_res(tp) == 0)) || - !XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb)) { + !xfs_sb_version_hasextflgbit(&mp->m_sb)) { /* * Can't make it unwritten. There isn't * a full extent here so just skip it. Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.h +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h @@ -120,7 +120,7 @@ typedef enum { * Extent state and extent format macros. */ #define XFS_EXTFMT_INODE(x) \ - (XFS_SB_VERSION_HASEXTFLGBIT(&((x)->i_mount->m_sb)) ? \ + (xfs_sb_version_hasextflgbit(&((x)->i_mount->m_sb)) ? \ XFS_EXTFMT_HASSTATE : XFS_EXTFMT_NOSTATE) #define ISUNWRITTEN(x) ((x)->br_state == XFS_EXT_UNWRITTEN) Index: linux-2.6-xfs/fs/xfs/xfs_dir2.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_dir2.c +++ linux-2.6-xfs/fs/xfs/xfs_dir2.c @@ -49,7 +49,7 @@ void xfs_dir_mount( xfs_mount_t *mp) { - ASSERT(XFS_SB_VERSION_HASDIRV2(&mp->m_sb)); + ASSERT(xfs_sb_version_hasdirv2(&mp->m_sb)); ASSERT((1 << (mp->m_sb.sb_blocklog + mp->m_sb.sb_dirblklog)) <= XFS_MAX_BLOCKSIZE); mp->m_dirblksize = 1 << (mp->m_sb.sb_blocklog + mp->m_sb.sb_dirblklog); Index: linux-2.6-xfs/fs/xfs/xfs_fsops.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_fsops.c +++ linux-2.6-xfs/fs/xfs/xfs_fsops.c @@ -77,36 +77,36 @@ xfs_fs_geometry( if (new_version >= 3) { geo->version = XFS_FSOP_GEOM_VERSION; geo->flags = - (XFS_SB_VERSION_HASATTR(&mp->m_sb) ? + (xfs_sb_version_hasattr(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_ATTR : 0) | - (XFS_SB_VERSION_HASNLINK(&mp->m_sb) ? + (xfs_sb_version_hasnlink(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_NLINK : 0) | - (XFS_SB_VERSION_HASQUOTA(&mp->m_sb) ? + (xfs_sb_version_hasquota(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_QUOTA : 0) | - (XFS_SB_VERSION_HASALIGN(&mp->m_sb) ? + (xfs_sb_version_hasalign(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_IALIGN : 0) | - (XFS_SB_VERSION_HASDALIGN(&mp->m_sb) ? + (xfs_sb_version_hasdalign(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_DALIGN : 0) | - (XFS_SB_VERSION_HASSHARED(&mp->m_sb) ? + (xfs_sb_version_hasshared(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_SHARED : 0) | - (XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb) ? + (xfs_sb_version_hasextflgbit(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_EXTFLG : 0) | - (XFS_SB_VERSION_HASDIRV2(&mp->m_sb) ? + (xfs_sb_version_hasdirv2(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_DIRV2 : 0) | - (XFS_SB_VERSION_HASSECTOR(&mp->m_sb) ? + (xfs_sb_version_hassector(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_SECTOR : 0) | (xfs_sb_version_haslazysbcount(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_LAZYSB : 0) | - (XFS_SB_VERSION_HASATTR2(&mp->m_sb) ? + (xfs_sb_version_hasattr2(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_ATTR2 : 0); - geo->logsectsize = XFS_SB_VERSION_HASSECTOR(&mp->m_sb) ? + geo->logsectsize = xfs_sb_version_hassector(&mp->m_sb) ? mp->m_sb.sb_logsectsize : BBSIZE; geo->rtsectsize = mp->m_sb.sb_blocksize; geo->dirblocksize = mp->m_dirblksize; } if (new_version >= 4) { geo->flags |= - (XFS_SB_VERSION_HASLOGV2(&mp->m_sb) ? + (xfs_sb_version_haslogv2(&mp->m_sb) ? XFS_FSOP_GEOM_FLAGS_LOGV2 : 0); geo->logsunit = mp->m_sb.sb_logsunit; } Index: linux-2.6-xfs/fs/xfs/xfs_ialloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc.c +++ linux-2.6-xfs/fs/xfs/xfs_ialloc.c @@ -191,7 +191,7 @@ xfs_ialloc_ag_alloc( ASSERT(!(args.mp->m_flags & XFS_MOUNT_NOALIGN)); args.alignment = args.mp->m_dalign; isaligned = 1; - } else if (XFS_SB_VERSION_HASALIGN(&args.mp->m_sb) && + } else if (xfs_sb_version_hasalign(&args.mp->m_sb) && args.mp->m_sb.sb_inoalignmt >= XFS_B_TO_FSBT(args.mp, XFS_INODE_CLUSTER_SIZE(args.mp))) @@ -230,7 +230,7 @@ xfs_ialloc_ag_alloc( args.agbno = be32_to_cpu(agi->agi_root); args.fsbno = XFS_AGB_TO_FSB(args.mp, be32_to_cpu(agi->agi_seqno), args.agbno); - if (XFS_SB_VERSION_HASALIGN(&args.mp->m_sb) && + if (xfs_sb_version_hasalign(&args.mp->m_sb) && args.mp->m_sb.sb_inoalignmt >= XFS_B_TO_FSBT(args.mp, XFS_INODE_CLUSTER_SIZE(args.mp))) args.alignment = args.mp->m_sb.sb_inoalignmt; @@ -271,7 +271,7 @@ xfs_ialloc_ag_alloc( * use the old version so that old kernels will continue to be * able to use the file system. */ - if (XFS_SB_VERSION_HASNLINK(&args.mp->m_sb)) + if (xfs_sb_version_hasnlink(&args.mp->m_sb)) version = XFS_DINODE_VERSION_2; else version = XFS_DINODE_VERSION_1; Index: linux-2.6-xfs/fs/xfs/xfs_inode.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_inode.c +++ linux-2.6-xfs/fs/xfs/xfs_inode.c @@ -1147,7 +1147,7 @@ xfs_ialloc( * the inode version number now. This way we only do the conversion * here rather than here and in the flush/logging code. */ - if (XFS_SB_VERSION_HASNLINK(&tp->t_mountp->m_sb) && + if (xfs_sb_version_hasnlink(&tp->t_mountp->m_sb) && ip->i_d.di_version == XFS_DINODE_VERSION_1) { ip->i_d.di_version = XFS_DINODE_VERSION_2; /* @@ -3434,9 +3434,9 @@ xfs_iflush_int( * has been updated, then make the conversion permanent. */ ASSERT(ip->i_d.di_version == XFS_DINODE_VERSION_1 || - XFS_SB_VERSION_HASNLINK(&mp->m_sb)); + xfs_sb_version_hasnlink(&mp->m_sb)); if (ip->i_d.di_version == XFS_DINODE_VERSION_1) { - if (!XFS_SB_VERSION_HASNLINK(&mp->m_sb)) { + if (!xfs_sb_version_hasnlink(&mp->m_sb)) { /* * Convert it back. */ Index: linux-2.6-xfs/fs/xfs/xfs_inode_item.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_inode_item.c +++ linux-2.6-xfs/fs/xfs/xfs_inode_item.c @@ -296,9 +296,9 @@ xfs_inode_item_format( */ mp = ip->i_mount; ASSERT(ip->i_d.di_version == XFS_DINODE_VERSION_1 || - XFS_SB_VERSION_HASNLINK(&mp->m_sb)); + xfs_sb_version_hasnlink(&mp->m_sb)); if (ip->i_d.di_version == XFS_DINODE_VERSION_1) { - if (!XFS_SB_VERSION_HASNLINK(&mp->m_sb)) { + if (!xfs_sb_version_hasnlink(&mp->m_sb)) { /* * Convert it back. */ Index: linux-2.6-xfs/fs/xfs/xfs_itable.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_itable.c +++ linux-2.6-xfs/fs/xfs/xfs_itable.c @@ -45,7 +45,7 @@ xfs_internal_inum( xfs_ino_t ino) { return (ino == mp->m_sb.sb_rbmino || ino == mp->m_sb.sb_rsumino || - (XFS_SB_VERSION_HASQUOTA(&mp->m_sb) && + (xfs_sb_version_hasquota(&mp->m_sb) && (ino == mp->m_sb.sb_uquotino || ino == mp->m_sb.sb_gquotino))); } Index: linux-2.6-xfs/fs/xfs/xfs_log.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_log.c +++ linux-2.6-xfs/fs/xfs/xfs_log.c @@ -1090,7 +1090,7 @@ xlog_get_iclog_buffer_size(xfs_mount_t * size >>= 1; } - if (XFS_SB_VERSION_HASLOGV2(&mp->m_sb)) { + if (xfs_sb_version_haslogv2(&mp->m_sb)) { /* # headers = size / 32K * one header holds cycles from 32K of data */ @@ -1186,13 +1186,13 @@ xlog_alloc_log(xfs_mount_t *mp, log->l_grant_reserve_cycle = 1; log->l_grant_write_cycle = 1; - if (XFS_SB_VERSION_HASSECTOR(&mp->m_sb)) { + if (xfs_sb_version_hassector(&mp->m_sb)) { log->l_sectbb_log = mp->m_sb.sb_logsectlog - BBSHIFT; ASSERT(log->l_sectbb_log <= mp->m_sectbb_log); /* for larger sector sizes, must have v2 or external log */ ASSERT(log->l_sectbb_log == 0 || log->l_logBBstart == 0 || - XFS_SB_VERSION_HASLOGV2(&mp->m_sb)); + xfs_sb_version_haslogv2(&mp->m_sb)); ASSERT(mp->m_sb.sb_logsectlog >= BBSHIFT); } log->l_sectbb_mask = (1 << log->l_sectbb_log) - 1; @@ -1247,7 +1247,7 @@ xlog_alloc_log(xfs_mount_t *mp, memset(head, 0, sizeof(xlog_rec_header_t)); head->h_magicno = cpu_to_be32(XLOG_HEADER_MAGIC_NUM); head->h_version = cpu_to_be32( - XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) ? 2 : 1); + xfs_sb_version_haslogv2(&log->l_mp->m_sb) ? 2 : 1); head->h_size = cpu_to_be32(log->l_iclog_size); /* new fields */ head->h_fmt = cpu_to_be32(XLOG_FMT); @@ -1402,7 +1402,7 @@ xlog_sync(xlog_t *log, int roundoff; /* roundoff to BB or stripe */ int split = 0; /* split write into two regions */ int error; - int v2 = XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb); + int v2 = xfs_sb_version_haslogv2(&log->l_mp->m_sb); XFS_STATS_INC(xs_log_writes); ASSERT(iclog->ic_refcnt == 0); @@ -2881,7 +2881,7 @@ xlog_state_switch_iclogs(xlog_t *log, log->l_curr_block += BTOBB(eventual_size)+BTOBB(log->l_iclog_hsize); /* Round up to next log-sunit */ - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) && + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb) && log->l_mp->m_sb.sb_logsunit > 1) { __uint32_t sunit_bb = BTOBB(log->l_mp->m_sb.sb_logsunit); log->l_curr_block = roundup(log->l_curr_block, sunit_bb); @@ -3334,7 +3334,7 @@ xlog_ticket_get(xlog_t *log, unit_bytes += sizeof(xlog_op_header_t) * num_headers; /* for roundoff padding for transaction data and one for commit record */ - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) && + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb) && log->l_mp->m_sb.sb_logsunit > 1) { /* log su roundoff */ unit_bytes += 2*log->l_mp->m_sb.sb_logsunit; Index: linux-2.6-xfs/fs/xfs/xfs_log_priv.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_log_priv.h +++ linux-2.6-xfs/fs/xfs/xfs_log_priv.h @@ -49,10 +49,10 @@ struct xfs_mount; #define XLOG_HEADER_SIZE 512 #define XLOG_REC_SHIFT(log) \ - BTOBB(1 << (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) ? \ + BTOBB(1 << (xfs_sb_version_haslogv2(&log->l_mp->m_sb) ? \ XLOG_MAX_RECORD_BSHIFT : XLOG_BIG_RECORD_BSHIFT)) #define XLOG_TOTAL_REC_SHIFT(log) \ - BTOBB(XLOG_MAX_ICLOGS << (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) ? \ + BTOBB(XLOG_MAX_ICLOGS << (xfs_sb_version_haslogv2(&log->l_mp->m_sb) ? \ XLOG_MAX_RECORD_BSHIFT : XLOG_BIG_RECORD_BSHIFT)) Index: linux-2.6-xfs/fs/xfs/xfs_log_recover.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_log_recover.c +++ linux-2.6-xfs/fs/xfs/xfs_log_recover.c @@ -478,7 +478,7 @@ xlog_find_verify_log_record( * reset last_blk. Only when last_blk points in the middle of a log * record do we update last_blk. */ - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { uint h_size = be32_to_cpu(head->h_size); xhdrs = h_size / XLOG_HEADER_CYCLE_SIZE; @@ -888,7 +888,7 @@ xlog_find_tail( * unmount record if there is one, so we pass the lsn of the * unmount record rather than the block after it. */ - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { int h_size = be32_to_cpu(rhead->h_size); int h_version = be32_to_cpu(rhead->h_version); @@ -1101,7 +1101,7 @@ xlog_add_record( recp->h_magicno = cpu_to_be32(XLOG_HEADER_MAGIC_NUM); recp->h_cycle = cpu_to_be32(cycle); recp->h_version = cpu_to_be32( - XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb) ? 2 : 1); + xfs_sb_version_haslogv2(&log->l_mp->m_sb) ? 2 : 1); recp->h_lsn = cpu_to_be64(xlog_assign_lsn(cycle, block)); recp->h_tail_lsn = cpu_to_be64(xlog_assign_lsn(tail_cycle, tail_block)); recp->h_fmt = cpu_to_be32(XLOG_FMT); @@ -3348,7 +3348,7 @@ xlog_pack_data( dp += BBSIZE; } - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { xhdr = (xlog_in_core_2_t *)&iclog->ic_header; for ( ; i < BTOBB(size); i++) { j = i / (XLOG_HEADER_CYCLE_SIZE / BBSIZE); @@ -3388,7 +3388,7 @@ xlog_unpack_data_checksum( be32_to_cpu(rhead->h_chksum), chksum); cmn_err(CE_DEBUG, "XFS: Disregard message if filesystem was created with non-DEBUG kernel"); - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { cmn_err(CE_DEBUG, "XFS: LogR this is a LogV2 filesystem\n"); } @@ -3415,7 +3415,7 @@ xlog_unpack_data( dp += BBSIZE; } - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { xhdr = (xlog_in_core_2_t *)rhead; for ( ; i < BTOBB(be32_to_cpu(rhead->h_len)); i++) { j = i / (XLOG_HEADER_CYCLE_SIZE / BBSIZE); @@ -3494,7 +3494,7 @@ xlog_do_recovery_pass( * Read the header of the tail block and get the iclog buffer size from * h_size. Use this to tell how many sectors make up the log header. */ - if (XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb)) { + if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) { /* * When using variable length iclogs, read first sector of * iclog header and extract the header size from it. Get a @@ -3838,7 +3838,7 @@ xlog_do_recover( sbp = &log->l_mp->m_sb; xfs_sb_from_disk(sbp, XFS_BUF_TO_SBP(bp)); ASSERT(sbp->sb_magicnum == XFS_SB_MAGIC); - ASSERT(XFS_SB_GOOD_VERSION(sbp)); + ASSERT(xfs_sb_good_version(sbp)); xfs_buf_relse(bp); /* We've re-read the superblock so re-initialize per-cpu counters */ Index: linux-2.6-xfs/fs/xfs/xfs_mount.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_mount.c +++ linux-2.6-xfs/fs/xfs/xfs_mount.c @@ -225,7 +225,7 @@ xfs_mount_validate_sb( return XFS_ERROR(EWRONGFS); } - if (!XFS_SB_GOOD_VERSION(sbp)) { + if (!xfs_sb_good_version(sbp)) { xfs_fs_mount_cmn_err(flags, "bad version"); return XFS_ERROR(EWRONGFS); } @@ -300,7 +300,7 @@ xfs_mount_validate_sb( /* * Version 1 directory format has never worked on Linux. */ - if (unlikely(!XFS_SB_VERSION_HASDIRV2(sbp))) { + if (unlikely(!xfs_sb_version_hasdirv2(sbp))) { xfs_fs_mount_cmn_err(flags, "file system using version 1 directory format"); return XFS_ERROR(ENOSYS); @@ -781,7 +781,7 @@ xfs_update_alignment(xfs_mount_t *mp, in * Update superblock with new values * and log changes */ - if (XFS_SB_VERSION_HASDALIGN(sbp)) { + if (xfs_sb_version_hasdalign(sbp)) { if (sbp->sb_unit != mp->m_dalign) { sbp->sb_unit = mp->m_dalign; *update_flags |= XFS_SB_UNIT; @@ -792,7 +792,7 @@ xfs_update_alignment(xfs_mount_t *mp, in } } } else if ((mp->m_flags & XFS_MOUNT_NOALIGN) != XFS_MOUNT_NOALIGN && - XFS_SB_VERSION_HASDALIGN(&mp->m_sb)) { + xfs_sb_version_hasdalign(&mp->m_sb)) { mp->m_dalign = sbp->sb_unit; mp->m_swidth = sbp->sb_width; } @@ -869,7 +869,7 @@ xfs_set_rw_sizes(xfs_mount_t *mp) STATIC void xfs_set_inoalignment(xfs_mount_t *mp) { - if (XFS_SB_VERSION_HASALIGN(&mp->m_sb) && + if (xfs_sb_version_hasalign(&mp->m_sb) && mp->m_sb.sb_inoalignmt >= XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) mp->m_inoalign_mask = mp->m_sb.sb_inoalignmt - 1; Index: linux-2.6-xfs/fs/xfs/xfs_sb.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_sb.h +++ linux-2.6-xfs/fs/xfs/xfs_sb.h @@ -271,7 +271,6 @@ typedef enum { #define XFS_SB_VERSION_NUM(sbp) ((sbp)->sb_versionnum & XFS_SB_VERSION_NUMBITS) -#define XFS_SB_GOOD_VERSION(sbp) xfs_sb_good_version(sbp) #ifdef __KERNEL__ static inline int xfs_sb_good_version(xfs_sb_t *sbp) { @@ -297,7 +296,6 @@ static inline int xfs_sb_good_version(xf } #endif /* __KERNEL__ */ -#define XFS_SB_VERSION_TONEW(v) xfs_sb_version_tonew(v) static inline unsigned xfs_sb_version_tonew(unsigned v) { return ((((v) == XFS_SB_VERSION_1) ? \ @@ -308,7 +306,6 @@ static inline unsigned xfs_sb_version_to XFS_SB_VERSION_4); } -#define XFS_SB_VERSION_TOOLD(v) xfs_sb_version_toold(v) static inline unsigned xfs_sb_version_toold(unsigned v) { return (((v) & (XFS_SB_VERSION_QUOTABIT | XFS_SB_VERSION_ALIGNBIT)) ? \ @@ -320,7 +317,6 @@ static inline unsigned xfs_sb_version_to XFS_SB_VERSION_1))); } -#define XFS_SB_VERSION_HASATTR(sbp) xfs_sb_version_hasattr(sbp) static inline int xfs_sb_version_hasattr(xfs_sb_t *sbp) { return ((sbp)->sb_versionnum == XFS_SB_VERSION_2) || \ @@ -329,7 +325,6 @@ static inline int xfs_sb_version_hasattr ((sbp)->sb_versionnum & XFS_SB_VERSION_ATTRBIT)); } -#define XFS_SB_VERSION_ADDATTR(sbp) xfs_sb_version_addattr(sbp) static inline void xfs_sb_version_addattr(xfs_sb_t *sbp) { (sbp)->sb_versionnum = (((sbp)->sb_versionnum == XFS_SB_VERSION_1) ? \ @@ -339,7 +334,6 @@ static inline void xfs_sb_version_addatt (XFS_SB_VERSION_4 | XFS_SB_VERSION_ATTRBIT))); } -#define XFS_SB_VERSION_HASNLINK(sbp) xfs_sb_version_hasnlink(sbp) static inline int xfs_sb_version_hasnlink(xfs_sb_t *sbp) { return ((sbp)->sb_versionnum == XFS_SB_VERSION_3) || \ @@ -347,7 +341,6 @@ static inline int xfs_sb_version_hasnlin ((sbp)->sb_versionnum & XFS_SB_VERSION_NLINKBIT)); } -#define XFS_SB_VERSION_ADDNLINK(sbp) xfs_sb_version_addnlink(sbp) static inline void xfs_sb_version_addnlink(xfs_sb_t *sbp) { (sbp)->sb_versionnum = ((sbp)->sb_versionnum <= XFS_SB_VERSION_2 ? \ @@ -355,115 +348,63 @@ static inline void xfs_sb_version_addnli ((sbp)->sb_versionnum | XFS_SB_VERSION_NLINKBIT)); } -#define XFS_SB_VERSION_HASQUOTA(sbp) xfs_sb_version_hasquota(sbp) static inline int xfs_sb_version_hasquota(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_QUOTABIT); } -#define XFS_SB_VERSION_ADDQUOTA(sbp) xfs_sb_version_addquota(sbp) static inline void xfs_sb_version_addquota(xfs_sb_t *sbp) { (sbp)->sb_versionnum = \ (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 ? \ ((sbp)->sb_versionnum | XFS_SB_VERSION_QUOTABIT) : \ - (XFS_SB_VERSION_TONEW((sbp)->sb_versionnum) | \ + (xfs_sb_version_tonew((sbp)->sb_versionnum) | \ XFS_SB_VERSION_QUOTABIT)); } -#define XFS_SB_VERSION_HASALIGN(sbp) xfs_sb_version_hasalign(sbp) static inline int xfs_sb_version_hasalign(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_ALIGNBIT); } -#define XFS_SB_VERSION_SUBALIGN(sbp) xfs_sb_version_subalign(sbp) -static inline void xfs_sb_version_subalign(xfs_sb_t *sbp) -{ - (sbp)->sb_versionnum = \ - XFS_SB_VERSION_TOOLD((sbp)->sb_versionnum & ~XFS_SB_VERSION_ALIGNBIT); -} - -#define XFS_SB_VERSION_HASDALIGN(sbp) xfs_sb_version_hasdalign(sbp) static inline int xfs_sb_version_hasdalign(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_DALIGNBIT); } -#define XFS_SB_VERSION_ADDDALIGN(sbp) xfs_sb_version_adddalign(sbp) -static inline int xfs_sb_version_adddalign(xfs_sb_t *sbp) -{ - return (sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum | XFS_SB_VERSION_DALIGNBIT); -} - -#define XFS_SB_VERSION_HASSHARED(sbp) xfs_sb_version_hasshared(sbp) static inline int xfs_sb_version_hasshared(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_SHAREDBIT); } -#define XFS_SB_VERSION_ADDSHARED(sbp) xfs_sb_version_addshared(sbp) -static inline int xfs_sb_version_addshared(xfs_sb_t *sbp) -{ - return (sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum | XFS_SB_VERSION_SHAREDBIT); -} - -#define XFS_SB_VERSION_SUBSHARED(sbp) xfs_sb_version_subshared(sbp) -static inline int xfs_sb_version_subshared(xfs_sb_t *sbp) -{ - return (sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum & ~XFS_SB_VERSION_SHAREDBIT); -} - -#define XFS_SB_VERSION_HASDIRV2(sbp) xfs_sb_version_hasdirv2(sbp) static inline int xfs_sb_version_hasdirv2(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_DIRV2BIT); } -#define XFS_SB_VERSION_HASLOGV2(sbp) xfs_sb_version_haslogv2(sbp) static inline int xfs_sb_version_haslogv2(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_LOGV2BIT); } -#define XFS_SB_VERSION_HASEXTFLGBIT(sbp) xfs_sb_version_hasextflgbit(sbp) static inline int xfs_sb_version_hasextflgbit(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_EXTFLGBIT); } -#define XFS_SB_VERSION_ADDEXTFLGBIT(sbp) xfs_sb_version_addextflgbit(sbp) -static inline int xfs_sb_version_addextflgbit(xfs_sb_t *sbp) -{ - return (sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum | XFS_SB_VERSION_EXTFLGBIT); -} - -#define XFS_SB_VERSION_SUBEXTFLGBIT(sbp) xfs_sb_version_subextflgbit(sbp) -static inline int xfs_sb_version_subextflgbit(xfs_sb_t *sbp) -{ - return (sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum & ~XFS_SB_VERSION_EXTFLGBIT); -} - -#define XFS_SB_VERSION_HASSECTOR(sbp) xfs_sb_version_hassector(sbp) static inline int xfs_sb_version_hassector(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ ((sbp)->sb_versionnum & XFS_SB_VERSION_SECTORBIT); } -#define XFS_SB_VERSION_HASMOREBITS(sbp) xfs_sb_version_hasmorebits(sbp) static inline int xfs_sb_version_hasmorebits(xfs_sb_t *sbp) { return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ @@ -476,24 +417,22 @@ static inline int xfs_sb_version_hasmore * For example, for a bit defined as XFS_SB_VERSION2_FUNBIT, has a macro: * * SB_VERSION_HASFUNBIT(xfs_sb_t *sbp) - * ((XFS_SB_VERSION_HASMOREBITS(sbp) && + * ((xfs_sb_version_hasmorebits(sbp) && * ((sbp)->sb_features2 & XFS_SB_VERSION2_FUNBIT) */ static inline int xfs_sb_version_haslazysbcount(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_HASMOREBITS(sbp) && \ + return (xfs_sb_version_hasmorebits(sbp) && \ ((sbp)->sb_features2 & XFS_SB_VERSION2_LAZYSBCOUNTBIT)); } -#define XFS_SB_VERSION_HASATTR2(sbp) xfs_sb_version_hasattr2(sbp) static inline int xfs_sb_version_hasattr2(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_HASMOREBITS(sbp)) && \ + return (xfs_sb_version_hasmorebits(sbp)) && \ ((sbp)->sb_features2 & XFS_SB_VERSION2_ATTR2BIT); } -#define XFS_SB_VERSION_ADDATTR2(sbp) xfs_sb_version_addattr2(sbp) static inline void xfs_sb_version_addattr2(xfs_sb_t *sbp) { ((sbp)->sb_versionnum = \ Index: linux-2.6-xfs/fs/xfs/xfs_utils.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_utils.c +++ linux-2.6-xfs/fs/xfs/xfs_utils.c @@ -339,10 +339,10 @@ xfs_bump_ino_vers2( ip->i_d.di_onlink = 0; memset(&(ip->i_d.di_pad[0]), 0, sizeof(ip->i_d.di_pad)); mp = tp->t_mountp; - if (!XFS_SB_VERSION_HASNLINK(&mp->m_sb)) { + if (!xfs_sb_version_hasnlink(&mp->m_sb)) { spin_lock(&mp->m_sb_lock); - if (!XFS_SB_VERSION_HASNLINK(&mp->m_sb)) { - XFS_SB_VERSION_ADDNLINK(&mp->m_sb); + if (!xfs_sb_version_hasnlink(&mp->m_sb)) { + xfs_sb_version_addnlink(&mp->m_sb); spin_unlock(&mp->m_sb_lock); xfs_mod_sb(tp, XFS_SB_VERSIONNUM); } else { Index: linux-2.6-xfs/fs/xfs/xfs_vfsops.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_vfsops.c +++ linux-2.6-xfs/fs/xfs/xfs_vfsops.c @@ -330,7 +330,7 @@ xfs_finish_flags( int ronly = (mp->m_flags & XFS_MOUNT_RDONLY); /* Fail a mount where the logbuf is smaller then the log stripe */ - if (XFS_SB_VERSION_HASLOGV2(&mp->m_sb)) { + if (xfs_sb_version_haslogv2(&mp->m_sb)) { if ((ap->logbufsize <= 0) && (mp->m_sb.sb_logsunit > XLOG_BIG_RECORD_BSIZE)) { mp->m_logbsize = mp->m_sb.sb_logsunit; @@ -349,7 +349,7 @@ xfs_finish_flags( } } - if (XFS_SB_VERSION_HASATTR2(&mp->m_sb)) { + if (xfs_sb_version_hasattr2(&mp->m_sb)) { mp->m_flags |= XFS_MOUNT_ATTR2; } @@ -366,7 +366,7 @@ xfs_finish_flags( * check for shared mount. */ if (ap->flags & XFSMNT_SHARED) { - if (!XFS_SB_VERSION_HASSHARED(&mp->m_sb)) + if (!xfs_sb_version_hasshared(&mp->m_sb)) return XFS_ERROR(EINVAL); /* @@ -512,7 +512,7 @@ xfs_mount( if (!error && logdev && logdev != ddev) { unsigned int log_sector_size = BBSIZE; - if (XFS_SB_VERSION_HASSECTOR(&mp->m_sb)) + if (xfs_sb_version_hassector(&mp->m_sb)) log_sector_size = mp->m_sb.sb_logsectsize; error = xfs_setsize_buftarg(mp->m_logdev_targp, mp->m_sb.sb_blocksize, Index: linux-2.6-xfs/fs/xfs/xfs_vnodeops.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_vnodeops.c +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.c @@ -4126,7 +4126,7 @@ xfs_free_file_space( * actually need to zero the extent edges. Otherwise xfs_bunmapi * will take care of it for us. */ - if (rt && !XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb)) { + if (rt && !xfs_sb_version_hasextflgbit(&mp->m_sb)) { nimap = 1; error = xfs_bmapi(NULL, ip, startoffset_fsb, 1, 0, NULL, 0, &imap, &nimap, NULL, NULL); From owner-xfs@oss.sgi.com Sat Feb 9 12:45:26 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 09 Feb 2008 12:45:33 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m19KjNlJ011414 for ; Sat, 9 Feb 2008 12:45:26 -0800 X-ASG-Debug-ID: 1202589946-038c02060000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0DDD75BBB96 for ; Sat, 9 Feb 2008 12:45:47 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id o5yDvqtXkKw0dsVn for ; Sat, 09 Feb 2008 12:45:47 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id AE6EA18008605; Sat, 9 Feb 2008 14:45:46 -0600 (CST) Message-ID: <47AE10FA.6020708@sandeen.net> Date: Sat, 09 Feb 2008 14:45:46 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs-oss X-ASG-Orig-Subj: Re: [PATCH] recover from iclog allocation failures Subject: Re: [PATCH] recover from iclog allocation failures References: <47AD3E11.7020608@redhat.com> <20080209063329.GA6840@infradead.org> <47AD4BD9.5030605@sandeen.net> In-Reply-To: <47AD4BD9.5030605@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202589948 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41828 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5761/Sat Feb 9 10:02:33 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14391 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Eric Sandeen wrote: > Christoph Hellwig wrote: >>> Index: linux-2.6.24.noarch/fs/xfs/xfs_mount.c >>> =================================================================== >>> --- linux-2.6.24.noarch.orig/fs/xfs/xfs_mount.c >>> +++ linux-2.6.24.noarch/fs/xfs/xfs_mount.c >>> @@ -1007,6 +1007,7 @@ xfs_mountfs( >>> error = XFS_ERROR(EINVAL); >>> goto error1; >>> } >>> + uuid_mounted = 1; >> How is this related to the rest of the patch? > > if we errored out, we returned from mount w/o taking the uuid back out > of the table. er, I must have messed up the tree, that's already there isn't it... Let me respin this one... :) -Eric From owner-xfs@oss.sgi.com Sat Feb 9 21:04:27 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 09 Feb 2008 21:04:31 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1A54PDT016995 for ; Sat, 9 Feb 2008 21:04:27 -0800 X-ASG-Debug-ID: 1202619887-1c0601550000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A3640DD338C for ; Sat, 9 Feb 2008 21:04:47 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id zLEzkQL1x9uupLo2 for ; Sat, 09 Feb 2008 21:04:47 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JO4Mw-0005ho-SG; Sun, 10 Feb 2008 05:04:46 +0000 Date: Sun, 10 Feb 2008 00:04:46 -0500 From: Christoph Hellwig To: Eric Sandeen Cc: xfs-oss X-ASG-Orig-Subj: Re: [PATCH] remove forward declarations for ioctl helpers; let "noinline" do the work Subject: Re: [PATCH] remove forward declarations for ioctl helpers; let "noinline" do the work Message-ID: <20080210050446.GA12398@infradead.org> References: <47AE0232.9000002@sandeen.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47AE0232.9000002@sandeen.net> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1202619888 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41861 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5761/Sat Feb 9 10:02:33 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14392 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Sat, Feb 09, 2008 at 01:42:42PM -0600, Eric Sandeen wrote: > (if this one is too purely cosmetic I won't be offended) > > The forward declarations for the xfs_ioctl() helpers and > the associated comment about gcc behavior really aren't > needed; all of these functions are marked STATIC which > includes noinline, and the stack usage won't be a problem. > > This effectively just removes the forward declarations and > moves xfs_ioctl() back to the end of the file. Fine in generaly, but I'm a bit worried about the too cosmetic one. If the gods at sgi decide it's worth it please get it in ASAP (and that includes 2.6.25). > > Signed-off-by: Eric Sandeen > > --- > > xfs_ioctl.c | 563 ++++++++++++++++---------------------- > 1 files changed, 255 insertions(+), 308 deletions(-) > > Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ioctl.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_ioctl.c > +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ioctl.c > @@ -651,314 +651,6 @@ xfs_attrmulti_by_handle( > return -error; > } > > -/* prototypes for a few of the stack-hungry cases that have > - * their own functions. Functions are defined after their use > - * so gcc doesn't get fancy and inline them with -03 */ > - > -STATIC int > -xfs_ioc_space( > - struct xfs_inode *ip, > - struct inode *inode, > - struct file *filp, > - int flags, > - unsigned int cmd, > - void __user *arg); > - > -STATIC int > -xfs_ioc_bulkstat( > - xfs_mount_t *mp, > - unsigned int cmd, > - void __user *arg); > - > -STATIC int > -xfs_ioc_fsgeometry_v1( > - xfs_mount_t *mp, > - void __user *arg); > - > -STATIC int > -xfs_ioc_fsgeometry( > - xfs_mount_t *mp, > - void __user *arg); > - > -STATIC int > -xfs_ioc_xattr( > - xfs_inode_t *ip, > - struct file *filp, > - unsigned int cmd, > - void __user *arg); > - > -STATIC int > -xfs_ioc_fsgetxattr( > - xfs_inode_t *ip, > - int attr, > - void __user *arg); > - > -STATIC int > -xfs_ioc_getbmap( > - struct xfs_inode *ip, > - int flags, > - unsigned int cmd, > - void __user *arg); > - > -STATIC int > -xfs_ioc_getbmapx( > - struct xfs_inode *ip, > - void __user *arg); > - > -int > -xfs_ioctl( > - xfs_inode_t *ip, > - struct file *filp, > - int ioflags, > - unsigned int cmd, > - void __user *arg) > -{ > - struct inode *inode = filp->f_path.dentry->d_inode; > - xfs_mount_t *mp = ip->i_mount; > - int error; > - > - xfs_itrace_entry(XFS_I(inode)); > - switch (cmd) { > - > - case XFS_IOC_ALLOCSP: > - case XFS_IOC_FREESP: > - case XFS_IOC_RESVSP: > - case XFS_IOC_UNRESVSP: > - case XFS_IOC_ALLOCSP64: > - case XFS_IOC_FREESP64: > - case XFS_IOC_RESVSP64: > - case XFS_IOC_UNRESVSP64: > - /* > - * Only allow the sys admin to reserve space unless > - * unwritten extents are enabled. > - */ > - if (!XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb) && > - !capable(CAP_SYS_ADMIN)) > - return -EPERM; > - > - return xfs_ioc_space(ip, inode, filp, ioflags, cmd, arg); > - > - case XFS_IOC_DIOINFO: { > - struct dioattr da; > - xfs_buftarg_t *target = > - XFS_IS_REALTIME_INODE(ip) ? > - mp->m_rtdev_targp : mp->m_ddev_targp; > - > - da.d_mem = da.d_miniosz = 1 << target->bt_sshift; > - da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1); > - > - if (copy_to_user(arg, &da, sizeof(da))) > - return -XFS_ERROR(EFAULT); > - return 0; > - } > - > - case XFS_IOC_FSBULKSTAT_SINGLE: > - case XFS_IOC_FSBULKSTAT: > - case XFS_IOC_FSINUMBERS: > - return xfs_ioc_bulkstat(mp, cmd, arg); > - > - case XFS_IOC_FSGEOMETRY_V1: > - return xfs_ioc_fsgeometry_v1(mp, arg); > - > - case XFS_IOC_FSGEOMETRY: > - return xfs_ioc_fsgeometry(mp, arg); > - > - case XFS_IOC_GETVERSION: > - return put_user(inode->i_generation, (int __user *)arg); > - > - case XFS_IOC_FSGETXATTR: > - return xfs_ioc_fsgetxattr(ip, 0, arg); > - case XFS_IOC_FSGETXATTRA: > - return xfs_ioc_fsgetxattr(ip, 1, arg); > - case XFS_IOC_GETXFLAGS: > - case XFS_IOC_SETXFLAGS: > - case XFS_IOC_FSSETXATTR: > - return xfs_ioc_xattr(ip, filp, cmd, arg); > - > - case XFS_IOC_FSSETDM: { > - struct fsdmidata dmi; > - > - if (copy_from_user(&dmi, arg, sizeof(dmi))) > - return -XFS_ERROR(EFAULT); > - > - error = xfs_set_dmattrs(ip, dmi.fsd_dmevmask, > - dmi.fsd_dmstate); > - return -error; > - } > - > - case XFS_IOC_GETBMAP: > - case XFS_IOC_GETBMAPA: > - return xfs_ioc_getbmap(ip, ioflags, cmd, arg); > - > - case XFS_IOC_GETBMAPX: > - return xfs_ioc_getbmapx(ip, arg); > - > - case XFS_IOC_FD_TO_HANDLE: > - case XFS_IOC_PATH_TO_HANDLE: > - case XFS_IOC_PATH_TO_FSHANDLE: > - return xfs_find_handle(cmd, arg); > - > - case XFS_IOC_OPEN_BY_HANDLE: > - return xfs_open_by_handle(mp, arg, filp, inode); > - > - case XFS_IOC_FSSETDM_BY_HANDLE: > - return xfs_fssetdm_by_handle(mp, arg, inode); > - > - case XFS_IOC_READLINK_BY_HANDLE: > - return xfs_readlink_by_handle(mp, arg, inode); > - > - case XFS_IOC_ATTRLIST_BY_HANDLE: > - return xfs_attrlist_by_handle(mp, arg, inode); > - > - case XFS_IOC_ATTRMULTI_BY_HANDLE: > - return xfs_attrmulti_by_handle(mp, arg, inode); > - > - case XFS_IOC_SWAPEXT: { > - error = xfs_swapext((struct xfs_swapext __user *)arg); > - return -error; > - } > - > - case XFS_IOC_FSCOUNTS: { > - xfs_fsop_counts_t out; > - > - error = xfs_fs_counts(mp, &out); > - if (error) > - return -error; > - > - if (copy_to_user(arg, &out, sizeof(out))) > - return -XFS_ERROR(EFAULT); > - return 0; > - } > - > - case XFS_IOC_SET_RESBLKS: { > - xfs_fsop_resblks_t inout; > - __uint64_t in; > - > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > - > - if (copy_from_user(&inout, arg, sizeof(inout))) > - return -XFS_ERROR(EFAULT); > - > - /* input parameter is passed in resblks field of structure */ > - in = inout.resblks; > - error = xfs_reserve_blocks(mp, &in, &inout); > - if (error) > - return -error; > - > - if (copy_to_user(arg, &inout, sizeof(inout))) > - return -XFS_ERROR(EFAULT); > - return 0; > - } > - > - case XFS_IOC_GET_RESBLKS: { > - xfs_fsop_resblks_t out; > - > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > - > - error = xfs_reserve_blocks(mp, NULL, &out); > - if (error) > - return -error; > - > - if (copy_to_user(arg, &out, sizeof(out))) > - return -XFS_ERROR(EFAULT); > - > - return 0; > - } > - > - case XFS_IOC_FSGROWFSDATA: { > - xfs_growfs_data_t in; > - > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > - > - if (copy_from_user(&in, arg, sizeof(in))) > - return -XFS_ERROR(EFAULT); > - > - error = xfs_growfs_data(mp, &in); > - return -error; > - } > - > - case XFS_IOC_FSGROWFSLOG: { > - xfs_growfs_log_t in; > - > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > - > - if (copy_from_user(&in, arg, sizeof(in))) > - return -XFS_ERROR(EFAULT); > - > - error = xfs_growfs_log(mp, &in); > - return -error; > - } > - > - case XFS_IOC_FSGROWFSRT: { > - xfs_growfs_rt_t in; > - > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > - > - if (copy_from_user(&in, arg, sizeof(in))) > - return -XFS_ERROR(EFAULT); > - > - error = xfs_growfs_rt(mp, &in); > - return -error; > - } > - > - case XFS_IOC_FREEZE: > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > - > - if (inode->i_sb->s_frozen == SB_UNFROZEN) > - freeze_bdev(inode->i_sb->s_bdev); > - return 0; > - > - case XFS_IOC_THAW: > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > - if (inode->i_sb->s_frozen != SB_UNFROZEN) > - thaw_bdev(inode->i_sb->s_bdev, inode->i_sb); > - return 0; > - > - case XFS_IOC_GOINGDOWN: { > - __uint32_t in; > - > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > - > - if (get_user(in, (__uint32_t __user *)arg)) > - return -XFS_ERROR(EFAULT); > - > - error = xfs_fs_goingdown(mp, in); > - return -error; > - } > - > - case XFS_IOC_ERROR_INJECTION: { > - xfs_error_injection_t in; > - > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > - > - if (copy_from_user(&in, arg, sizeof(in))) > - return -XFS_ERROR(EFAULT); > - > - error = xfs_errortag_add(in.errtag, mp); > - return -error; > - } > - > - case XFS_IOC_ERROR_CLEARALL: > - if (!capable(CAP_SYS_ADMIN)) > - return -EPERM; > - > - error = xfs_errortag_clearall(mp, 1); > - return -error; > - > - default: > - return -ENOTTY; > - } > -} > - > STATIC int > xfs_ioc_space( > struct xfs_inode *ip, > @@ -1332,3 +1024,258 @@ xfs_ioc_getbmapx( > > return 0; > } > + > +int > +xfs_ioctl( > + xfs_inode_t *ip, > + struct file *filp, > + int ioflags, > + unsigned int cmd, > + void __user *arg) > +{ > + struct inode *inode = filp->f_path.dentry->d_inode; > + xfs_mount_t *mp = ip->i_mount; > + int error; > + > + xfs_itrace_entry(XFS_I(inode)); > + switch (cmd) { > + > + case XFS_IOC_ALLOCSP: > + case XFS_IOC_FREESP: > + case XFS_IOC_RESVSP: > + case XFS_IOC_UNRESVSP: > + case XFS_IOC_ALLOCSP64: > + case XFS_IOC_FREESP64: > + case XFS_IOC_RESVSP64: > + case XFS_IOC_UNRESVSP64: > + /* > + * Only allow the sys admin to reserve space unless > + * unwritten extents are enabled. > + */ > + if (!XFS_SB_VERSION_HASEXTFLGBIT(&mp->m_sb) && > + !capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + return xfs_ioc_space(ip, inode, filp, ioflags, cmd, arg); > + > + case XFS_IOC_DIOINFO: { > + struct dioattr da; > + xfs_buftarg_t *target = > + XFS_IS_REALTIME_INODE(ip) ? > + mp->m_rtdev_targp : mp->m_ddev_targp; > + > + da.d_mem = da.d_miniosz = 1 << target->bt_sshift; > + da.d_maxiosz = INT_MAX & ~(da.d_miniosz - 1); > + > + if (copy_to_user(arg, &da, sizeof(da))) > + return -XFS_ERROR(EFAULT); > + return 0; > + } > + > + case XFS_IOC_FSBULKSTAT_SINGLE: > + case XFS_IOC_FSBULKSTAT: > + case XFS_IOC_FSINUMBERS: > + return xfs_ioc_bulkstat(mp, cmd, arg); > + > + case XFS_IOC_FSGEOMETRY_V1: > + return xfs_ioc_fsgeometry_v1(mp, arg); > + > + case XFS_IOC_FSGEOMETRY: > + return xfs_ioc_fsgeometry(mp, arg); > + > + case XFS_IOC_GETVERSION: > + return put_user(inode->i_generation, (int __user *)arg); > + > + case XFS_IOC_FSGETXATTR: > + return xfs_ioc_fsgetxattr(ip, 0, arg); > + case XFS_IOC_FSGETXATTRA: > + return xfs_ioc_fsgetxattr(ip, 1, arg); > + case XFS_IOC_GETXFLAGS: > + case XFS_IOC_SETXFLAGS: > + case XFS_IOC_FSSETXATTR: > + return xfs_ioc_xattr(ip, filp, cmd, arg); > + > + case XFS_IOC_FSSETDM: { > + struct fsdmidata dmi; > + > + if (copy_from_user(&dmi, arg, sizeof(dmi))) > + return -XFS_ERROR(EFAULT); > + > + error = xfs_set_dmattrs(ip, dmi.fsd_dmevmask, > + dmi.fsd_dmstate); > + return -error; > + } > + > + case XFS_IOC_GETBMAP: > + case XFS_IOC_GETBMAPA: > + return xfs_ioc_getbmap(ip, ioflags, cmd, arg); > + > + case XFS_IOC_GETBMAPX: > + return xfs_ioc_getbmapx(ip, arg); > + > + case XFS_IOC_FD_TO_HANDLE: > + case XFS_IOC_PATH_TO_HANDLE: > + case XFS_IOC_PATH_TO_FSHANDLE: > + return xfs_find_handle(cmd, arg); > + > + case XFS_IOC_OPEN_BY_HANDLE: > + return xfs_open_by_handle(mp, arg, filp, inode); > + > + case XFS_IOC_FSSETDM_BY_HANDLE: > + return xfs_fssetdm_by_handle(mp, arg, inode); > + > + case XFS_IOC_READLINK_BY_HANDLE: > + return xfs_readlink_by_handle(mp, arg, inode); > + > + case XFS_IOC_ATTRLIST_BY_HANDLE: > + return xfs_attrlist_by_handle(mp, arg, inode); > + > + case XFS_IOC_ATTRMULTI_BY_HANDLE: > + return xfs_attrmulti_by_handle(mp, arg, inode); > + > + case XFS_IOC_SWAPEXT: { > + error = xfs_swapext((struct xfs_swapext __user *)arg); > + return -error; > + } > + > + case XFS_IOC_FSCOUNTS: { > + xfs_fsop_counts_t out; > + > + error = xfs_fs_counts(mp, &out); > + if (error) > + return -error; > + > + if (copy_to_user(arg, &out, sizeof(out))) > + return -XFS_ERROR(EFAULT); > + return 0; > + } > + > + case XFS_IOC_SET_RESBLKS: { > + xfs_fsop_resblks_t inout; > + __uint64_t in; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + if (copy_from_user(&inout, arg, sizeof(inout))) > + return -XFS_ERROR(EFAULT); > + > + /* input parameter is passed in resblks field of structure */ > + in = inout.resblks; > + error = xfs_reserve_blocks(mp, &in, &inout); > + if (error) > + return -error; > + > + if (copy_to_user(arg, &inout, sizeof(inout))) > + return -XFS_ERROR(EFAULT); > + return 0; > + } > + > + case XFS_IOC_GET_RESBLKS: { > + xfs_fsop_resblks_t out; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + error = xfs_reserve_blocks(mp, NULL, &out); > + if (error) > + return -error; > + > + if (copy_to_user(arg, &out, sizeof(out))) > + return -XFS_ERROR(EFAULT); > + > + return 0; > + } > + > + case XFS_IOC_FSGROWFSDATA: { > + xfs_growfs_data_t in; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + if (copy_from_user(&in, arg, sizeof(in))) > + return -XFS_ERROR(EFAULT); > + > + error = xfs_growfs_data(mp, &in); > + return -error; > + } > + > + case XFS_IOC_FSGROWFSLOG: { > + xfs_growfs_log_t in; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + if (copy_from_user(&in, arg, sizeof(in))) > + return -XFS_ERROR(EFAULT); > + > + error = xfs_growfs_log(mp, &in); > + return -error; > + } > + > + case XFS_IOC_FSGROWFSRT: { > + xfs_growfs_rt_t in; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + if (copy_from_user(&in, arg, sizeof(in))) > + return -XFS_ERROR(EFAULT); > + > + error = xfs_growfs_rt(mp, &in); > + return -error; > + } > + > + case XFS_IOC_FREEZE: > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + if (inode->i_sb->s_frozen == SB_UNFROZEN) > + freeze_bdev(inode->i_sb->s_bdev); > + return 0; > + > + case XFS_IOC_THAW: > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + if (inode->i_sb->s_frozen != SB_UNFROZEN) > + thaw_bdev(inode->i_sb->s_bdev, inode->i_sb); > + return 0; > + > + case XFS_IOC_GOINGDOWN: { > + __uint32_t in; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + if (get_user(in, (__uint32_t __user *)arg)) > + return -XFS_ERROR(EFAULT); > + > + error = xfs_fs_goingdown(mp, in); > + return -error; > + } > + > + case XFS_IOC_ERROR_INJECTION: { > + xfs_error_injection_t in; > + > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + if (copy_from_user(&in, arg, sizeof(in))) > + return -XFS_ERROR(EFAULT); > + > + error = xfs_errortag_add(in.errtag, mp); > + return -error; > + } > + > + case XFS_IOC_ERROR_CLEARALL: > + if (!capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + error = xfs_errortag_clearall(mp, 1); > + return -error; > + > + default: > + return -ENOTTY; > + } > +} > + > > > ---end quoted text--- From owner-xfs@oss.sgi.com Sun Feb 10 09:54:19 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 10 Feb 2008 09:54:23 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: **** X-Spam-Status: No, score=4.0 required=5.0 tests=BAYES_99 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1AHsHl5000394 for ; Sun, 10 Feb 2008 09:54:19 -0800 X-ASG-Debug-ID: 1202666080-588200940000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ms-smtp-01.nyroc.rr.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id DEF45DD8A59 for ; Sun, 10 Feb 2008 09:54:41 -0800 (PST) Received: from ms-smtp-01.nyroc.rr.com (ms-smtp-01.nyroc.rr.com [24.24.2.55]) by cuda.sgi.com with ESMTP id f5AM2WpryrndviVw for ; Sun, 10 Feb 2008 09:54:41 -0800 (PST) Received: from ms-mss-02 (ms-mss-02-smtp-a [10.10.4.8]) by ms-smtp-01.nyroc.rr.com (8.13.6/8.13.6) with ESMTP id m1AESBwZ008834 for ; Sun, 10 Feb 2008 09:28:12 -0500 (EST) Received: from nyroc.rr.com (localhost [127.0.0.1]) by ms-mss-02.nyroc.rr.com (iPlanet Messaging Server 5.2 HotFix 2.10 (built Dec 26 2005)) with ESMTP id <0JW100EVN1IZX4@ms-mss-02.nyroc.rr.com> for linux-xfs@oss.sgi.com; Sun, 10 Feb 2008 09:28:11 -0500 (EST) Received: from [10.10.6.27] (Forwarded-For: [64.110.28.149]) by ms-mss-02.nyroc.rr.com (mshttpd); Sun, 10 Feb 2008 06:28:11 -0800 Date: Sun, 10 Feb 2008 06:28:11 -0800 From: joyr@maine.rr.com X-ASG-Orig-Subj: $$$Loan offer... Apply Now!! Subject: $$$Loan offer... Apply Now!! Reply-To: mikepeters009@yahoo.com Message-id: MIME-version: 1.0 X-Mailer: iPlanet Messenger Express 5.2 HotFix 2.10 (built Dec 26 2005) Content-type: text/plain; charset=us-ascii Content-language: en Content-transfer-encoding: 7BIT Content-disposition: inline X-Accept-Language: en Priority: normal X-Virus-Scanned: ClamAV 0.91.2/5764/Sat Feb 9 23:38:01 2008 on oss.sgi.com X-Virus-Scanned: Symantec AntiVirus Scan Engine X-Barracuda-Connect: ms-smtp-01.nyroc.rr.com[24.24.2.55] X-Barracuda-Start-Time: 1202666081 X-Barracuda-Bayes: INNOCENT GLOBAL 0.3350 1.0000 -0.2074 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 0.94 X-Barracuda-Spam-Status: No, SCORE=0.94 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=MAILTO_TO_SPAM_ADDR, MISSING_HEADERS, NO_REAL_NAME, TO_CC_NONE X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41913 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.55 NO_REAL_NAME From: does not include a real name 0.19 MISSING_HEADERS Missing To: header 0.28 MAILTO_TO_SPAM_ADDR URI: Includes a link to a likely spammer email 0.13 TO_CC_NONE No To: or Cc: header To: undisclosed-recipients:; X-Virus-Status: Clean X-archive-position: 14393 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: joyr@maine.rr.com Precedence: bulk X-list: xfs Hello Contact us for unsecured personal or business loan of any amount, we are ready to help you out. Our interest rates are reasonable. Here are some few questions from us: Are you in debts? Do you want to pay of your debts? Do you have plans in expanding your business? Do you want to get your self financially equipped? If yes then contact us and you will be given the best of all financial support. Many are out there that needs financial assistance with no opportunity. But this is an opportunity you can't afford to miss. We are out in the world to get most people that needs financial assistance financially equipped. For more information contact us via email: mikepeters009@yahoo.com From owner-xfs@oss.sgi.com Sun Feb 10 16:17:25 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 10 Feb 2008 16:17:32 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1B0HLfv023561 for ; Sun, 10 Feb 2008 16:17:23 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA01229; Mon, 11 Feb 2008 11:17:35 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1B0HWLF58832737; Mon, 11 Feb 2008 11:17:33 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1B0HSFt58727187; Mon, 11 Feb 2008 11:17:28 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Mon, 11 Feb 2008 11:17:28 +1100 From: David Chinner To: Sven Geggus Cc: David Chinner , xfs@oss.sgi.com, Tobias Ulmer , Andrea Perotti Subject: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Message-ID: <20080211001727.GO155407@sgi.com> References: <20080205052418.GU155259@sgi.com> <20080209120423.GA6699@diesel.geggus.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080209120423.GA6699@diesel.geggus.net> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5767/Sun Feb 10 12:57:54 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14394 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Sat, Feb 09, 2008 at 01:04:24PM +0100, Sven Geggus wrote: > David Chinner schrieb am Dienstag, den 05. Februar um 06:24 Uhr: > > > Can you try the patch attached below > > Am I correct in the assumption, that this did not make it into > 2.6.24.1? Right - the fix wasn't in Linus' kernel by the time 2.6.24.1 was released. > Can we reckon that this patch will get included in one of the next > minor releases? Already queued for 2.6.24.2. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sun Feb 10 19:39:06 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 10 Feb 2008 19:39:12 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1B3d2Is004578 for ; Sun, 10 Feb 2008 19:39:05 -0800 Received: from pc-bnaujok.melbourne.sgi.com (pc-bnaujok.melbourne.sgi.com [134.14.55.58]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA05429 for ; Mon, 11 Feb 2008 14:39:25 +1100 Date: Mon, 11 Feb 2008 14:40:46 +1100 To: "xfs@oss.sgi.com" Subject: New source tarballs available From: "Barry Naujok" Organization: SGI Content-Type: text/plain; format=flowed; delsp=yes; charset=utf-8 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Message-ID: User-Agent: Opera Mail/9.24 (Win32) X-Virus-Scanned: ClamAV 0.91.2/5767/Sun Feb 10 12:57:54 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14395 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@sgi.com Precedence: bulk X-list: xfs Just an FYI... acl 2.2.47 attr 2.4.41 xfsdump 2.2.48 xfsprogs 2.9.6 (No changes to dmapi.) From owner-xfs@oss.sgi.com Sun Feb 10 20:52:37 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 10 Feb 2008 20:52:41 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_00,WEIRD_PORT autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1B4qXCd014269 for ; Sun, 10 Feb 2008 20:52:35 -0800 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA06574; Mon, 11 Feb 2008 15:52:50 +1100 Message-ID: <47AFD5B5.7040506@sgi.com> Date: Mon, 11 Feb 2008 15:57:25 +1100 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.9 (X11/20071031) MIME-Version: 1.0 To: Eric Sandeen CC: xfs@oss.sgi.com Subject: Re: [GIT PULL] XFS update for 2.6.25 References: <20080208022705.0DB1058C4C11@chook.melbourne.sgi.com> <47AD284F.7080603@sandeen.net> In-Reply-To: <47AD284F.7080603@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/5767/Sun Feb 10 12:57:54 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14396 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Eric, The other makefile changes caused a merge failure because of changes we keep in our development tree that are not in mainline. I didn't have time to fix them for rc1 so I'll get them out in rc2. I know the changes are good and very useful so sorry for the delay. Lachlan Eric Sandeen wrote: > Lachlan McIlroy wrote: >> Please pull from the for-linus branch: >> git pull git://oss.sgi.com:8090/xfs/xfs-2.6.git for-linus >> >> This will update the following files: >> >> fs/xfs/Makefile-linux-2.6 | 1 - > > Is there a reason the other various makefile updates still haven't been > pushed? They're a lot tidier now, and they facilitate out-of-tree > building... > > Thanks, > -Eric > > Remove Makefile wrappers in XFS > > Makefile (and Kbuild) would include Makefile-linux-26 > I doubt XFS will really still compile on 2.4; so drop that. This > moves Makefile-linux-26 into Makefile and drops Kbuild. > Also having wrappers as both Kbuild and Makefile seemed redundant > anyways. > > The patch is relatively large because it renames a file, but > no functional changes. > > Signed-off-by: Andi Kleen > Merge of xfs-linux-melb:xfs-kern:29781a by kenmcd. > > Remove Makefile wrappers in XFS. > > Fix up xfs out-of-tree builds. (a.k.a. external modules) > > Change -I include directives to find headers in the out-of-tree spot. > This allows a directory containing only xfs files to be built as: > > # make -C /path/to/kernel M=`pwd` > > Signed-off-by: Eric Sandeen > Merge of xfs-linux-melb:xfs-kern:29878a by kenmcd. > > fix up out-of-tree xfs builds. > > From owner-xfs@oss.sgi.com Sun Feb 10 20:57:13 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 10 Feb 2008 20:57:16 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1B4vAf0015451 for ; Sun, 10 Feb 2008 20:57:13 -0800 X-ASG-Debug-ID: 1202705853-248903bd0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 67BA8DDB1F5 for ; Sun, 10 Feb 2008 20:57:33 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id Va0NZoPHaFFEdzWC for ; Sun, 10 Feb 2008 20:57:33 -0800 (PST) Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 3CACF18DF47AE; Sun, 10 Feb 2008 22:57:32 -0600 (CST) Message-ID: <47AFD5BB.7090607@sandeen.net> Date: Sun, 10 Feb 2008 22:57:31 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: lachlan@sgi.com CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [GIT PULL] XFS update for 2.6.25 Subject: Re: [GIT PULL] XFS update for 2.6.25 References: <20080208022705.0DB1058C4C11@chook.melbourne.sgi.com> <47AD284F.7080603@sandeen.net> <47AFD5B5.7040506@sgi.com> In-Reply-To: <47AFD5B5.7040506@sgi.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202705854 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41956 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5767/Sun Feb 10 12:57:54 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14397 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Lachlan McIlroy wrote: > Eric, > > The other makefile changes caused a merge failure because of changes we > keep in our development tree that are not in mainline. I didn't have > time to fix them for rc1 so I'll get them out in rc2. I know the changes > are good and very useful so sorry for the delay. > > Lachlan Thanks Lachlan, no worries. Hopefully the patches I sent will help. -Eric From owner-xfs@oss.sgi.com Sun Feb 10 22:35:16 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 10 Feb 2008 22:35:37 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1B6ZEJE003813 for ; Sun, 10 Feb 2008 22:35:15 -0800 X-ASG-Debug-ID: 1202711737-758e00d20000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from amanpulo.fs3.ph (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B97AD5BF5CE for ; Sun, 10 Feb 2008 22:35:38 -0800 (PST) Received: from amanpulo.fs3.ph (amanpulo.fs3.ph [202.81.162.101]) by cuda.sgi.com with ESMTP id I3AagPgudgHJ6nc6 for ; Sun, 10 Feb 2008 22:35:38 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by amanpulo.fs3.ph (Postfix) with ESMTP id 2056B2806A350 for ; Mon, 11 Feb 2008 14:35:36 +0800 (PHT) X-Virus-Scanned: ClamAV 0.91.2/5767/Sun Feb 10 12:57:54 2008 on oss.sgi.com X-Virus-Scanned: Debian amavisd-new at amanpulo.fs3.ph Received: from amanpulo.fs3.ph ([127.0.0.1]) by localhost (amanpulo.fs3.ph [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Ne6o6Ci3saH2 for ; Mon, 11 Feb 2008 14:35:27 +0800 (PHT) Received: from humanitas.leathercollection.ph (gusi.leathercollection.ph [202.163.192.10]) (using SSLv3 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by amanpulo.fs3.ph (Postfix) with ESMTP id 1247328056706 for ; Mon, 11 Feb 2008 14:35:26 +0800 (PHT) X-ASG-Orig-Subj: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Subject: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash From: Federico Sevilla III To: xfs@oss.sgi.com In-Reply-To: <20080211001727.GO155407@sgi.com> References: <20080205052418.GU155259@sgi.com> <20080209120423.GA6699@diesel.geggus.net> <20080211001727.GO155407@sgi.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-B4Z5fp3dsNBz1w9ttDd3" Organization: F S 3 Consulting Inc. Date: Mon, 11 Feb 2008 14:35:25 +0800 Message-Id: <1202711725.4679.7.camel@humanitas.fs3.ph> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 X-Barracuda-Connect: amanpulo.fs3.ph[202.81.162.101] X-Barracuda-Start-Time: 1202711738 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.41963 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Status: Clean X-archive-position: 14398 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jijo@fs3.ph Precedence: bulk X-list: xfs --=-B4Z5fp3dsNBz1w9ttDd3 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Mon, 2008-02-11 at 11:17 +1100, David Chinner wrote: > > Can we reckon that this patch will get included in one of the next > > minor releases? >=20 > Already queued for 2.6.24.2. 2.6.24.2 has been released to address the vmsplice issue. Unfortunately, no other changes seem to have been included. Hopefully, the xfs_file_readdir patch will make it to 2.6.24.3. --=20 Federico Sevilla III F S 3 Consulting Inc. http://www.fs3.ph --=-B4Z5fp3dsNBz1w9ttDd3 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQBHr+yt5rCBSJO3Rr4RAm7bAJ9NzYTg3e+DXaELNycXe0ggWIJ8BgCfQfUR D79B18ayHjZqGh2BbgNE5WU= =QVh1 -----END PGP SIGNATURE----- --=-B4Z5fp3dsNBz1w9ttDd3-- From owner-xfs@oss.sgi.com Sun Feb 10 23:44:33 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 10 Feb 2008 23:44:50 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_73, SUBJ_ALL_CAPS autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1B7iOJt003541 for ; Sun, 10 Feb 2008 23:44:32 -0800 Received: from linuxbuild.melbourne.sgi.com (linuxbuild.melbourne.sgi.com [134.14.54.115]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA10097; Mon, 11 Feb 2008 18:44:42 +1100 From: donaldd@sgi.com Received: by linuxbuild.melbourne.sgi.com (Postfix, from userid 16365) id 604F8334AB1F; Mon, 11 Feb 2008 18:44:42 +1100 (EST) To: xfs@oss.sgi.com, sgi.bugs.xfs@engr.sgi.com Subject: PARTIAL TAKE 971186 - Message-Id: <20080211074442.604F8334AB1F@linuxbuild.melbourne.sgi.com> Date: Mon, 11 Feb 2008 18:44:42 +1100 (EST) X-Virus-Scanned: ClamAV 0.91.2/5768/Sun Feb 10 22:31:03 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14399 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: donaldd@sgi.com Precedence: bulk X-list: xfs Remove the xfs_refcache, it was only needed while we were still building for 2.4 kernels. Date: Mon Feb 11 18:43:07 AEDT 2008 Workarea: linuxbuild.melbourne.sgi.com:/home/donaldd/isms/2.6.x-xfs Inspected by: lachlan,hch The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30472a fs/xfs/xfs_vnodeops.c - 1.732 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.732&r2=text&tr2=1.731&f=h - Remove xfs_refcache. fs/xfs/xfs_vfsops.c - 1.552 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vfsops.c.diff?r1=text&tr1=1.552&r2=text&tr2=1.551&f=h - Remove xfs_refcache. fs/xfs/xfs_inode.h - 1.241 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.h.diff?r1=text&tr1=1.241&r2=text&tr2=1.240&f=h - Remove xfs_refcache. fs/xfs/xfs_rename.c - 1.78 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_rename.c.diff?r1=text&tr1=1.78&r2=text&tr2=1.77&f=h - Remove xfs_refcache. fs/xfs/xfs_refcache.c - 1.8 - deleted http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_refcache.c.diff?r1=text&tr1=1.8&r2=text&tr2=1.7&f=h - Remove xfs_refcache. fs/xfs/xfs_refcache.h - 1.3 - deleted http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_refcache.h.diff?r1=text&tr1=1.3&r2=text&tr2=1.2&f=h - Remove xfs_refcache. fs/xfs/linux-2.6/xfs_linux.h - 1.163 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_linux.h.diff?r1=text&tr1=1.163&r2=text&tr2=1.162&f=h - Remove xfs_refcache. fs/xfs/linux-2.6/xfs_ksyms.c - 1.79 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_ksyms.c.diff?r1=text&tr1=1.79&r2=text&tr2=1.78&f=h - Remove xfs_refcache. From owner-xfs@oss.sgi.com Mon Feb 11 05:00:40 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 11 Feb 2008 05:00:47 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from relay.sgi.com (relay2.corp.sgi.com [192.26.58.22]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1BD0eJ9006784 for ; Mon, 11 Feb 2008 05:00:40 -0800 Received: from outhouse.melbourne.sgi.com (outhouse.melbourne.sgi.com [134.14.52.145]) by relay2.corp.sgi.com (Postfix) with ESMTP id 1A810304066; Mon, 11 Feb 2008 05:01:00 -0800 (PST) Received: from [134.15.251.2] (melb-sw-corp-251-2.corp.sgi.com [134.15.251.2]) by outhouse.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1BD0rTG2015593; Tue, 12 Feb 2008 00:00:57 +1100 (AEDT) Message-ID: <47B04706.8010002@sgi.com> Date: Tue, 12 Feb 2008 00:00:54 +1100 From: Donald Douwsma User-Agent: Thunderbird 2.0.0.6 (X11/20071022) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs-oss Subject: Re: [review] Remove the xfs refcache References: <4765EC66.5020202@sgi.com> <20080207083554.GA11119@infradead.org> In-Reply-To: <20080207083554.GA11119@infradead.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/5769/Mon Feb 11 02:56:45 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14400 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: donaldd@sgi.com Precedence: bulk X-list: xfs Christoph Hellwig wrote: > On Mon, Dec 17, 2007 at 02:26:30PM +1100, Donald Douwsma wrote: >> Remove the xfs_refcache, it was only needed while we were still building for >> 2.4 kernels. > > Given that we finally agreed that this form of refcache shouldn't come > back can you commit the patch? Done. I'll see if we can get it committed with the Makefile cleanups, that should be the end of the 2.4 crud. Don From owner-xfs@oss.sgi.com Mon Feb 11 08:32:32 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 11 Feb 2008 08:32:37 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1BGWUuU025078 for ; Mon, 11 Feb 2008 08:32:31 -0800 X-ASG-Debug-ID: 1202747574-243202ce0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 619E1DE1410 for ; Mon, 11 Feb 2008 08:32:54 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id Fnet3VEK3ClAjrOb for ; Mon, 11 Feb 2008 08:32:54 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JObZu-00059A-3v; Mon, 11 Feb 2008 16:32:22 +0000 Date: Mon, 11 Feb 2008 11:32:22 -0500 From: Christoph Hellwig To: Donald Douwsma Cc: xfs-oss X-ASG-Orig-Subj: Re: [review] Remove the xfs refcache Subject: Re: [review] Remove the xfs refcache Message-ID: <20080211163222.GA19745@infradead.org> References: <4765EC66.5020202@sgi.com> <20080207083554.GA11119@infradead.org> <47B04706.8010002@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47B04706.8010002@sgi.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1202747575 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42002 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5774/Mon Feb 11 06:30:56 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14401 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Tue, Feb 12, 2008 at 12:00:54AM +1100, Donald Douwsma wrote: > > Given that we finally agreed that this form of refcache shouldn't come > > back can you commit the patch? > > Done. I'll see if we can get it committed with the Makefile cleanups, that > should be the end of the 2.4 crud. Thanks! From owner-xfs@oss.sgi.com Mon Feb 11 08:46:11 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 11 Feb 2008 08:46:18 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_54 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1BGk9Qu026003 for ; Mon, 11 Feb 2008 08:46:11 -0800 X-ASG-Debug-ID: 1202748392-2a1802ea0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from out4.smtp.messagingengine.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C3D67DE165E for ; Mon, 11 Feb 2008 08:46:33 -0800 (PST) Received: from out4.smtp.messagingengine.com (out4.smtp.messagingengine.com [66.111.4.28]) by cuda.sgi.com with ESMTP id FD304JH0CRMSQPvU for ; Mon, 11 Feb 2008 08:46:33 -0800 (PST) Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id A4AEB8834C for ; Mon, 11 Feb 2008 11:46:29 -0500 (EST) Received: from web5.messagingengine.com ([10.202.2.214]) by compute1.internal (MEProxy); Mon, 11 Feb 2008 11:46:29 -0500 Received: by web5.messagingengine.com (Postfix, from userid 99) id 713726960D; Mon, 11 Feb 2008 11:46:29 -0500 (EST) Message-Id: <1202748389.28320.1236240801@webmail.messagingengine.com> X-Sasl-Enc: k+Ueo8PE2Lm9ygHD/YSrUzcAyrsX232oFVf6w/wF14ID 1202748389 From: "Felix E. Klee" To: "xfs-oss" Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="ISO-8859-1" MIME-Version: 1.0 X-Mailer: MessagingEngine.com Webmail Interface X-ASG-Orig-Subj: Data safety horror stories? Subject: Data safety horror stories? Date: Mon, 11 Feb 2008 17:46:29 +0100 X-Barracuda-Connect: out4.smtp.messagingengine.com[66.111.4.28] X-Barracuda-Start-Time: 1202748393 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42004 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5774/Mon Feb 11 06:30:56 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14402 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: felix.klee@inka.de Precedence: bulk X-list: xfs I heard that, in case of a power failure, XFS may lose data, even data that was already existing on the disk. For example, I heard horror stories of files being overwritten with zeros. Are those stories true? If so: * Do you recommend not using XFS on devices that may frequently fail due to power failure? * Is it possible to find out what files have been damaged? If not, will only files be affected that have been changed during the last couple of hours? * Are there options to increase data safety? Should one run a regular "sync" in a cron job? * Is it unsafe to use XFS in a virtual machine which may sometimes be terminated without proper shutdown? Suppose the virtual machine is running under Windows, and Windows may sometimes be terminated without proper shutdown. I currently am using XFS under Ubuntu 7.10 (Kernel 2.6.22), running in a virtual machine (VMware) under Windows. The XFS file system is in a native partition on a second HDD. -- Felix E. Klee Jabber/Google Talk: feklee@jabber.org, SIP: 9779619@sipgate.de ICQ: 158124695, Yahoo!: feklee, AIM: felix.klee@inka.de Gizmo: felixklee, Skype: felix.klee From owner-xfs@oss.sgi.com Mon Feb 11 09:11:56 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 11 Feb 2008 09:12:02 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1BHBsAs027326 for ; Mon, 11 Feb 2008 09:11:55 -0800 X-ASG-Debug-ID: 1202749937-776f001a0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ug-out-1314.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B50DCDE1979 for ; Mon, 11 Feb 2008 09:12:17 -0800 (PST) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.175]) by cuda.sgi.com with ESMTP id LnZtcQ2qAsCrVSbm for ; Mon, 11 Feb 2008 09:12:17 -0800 (PST) Received: by ug-out-1314.google.com with SMTP id o29so841501ugd.20 for ; Mon, 11 Feb 2008 09:12:16 -0800 (PST) Received: by 10.67.29.20 with SMTP id g20mr9297187ugj.54.1202749936317; Mon, 11 Feb 2008 09:12:16 -0800 (PST) Received: from teal.hq.k1024.org ( [84.75.117.152]) by mx.google.com with ESMTPS id m38sm2190960ugd.44.2008.02.11.09.12.13 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 11 Feb 2008 09:12:15 -0800 (PST) Received: by teal.hq.k1024.org (Postfix, from userid 4004) id 527CE40A0B7; Mon, 11 Feb 2008 18:12:09 +0100 (CET) Date: Mon, 11 Feb 2008 18:12:09 +0100 From: Iustin Pop To: "Felix E. Klee" Cc: xfs-oss X-ASG-Orig-Subj: Re: Data safety horror stories? Subject: Re: Data safety horror stories? Message-ID: <20080211171209.GA7567@teal.hq.k1024.org> Mail-Followup-To: "Felix E. Klee" , xfs-oss References: <1202748389.28320.1236240801@webmail.messagingengine.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1202748389.28320.1236240801@webmail.messagingengine.com> X-Linux: This message was written on Linux X-Header: /usr/include gives great headers User-Agent: Mutt/1.5.17+20080114 (2008-01-14) X-Barracuda-Connect: ug-out-1314.google.com[66.249.92.175] X-Barracuda-Start-Time: 1202749938 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42006 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5774/Mon Feb 11 06:30:56 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14403 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: iusty@k1024.org Precedence: bulk X-list: xfs On Mon, Feb 11, 2008 at 05:46:29PM +0100, Felix E. Klee wrote: > I heard that, in case of a power failure, XFS may lose data, even data > that was already existing on the disk. For example, I heard horror > stories of files being overwritten with zeros. > > Are those stories true? No, XFS will not lose any data that the application has committed to the disk. Improperly written applications and/or improperly configured systems might have issues with recently written files losing data. FWIW: I have never lost data with XFS, neither on home computers nor in SAN environments (in the presence of link/path failure). Be sure the read the FAQ, especially the section about write cache on dekstop/consumer HDDs. Just my opinion as an XFS user, your mileage might vary. iustin From owner-xfs@oss.sgi.com Mon Feb 11 13:23:21 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 11 Feb 2008 13:23:27 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1BLNHBD014435 for ; Mon, 11 Feb 2008 13:23:21 -0800 X-ASG-Debug-ID: 1202765020-128c013d0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from out4.smtp.messagingengine.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8ED58DE5184 for ; Mon, 11 Feb 2008 13:23:40 -0800 (PST) Received: from out4.smtp.messagingengine.com (out4.smtp.messagingengine.com [66.111.4.28]) by cuda.sgi.com with ESMTP id 8Mu7Ek7W3NQtGPKs for ; Mon, 11 Feb 2008 13:23:40 -0800 (PST) Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id 0C93F908E3; Mon, 11 Feb 2008 16:23:10 -0500 (EST) Received: from web8.messagingengine.com ([10.202.2.217]) by compute1.internal (MEProxy); Mon, 11 Feb 2008 16:23:10 -0500 Received: by web8.messagingengine.com (Postfix, from userid 99) id DA78647CE9; Mon, 11 Feb 2008 16:23:09 -0500 (EST) Message-Id: <1202764989.11126.1236296081@webmail.messagingengine.com> X-Sasl-Enc: +RnNCMxm8bIa2I8eodHKJv7fJq5BTkOP92QmtHi3XyDt 1202764989 From: "Felix E. Klee" To: "Iustin Pop" Cc: "xfs-oss" Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="ISO-8859-1" MIME-Version: 1.0 X-Mailer: MessagingEngine.com Webmail Interface References: <1202748389.28320.1236240801@webmail.messagingengine.com> <20080211171209.GA7567@teal.hq.k1024.org> X-ASG-Orig-Subj: Re: Data safety horror stories? Subject: Re: Data safety horror stories? In-Reply-To: <20080211171209.GA7567@teal.hq.k1024.org> Date: Mon, 11 Feb 2008 22:23:09 +0100 X-Barracuda-Connect: out4.smtp.messagingengine.com[66.111.4.28] X-Barracuda-Start-Time: 1202765021 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42021 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5779/Mon Feb 11 11:56:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14404 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: felix.klee@inka.de Precedence: bulk X-list: xfs Hi Justin, thanks for the info! On Mon, 11 Feb 2008 18:12:09 +0100, "Iustin Pop" said: > No, XFS will not lose any data that the application has committed to > the disk. OK, but just to make sure: The following FAQ entry refers only to *newly* created files - right? http://oss.sgi.com/projects/xfs/faq.html#nulls > Improperly written applications and/or improperly configured systems > might have issues with recently written files losing data. Again, just to make sure that I understood you correctly: Could you name an example? > Be sure the read the FAQ, especially the section about write cache on > dekstop/consumer HDDs. You are probably referring to the following entry. I now disabled the write cache of the 2nd HDD in Windows (remember my configuration). I wonder though: Wouldn't it be possible to design journaling in a way so that the write cache does never cause problems? Could someone provide an example which illustrates the write cache problem in simple terms? http://oss.sgi.com/projects/xfs/faq.html#wcache > Just my opinion as an XFS user, your mileage might vary. Hopefully, it's not about opinions ... - Felix -- Dipl.-Phys. Felix E. Klee Naunynstr. 2, 76530 Baden-Baden, Germany Tel.: +49 7221 396961, Fax: +49 7221 396960, Mobile: +49 174 1386060 http://www.linkedin.com/in/feklee From owner-xfs@oss.sgi.com Mon Feb 11 14:03:27 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 11 Feb 2008 14:03:33 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_54 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1BM3Mnc016077 for ; Mon, 11 Feb 2008 14:03:26 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA04870; Tue, 12 Feb 2008 09:03:42 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1BM3fLF60228362; Tue, 12 Feb 2008 09:03:41 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1BM3dfE60221333; Tue, 12 Feb 2008 09:03:39 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Tue, 12 Feb 2008 09:03:39 +1100 From: David Chinner To: "Felix E. Klee" Cc: xfs-oss Subject: Re: Data safety horror stories? Message-ID: <20080211220339.GZ155407@sgi.com> References: <1202748389.28320.1236240801@webmail.messagingengine.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1202748389.28320.1236240801@webmail.messagingengine.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5779/Mon Feb 11 11:56:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14405 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Mon, Feb 11, 2008 at 05:46:29PM +0100, Felix E. Klee wrote: > I heard that, in case of a power failure, XFS may lose data, even data > that was already existing on the disk. For example, I heard horror > stories of files being overwritten with zeros. > > Are those stories true? > > If so: > > * Do you recommend not using XFS on devices that may frequently fail due > to power failure? Depends on how much you care about your system and data. I use XFS on write-cache enabled SATA drives without barriers with no UPS (yes, it's unsafe!) and I lose power at least once a week. I haven't had a data loss or corruption in over two years and tens of power failures.... > * Is it possible to find out what files have been damaged? Not easily. > If not, > will only files be affected that have been changed during the last > couple of hours? Last few seconds before the power fail, actually. > * Are there options to increase data safety? Should one run a regular > "sync" in a cron job? If you are truly paranoid - turn off drive caching and mount with the 'wsync' option. > * Is it unsafe to use XFS in a virtual machine which may sometimes be > terminated without proper shutdown? I do that all the time, too. Corruption is rare and usually as a result of some bug in the code I'm testing ;) > I currently am using XFS under Ubuntu 7.10 (Kernel 2.6.22), running in a > virtual machine (VMware) under Windows. The XFS file system is in a > native partition on a second HDD. Should be just fine. If you are really concerned - test it. Cheers, Dave. > > -- > Felix E. Klee > Jabber/Google Talk: feklee@jabber.org, SIP: 9779619@sipgate.de > ICQ: 158124695, Yahoo!: feklee, AIM: felix.klee@inka.de > Gizmo: felixklee, Skype: felix.klee > -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Mon Feb 11 14:38:54 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 11 Feb 2008 14:39:01 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1BMcn1j017719 for ; Mon, 11 Feb 2008 14:38:54 -0800 X-ASG-Debug-ID: 1202769552-132202050000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from out4.smtp.messagingengine.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id CC338D60E3B for ; Mon, 11 Feb 2008 14:39:12 -0800 (PST) Received: from out4.smtp.messagingengine.com (out4.smtp.messagingengine.com [66.111.4.28]) by cuda.sgi.com with ESMTP id DjU8Vkwbwieo5hLE for ; Mon, 11 Feb 2008 14:39:12 -0800 (PST) Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id CF8D390A7A for ; Mon, 11 Feb 2008 17:39:11 -0500 (EST) Received: from web6.messagingengine.com ([10.202.2.215]) by compute1.internal (MEProxy); Mon, 11 Feb 2008 17:39:11 -0500 Received: by web6.messagingengine.com (Postfix, from userid 99) id AAE055DA59; Mon, 11 Feb 2008 17:39:11 -0500 (EST) Message-Id: <1202769551.16458.1236311973@webmail.messagingengine.com> X-Sasl-Enc: k/fsyQz7LfDLN4SL0M3fQr0OkmNUaPni5htwjv6RQi1s 1202769551 From: "Felix E. Klee" To: xfs@oss.sgi.com Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="ISO-8859-1" MIME-Version: 1.0 X-Mailer: MessagingEngine.com Webmail Interface X-ASG-Orig-Subj: Restoring damaged incremental XFS dump? Subject: Restoring damaged incremental XFS dump? Date: Mon, 11 Feb 2008 23:39:11 +0100 X-Barracuda-Connect: out4.smtp.messagingengine.com[66.111.4.28] X-Barracuda-Start-Time: 1202769553 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42027 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5779/Mon Feb 11 11:56:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14406 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: felix.klee@inka.de Precedence: bulk X-list: xfs I wonder whether I should use xfsdump as a replacement for a more traditional incremental backup solution centered around the TAR archiver STAR. The advantage of STAR seems to be that files can also be recovered even if, for example, the level 0 dump is damaged. After all, one is dealing with TAR, a pretty transparent archive format. The advantage of xfsdump is that it creates true snapshots. So, what happens when the level 0 dump created with xfsdump becomes damaged. Will I still be able to recover some files? What about files from >0 dumps? -- Dipl.-Phys. Felix E. Klee Naunynstr. 2, 76530 Baden-Baden, Germany Tel.: +49 7221 396961, Fax: +49 7221 396960, Mobile: +49 174 1386060 http://www.linkedin.com/in/feklee From owner-xfs@oss.sgi.com Mon Feb 11 16:03:42 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 11 Feb 2008 16:03:45 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1C03chj021156 for ; Mon, 11 Feb 2008 16:03:40 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA08893; Tue, 12 Feb 2008 11:03:57 +1100 Message-ID: <47B0E26D.7070807@sgi.com> Date: Tue, 12 Feb 2008 11:03:57 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: "Felix E. Klee" CC: xfs@oss.sgi.com Subject: Re: Restoring damaged incremental XFS dump? References: <1202769551.16458.1236311973@webmail.messagingengine.com> In-Reply-To: <1202769551.16458.1236311973@webmail.messagingengine.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/5779/Mon Feb 11 11:56:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14407 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Felix E. Klee wrote: > I wonder whether I should use xfsdump as a replacement for a more > traditional incremental backup solution centered around the TAR archiver > STAR. The advantage of STAR seems to be that files can also be > recovered even if, for example, the level 0 dump is damaged. After all, > one is dealing with TAR, a pretty transparent archive format. The > advantage of xfsdump is that it creates true snapshots. > I'm not sure what you mean by "true snapshots". I wouldn't really call it snapshots as in what you could get if you froze the filesystem etc.. But I presume you are meaning how it tries to store all the xfs supported information including extended attributes, extended inode attributes, holes, etc... > So, what happens when the level 0 dump created with xfsdump becomes > damaged. Will I still be able to recover some files? What about files > from >0 dumps? > Yes you will still be able to restore stuff. However, it is in its own format so only xfsrestore will be able to do your restoring. Dumps are separated into what it calls media files which are meant to be self containing. So damage to 1 theoretically shouldn't prevent restoring from another media file. Multiple media files are normally only used for tape (only 1 used for a dump to a file). And an incremental dump should also be able to be restored in isolation and there are some QA tests (in xfs-cmds/xfstests) that test this. --Tim From owner-xfs@oss.sgi.com Mon Feb 11 17:01:43 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 11 Feb 2008 17:01:46 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1C11gXG028121 for ; Mon, 11 Feb 2008 17:01:43 -0800 X-ASG-Debug-ID: 1202778125-028000140000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ishtar.tlinx.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 57974DE6C66 for ; Mon, 11 Feb 2008 17:02:05 -0800 (PST) Received: from ishtar.tlinx.org (ishtar.tlinx.org [64.81.245.74]) by cuda.sgi.com with ESMTP id Yo1HfpBrqEKgJda7 for ; Mon, 11 Feb 2008 17:02:05 -0800 (PST) Received: from [192.168.3.11] (Athena [192.168.3.11]) by ishtar.tlinx.org (8.13.3/8.12.10/SuSE Linux 0.7) with ESMTP id m1C1257n016022; Mon, 11 Feb 2008 17:02:05 -0800 Message-ID: <47B0F00D.3060802@tlinx.org> Date: Mon, 11 Feb 2008 17:02:05 -0800 From: Linda Walsh User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: Linux-Xfs CC: Linux-Kernel X-ASG-Orig-Subj: xfs [_fsr] probs in 2.6.24.0 Subject: xfs [_fsr] probs in 2.6.24.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Connect: ishtar.tlinx.org[64.81.245.74] X-Barracuda-Start-Time: 1202778126 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.52 X-Barracuda-Spam-Status: No, SCORE=-1.52 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=BSF_RULE7568M X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42035 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_RULE7568M Custom Rule 7568M X-Virus-Scanned: ClamAV 0.91.2/5779/Mon Feb 11 11:56:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14408 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: xfs@tlinx.org Precedence: bulk X-list: xfs I'm getting similar errors on an x86-32 & x86-64 kernel. The x86-64 system (2nd log below w/date+times) was unusable this morning: one or more of the xfs file systems had "gone off line" due to some unknown error (upon reboot, no errors were indicated; all partitions on the same physical disk). I keep ending up with random failures on two systems. The 32-bit sys more often than not just "locks-up" -- no messages, no keyboard response...etc. Nothing to do except reset -- and of course, nothing in the logs.... I'm turning on all the stat and diagnostics that don't seem to have a noted performance penalty (or not much of one) to see if that helps. Perhaps one issue is the "bug" (looks like multiple instances of the same bug) in xfs. The 32-bit machine does lots of disk activity in early morning hours...and coincidentally, it seemed to be crashing (not consistently) in the morning. With 2.6.24, the 32-bit machine "seems" a bit more stable -- up for over 3 days now (average before was <48 hours). But the 64-bit machine went bonkers (not sure of exact time) -- It *had* been stable (non crashing, anyway) before 2.6.24. I vaguely remember there being a similar xfs lock bug a few versions back and was told just "not to worry about and turn off the lock checking to "avoid the problem"....( :^| ). So I did, but still trying to track down randomness. So I turn back on what checks I could and ... ding -- still a problem in the xfs area -- which "coincidentally" (has happened on more than one occasion), an xfs partition developed a run-time error and shut itself down. This also happened on the 32-bit machine with the SATA disk (thought it might be sata specific), so removed its controller and disk and threw in a same-size PATA. No more file-system errors on 32bit. But first time I've had a filesystem 'forced offline' on the 64bit (but just switched to 2.6.24 recently for obvious reasons). Sadly -- it also seemed to be the case that the 32bit machine when it had the SATA controller+disk installed, was more likely to crash if xfs_fsr was running at the same-time remote backups were being written to the SATA-drive. Couldn't repeat it reliably, though so not sure what's going on there. Both machines are SMP, so that's a potential factor along with the fact that both machines also use SCSI (64bit, SAS form). Neither are running a graphical console but are being remotely accessed. Lemme know if I can provide any info in addressing these problems. I don't know if they are related to the other system crashes, but usage patterns indicate it could be disk-activity related. So eliminating known bugs & problems in that area might help (?)... Suspect log messages follow (1st, w/o Date/time were from dmesg on 32bit, latter was a previously unnoticed one on 64bit until I started looking through logs for clues on last night's failures. The first set of errors the I have some diagnostics for, I could only find in dmesg: (ish-32bit machine) ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.24-ish #1 ------------------------------------------------------- xfs_fsr/2119 is trying to acquire lock: (&mm->mmap_sem){----}, at: [] dio_get_page+0x62/0x160 but task is already holding lock: (&(&ip->i_iolock)->mr_lock){----}, at: [] xfs_ilock+0x5b/0xb0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&(&ip->i_iolock)->mr_lock){----}: [] __lock_acquire+0xce6/0x1150 [] xfs_ilock+0x97/0xb0 [] trace_hardirqs_on+0xaf/0x160 [] lock_acquire+0x6a/0x80 [] xfs_ilock+0x97/0xb0 [] down_write_nested+0x43/0x80 [] xfs_ilock+0x97/0xb0 [] xfs_ilock+0x97/0xb0 [] xfs_free_eofblocks+0x231/0x2e0 [] xfs_release+0x1a7/0x230 [] xfs_file_release+0xb/0x10 [] __fput+0xac/0x180 [] remove_vma+0x35/0x50 [] do_munmap+0x190/0x200 [] sys_munmap+0x2f/0x50 [] sysenter_past_esp+0x5f/0xa5 [] 0xffffffff -> #0 (&mm->mmap_sem){----}: [] print_circular_bug_entry+0x41/0x50 [] __lock_acquire+0xad9/0x1150 [] mark_held_locks+0x43/0x80 [] lock_acquire+0x6a/0x80 [] dio_get_page+0x62/0x160 [] down_read+0x38/0x80 [] dio_get_page+0x62/0x160 [] dio_get_page+0x62/0x160 [] __spin_lock_init+0x32/0x60 [] __blockdev_direct_IO+0x483/0xc70 [] kmem_cache_alloc+0x9b/0xc0 [] _spin_unlock+0x14/0x20 [] lockdep_init_map+0x3d/0x4c0 [] xfs_alloc_ioend+0x120/0x170 [] xfs_vm_direct_IO+0x15f/0x170 [] xfs_get_blocks_direct+0x0/0x30 [] xfs_end_io_direct+0x0/0x90 [] generic_file_direct_IO+0xd2/0x160 [] generic_file_direct_write+0x5a/0x170 [] up_write+0x14/0x30 [] xfs_write+0x3e8/0x920 [] file_read_actor+0x0/0x120 [] xfs_file_aio_write+0x5e/0x70 [] do_sync_write+0xc7/0x110 [] autoremove_wake_function+0x0/0x40 [] lock_release_holdtime+0x47/0x70 [] vfs_read+0xd0/0x160 [] do_sync_write+0x0/0x110 [] vfs_write+0xa0/0x160 [] sys_write+0x41/0x70 [] sysenter_past_esp+0x5f/0xa5 [] 0xffffffff other info that might help us debug this: 1 lock held by xfs_fsr/2119: #0: (&(&ip->i_iolock)->mr_lock){----}, at: [] xfs_ilock+0x5b/0xb0 stack backtrace: Pid: 2119, comm: xfs_fsr Not tainted 2.6.24-ish #1 [] print_circular_bug_tail+0x7a/0x90 [] __lock_acquire+0xad9/0x1150 [] mark_held_locks+0x43/0x80 [] lock_acquire+0x6a/0x80 [] dio_get_page+0x62/0x160 [] down_read+0x38/0x80 [] dio_get_page+0x62/0x160 [] dio_get_page+0x62/0x160 [] __spin_lock_init+0x32/0x60 [] __blockdev_direct_IO+0x483/0xc70 [] kmem_cache_alloc+0x9b/0xc0 [] _spin_unlock+0x14/0x20 [] lockdep_init_map+0x3d/0x4c0 [] xfs_alloc_ioend+0x120/0x170 [] xfs_vm_direct_IO+0x15f/0x170 [] xfs_get_blocks_direct+0x0/0x30 [] xfs_end_io_direct+0x0/0x90 [] generic_file_direct_IO+0xd2/0x160 [] generic_file_direct_write+0x5a/0x170 [] up_write+0x14/0x30 [] xfs_write+0x3e8/0x920 [] file_read_actor+0x0/0x120 [] xfs_file_aio_write+0x5e/0x70 [] do_sync_write+0xc7/0x110 [] autoremove_wake_function+0x0/0x40 [] lock_release_holdtime+0x47/0x70 [] vfs_read+0xd0/0x160 [] do_sync_write+0x0/0x110 [] vfs_write+0xa0/0x160 [] sys_write+0x41/0x70 [] sysenter_past_esp+0x5f/0xa5 ======================= -------------- The other system shows a similar message: in its logs: Feb 7 02:01:50 kern: ======================================================= Feb 7 02:01:50 kern: [ INFO: possible circular locking dependency detected ] Feb 7 02:01:50 kern: 2.6.24-asa64def #3 Feb 7 02:01:50 kern: ------------------------------------------------------- Feb 7 02:01:50 kern: xfs_fsr/6313 is trying to acquire lock: Feb 7 02:01:50 kern: (&(&ip->i_lock)->mr_lock/2){----}, at: [] xfs_ilock+0x82/0xc0 Feb 7 02:01:50 kern: Feb 7 02:01:50 kern: but task is already holding lock: Feb 7 02:01:50 kern: (&(&ip->i_iolock)->mr_lock/3){--..}, at: [] xfs_ilock+0xa5/0xc0 Feb 7 02:01:50 kern: Feb 7 02:01:50 kern: which lock already depends on the new lock. Feb 7 02:01:50 kern: Feb 7 02:01:50 kern: Feb 7 02:01:50 kern: the existing dependency chain (in reverse order) is: Feb 7 02:01:50 kern: Feb 7 02:01:50 kern: -> #1 (&(&ip->i_iolock)->mr_lock/3){--..}: Feb 7 02:01:50 kern: [] __lock_acquire+0xc39/0x1090 Feb 7 02:01:50 kern: [] lock_acquire+0x61/0x80 Feb 7 02:01:50 kern: [] xfs_ilock+0xa5/0xc0 Feb 7 02:01:50 kern: [] down_write_nested+0x3a/0x80 Feb 7 02:01:50 kern: [] xfs_ilock+0xa5/0xc0 Feb 7 02:01:50 kern: [] xfs_lock_inodes+0x143/0x190 Feb 7 02:01:50 kern: [] xfs_swap_extents+0xa1/0x630 Feb 7 02:01:50 kern: [] lock_release_holdtime+0x45/0x70 Feb 7 02:01:50 kern: [] xfs_swapext+0x131/0x160 Feb 7 02:01:50 kern: [] xfs_ioctl+0x5ce/0x740 Feb 7 02:01:50 kern: [] lock_release_holdtime+0x45/0x70 Feb 7 02:01:50 kern: [] __mutex_unlock_slowpath+0xca/0x1a0 Feb 7 02:01:50 kern: [] xfs_file_ioctl_invis+0x36/0x80 Feb 7 02:01:50 kern: [] do_ioctl+0x2f/0xa0 Feb 7 02:01:50 kern: [] vfs_ioctl+0x230/0x2d0 Feb 7 02:01:50 kern: [] trace_hardirqs_on+0xc1/0x160 Feb 7 02:01:50 kern: [] sys_ioctl+0x49/0x80 Feb 7 02:01:51 kern: [] system_call+0x7e/0x83 Feb 7 02:01:51 kern: [] 0xffffffffffffffff Feb 7 02:01:51 kern: Feb 7 02:01:51 kern: -> #0 (&(&ip->i_lock)->mr_lock/2){----}: Feb 7 02:01:51 kern: [] print_circular_bug_header+0xe8/0xf0 Feb 7 02:01:51 kern: [] __lock_acquire+0xaa8/0x1090 Feb 7 02:01:51 kern: [] lock_acquire+0x61/0x80 Feb 7 02:01:51 kern: [] xfs_ilock+0x82/0xc0 Feb 7 02:01:51 kern: [] _spin_unlock+0x17/0x20 Feb 7 02:01:51 kern: [] down_write_nested+0x3a/0x80 Feb 7 02:01:51 kern: [] xfs_ilock+0x82/0xc0 Feb 7 02:01:51 kern: [] xfs_lock_inodes+0x143/0x190 Feb 7 02:01:51 kern: [] xfs_swap_extents+0x31e/0x630 Feb 7 02:01:51 kern: [] xfs_swapext+0x131/0x160 Feb 7 02:01:51 kern: [] xfs_ioctl+0x5ce/0x740 Feb 7 02:01:51 kern: [] lock_release_holdtime+0x45/0x70 Feb 7 02:01:51 kern: [] __mutex_unlock_slowpath+0xca/0x1a0 Feb 7 02:01:51 kern: [] xfs_file_ioctl_invis+0x36/0x80 Feb 7 02:01:51 kern: [] do_ioctl+0x2f/0xa0 Feb 7 02:01:51 kern: [] vfs_ioctl+0x230/0x2d0 Feb 7 02:01:51 kern: [] trace_hardirqs_on+0xc1/0x160 Feb 7 02:01:51 kern: [] sys_ioctl+0x49/0x80 Feb 7 02:01:51 kern: [] system_call+0x7e/0x83 Feb 7 02:01:51 kern: [] 0xffffffffffffffff Feb 7 02:01:51 kern: Feb 7 02:01:51 kern: other info that might help us debug this: Feb 7 02:01:51 kern: Feb 7 02:01:51 kern: 2 locks held by xfs_fsr/6313: Feb 7 02:01:51 kern: #0: (&(&ip->i_iolock)->mr_lock/2){--..}, at: [] xfs_ilock+0xa5/0xc0 Feb 7 02:01:51 kern: #1: (&(&ip->i_iolock)->mr_lock/3){--..}, at: [] xfs_ilock+0xa5/0xc0 Feb 7 02:01:51 kern: Feb 7 02:01:51 kern: stack backtrace: Feb 7 02:01:51 kern: Pid: 6313, comm: xfs_fsr Not tainted 2.6.24-asa64def #3 Feb 7 02:01:51 kern: Feb 7 02:01:51 kern: Call Trace: Feb 7 02:01:51 kern: [] print_circular_bug_tail+0x83/0x90 Feb 7 02:01:51 kern: [] print_circular_bug_header+0xe8/0xf0 Feb 7 02:01:51 kern: [] __lock_acquire+0xaa8/0x1090 Feb 7 02:01:51 kern: [] lock_acquire+0x61/0x80 Feb 7 02:01:51 kern: [] xfs_ilock+0x82/0xc0 Feb 7 02:01:51 kern: [] _spin_unlock+0x17/0x20 Feb 7 02:01:51 kern: [] down_write_nested+0x3a/0x80 Feb 7 02:01:51 kern: [] xfs_ilock+0x82/0xc0 Feb 7 02:01:51 kern: [] xfs_lock_inodes+0x143/0x190 Feb 7 02:01:51 kern: [] xfs_swap_extents+0x31e/0x630 Feb 7 02:01:51 kern: [] xfs_swapext+0x131/0x160 Feb 7 02:01:51 kern: [] xfs_ioctl+0x5ce/0x740 Feb 7 02:01:51 kern: [] lock_release_holdtime+0x45/0x70 Feb 7 02:01:51 kern: [] __mutex_unlock_slowpath+0xca/0x1a0 [] xfs_file_ioctl_invis+0x36/0x80 Feb 7 02:01:51 kern: [] do_ioctl+0x2f/0xa0 Feb 7 02:01:51 kern: [] vfs_ioctl+0x230/0x2d0 Feb 7 02:01:51 kern: [] trace_hardirqs_on+0xc1/0x160 Feb 7 02:01:51 kern: [] sys_ioctl+0x49/0x80 Feb 7 02:01:51 kern: [] system_call+0x7e/0x83 From owner-xfs@oss.sgi.com Tue Feb 12 00:57:57 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 00:58:04 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1C8voBr026490 for ; Tue, 12 Feb 2008 00:57:55 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id TAA22841; Tue, 12 Feb 2008 19:58:07 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1C8w5LF60502323; Tue, 12 Feb 2008 19:58:06 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1C8w2Wf60514789; Tue, 12 Feb 2008 19:58:02 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Tue, 12 Feb 2008 19:58:02 +1100 From: David Chinner To: Linda Walsh Cc: xfs@oss.sgi.com, Linux-Kernel Subject: Re: xfs [_fsr] probs in 2.6.24.0 Message-ID: <20080212085802.GA155407@sgi.com> References: <47B0F00D.3060802@tlinx.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47B0F00D.3060802@tlinx.org> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5779/Mon Feb 11 11:56:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14409 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Mon, Feb 11, 2008 at 05:02:05PM -0800, Linda Walsh wrote: > > I'm getting similar errors on an x86-32 & x86-64 kernel. The x86-64 system > (2nd log below w/date+times) was unusable this morning: one or more of the > xfs file systems had "gone off line" due to some unknown error (upon reboot, > no errors were indicated; all partitions on the same physical disk). > > I keep ending up with random failures on two systems. The 32-bit sys more > often than not just "locks-up" -- no messages, no keyboard response...etc. > Nothing to do except reset -- and of course, nothing in the logs.... Filesystem bugs rarely hang systems hard like that - more likely is a hardware or driver problem. And neither of the lockdep reports below are likely to be responsible for a system wide, no-response hang. [cut a bunch of speculation and stuff about hardware problems causing XFS problems] > Lemme know if I can provide any info in addressing these problems. I don't > know if they are related to the other system crashes, but usage patterns > indicate it could be disk-activity related. So eliminating known bugs & > problems in that area might help (?)... If your hardware or drivers are unstable, then XFS cannot be expected to reliably work. Given that xfs_fsr apparently triggers the hangs, I'd suggest putting lots of I/O load on your disk subsystem by copying files around with direct I/O (just like xfs_fsr does) to try to reproduce the problem. Perhaps by running xfs_fsr manually you could reproduce the problem while you are sitting in front of the machine... Looking at the lockdep reports: > The first set of errors the I have some diagnostics for, I could > only find in dmesg: > (ish-32bit machine) > ======================================================= > [ INFO: possible circular locking dependency detected ] > 2.6.24-ish #1 > ------------------------------------------------------- > xfs_fsr/2119 is trying to acquire lock: > (&mm->mmap_sem){----}, at: [] dio_get_page+0x62/0x160 > > but task is already holding lock: > (&(&ip->i_iolock)->mr_lock){----}, at: [] xfs_ilock+0x5b/0xb0 dio_get_page() takes the mmap_sem of the processes vma that has the pages we do I/O into. That's not new. We're holding the xfs inode iolock at this point to protect against truncate and simultaneous buffered I/O races and this is also unchanged. i.e. this is normal. > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: munmap() dropping the last reference to it's vm_file and calling ->release() which causes a truncate of speculatively allocated space to take place. IOWs, ->release() is called with the mmap_sem held. Hmmm.... Looking at it in terms of i_mutex, other filesystems hold i_mutex over dio_get_page() (all those that use DIO_LOCKING) so question is whether we are allowed to take the i_mutex in ->release. I note that both reiserfs and hfsplus take i_mutex in ->release as well as use DIO_LOCKING, so this problem is not isolated to XFS. However, it would appear that mmap_sem -> i_mutex is illegal according to the comment at the head of mm/filemap.c. While we are not using i_mutex in this case, the inversion would seem to be equivalent in nature. There's not going to be a quick fix for this. And the other one: > Feb 7 02:01:50 kern: > ------------------------------------------------------- > Feb 7 02:01:50 kern: xfs_fsr/6313 is trying to acquire lock: > Feb 7 02:01:50 kern: (&(&ip->i_lock)->mr_lock/2){----}, at: > [] xfs_ilock+0x82/0xc0 > Feb 7 02:01:50 kern: > Feb 7 02:01:50 kern: but task is already holding lock: > Feb 7 02:01:50 kern: (&(&ip->i_iolock)->mr_lock/3){--..}, at: > [] xfs_ilock+0xa5/0xc0 > Feb 7 02:01:50 kern: > Feb 7 02:01:50 kern: which lock already depends on the new lock. Looks like yet another false positive. Basically we do this in xfs_swap_extents: inode A: i_iolock class 2 inode A: i_ilock class 2 inode B: i_iolock class 3 inode B: i_ilock class 3 ..... inode A: unlock ilock inode B: unlock ilock ..... >>>>> inode A: ilock class 2 inode B: ilock class 3 And lockdep appears to be complaining about the relocking of inode A as class 2 because we've got a class 3 iolock still held, hence violating the order it saw initially. There's no possible deadlock here so we'll just have to add more hacks to the annotation code to make lockdep happy. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Feb 12 03:16:57 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 03:17:34 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1CBGsme003056 for ; Tue, 12 Feb 2008 03:16:57 -0800 X-ASG-Debug-ID: 1202815036-1a8500a20000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from out4.smtp.messagingengine.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A30FD5C71A9 for ; Tue, 12 Feb 2008 03:17:16 -0800 (PST) Received: from out4.smtp.messagingengine.com (out4.smtp.messagingengine.com [66.111.4.28]) by cuda.sgi.com with ESMTP id 3x3TQ2jeIhPkrKAo for ; Tue, 12 Feb 2008 03:17:16 -0800 (PST) Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id 2FED7907AB; Tue, 12 Feb 2008 06:17:16 -0500 (EST) Received: from web6.messagingengine.com ([10.202.2.215]) by compute1.internal (MEProxy); Tue, 12 Feb 2008 06:17:16 -0500 Received: by web6.messagingengine.com (Postfix, from userid 99) id 0216E4710F; Tue, 12 Feb 2008 06:17:15 -0500 (EST) Message-Id: <1202815035.5821.1236404891@webmail.messagingengine.com> X-Sasl-Enc: mNleXhcmCJ8m5sjRyf15ycqr6NfBc0K2Lw+Z5iNql234 1202815035 From: "Felix E. Klee" To: "Timothy Shimmin" Cc: xfs@oss.sgi.com Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="ISO-8859-1" MIME-Version: 1.0 X-Mailer: MessagingEngine.com Webmail Interface References: <1202769551.16458.1236311973@webmail.messagingengine.com> <47B0E26D.7070807@sgi.com> X-ASG-Orig-Subj: Re: Restoring damaged incremental XFS dump? Subject: Re: Restoring damaged incremental XFS dump? In-Reply-To: <47B0E26D.7070807@sgi.com> Date: Tue, 12 Feb 2008 12:17:15 +0100 X-Barracuda-Connect: out4.smtp.messagingengine.com[66.111.4.28] X-Barracuda-Start-Time: 1202815038 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42047 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5782/Tue Feb 12 00:29:55 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14410 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: felix.klee@inka.de Precedence: bulk X-list: xfs On Tue, 12 Feb 2008 11:03:57 +1100, "Timothy Shimmin" said: > I'm not sure what you mean by "true snapshots". I wouldn't really > call it snapshots as in what you could get if you froze the > filesystem etc.. Oh, it does not freeze the filesystem - what a pity. I recall someone telling me that it does. Seems like that was bad information or my memory is failing on me. > > So, what happens when the level 0 dump created with xfsdump becomes > > damaged. Will I still be able to recover some files? What about > > files from >0 dumps? > > Yes you will still be able to restore stuff. However, it is in its own > format so only xfsrestore will be able to do your restoring. Dumps are > separated into what it calls media files which are meant to be self > containing. So damage to 1 theoretically shouldn't prevent restoring > from another media file. Multiple media files are normally only used > for tape (only 1 used for a dump to a file). And an incremental dump > should also be able to be restored in isolation and there are some QA > tests (in xfs-cmds/xfstests) that test this. Thanks! Felix -- Dipl.-Phys. Felix E. Klee Naunynstr. 2, 76530 Baden-Baden, Germany Tel.: +49 7221 396961, Fax: +49 7221 396960, Mobile: +49 174 1386060 http://www.linkedin.com/in/feklee From owner-xfs@oss.sgi.com Tue Feb 12 03:37:18 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 03:37:26 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,J_CHICKENPOX_42, J_CHICKENPOX_43,J_CHICKENPOX_48,J_CHICKENPOX_52,J_CHICKENPOX_54, MIME_8BIT_HEADER autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1CBbHCf008033 for ; Tue, 12 Feb 2008 03:37:18 -0800 X-ASG-Debug-ID: 1202816258-16b100eb0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from hu-out-0506.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5FC095C72F7 for ; Tue, 12 Feb 2008 03:37:38 -0800 (PST) Received: from hu-out-0506.google.com (hu-out-0506.google.com [72.14.214.236]) by cuda.sgi.com with ESMTP id BC8bPF7vOc3B41Ld for ; Tue, 12 Feb 2008 03:37:38 -0800 (PST) Received: by hu-out-0506.google.com with SMTP id 16so7418337hue.17 for ; Tue, 12 Feb 2008 03:37:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; bh=zFycwquvyQlRIXedJ/A5Oq8CYklKHuBaULB9F49fom4=; b=rzFdkYVhXFABWFzfU4Xf9pencZzq+0GLiv/K6EVfhE6w4/uMSJGnxUilKpHLBNZtWu9usLp5v8Fku0K7AQi3IPGy5zmF7ogpr+7cnzazKhSMUnUQ1f0eZUhyBzbnVgn9dqL8i4fLAeIOJ5UTlqb6xwHCGuRC7lrimjiccn8DmNc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=Xt+4wVS66nL82kDJew2TIY7rDw+w1hFJv5BplXXB06ZUn5A9wiXNs+WFuV3N5Aufjbp2m368NBATy2WrooO62KN3UTqbI0NSXMs2+HHMLCPNXEv3QrtdWevlaeNjOEl3r1iqvAc5tIowQ/yCiVpTnykkeQiaDtqvA3ubruVLvpw= Received: by 10.150.148.7 with SMTP id v7mr436604ybd.95.1202816256439; Tue, 12 Feb 2008 03:37:36 -0800 (PST) Received: by 10.150.191.13 with HTTP; Tue, 12 Feb 2008 03:37:36 -0800 (PST) Message-ID: <1a4a774c0802120337x55fa2eb6qb7d52511fba3d11c@mail.gmail.com> Date: Tue, 12 Feb 2008 12:37:36 +0100 From: "=?ISO-8859-1?Q?Christian_R=F8snes?=" To: xfs@oss.sgi.com X-ASG-Orig-Subj: inode size benchmarking Subject: inode size benchmarking MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Barracuda-Connect: hu-out-0506.google.com[72.14.214.236] X-Barracuda-Start-Time: 1202816260 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42047 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5785/Tue Feb 12 02:41:10 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14411 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: christian.rosnes@gmail.com Precedence: bulk X-list: xfs I'm trying to figure out how different inode sizes on my system affect the time it takes to: 1) Create directories each with files (using different file sizes) 2) Read all the files from dir1, all the files from dir2, ... 3) Read file1 from dir1, file1 from dir2, ... file2 from dir1, file2 from dir2, ... The file sizes in this test range from 1MB to 50MB. In the test case enclosed I've used inode size 256 and 2048. (The script I've used is also attached at the end of this email) The test server used: * Debian 4 (Etch) * Kernel: Debian 2.6.18-6-amd64 #1 SMP Wed Jan 23 06:27:23 UTC 2008 x86_64 GNU/Linux * CPU: Intel(R) Xeon(R) CPU E5405 @ 2.00GHz * MEM: 4GB RAM * DISK: DELL MD1000 7 disks (1TB SATA) in RAID5. PERC6/E controller * The test partition is 6TB. * I'm using xfsprogs 2.9.4 The timing results indicate that using an inode size = 2048 bytes, the tests on my system complete quicker than the default inode size = 256 bytes. I'm curious what might cause this, and if it can be explained by merely looking at the inode data ? And if yes, what should I be looking for ? Looking at the extent mapping for each inode from test 1) above, I find that the each file only contain 1 extent (well in some cases a few files turn up with 2 extents - see full log below for details), and thus it should fit inside the inode regardless of the inode size (256 vs 2048). So could there be other factors weighing in ? Thanks Christian # ---------------------- Results - short version ---------------------- # grep -A 2 'CREATE_TIME' run5.log CREATE_TIME:isize=256:fname=file1mb real 0m10.781s -- CREATE_TIME:isize=2048:fname=file1mb real 0m5.601s -- CREATE_TIME:isize=256:fname=file5mb real 1m9.771s -- CREATE_TIME:isize=2048:fname=file5mb real 0m46.441s -- CREATE_TIME:isize=256:fname=file10mb real 2m12.143s -- CREATE_TIME:isize=2048:fname=file10mb real 1m20.204s -- CREATE_TIME:isize=256:fname=file20mb real 3m59.955s -- CREATE_TIME:isize=2048:fname=file20mb real 2m18.546s -- CREATE_TIME:isize=256:fname=file30mb real 5m40.613s -- CREATE_TIME:isize=2048:fname=file30mb real 3m8.358s -- CREATE_TIME:isize=256:fname=file40mb real 7m16.146s -- CREATE_TIME:isize=2048:fname=file40mb real 4m7.343s -- CREATE_TIME:isize=256:fname=file50mb real 9m3.861s -- CREATE_TIME:isize=2048:fname=file50mb real 4m59.963s # grep -A 2 'READ1_TIME' run5.log READ1_TIME:isize=256:fname=file1mb real 1m4.766s -- READ1_TIME:isize=2048:fname=file1mb real 0m23.320s -- READ1_TIME:isize=256:fname=file5mb real 1m23.792s -- READ1_TIME:isize=2048:fname=file5mb real 0m57.450s -- READ1_TIME:isize=256:fname=file10mb real 1m42.586s -- READ1_TIME:isize=2048:fname=file10mb real 1m14.005s -- READ1_TIME:isize=256:fname=file20mb real 2m13.475s -- READ1_TIME:isize=2048:fname=file20mb real 1m53.727s -- READ1_TIME:isize=256:fname=file30mb real 2m53.505s -- READ1_TIME:isize=2048:fname=file30mb real 2m26.111s -- READ1_TIME:isize=256:fname=file40mb real 3m27.172s -- READ1_TIME:isize=2048:fname=file40mb real 3m2.579s -- READ1_TIME:isize=256:fname=file50mb real 4m2.087s -- READ1_TIME:isize=2048:fname=file50mb real 3m35.429s # grep -A 2 'READ2_TIME' run5.log READ2_TIME:isize=256:fname=file1mb real 0m31.013s -- READ2_TIME:isize=2048:fname=file1mb real 0m7.328s -- READ2_TIME:isize=256:fname=file5mb real 0m59.086s -- READ2_TIME:isize=2048:fname=file5mb real 0m21.544s -- READ2_TIME:isize=256:fname=file10mb real 1m18.158s -- READ2_TIME:isize=2048:fname=file10mb real 0m39.243s -- READ2_TIME:isize=256:fname=file20mb real 1m51.332s -- READ2_TIME:isize=2048:fname=file20mb real 1m14.573s -- READ2_TIME:isize=256:fname=file30mb real 2m25.312s -- READ2_TIME:isize=2048:fname=file30mb real 1m50.197s -- READ2_TIME:isize=256:fname=file40mb real 3m0.558s -- READ2_TIME:isize=2048:fname=file40mb real 2m24.960s -- READ2_TIME:isize=256:fname=file50mb real 3m36.145s -- READ2_TIME:isize=2048:fname=file50mb real 3m1.140s # ---------------------- Test script ---------------------- #!/bin/sh PARTITION=/content/backup01 DEV=/dev/sdb1 SRCDIR=/root/test TEST01=$SRCDIR/test01.sh TEST02=$SRCDIR/test02.sh FHEAD=file FTAIL=mb MPARAM="nobarrier,noatime,logbufs=8,logbsize=256k" MOUNT="mount -t xfs -o $MPARAM $DEV $PARTITION" UMOUNT="umount $PARTITION" DIRCNT=10 FILECNT=100 FILESIZES_MB="1 5 10 20 30 40 50" INODESIZES="256 2048" if [ ! -d "$SRCDIR" ] ; then echo "ERR: SRCDIR $SRCDIR does not exist" exit 1 fi #create srcfiles used during cp for MBSIZE in $FILESIZES_MB ; do fname=$FHEAD$MBSIZE$FTAIL dd if=/dev/zero of=$SRCDIR/$fname bs=1024k count=$MBSIZE >/dev/null 2>&1 done sync # start test for MBSIZE in $FILESIZES_MB ; do for isize in $INODESIZES ; do fname=$FHEAD$MBSIZE$FTAIL echo "Run: isize=$isize, fname=$fname (`date`)" # RAID5 - 7 disks (6 data + 1 parity) sw=6 mkfs.xfs -f -i size=$isize,attr=2 -l \ version=2,su=64k,size=128m -d su=64k,sw=6 $DEV || exit 1 $MOUNT || exit 1 echo "Mount options used:" mount | grep $DEV # cache srcfile used during cp cp $SRCDIR/$fname /dev/null echo "Creating $DIRCNT directories and cp $FILECNT $fname files into each" echo "CREATE_TIME:isize=$isize:fname=$fname" time { for d in `seq -w 1 $DIRCNT` do D="$PARTITION/test/$d" mkdir -p "$D" || exit 1 for f in `seq -w 1 $FILECNT` do cp "$SRCDIR/$fname" "$D/file.$f" ; done done sync } echo echo "Disk usage:" df -k | grep "$DEV" echo echo "Extents:" for ino in `find $PARTITION/test -type f -printf "%i "` ; do xfs_db -r -c "inode $ino " -c "print core.nextents" $DEV done | sort | uniq -c $UMOUNT || exit 1 $MOUNT || exit 1 echo echo "READ1 (cp) each of the $FILECNT $fname files" echo " in each of the $DIRCNT directories" echo "NOTE: we read (cp) the first file from each directory, " echo " then the second file from each directory, and so on" echo "READ1_TIME:isize=$isize:fname=$fname" time { for f in `seq -w 1 $FILECNT` do for d in `seq -w 1 $DIRCNT` do D="$PARTITION/test/$d"; cp "$D/file.$f" /dev/null ; done done sync } $UMOUNT || exit 1 $MOUNT || exit 1 echo echo "READ2 (cp) each of the $FILECNT $fname files" echo " in each of the $DIRCNT directories" echo "NOTE: we read all files in the first directory," echo " then the all the files in the second, and so on" echo "READ2_TIME:isize=$isize:fname=$fname" time { for d in `seq -w 1 $DIRCNT` do for f in `seq -w 1 $FILECNT` do D="$PARTITION/test/$d"; cp "$D/file.$f" /dev/null ; done done sync } $UMOUNT || exit 1 echo "-------------------------------------------------------------" done done echo "Done `date`" # ---------------------- Results - full version ---------------------- ./benchmark_fs.sh > run5.log 2>&1 # cat run5.log Run: isize=256, fname=file1mb (Tue Feb 12 10:31:54 CET 2008) meta-data=/dev/sdb1 isize=256 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file1mb files into each CREATE_TIME:isize=256:fname=file1mb real 0m10.781s user 0m0.388s sys 0m2.496s Disk usage: /dev/sdb1 5857212416 1025368 5856187048 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file1mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=256:fname=file1mb real 1m4.766s user 0m0.372s sys 0m1.468s READ2 (cp) each of the 100 file1mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=256:fname=file1mb real 0m31.013s user 0m0.376s sys 0m1.520s ------------------------------------------------------------- Run: isize=2048, fname=file1mb (Tue Feb 12 10:33:43 CET 2008) meta-data=/dev/sdb1 isize=2048 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file1mb files into each CREATE_TIME:isize=2048:fname=file1mb real 0m5.601s user 0m0.348s sys 0m2.456s Disk usage: /dev/sdb1 5857212416 1027728 5856184688 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file1mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=2048:fname=file1mb real 0m23.320s user 0m0.344s sys 0m1.564s READ2 (cp) each of the 100 file1mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=2048:fname=file1mb real 0m7.328s user 0m0.296s sys 0m1.448s ------------------------------------------------------------- Run: isize=256, fname=file5mb (Tue Feb 12 10:34:22 CET 2008) meta-data=/dev/sdb1 isize=256 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file5mb files into each CREATE_TIME:isize=256:fname=file5mb real 1m9.771s user 0m0.672s sys 0m10.789s Disk usage: /dev/sdb1 5857212416 5121368 5852091048 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file5mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=256:fname=file5mb real 1m23.792s user 0m0.552s sys 0m4.140s READ2 (cp) each of the 100 file5mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=256:fname=file5mb real 0m59.086s user 0m0.468s sys 0m3.936s ------------------------------------------------------------- Run: isize=2048, fname=file5mb (Tue Feb 12 10:37:59 CET 2008) meta-data=/dev/sdb1 isize=2048 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file5mb files into each CREATE_TIME:isize=2048:fname=file5mb real 0m46.441s user 0m0.688s sys 0m9.833s Disk usage: /dev/sdb1 5857212416 5123728 5852088688 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file5mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=2048:fname=file5mb real 0m57.450s user 0m0.552s sys 0m3.996s READ2 (cp) each of the 100 file5mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=2048:fname=file5mb real 0m21.544s user 0m0.552s sys 0m3.784s ------------------------------------------------------------- Run: isize=256, fname=file10mb (Tue Feb 12 10:40:12 CET 2008) meta-data=/dev/sdb1 isize=256 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file10mb files into each CREATE_TIME:isize=256:fname=file10mb real 2m12.143s user 0m0.872s sys 0m20.629s Disk usage: /dev/sdb1 5857212416 10241368 5846971048 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file10mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=256:fname=file10mb real 1m42.586s user 0m0.764s sys 0m7.272s READ2 (cp) each of the 100 file10mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=256:fname=file10mb real 1m18.158s user 0m0.752s sys 0m7.036s ------------------------------------------------------------- Run: isize=2048, fname=file10mb (Tue Feb 12 10:45:29 CET 2008) meta-data=/dev/sdb1 isize=2048 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file10mb files into each CREATE_TIME:isize=2048:fname=file10mb real 1m20.204s user 0m0.960s sys 0m19.189s Disk usage: /dev/sdb1 5857212416 10243728 5846968688 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file10mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=2048:fname=file10mb real 1m14.005s user 0m0.768s sys 0m7.148s READ2 (cp) each of the 100 file10mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=2048:fname=file10mb real 0m39.243s user 0m0.724s sys 0m7.136s ------------------------------------------------------------- Run: isize=256, fname=file20mb (Tue Feb 12 10:48:47 CET 2008) meta-data=/dev/sdb1 isize=256 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file20mb files into each CREATE_TIME:isize=256:fname=file20mb real 3m59.955s user 0m1.376s sys 0m39.874s Disk usage: /dev/sdb1 5857212416 20481368 5836731048 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file20mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=256:fname=file20mb real 2m13.475s user 0m1.036s sys 0m13.849s READ2 (cp) each of the 100 file20mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=256:fname=file20mb real 1m51.332s user 0m0.972s sys 0m13.313s ------------------------------------------------------------- Run: isize=2048, fname=file20mb (Tue Feb 12 10:56:56 CET 2008) meta-data=/dev/sdb1 isize=2048 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file20mb files into each CREATE_TIME:isize=2048:fname=file20mb real 2m18.546s user 0m1.256s sys 0m39.622s Disk usage: /dev/sdb1 5857212416 20483728 5836728688 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file20mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=2048:fname=file20mb real 1m53.727s user 0m1.156s sys 0m13.789s READ2 (cp) each of the 100 file20mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=2048:fname=file20mb real 1m14.573s user 0m1.016s sys 0m13.229s ------------------------------------------------------------- Run: isize=256, fname=file30mb (Tue Feb 12 11:02:28 CET 2008) meta-data=/dev/sdb1 isize=256 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file30mb files into each CREATE_TIME:isize=256:fname=file30mb real 5m40.613s user 0m1.840s sys 0m58.684s Disk usage: /dev/sdb1 5857212416 30721368 5826491048 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file30mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=256:fname=file30mb real 2m53.505s user 0m1.340s sys 0m20.013s READ2 (cp) each of the 100 file30mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=256:fname=file30mb real 2m25.312s user 0m1.408s sys 0m20.073s ------------------------------------------------------------- Run: isize=2048, fname=file30mb (Tue Feb 12 11:13:36 CET 2008) meta-data=/dev/sdb1 isize=2048 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file30mb files into each CREATE_TIME:isize=2048:fname=file30mb real 3m8.358s user 0m1.684s sys 0m58.192s Disk usage: /dev/sdb1 5857212416 30723728 5826488688 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file30mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=2048:fname=file30mb real 2m26.111s user 0m1.344s sys 0m19.993s READ2 (cp) each of the 100 file30mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=2048:fname=file30mb real 1m50.197s user 0m1.340s sys 0m19.809s ------------------------------------------------------------- Run: isize=256, fname=file40mb (Tue Feb 12 11:21:07 CET 2008) meta-data=/dev/sdb1 isize=256 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file40mb files into each CREATE_TIME:isize=256:fname=file40mb real 7m16.146s user 0m2.020s sys 1m18.325s Disk usage: /dev/sdb1 5857212416 40961368 5816251048 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file40mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=256:fname=file40mb real 3m27.172s user 0m1.744s sys 0m25.962s READ2 (cp) each of the 100 file40mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=256:fname=file40mb real 3m0.558s user 0m1.452s sys 0m25.946s ------------------------------------------------------------- Run: isize=2048, fname=file40mb (Tue Feb 12 11:35:00 CET 2008) meta-data=/dev/sdb1 isize=2048 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file40mb files into each CREATE_TIME:isize=2048:fname=file40mb real 4m7.343s user 0m2.144s sys 1m17.077s Disk usage: /dev/sdb1 5857212416 40963728 5816248688 1% /content/backup01 Extents: 1000 core.nextents = 1 READ1 (cp) each of the 100 file40mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=2048:fname=file40mb real 3m2.579s user 0m1.748s sys 0m26.298s READ2 (cp) each of the 100 file40mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=2048:fname=file40mb real 2m24.960s user 0m1.444s sys 0m26.126s ------------------------------------------------------------- Run: isize=256, fname=file50mb (Tue Feb 12 11:44:41 CET 2008) meta-data=/dev/sdb1 isize=256 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file50mb files into each CREATE_TIME:isize=256:fname=file50mb real 9m3.861s user 0m2.512s sys 1m36.162s Disk usage: /dev/sdb1 5857212416 51201368 5806011048 1% /content/backup01 Extents: 997 core.nextents = 1 3 core.nextents = 2 READ1 (cp) each of the 100 file50mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=256:fname=file50mb real 4m2.087s user 0m2.036s sys 0m32.382s READ2 (cp) each of the 100 file50mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=256:fname=file50mb real 3m36.145s user 0m2.044s sys 0m32.598s ------------------------------------------------------------- Run: isize=2048, fname=file50mb (Tue Feb 12 12:01:28 CET 2008) meta-data=/dev/sdb1 isize=2048 agcount=32, agsize=45760496 blks = sectsz=512 attr=2 data = bsize=4096 blocks=1464335872, imaxpct=25 = sunit=16 swidth=96 blks, unwritten=1 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=16 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Mount options used: /dev/sdb1 on /content/backup01 type xfs (rw,noatime,nobarrier,logbufs=8,logbsize=256k) Creating 10 directories and cp 100 file50mb files into each CREATE_TIME:isize=2048:fname=file50mb real 4m59.963s user 0m2.676s sys 1m34.186s Disk usage: /dev/sdb1 5857212416 51203728 5806008688 1% /content/backup01 Extents: 996 core.nextents = 1 4 core.nextents = 2 READ1 (cp) each of the 100 file50mb files in each of the 10 directories NOTE: we read (cp) the first file from each directory, then the second file from each directory, and so on READ1_TIME:isize=2048:fname=file50mb real 3m35.429s user 0m1.776s sys 0m32.742s READ2 (cp) each of the 100 file50mb files in each of the 10 directories NOTE: we read all files in the first directory, then the all the files in the second, and so on READ2_TIME:isize=2048:fname=file50mb real 3m1.140s user 0m2.116s sys 0m32.810s ------------------------------------------------------------- Done Tue Feb 12 12:13:11 CET 2008 From owner-xfs@oss.sgi.com Tue Feb 12 04:15:13 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 04:15:44 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_54, MIME_8BIT_HEADER autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1CCF5fx011385 for ; Tue, 12 Feb 2008 04:15:10 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id XAA27266; Tue, 12 Feb 2008 23:15:23 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1CCFLLF60688918; Tue, 12 Feb 2008 23:15:22 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1CCFIPa60691677; Tue, 12 Feb 2008 23:15:18 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Tue, 12 Feb 2008 23:15:18 +1100 From: David Chinner To: Christian =?iso-8859-1?Q?R=F8snes?= Cc: xfs@oss.sgi.com Subject: Re: inode size benchmarking Message-ID: <20080212121518.GD155407@sgi.com> References: <1a4a774c0802120337x55fa2eb6qb7d52511fba3d11c@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1a4a774c0802120337x55fa2eb6qb7d52511fba3d11c@mail.gmail.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5785/Tue Feb 12 02:41:10 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14412 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Tue, Feb 12, 2008 at 12:37:36PM +0100, Christian Røsnes wrote: > I'm trying to figure out how different inode sizes on my system > affect the time it takes to: > > 1) Create directories each with files (using different file sizes) > > 2) Read all the files from dir1, > all the files from dir2, > ... > > 3) Read file1 from dir1, > file1 from dir2, > ... > file2 from dir1, > file2 from dir2, > ... Ok, and destination file names are "file.XXX" so there's about 12 bytes per shorform dir entry. That means the 2k inodes hold all the 100 files in them directly. That's the only on disk difference that changing the inode size will make, and it clearly does not explain a 50% difference in perfomrance between 256 byte and 2k inodes in these tests. > The test server used: > > * Debian 4 (Etch) > * Kernel: Debian 2.6.18-6-amd64 #1 SMP Wed Jan 23 06:27:23 UTC 2008 > x86_64 GNU/Linux > * CPU: Intel(R) Xeon(R) CPU E5405 @ 2.00GHz > * MEM: 4GB RAM > * DISK: DELL MD1000 7 disks (1TB SATA) in RAID5. PERC6/E controller > * The test partition is 6TB. ^^^ This does, though. With 256 byte inodes, the allocator changes behaviour at filesystem sizes > 1TB to keep inodes at smaller than 32 bits. This change means that data is no longer close to the inodes, thereby seeking the disks more as it moves between writing data and writing inodes. With 2k inodes, that change doesn't occur until 8TB in size (as that is the 32bit inode number limit with 2k inodes), so the allocator is still keeping inode+data locality as close as possible on a fs size of 6TB. I suggest running the 256 byte inode numbers again with the "inode64" mount option (so the allocator behaves the same as for 2k inodes) and seeing how much difference remains.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Feb 12 04:24:42 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 04:24:52 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1CCOXTF017824 for ; Tue, 12 Feb 2008 04:24:40 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id XAA27472; Tue, 12 Feb 2008 23:24:49 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1CCOlLF60495370; Tue, 12 Feb 2008 23:24:48 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1CCOkVf60615752; Tue, 12 Feb 2008 23:24:46 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Tue, 12 Feb 2008 23:24:46 +1100 From: David Chinner To: "Felix E. Klee" Cc: Timothy Shimmin , xfs@oss.sgi.com Subject: Re: Restoring damaged incremental XFS dump? Message-ID: <20080212122446.GE155407@sgi.com> References: <1202769551.16458.1236311973@webmail.messagingengine.com> <47B0E26D.7070807@sgi.com> <1202815035.5821.1236404891@webmail.messagingengine.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1202815035.5821.1236404891@webmail.messagingengine.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5785/Tue Feb 12 02:41:10 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14413 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Tue, Feb 12, 2008 at 12:17:15PM +0100, Felix E. Klee wrote: > On Tue, 12 Feb 2008 11:03:57 +1100, "Timothy Shimmin" > said: > > I'm not sure what you mean by "true snapshots". I wouldn't really > > call it snapshots as in what you could get if you froze the > > filesystem etc.. > > Oh, it does not freeze the filesystem - what a pity. I recall someone > telling me that it does. Seems like that was bad information or my > memory is failing on me. Use dm-snap to create a snapshot and do the backup from that. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Feb 12 06:15:18 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 06:15:28 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,J_CHICKENPOX_54, MIME_8BIT_HEADER autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1CEFH35030257 for ; Tue, 12 Feb 2008 06:15:17 -0800 X-ASG-Debug-ID: 1202825740-1a8103580000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ti-out-0910.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8952E5C7E2C for ; Tue, 12 Feb 2008 06:15:40 -0800 (PST) Received: from ti-out-0910.google.com (ti-out-0910.google.com [209.85.142.191]) by cuda.sgi.com with ESMTP id 4HusHHAPZ0ksoYDw for ; Tue, 12 Feb 2008 06:15:40 -0800 (PST) Received: by ti-out-0910.google.com with SMTP id d10so477678tib.18 for ; Tue, 12 Feb 2008 06:15:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=VmcAsMxv0vW71DXFzQKhF/D/XIp0dSZw88sOLDhsS0k=; b=ctDR25lxW/U6xadMZzn3KKWt+g8qqGpf3qq6CxFyqNeuv/r3OMT7DeXHxSScMOe0GW5qheE6zK8mSHj3B6UkvYIJzeE9f5FJ3A1vWmcSCL6YuJxC8vyjo30MUod8RUVO9SX/6V9+j+517sN/Qe1q6lb7+0Rsr7y2B4lZwo9xK/M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=UpK1pSl+rIPnOKdDqTWQXhthnXrX4htwsBsWs2cTmf85ov04JRwxkrolqctLnXsbfGMaBzNYD3mIGk8TOirgHWm2n/nDv8GAAMW/CwgNqRdV1DO+NUvyccfxgh7WZy4aQd/w1FE24bKkzyeePGk78m9YT1UP4EKV6BIAjgePFPU= Received: by 10.151.82.3 with SMTP id j3mr485047ybl.78.1202824275322; Tue, 12 Feb 2008 05:51:15 -0800 (PST) Received: by 10.150.191.13 with HTTP; Tue, 12 Feb 2008 05:51:15 -0800 (PST) Message-ID: <1a4a774c0802120551w523ecbb5l8fc5c73d22b6424b@mail.gmail.com> Date: Tue, 12 Feb 2008 14:51:15 +0100 From: "=?ISO-8859-1?Q?Christian_R=F8snes?=" To: "David Chinner" X-ASG-Orig-Subj: Re: inode size benchmarking Subject: Re: inode size benchmarking Cc: xfs@oss.sgi.com In-Reply-To: <20080212121518.GD155407@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <1a4a774c0802120337x55fa2eb6qb7d52511fba3d11c@mail.gmail.com> <20080212121518.GD155407@sgi.com> X-Barracuda-Connect: ti-out-0910.google.com[209.85.142.191] X-Barracuda-Start-Time: 1202825741 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42052 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5788/Tue Feb 12 05:09:49 2008 on oss.sgi.com X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id m1CEFI35030259 X-archive-position: 14414 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: christian.rosnes@gmail.com Precedence: bulk X-list: xfs On Feb 12, 2008 1:15 PM, David Chinner wrote: > On Tue, Feb 12, 2008 at 12:37:36PM +0100, Christian Røsnes wrote: > > The test server used: > > > > * Debian 4 (Etch) > > * Kernel: Debian 2.6.18-6-amd64 #1 SMP Wed Jan 23 06:27:23 UTC 2008 > > x86_64 GNU/Linux > > * CPU: Intel(R) Xeon(R) CPU E5405 @ 2.00GHz > > * MEM: 4GB RAM > > * DISK: DELL MD1000 7 disks (1TB SATA) in RAID5. PERC6/E controller > > * The test partition is 6TB. > ^^^ > > This does, though. > > With 256 byte inodes, the allocator changes behaviour at filesystem > sizes > 1TB to keep inodes at smaller than 32 bits. This change > means that data is no longer close to the inodes, thereby seeking > the disks more as it moves between writing data and writing inodes. > > With 2k inodes, that change doesn't occur until 8TB in size (as that > is the 32bit inode number limit with 2k inodes), so the allocator is > still keeping inode+data locality as close as possible on a fs size > of 6TB. > > I suggest running the 256 byte inode numbers again with the "inode64" > mount option (so the allocator behaves the same as for 2k inodes) and > seeing how much difference remains.... > Yes, with mount option inode64 the inode=256 tests now run as fast as the inode=2048 tests. Thanks Christian From owner-xfs@oss.sgi.com Tue Feb 12 13:01:45 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 13:01:49 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_45 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1CL1hnP027620 for ; Tue, 12 Feb 2008 13:01:45 -0800 X-ASG-Debug-ID: 1202850126-6e0703710000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ishtar.tlinx.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5E642DF588B; Tue, 12 Feb 2008 13:02:06 -0800 (PST) Received: from ishtar.tlinx.org (ishtar.tlinx.org [64.81.245.74]) by cuda.sgi.com with ESMTP id ePqrGlFM4vPS1kN4; Tue, 12 Feb 2008 13:02:06 -0800 (PST) Received: from [192.168.3.11] (Athena [192.168.3.11]) by ishtar.tlinx.org (8.13.3/8.12.10/SuSE Linux 0.7) with ESMTP id m1CL25UN029071; Tue, 12 Feb 2008 13:02:05 -0800 Message-ID: <47B2094D.50406@tlinx.org> Date: Tue, 12 Feb 2008 13:02:05 -0800 From: Linda Walsh User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: David Chinner CC: Linux-Xfs , LKML X-ASG-Orig-Subj: Re: xfs [_fsr] probs in 2.6.24.0 Subject: Re: xfs [_fsr] probs in 2.6.24.0 References: <47B0F00D.3060802@tlinx.org> <20080212085802.GA155407@sgi.com> In-Reply-To: <20080212085802.GA155407@sgi.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Connect: ishtar.tlinx.org[64.81.245.74] X-Barracuda-Start-Time: 1202850127 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42075 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5791/Tue Feb 12 10:37:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14415 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: xfs@tlinx.org Precedence: bulk X-list: xfs David Chinner wrote: > Filesystem bugs rarely hang systems hard like that - more likely is > a hardware or driver problem. And neither of the lockdep reports > below are likely to be responsible for a system wide, no-response > hang. --- "Ish", the 32-bitter, has been the only hard-hanger. Since upgrading to 2.6.24, it's crashed once, inexplicably, but has since stayed up longer than it has since I started with the whole SATA fiasco (which I intend to inflict upon myself again, as soon as I get back to a "stable" config -- masochistic nature I suppose). > If your hardware or drivers are unstable, then XFS cannot be > expected to reliably work. Given that xfs_fsr apparently triggers > the hangs, I'd suggest putting lots of I/O load on your disk subsystem > by copying files around with direct I/O (just like xfs_fsr does) to > try to reproduce the problem. --- The hardware drivers in ish are the older PATA drivers -- nothing new...cept I did add tickless option for system clock. I've only been running XFS on this system (mostly same hardware, disks upgraded), for about 6-7 years. > Perhaps by running xfs_fsr manually you could reproduce the > problem while you are sitting in front of the machine... ---- Um...yeah, AND with multiple "cp's of multi-gig files going on at same time, both local, by a sister machine via NFS, and and a 3rd machine tapping (not banging) away via CIFS. These were on top of normal server duties. Whenever I stress it on *purpose* and watch it, works fine. GRRRRRRR....I HATE THAT!!! >> xfs_fsr/2119 is trying to acquire lock: >> (&mm->mmap_sem){----}, at: [] dio_get_page+0x62/0x160 >> >> but task is already holding lock: >> (&(&ip->i_iolock)->mr_lock){----}, at: [] xfs_ilock+0x5b/0xb0 > > dio_get_page() takes the mmap_sem of the processes > vma that has the pages we do I/O into. That's not new. > We're holding the xfs inode iolock at this point to protect > against truncate and simultaneous buffered I/O races and > this is also unchanged. i.e. this is normal. --- Uh huh...please note I'm not, trying to point fingers at xfs_fsr, but the locking diagnostics associated with xfs_fsr have been the only "hint" of anything "irregular", _at_ _least_, that is, since I've removed the SATA controller+disk) on 'ish32'. The file system(s) going "offline" due to xfs-detected filesystem errors has only happened *once* on asa, the 64-bit machine. It's a fairly new machine w/o added hardware -- but this only happened in 2.24.0 when I added the tickless clock option, which sure seems like a remote possibility for causing an xfs error, but could be. A 3rd linux system, hardware poor, "ast-32", was up over 20 days on 2.23.14 (w/tickless) before I took it down for a 2.24.2 kernel install (its single 20G disk is so old that it doesn't support barriers). > >> which lock already depends on the new lock. >> the existing dependency chain (in reverse order) is: > > munmap() dropping the last reference to it's vm_file and > calling ->release() which causes a truncate of speculatively > allocated space to take place. IOWs, ->release() is called > with the mmap_sem held. Hmmm.... > > Looking at it in terms of i_mutex, other filesystems hold > i_mutex over dio_get_page() (all those that use DIO_LOCKING) > so question is whether we are allowed to take the i_mutex > in ->release. I note that both reiserfs and hfsplus take > i_mutex in ->release as well as use DIO_LOCKING, so this > problem is not isolated to XFS. > > However, it would appear that mmap_sem -> i_mutex is illegal > according to the comment at the head of mm/filemap.c. While we are > not using i_mutex in this case, the inversion would seem to be > equivalent in nature. > > There's not going to be a quick fix for this. ---- What could be the consequences of this locking anomaly? I.e., for example, in NFS, I have enabled "allow direct I/O on NFS files". The times when the system has been unstable would be around the time when the local machine might be running xfs_fsr while a remote system is using NFS to write its backups. The exact timing of things depends on the dump-level and internet-'book-keeping' work done on the local system which adds an element of uncertainty as to whether or not xfs_fsr might be running at the same time NFS might be doing direct I/O. It's also possible for a local backup to be writing to a backup disk at the same time xfs_fsr is running, since they trigger off of different cron entries (xfs_fsr off of cron.daily which runs "whenever"), and backups which run at mostly fixed times. The local backup uses xfs_dump (which might use some direct I/O to read?) but the writes go through compression, and are likely using buffered i/o. > > And the other one: > >> Feb 7 02:01:50 kern: >> ------------------------------------------------------- >> Feb 7 02:01:50 kern: xfs_fsr/6313 is trying to acquire lock: >> Feb 7 02:01:50 kern: (&(&ip->i_lock)->mr_lock/2){----}, at: >> [] xfs_ilock+0x82/0xc0 >> Feb 7 02:01:50 kern: >> Feb 7 02:01:50 kern: but task is already holding lock: >> Feb 7 02:01:50 kern: (&(&ip->i_iolock)->mr_lock/3){--..}, at: >> [] xfs_ilock+0xa5/0xc0 >> Feb 7 02:01:50 kern: >> Feb 7 02:01:50 kern: which lock already depends on the new lock. > > Looks like yet another false positive. Basically we do this > in xfs_swap_extents: > > inode A: i_iolock class 2 > inode A: i_ilock class 2 > inode B: i_iolock class 3 > inode B: i_ilock class 3 > ..... > inode A: unlock ilock > inode B: unlock ilock > ..... >>>>>> inode A: ilock class 2 > inode B: ilock class 3 > > And lockdep appears to be complaining about the relocking of inode A > as class 2 because we've got a class 3 iolock still held, hence > violating the order it saw initially. There's no possible deadlock > here so we'll just have to add more hacks to the annotation code to make > lockdep happy. ---- Is there a reason to unlock and relock the same inode while the level 3 lock is held -- i.e. does 'unlocking ilock' allow some increased 'throughput' for some other potential process to access the same inode? I'd expect not, if the 'iolock' is held, but just a question. I certainly don't understand the exact effects of the various locks in question, but it seems that the 2nd two groups where the inodes are unlocked and relocked are superfluous if an iolock for those inodes remains held. But again, I don't really know what the locks are doing, so don't know. Sorry for all the bother. Just trying to figure out why a system that was rock-solid (2-3 month uptimes, easily, only planned downs), to going all flako on me when I tried to add SATA and upgraded kernel to include latest SATA code & drivers. Unfortunately part of that was adding udev in place of a static /dev, so that's another unknown that I know is flakey at times (had a SATA sdb disk go off line with a supposed HW-reset error, then have it come back on line as "sdc"!) That's certainly a bit weird from my perspective, but hey, some might consider it a feature, so who am I to argue....:-) From owner-xfs@oss.sgi.com Tue Feb 12 13:24:03 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 13:24:06 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1CLO2Wa029140 for ; Tue, 12 Feb 2008 13:24:03 -0800 X-ASG-Debug-ID: 1202851465-76e2026c0000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C242BDF5B27; Tue, 12 Feb 2008 13:24:26 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com with ESMTP id GKG9H9sBPgtslPVr; Tue, 12 Feb 2008 13:24:26 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id m1CLOPCr008813; Tue, 12 Feb 2008 16:24:25 -0500 Received: from lacrosse.corp.redhat.com (lacrosse.corp.redhat.com [172.16.52.154]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m1CLOPuk022208; Tue, 12 Feb 2008 16:24:25 -0500 Received: from neon.msp.redhat.com (neon.msp.redhat.com [10.15.80.10]) by lacrosse.corp.redhat.com (8.12.11.20060308/8.11.6) with ESMTP id m1CLOLr3013354; Tue, 12 Feb 2008 16:24:22 -0500 Message-ID: <47B20E84.6020707@sandeen.net> Date: Tue, 12 Feb 2008 15:24:20 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (X11/20071115) MIME-Version: 1.0 To: Linda Walsh CC: David Chinner , Linux-Xfs , LKML X-ASG-Orig-Subj: Re: xfs [_fsr] probs in 2.6.24.0 Subject: Re: xfs [_fsr] probs in 2.6.24.0 References: <47B0F00D.3060802@tlinx.org> <20080212085802.GA155407@sgi.com> <47B2094D.50406@tlinx.org> In-Reply-To: <47B2094D.50406@tlinx.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 X-Barracuda-Connect: mx1.redhat.com[66.187.233.31] X-Barracuda-Start-Time: 1202851466 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42077 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5794/Tue Feb 12 12:49:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14416 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Linda Walsh wrote: > > David Chinner wrote: >> Filesystem bugs rarely hang systems hard like that - more likely is >> a hardware or driver problem. And neither of the lockdep reports >> below are likely to be responsible for a system wide, no-response >> hang. > --- > "Ish", the 32-bitter, has been the only hard-hanger. 4k stacks? -Eric From owner-xfs@oss.sgi.com Tue Feb 12 13:46:47 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 13:46:53 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1CLkimv030260 for ; Tue, 12 Feb 2008 13:46:47 -0800 X-ASG-Debug-ID: 1202852828-53b700630000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ishtar.tlinx.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 381475CB32B for ; Tue, 12 Feb 2008 13:47:08 -0800 (PST) Received: from ishtar.tlinx.org (ishtar.tlinx.org [64.81.245.74]) by cuda.sgi.com with ESMTP id 3yo09zc0DqKeOQel for ; Tue, 12 Feb 2008 13:47:08 -0800 (PST) Received: from [192.168.3.11] (Athena [192.168.3.11]) by ishtar.tlinx.org (8.13.3/8.12.10/SuSE Linux 0.7) with ESMTP id m1CLi73k029621; Tue, 12 Feb 2008 13:44:07 -0800 Message-ID: <47B21327.3080502@tlinx.org> Date: Tue, 12 Feb 2008 13:44:07 -0800 From: Linda Walsh User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: Eric Sandeen CC: David Chinner , Linux-Xfs , LKML X-ASG-Orig-Subj: Re: xfs [_fsr] probs in 2.6.24.0 Subject: Re: xfs [_fsr] probs in 2.6.24.0 References: <47B0F00D.3060802@tlinx.org> <20080212085802.GA155407@sgi.com> <47B2094D.50406@tlinx.org> <47B20E84.6020707@sandeen.net> In-Reply-To: <47B20E84.6020707@sandeen.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Connect: ishtar.tlinx.org[64.81.245.74] X-Barracuda-Start-Time: 1202852829 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42080 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5794/Tue Feb 12 12:49:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14417 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: xfs@tlinx.org Precedence: bulk X-list: xfs Eric Sandeen wrote: > Linda Walsh wrote: >> David Chinner wrote: >>> Filesystem bugs rarely hang systems hard like that - more likely is >>> a hardware or driver problem. And neither of the lockdep reports >>> below are likely to be responsible for a system wide, no-response >>> hang. >> --- >> "Ish", the 32-bitter, has been the only hard-hanger. > > 4k stacks? ---- But but but...almost from the day they were introduced. And these are more recent probs. Has stack usage increased for some reason, :-(. I do have the option to detect stack-overflow turned on as well -- guess it doesn't work so well? If they are that problematic, maybe selecting xfs as a config option should force 8k stacks (ugly solution, but might eliminate some lost hair (from pulling it out) for end users....? Guess I should go back to 8k's for now...seems odd that it'd pop up now, but maybe it's the xtra NFS loading? Sigh. From owner-xfs@oss.sgi.com Tue Feb 12 13:54:04 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 13:54:06 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1CLs2Dq030840 for ; Tue, 12 Feb 2008 13:54:04 -0800 X-ASG-Debug-ID: 1202853265-06df016a0000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B7B615CB3FC; Tue, 12 Feb 2008 13:54:25 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com with ESMTP id 4cUlNrkgByVi7EdG; Tue, 12 Feb 2008 13:54:25 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id m1CLsOlX017781; Tue, 12 Feb 2008 16:54:24 -0500 Received: from lacrosse.corp.redhat.com (lacrosse.corp.redhat.com [172.16.52.154]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m1CLsO6X011100; Tue, 12 Feb 2008 16:54:24 -0500 Received: from neon.msp.redhat.com (neon.msp.redhat.com [10.15.80.10]) by lacrosse.corp.redhat.com (8.12.11.20060308/8.11.6) with ESMTP id m1CLsNv2019954; Tue, 12 Feb 2008 16:54:24 -0500 Message-ID: <47B2158F.2080305@sandeen.net> Date: Tue, 12 Feb 2008 15:54:23 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (X11/20071115) MIME-Version: 1.0 To: Linda Walsh CC: David Chinner , Linux-Xfs , LKML X-ASG-Orig-Subj: Re: xfs [_fsr] probs in 2.6.24.0 Subject: Re: xfs [_fsr] probs in 2.6.24.0 References: <47B0F00D.3060802@tlinx.org> <20080212085802.GA155407@sgi.com> <47B2094D.50406@tlinx.org> <47B20E84.6020707@sandeen.net> <47B21327.3080502@tlinx.org> In-Reply-To: <47B21327.3080502@tlinx.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 X-Barracuda-Connect: mx1.redhat.com[66.187.233.31] X-Barracuda-Start-Time: 1202853266 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42080 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5794/Tue Feb 12 12:49:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14418 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Linda Walsh wrote: > > Eric Sandeen wrote: >> Linda Walsh wrote: >>> David Chinner wrote: >>>> Filesystem bugs rarely hang systems hard like that - more likely is >>>> a hardware or driver problem. And neither of the lockdep reports >>>> below are likely to be responsible for a system wide, no-response >>>> hang. >>> --- >>> "Ish", the 32-bitter, has been the only hard-hanger. >> 4k stacks? > ---- > But but but...almost from the day they were introduced. And > these are more recent probs. Has stack usage increased for some reason, > :-(. I do have the option to detect stack-overflow turned on as well > -- guess it doesn't work so well? Resource requirements grow over time, film at 11? :) the checker is a random thing, it checks only on interrupts; it won't always hit. you could try CONFIG_DEBUG_STACK_USAGE too, each thread prints max stack used when it exits, to see if you're getting close on normal usage. Or just use 8k. -Eric From owner-xfs@oss.sgi.com Tue Feb 12 13:59:01 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 13:59:03 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1CLwsVr031435 for ; Tue, 12 Feb 2008 13:58:59 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id IAA16418; Wed, 13 Feb 2008 08:59:13 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1CLxALF61516213; Wed, 13 Feb 2008 08:59:11 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1CLx85161448451; Wed, 13 Feb 2008 08:59:08 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Wed, 13 Feb 2008 08:59:07 +1100 From: David Chinner To: Linda Walsh Cc: David Chinner , Linux-Xfs , LKML Subject: Re: xfs [_fsr] probs in 2.6.24.0 Message-ID: <20080212215907.GH155407@sgi.com> References: <47B0F00D.3060802@tlinx.org> <20080212085802.GA155407@sgi.com> <47B2094D.50406@tlinx.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47B2094D.50406@tlinx.org> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5794/Tue Feb 12 12:49:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14419 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Tue, Feb 12, 2008 at 01:02:05PM -0800, Linda Walsh wrote: > David Chinner wrote: > >Perhaps by running xfs_fsr manually you could reproduce the > >problem while you are sitting in front of the machine... > ---- > Um...yeah, AND with multiple "cp's of multi-gig files > going on at same time, both local, by a sister machine via NFS, > and and a 3rd machine tapping (not banging) away via CIFS. > These were on top of normal server duties. Whenever I stress > it on *purpose* and watch it, works fine. GRRRRRRR....I HATE > THAT!!! I feel your pain. > The file system(s) going "offline" due > to xfs-detected filesystem errors has only happened *once* on > asa, the 64-bit machine. It's a fairly new machine w/o added > hardware -- but this only happened in 2.24.0 when I added the > tickless clock option, which sure seems like a remote possibility for > causing an xfs error, but could be. Well, tickless is new and shiny and I doubt anyone has done much testing with XFS on tickless kernels. Still, if that's a new config option you set, change it back to what you had for .23 on that hardware and try again. > >Looking at it in terms of i_mutex, other filesystems hold > >i_mutex over dio_get_page() (all those that use DIO_LOCKING) > >so question is whether we are allowed to take the i_mutex > >in ->release. I note that both reiserfs and hfsplus take > >i_mutex in ->release as well as use DIO_LOCKING, so this > >problem is not isolated to XFS. > > > >However, it would appear that mmap_sem -> i_mutex is illegal > >according to the comment at the head of mm/filemap.c. While we are > >not using i_mutex in this case, the inversion would seem to be > >equivalent in nature. > > > >There's not going to be a quick fix for this. > ---- > What could be the consequences of this locking anomaly? If you have a multithreaded application that mixes mmap and direct I/O, and you have a simultaneous munmap() call and read() to the same file, you might be able to deadlock access to that file. However, you'd have to be certifiably insane to write an application that did this (mix mmap and direct I/O to the same file at the same time), so I think exposure is pretty limited. > I.e., for example, in NFS, I have enabled "allow direct I/O on NFS > files". That's client side direct I/O, which is not what the server does. Client side direct I/O results in synchronous buffered I/O on the server, which will thrash your disks pretty hard. The config option help does warn you about this. ;) > >And the other one: > > > >>Feb 7 02:01:50 kern: > >>------------------------------------------------------- > >>Feb 7 02:01:50 kern: xfs_fsr/6313 is trying to acquire lock: > >>Feb 7 02:01:50 kern: (&(&ip->i_lock)->mr_lock/2){----}, at: > >>[] xfs_ilock+0x82/0xc0 > >>Feb 7 02:01:50 kern: > >>Feb 7 02:01:50 kern: but task is already holding lock: > >>Feb 7 02:01:50 kern: (&(&ip->i_iolock)->mr_lock/3){--..}, at: > >>[] xfs_ilock+0xa5/0xc0 > >>Feb 7 02:01:50 kern: > >>Feb 7 02:01:50 kern: which lock already depends on the new lock. > > > >Looks like yet another false positive. Basically we do this > >in xfs_swap_extents: > > > > inode A: i_iolock class 2 > > inode A: i_ilock class 2 > > inode B: i_iolock class 3 > > inode B: i_ilock class 3 > > ..... > > inode A: unlock ilock > > inode B: unlock ilock > > ..... > >>>>>> inode A: ilock class 2 > > inode B: ilock class 3 > > > >And lockdep appears to be complaining about the relocking of inode A > >as class 2 because we've got a class 3 iolock still held, hence > >violating the order it saw initially. There's no possible deadlock > >here so we'll just have to add more hacks to the annotation code to make > >lockdep happy. > ---- > Is there a reason to unlock and relock the same inode while > the level 3 lock is held -- i.e. does 'unlocking ilock' allow some > increased 'throughput' for some other potential process to access > the same inode? It prevents a single thread deadlock when doing transaction reservation. i.e. the process of setting up a transaction can require the ilock to be taken, and hence we have to drop it before and pick it back up after the transaction reservation. We hold on to the iolock to prevent the inode from having new I/O started while we do the transaction reservation, so it's in the same state after the reservation as it was before.... We have to hold both locks to guarantee exclusive access to the inode, so once we have the reservation we need to pick the ilocks back up. The way we do it here does not violate lock ordering at all (iolock before ilock on a single inode, and ascending inode number order for multiple inodes), but lockdep is not smart enough to know that. Hence we need more complex annotations to shut it up. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Feb 12 16:47:11 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 12 Feb 2008 16:47:15 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1D0l7ld010269 for ; Tue, 12 Feb 2008 16:47:11 -0800 X-ASG-Debug-ID: 1202863650-53bc02db0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 277775CC743 for ; Tue, 12 Feb 2008 16:47:31 -0800 (PST) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id mUiOUkD5OBjSHSuC for ; Tue, 12 Feb 2008 16:47:31 -0800 (PST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aq4HAC7LsUc7p0u0Wmdsb2JhbACBWY5YASCcaw X-IronPort-AV: E=Sophos;i="4.25,342,1199626200"; d="scan'208";a="51209640" Received: from ppp59-167-75-180.lns1.mel6.internode.on.net (HELO jdc.jasonjgw.net) ([59.167.75.180]) by ipmail04.adl2.internode.on.net with ESMTP; 13 Feb 2008 11:17:29 +1030 Received: from jdc.jasonjgw.net (ip6-localhost [IPv6:::1]) by jdc.jasonjgw.net (8.14.2/8.14.2/Debian-3) with ESMTP id m1D0keIc004599 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Wed, 13 Feb 2008 11:46:40 +1100 Received: (from jason@localhost) by jdc.jasonjgw.net (8.14.2/8.14.2/Submit) id m1D0ke8v004598 for xfs@oss.sgi.com; Wed, 13 Feb 2008 11:46:40 +1100 Date: Wed, 13 Feb 2008 11:46:39 +1100 From: Jason White To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Subject: Re: [PATCH] Possible fix for 2.6.24 xfs_file_readdir crash Message-ID: <20080213004639.GA4534@jdc.jasonjgw.net> Mail-Followup-To: xfs@oss.sgi.com References: <20080205052418.GU155259@sgi.com> <20080209120423.GA6699@diesel.geggus.net> <20080211001727.GO155407@sgi.com> <1202711725.4679.7.camel@humanitas.fs3.ph> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1202711725.4679.7.camel@humanitas.fs3.ph> User-Agent: Mutt/1.5.17+20080114 (2008-01-14) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1202863652 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42092 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5794/Tue Feb 12 12:49:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14420 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jason@jasonjgw.net Precedence: bulk X-list: xfs On Mon, Feb 11, 2008 at 02:35:25PM +0800, Federico Sevilla III wrote: > 2.6.24.2 has been released to address the vmsplice issue. Unfortunately, > no other changes seem to have been included. Hopefully, the > xfs_file_readdir patch will make it to 2.6.24.3. Which platforms are affected? My machines are x86_64, and I've just upgraded to avoid the vmsplice exploit. Of course, I could compile a patched kernel but if this platform isn't affected then I might as well wait for 2.6.24.3. I know that's a really lazy kind of question. From owner-xfs@oss.sgi.com Wed Feb 13 00:39:49 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 00:39:53 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=1.0 required=5.0 tests=ANY_BOUNCE_MESSAGE,AWL, BAYES_50,VBOUNCE_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1D8dlxB007508 for ; Wed, 13 Feb 2008 00:39:49 -0800 X-ASG-Debug-ID: 1202892010-07ea02c00000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from omr-m22.mx.aol.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id CD0FEDFCB5F for ; Wed, 13 Feb 2008 00:40:10 -0800 (PST) Received: from omr-m22.mx.aol.com (omr-m22.mx.aol.com [64.12.136.130]) by cuda.sgi.com with ESMTP id cIyMeCHHTV7YGG9A for ; Wed, 13 Feb 2008 00:40:10 -0800 (PST) Received: from rly-da10.mx.aol.com (rly-da10.mx.aol.com [205.188.159.56]) by omr-m22.mx.aol.com (v117.7) with ESMTP id MAILOMRM224-7d9b47b2ab93d0; Wed, 13 Feb 2008 03:34:27 -0500 Received: from localhost (localhost) by rly-da10.mx.aol.com (8.14.1/8.14.1) id m1D8YISu010539; Wed, 13 Feb 2008 03:34:27 -0500 Date: Wed, 13 Feb 2008 03:34:27 -0500 From: Mail Delivery Subsystem Message-Id: <200802130834.m1D8YISu010539@rly-da10.mx.aol.com> To: MIME-Version: 1.0 Content-Type: multipart/report; report-type=delivery-status; boundary="m1D8YISu010539.1202891667/rly-da10.mx.aol.com" X-ASG-Orig-Subj: Returned mail: see transcript for details Subject: Returned mail: see transcript for details Auto-Submitted: auto-generated (failure) X-AOL-INRLY: host.85.130.123.121.customers.net-surf.net [85.130.123.121] rly-da10 X-AOL-IP: 205.188.159.56 X-Barracuda-Connect: omr-m22.mx.aol.com[64.12.136.130] X-Barracuda-Start-Time: 1202892011 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=FORGED_AOL_RCVD X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42122 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 FORGED_AOL_RCVD Received forged, contains fake AOL relays X-Virus-Scanned: ClamAV 0.91.2/5794/Tue Feb 12 12:49:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14421 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: MAILER-DAEMON@aol.com Precedence: bulk X-list: xfs This is a MIME-encapsulated message --m1D8YISu010539.1202891667/rly-da10.mx.aol.com The original message was received at Wed, 13 Feb 2008 03:33:59 -0500 from host.85.130.123.121.customers.net-surf.net [85.130.123.121] *** ATTENTION *** Your e-mail is being returned to you because there was a problem with its delivery. The address which was undeliverable is listed in the section labeled: "----- The following addresses had permanent fatal errors -----". The reason your mail is being returned to you is listed in the section labeled: "----- Transcript of Session Follows -----". The line beginning with "<<<" describes the specific reason your e-mail could not be delivered. The next line contains a second error message which is a general translation for other e-mail servers. Please direct further questions regarding this message to your e-mail administrator. --AOL Postmaster ----- The following addresses had permanent fatal errors ----- (reason: 554 TRANSACTION FAILED - Unrepairable Virus Detected. Your mail has not been sent.) ----- Transcript of session follows ----- ... while talking to air-da01.mail.aol.com.: >>> DATA <<< 554 TRANSACTION FAILED - Unrepairable Virus Detected. Your mail has not been sent. 554 5.0.0 Service unavailable --m1D8YISu010539.1202891667/rly-da10.mx.aol.com Content-Type: message/delivery-status Reporting-MTA: dns; rly-da10.mx.aol.com Arrival-Date: Wed, 13 Feb 2008 03:33:59 -0500 Final-Recipient: RFC822; peopleofbulgaria@aol.com Action: failed Status: 5.0.0 Remote-MTA: DNS; air-da01.mail.aol.com Diagnostic-Code: SMTP; 554 TRANSACTION FAILED - Unrepairable Virus Detected. Your mail has not been sent. Last-Attempt-Date: Wed, 13 Feb 2008 03:34:27 -0500 --m1D8YISu010539.1202891667/rly-da10.mx.aol.com Content-Type: text/rfc822-headers Received: from oss.sgi.com (host.85.130.123.121.customers.net-surf.net [85.130.123.121]) by rly-da10.mx.aol.com (v121.4) with ESMTP id MAILRELAYINDA104-a9447b2ab73cb; Wed, 13 Feb 2008 03:33:56 -0500 From: xfs@oss.sgi.com To: peopleofbulgaria@aol.com Subject: Returned mail: see transcript for details Date: Wed, 13 Feb 2008 10:35:32 +0200 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0012_8B230F7E.4C0562E4" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 X-AOL-IP: 85.130.123.121 X-AOL-SCOLL-SCORE:0:2:298053088:9395240 X-AOL-SCOLL-URL_COUNT: X-AOL-SCOLL-AUTHENTICATION: listenair ; SPF_helo : n X-AOL-SCOLL-AUTHENTICATION: listenair ; SPF_822_from : n Message-ID: <200802130333.a9447b2ab73cb@rly-da10.mx.aol.com> --m1D8YISu010539.1202891667/rly-da10.mx.aol.com-- From owner-xfs@oss.sgi.com Wed Feb 13 02:58:47 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 02:58:51 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,J_CHICKENPOX_45, MIME_8BIT_HEADER autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1DAwiGU016267 for ; Wed, 13 Feb 2008 02:58:47 -0800 X-ASG-Debug-ID: 1202900348-2f0400520000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from rn-out-0910.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 2166F5CEC1F for ; Wed, 13 Feb 2008 02:59:08 -0800 (PST) Received: from rn-out-0910.google.com (rn-out-0910.google.com [64.233.170.189]) by cuda.sgi.com with ESMTP id CJ9bLC7cDArTyKCv for ; Wed, 13 Feb 2008 02:59:08 -0800 (PST) Received: by rn-out-0910.google.com with SMTP id a43so2838065rne.5 for ; Wed, 13 Feb 2008 02:59:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; bh=VTEvSBNmLNZQik3UDLmxYMvpqgalarf91arfp03XQlM=; b=iRbsK4Ya7FaEa1nVbaAGdMjKs9Hk35l5SRGRnkIpO6Gs09Ll0qZIZdvfUVoC7nbHmfdY/aJMwVn+uVjtQ9ajXG0EZWqbzc9zsigMBUOL3iLTDpGTogCC0ueOhnmawBWX/rHe+zg03Pc31S1iOZxHS5NzIqwbw7F7h/izBaVD5oM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=qTsDhTYEAuh3y0QypgC65ue+XTxM0vGsL9/+0s+wF1t9+wsBFjG1kG/v++f7KGLisasNpok506LJQ50rujBIYjFfD06JYaFEELzHu8IXkoucz47zztTn3+pE8oAGt2ElL4umDoXQdOCg6j0WeHc2Dr+A6lfqm4B0Mn0jUwSilK4= Received: by 10.150.155.1 with SMTP id c1mr926891ybe.15.1202899911154; Wed, 13 Feb 2008 02:51:51 -0800 (PST) Received: by 10.150.191.13 with HTTP; Wed, 13 Feb 2008 02:51:51 -0800 (PST) Message-ID: <1a4a774c0802130251h657a52f7lb97942e7afdf6e3f@mail.gmail.com> Date: Wed, 13 Feb 2008 11:51:51 +0100 From: "=?ISO-8859-1?Q?Christian_R=F8snes?=" To: xfs@oss.sgi.com X-ASG-Orig-Subj: XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c Subject: XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Barracuda-Connect: rn-out-0910.google.com[64.233.170.189] X-Barracuda-Start-Time: 1202900349 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42130 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5794/Tue Feb 12 12:49:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14422 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: christian.rosnes@gmail.com Precedence: bulk X-list: xfs Over the past month I've been hit with two cases of "xfs_trans_cancel at line 1150" The two errors occurred on different raid sets. In both cases the error happened during rsync from a remote server to this server, and the local partition which reported the error was 99% full (as reported by df -k, see below for details). System: Dell 2850 Mem: 4GB RAM OS: Debian 3 (32-bit) Kernel: 2.6.17.7 (custom compiled) I've been running this kernel since Aug 2006 without any of these problems, until a month ago. I've not used any of the previous kernel in the 2.6.17 series. /usr/src/linux-2.6.17.7# grep 4K .config # CONFIG_4KSTACKS is not set Are there any known XFS problems with this kernel version and nearly full partitions ? I'm thinking about upgrading the kernel to a newer version, to see if it fixes this problem. Are there any known XFS problems with version 2.6.24.2 ? Thanks Christian -- case logs: case 1: Filesystem "sdb1": XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c. Caller 0xc0208467 xfs_trans_cancel+0x54/0xe1 xfs_create+0x527/0x563 xfs_create+0x527/0x563 xfs_vn_mknod+0x1a9/0x3bd xfs_dir2_leafn_lookup_int+0x49/0x452 xfs_buf_free+0x7f/0x84 xfs_da_state_free+0x54/0x5a xfs_dir2_node_lookup+0x95/0xa0 xfs_dir2_lookup+0xf5/0x125 mntput_no_expire+0x14/0x71 xfs_vn_permission+0x1b/0x21 xfs_vn_create+0x13/0x17 vfs_create+0xc2/0xf8 open_namei+0x16d/0x5b3 do_filp_open+0x26/0x3c get_unused_fd+0x5a/0xb0 do_sys_open+0x40/0xb6 sys_open+0x13/0x17 syscall_call+0x7/0xb xfs_force_shutdown(sdb1,0x8) called from line 1151 of file fs/xfs/xfs_trans.c. Return address = 0xc0214e7d Filesystem "sdb1": Corruption of in-memory data detected. Shutting down filesystem: sdb1 Please umount the filesystem, and rectify the problem(s) mount options: /dev/sdb1 on /data type xfs (rw,noatime) df -k /dev/sdb1 286380096 283256112 3123984 99% /data sdb1 is an internal raid. Case 1 occurred last night, and I'm now about to run repair on that partition. case2: Filesystem "sdd1": XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c. Caller 0xc0208467 xfs_trans_cancel+0x54/0xe1 xfs_create+0x527/0x563 xfs_create+0x527/0x563 xfs_vn_mknod+0x1a9/0x3bd qdisc_restart+0x13/0x152 in_group_p+0x26/0x2d xfs_iaccess+0xad/0x15b xfs_access+0x2b/0x33 xfs_dir2_lookup+0xa5/0x125 mntput_no_expire+0x14/0x71 xfs_vn_permission+0x1b/0x21 xfs_vn_create+0x13/0x17 vfs_create+0xc2/0xf8 open_namei+0x16d/0x5b3 do_filp_open+0x26/0x3c get_unused_fd+0x5a/0xb0 do_sys_open+0x40/0xb6 sys_open+0x13/0x17 syscall_call+0x7/0xb xfs_force_shutdown(sdd1,0x8) called from line 1151 of file fs/xfs/xfs_trans.c. Return address = 0xc0214e7d Filesystem "sdd1": Corruption of in-memory data detected. Shutting down filesystem: sdd1 Please umount the filesystem, and rectify the problem(s) mount options: /dev/sdd1 on /content/raid03 type xfs (rw,noatime,logbufs=8,nobarrier) df -k: /dev/sdd1 1951266816 1925560144 25706672 99% /content/raid03 sdd1 is an external raid. In case 2 I rebooted, then ran xfs_repair from xfsprogs 2.9.4. And then remounted the partition, and the partition was ok. xfs_repair /dev/sdd1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done From owner-xfs@oss.sgi.com Wed Feb 13 03:04:56 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 03:04:58 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_00,MIME_8BIT_HEADER autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1DB4rkE017101 for ; Wed, 13 Feb 2008 03:04:56 -0800 X-ASG-Debug-ID: 1202900717-2afe03730000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from lucidpixels.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id ED2B5DFDB15 for ; Wed, 13 Feb 2008 03:05:17 -0800 (PST) Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by cuda.sgi.com with ESMTP id 9k4EBt9irePW4OVg for ; Wed, 13 Feb 2008 03:05:17 -0800 (PST) Received: by lucidpixels.com (Postfix, from userid 1001) id 45ACD1C00B5ED; Wed, 13 Feb 2008 06:04:46 -0500 (EST) Date: Wed, 13 Feb 2008 06:04:46 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: =?ISO-8859-15?Q?Christian_R=F8snes?= cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c Subject: Re: XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c In-Reply-To: <1a4a774c0802130251h657a52f7lb97942e7afdf6e3f@mail.gmail.com> Message-ID: References: <1a4a774c0802130251h657a52f7lb97942e7afdf6e3f@mail.gmail.com> User-Agent: Alpine 1.00 (DEB 882 2007-12-20) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-1463747160-915850544-1202900686=:22567" X-Barracuda-Connect: lucidpixels.com[75.144.35.66] X-Barracuda-Start-Time: 1202900717 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42131 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5794/Tue Feb 12 12:49:27 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14423 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1463747160-915850544-1202900686=:22567 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Wed, 13 Feb 2008, Christian R=F8snes wrote: > Over the past month I've been hit with two cases of "xfs_trans_cancel > at line 1150" > The two errors occurred on different raid sets. In both cases the > error happened during > rsync from a remote server to this server, and the local partition > which reported > the error was 99% full (as reported by df -k, see below for details). > > System: Dell 2850 > Mem: 4GB RAM > OS: Debian 3 (32-bit) > Kernel: 2.6.17.7 (custom compiled) Just curious have you run memtest86 for a few passes and checked to make su= re the memory is OK? Justin. ---1463747160-915850544-1202900686=:22567-- From owner-xfs@oss.sgi.com Wed Feb 13 03:44:41 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 03:44:44 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,MIME_8BIT_HEADER autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1DBiaPO019135 for ; Wed, 13 Feb 2008 03:44:40 -0800 X-ASG-Debug-ID: 1202903100-353d03100000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from rn-out-0910.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id BA4CB5CF0DD for ; Wed, 13 Feb 2008 03:45:00 -0800 (PST) Received: from rn-out-0910.google.com (rn-out-0910.google.com [64.233.170.188]) by cuda.sgi.com with ESMTP id TjnM52b0LqGSg3hh for ; Wed, 13 Feb 2008 03:45:00 -0800 (PST) Received: by rn-out-0910.google.com with SMTP id a43so2852368rne.5 for ; Wed, 13 Feb 2008 03:44:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=P56bpmOltonUKHHvCscA4+MA/WIkbGzz7V8OVKawpoc=; b=Tqor0eMpMCEbt1VdjZ+AFYPZ8ybtGUGmz80s5zWaRWuViqJK/sXxdoMSnXOek8aT8LrSHjwdEyhZBmpj8dVJisfH5fJHC12gxj7Aj8ueOSYNyErboGpR1x0Rql3yKJA+cktHcQtK4inC1muprhKJx+SpPYrP46bEyU7+wigGVOM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=WxRuaFma0tgKLDOe/5FtGH8HhqDTZZdxoj3CBGLdyp9rGNGspws80RZtwXs4XbxiLN0hnQzQnFjsmokehNmiXkXDBEKISExq6B+TpOfO0KTLSXW5L3CVuhORJg9blRl1DIUVSRL2V1fxXqEqJuTCrEWp7xiVlcLg3jH3hnsqyj4= Received: by 10.150.192.7 with SMTP id p7mr934177ybf.90.1202903099692; Wed, 13 Feb 2008 03:44:59 -0800 (PST) Received: by 10.150.191.13 with HTTP; Wed, 13 Feb 2008 03:44:59 -0800 (PST) Message-ID: <1a4a774c0802130344n109e54f9uc3cc8a4b2edf4a45@mail.gmail.com> Date: Wed, 13 Feb 2008 12:44:59 +0100 From: "=?ISO-8859-1?Q?Christian_R=F8snes?=" To: "Justin Piszcz" X-ASG-Orig-Subj: Re: XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c Subject: Re: XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c Cc: xfs@oss.sgi.com In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <1a4a774c0802130251h657a52f7lb97942e7afdf6e3f@mail.gmail.com> X-Barracuda-Connect: rn-out-0910.google.com[64.233.170.188] X-Barracuda-Start-Time: 1202903100 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42132 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5798/Wed Feb 13 02:51:39 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14424 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: christian.rosnes@gmail.com Precedence: bulk X-list: xfs On Feb 13, 2008 12:04 PM, Justin Piszcz wrote: > Just curious have you run memtest86 for a few passes and checked to make sure > the memory is OK? > I haven't run memtest yet, but I'll schedule some downtime for this server to get it done. Thanks Christian From owner-xfs@oss.sgi.com Wed Feb 13 13:21:17 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 13:21:21 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: **** X-Spam-Status: No, score=4.1 required=5.0 tests=BAYES_99,DATE_IN_PAST_03_06, J_CHICKENPOX_32 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1DLLCA0030267 for ; Wed, 13 Feb 2008 13:21:17 -0800 X-ASG-Debug-ID: 1202937695-366d00ab0000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from smtpout09.prod.mesa1.secureserver.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with SMTP id 4D83C5D2A4D for ; Wed, 13 Feb 2008 13:21:35 -0800 (PST) Received: from smtpout09.prod.mesa1.secureserver.net (smtpout09-04.prod.mesa1.secureserver.net [64.202.165.17]) by cuda.sgi.com with SMTP id z4baGN6bTLBKBWkr for ; Wed, 13 Feb 2008 13:21:35 -0800 (PST) Received: (qmail 16275 invoked from network); 13 Feb 2008 07:54:55 -0000 Received: from unknown (218.18.21.28) by smtpout09-04.prod.mesa1.secureserver.net (64.202.165.17) with ESMTP; 13 Feb 2008 07:54:54 -0000 MIME-Version: 1.0 From: "I-M-T technologies" X-ASG-Orig-Subj: Quality molds made in China Subject: Quality molds made in China To: "linux-xfs@oss.sgi.com" Reply-To: Frank@imt-technologies.com Date: Wed, 13 Feb 2008 15:54:56 +0000 Content-Type: text/plain X-Barracuda-Connect: smtpout09-04.prod.mesa1.secureserver.net[64.202.165.17] X-Barracuda-Start-Time: 1202937696 Message-Id: <20080213212135.4D83C5D2A4D@cuda.sgi.com> X-Barracuda-Bayes: INNOCENT GLOBAL 0.5000 1.0000 0.0100 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 0.99 X-Barracuda-Spam-Status: No, SCORE=0.99 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=MAILTO_TO_SPAM_ADDR, MSGID_FROM_MTA_ID X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42171 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.70 MSGID_FROM_MTA_ID Message-Id for external message added locally 0.28 MAILTO_TO_SPAM_ADDR URI: Includes a link to a likely spammer email X-Virus-Scanned: ClamAV 0.91.2/5801/Wed Feb 13 10:20:13 2008 on oss.sgi.com X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id m1DLLHA0030272 X-archive-position: 14425 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Helen@imt-technologies.com Precedence: bulk X-list: xfs Dear purchasing manager, Molds are used widely today, any of you need such molds, do contact me. This is Eliza, Sales engineer from Injection Mold Tooling Technologies Co.,Limited(IMT). We are in Shenzhen, China. I happen to know your esteemed company from internet and write to you to explore the possible cooperation. IMT is specialized in designing and constructing plastic injection moulds, providing impeccable craftsmanship and a personal commitment to mold excellence and on-time delivery. Our mold making is based on DME and HASCO standard for USA and European market. IMT houses everything from large CNC milling machines to EDM equipment We are capable of handling molds of any variety of sizes. We can also provide service of 2D/3D mould design and MoldFlow analysis. Our mission is to develop long-term relationships with customers by supplying them with high quality products on a timely basis with the best service they deserve at a lowest price. Warmly welcome you to visit our web site: www.IMT-Technologies.com to get further information if you are interested, also if you have any projects we could assist, please feel free to contact me at Eliza@IMT-Technologies.com. Your attention to my email would be highly appreciated. Best regards, I-M-T Technologies Co. Ltd Website£ºwww.IMT-Technologies.com Address: No 79 zone Bao'an district Shenzhen, China. TEL: 86-0755-27930253 FAX: 86-0755-27930216 EMAIL: Eliza@IMT-Technologies.com MSN: Eliza20012005@hotmail.com Skype: eliza.chenxiaoyun From owner-xfs@oss.sgi.com Wed Feb 13 13:45:40 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 13:45:42 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,MIME_8BIT_HEADER autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1DLjYPZ031539 for ; Wed, 13 Feb 2008 13:45:37 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id IAA10455; Thu, 14 Feb 2008 08:45:54 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1DLjrLF62710304; Thu, 14 Feb 2008 08:45:54 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1DLjpqh62597227; Thu, 14 Feb 2008 08:45:51 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Thu, 14 Feb 2008 08:45:51 +1100 From: David Chinner To: Christian =?iso-8859-1?Q?R=F8snes?= Cc: xfs@oss.sgi.com Subject: Re: XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c Message-ID: <20080213214551.GR155407@sgi.com> References: <1a4a774c0802130251h657a52f7lb97942e7afdf6e3f@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1a4a774c0802130251h657a52f7lb97942e7afdf6e3f@mail.gmail.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5803/Wed Feb 13 12:25:54 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14426 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Wed, Feb 13, 2008 at 11:51:51AM +0100, Christian Røsnes wrote: > Over the past month I've been hit with two cases of "xfs_trans_cancel > at line 1150" > The two errors occurred on different raid sets. In both cases the > error happened during > rsync from a remote server to this server, and the local partition > which reported > the error was 99% full (as reported by df -k, see below for details). > > System: Dell 2850 > Mem: 4GB RAM > OS: Debian 3 (32-bit) > Kernel: 2.6.17.7 (custom compiled) > > I've been running this kernel since Aug 2006 without any of these > problems, until a month ago. > > I've not used any of the previous kernel in the 2.6.17 series. > > /usr/src/linux-2.6.17.7# grep 4K .config > # CONFIG_4KSTACKS is not set > > > Are there any known XFS problems with this kernel version and nearly > full partitions ? Yes. Deadlocks that weren't properly fixed until 2.6.18 (partially fixed in 2.6.17) and an accounting problem in the transaction code that leads to the shutdown you are seeing. The accounting problem is fixed by this commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=45c34141126a89da07197d5b89c04c6847f1171a which I think went into 2.6.22. Luckily, neither of these problems result in corruption. > I'm thinking about upgrading the kernel to a newer version, to see if > it fixes this problem. > Are there any known XFS problems with version 2.6.24.2 ? Yes - a problem with readdir. The fix is currently in the stable queue (i.e for 2.6.24.3): http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=commit;h=ee864b866419890b019352412c7bc9634d96f61b So we are just waiting for Greg to release 2.6.24.3 now. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Wed Feb 13 17:18:34 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 17:18:39 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1E1IV4K012650 for ; Wed, 13 Feb 2008 17:18:33 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA20711; Thu, 14 Feb 2008 12:18:50 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 8F03858C4C11; Thu, 14 Feb 2008 12:18:50 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: PARTIAL TAKE 971186 - Fix regression due to refcache removal Message-Id: <20080214011850.8F03858C4C11@chook.melbourne.sgi.com> Date: Thu, 14 Feb 2008 12:18:50 +1100 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/5804/Wed Feb 13 15:04:59 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14427 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Fix regression due to refcache removal Date: Thu Feb 14 12:17:28 AEDT 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-test Inspected by: donaldd Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30490a fs/xfs/xfs_vnodeops.c - 1.733 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.733&r2=text&tr2=1.732&f=h - Fix regression due to refcache removal From owner-xfs@oss.sgi.com Wed Feb 13 18:59:27 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 18:59:32 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1E2xNAk016659 for ; Wed, 13 Feb 2008 18:59:25 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA23482; Thu, 14 Feb 2008 13:59:39 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1E2xcLF59360501; Thu, 14 Feb 2008 13:59:39 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1E2xa0p62624286; Thu, 14 Feb 2008 13:59:36 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Thu, 14 Feb 2008 13:59:36 +1100 From: David Chinner To: Lachlan McIlroy Cc: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: Re: PARTIAL TAKE 971186 - Fix regression due to refcache removal Message-ID: <20080214025935.GJ155259@sgi.com> References: <20080214011850.8F03858C4C11@chook.melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080214011850.8F03858C4C11@chook.melbourne.sgi.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5805/Wed Feb 13 15:29:12 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14428 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Thu, Feb 14, 2008 at 12:18:50PM +1100, Lachlan McIlroy wrote: > Fix regression due to refcache removal Can we please use more descriptive check-in messages? Remember, the description is what ends up in the git commit log and so needs to include a summary of the problem and the fix that was made. That way the it's easy see what changes were made when read the git log without needed to look at diffs. IOWs, the description should read: "Fix a regression due to the refcache removal. Some code in xfs_rwunlock() was accidentally removed when removing the refcache. This results in inodes not being properly unlocked and causes system hangs. Fixed by reinstating the missing code." Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Wed Feb 13 19:33:48 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 19:33:51 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1E3XmBq018480 for ; Wed, 13 Feb 2008 19:33:48 -0800 X-ASG-Debug-ID: 1202960048-148203da0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id E927FE10876 for ; Wed, 13 Feb 2008 19:34:09 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id Gnm6TcLkZfRBW0k5 for ; Wed, 13 Feb 2008 19:34:09 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id CD9D118E215A7 for ; Wed, 13 Feb 2008 21:34:06 -0600 (CST) Message-ID: <47B3B6AE.4030505@sandeen.net> Date: Wed, 13 Feb 2008 21:34:06 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: xfs-oss X-ASG-Orig-Subj: [PATCH] fix mount option pasing to make inode cluster deletion default (again) Subject: [PATCH] fix mount option pasing to make inode cluster deletion default (again) Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1202960052 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42190 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5805/Wed Feb 13 15:29:12 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14429 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs mod xfs-linux-melb:xfs-kern:29683a / git commit 574342f4ad450b33bc85ec53210b8aa8bfff2fcf broke default options in such a way that empty inode clusters are no longer deleted by default, because if no options are given, we "goto done;" without setting the default XFSMNT_IDELETE flag. All this logic could probably be rearranged to make things clearer, but for now I think this small patch fixes it: Set IDELETE a.k.a. "noikeep" by default, but if dmapi is in use, turn it back off (i.e. "ikeep") *unless* noikeep was specifically requested. Signed-off-by: Eric Sandeen --- Index: linux-2.6.24/fs/xfs/linux-2.6/xfs_super.c =================================================================== --- linux-2.6.24.orig/fs/xfs/linux-2.6/xfs_super.c +++ linux-2.6.24/fs/xfs/linux-2.6/xfs_super.c @@ -171,9 +171,10 @@ xfs_parseargs( char *this_char, *value, *eov; int dsunit, dswidth, vol_dsunit, vol_dswidth; int iosize; - int ikeep = 0; + int noikeep = 0; /* track _explicit_ requests */ args->flags |= XFSMNT_BARRIER; + args->flags |= XFSMNT_IDELETE; /* i.e. "noikeep" is default */ args->flags2 |= XFSMNT2_COMPAT_IOSIZE; if (!options) @@ -302,9 +303,9 @@ xfs_parseargs( } else if (!strcmp(this_char, MNTOPT_NOBARRIER)) { args->flags &= ~XFSMNT_BARRIER; } else if (!strcmp(this_char, MNTOPT_IKEEP)) { - ikeep = 1; args->flags &= ~XFSMNT_IDELETE; } else if (!strcmp(this_char, MNTOPT_NOIKEEP)) { + noikeep = 1; /* explicitly requested */ args->flags |= XFSMNT_IDELETE; } else if (!strcmp(this_char, MNTOPT_LARGEIO)) { args->flags2 &= ~XFSMNT2_COMPAT_IOSIZE; @@ -410,8 +411,8 @@ xfs_parseargs( * Note that if "ikeep" or "noikeep" mount options are * supplied, then they are honored. */ - if (!(args->flags & XFSMNT_DMAPI) && !ikeep) - args->flags |= XFSMNT_IDELETE; + if ((args->flags & XFSMNT_DMAPI) && !noikeep) + args->flags &= ~XFSMNT_IDELETE; if ((args->flags & XFSMNT_NOALIGN) != XFSMNT_NOALIGN) { if (dsunit) { From owner-xfs@oss.sgi.com Wed Feb 13 20:30:45 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 20:30:54 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_00,SUBJ_ALL_CAPS autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1E4UfHK026035 for ; Wed, 13 Feb 2008 20:30:44 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA26110; Thu, 14 Feb 2008 15:31:01 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 1161) id 011A558C4C11; Thu, 14 Feb 2008 15:31:00 +1100 (EST) To: sgi.bugs.mangrove@engr.sgi.com Cc: xfs@oss.sgi.com Subject: PARTIAL TAKE 977326 - Message-Id: <20080214043101.011A558C4C11@chook.melbourne.sgi.com> Date: Thu, 14 Feb 2008 15:31:00 +1100 (EST) From: bnaujok@sgi.com (Barry Naujok) X-Virus-Scanned: ClamAV 0.91.2/5805/Wed Feb 13 15:29:12 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14430 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@sgi.com Precedence: bulk X-list: xfs walk_tree.h is not a devel package file Date: Thu Feb 14 15:30:24 AEDT 2008 Workarea: chook.melbourne.sgi.com:/home/bnaujok/isms/xfs-cmds Inspected by: donaldd@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-cmds/master-melb Modid: master-melb:xfs-cmds:30491a attr/include/Makefile - 1.14 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/attr/include/Makefile.diff?r1=text&tr1=1.14&r2=text&tr2=1.13&f=h - walk_tree.h is not a installable/devel package file From owner-xfs@oss.sgi.com Wed Feb 13 20:43:46 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 13 Feb 2008 20:43:52 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_64, J_CHICKENPOX_66 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1E4ha2U027186 for ; Wed, 13 Feb 2008 20:43:40 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA26373; Thu, 14 Feb 2008 15:43:46 +1100 Message-ID: <47B3C701.6090409@sgi.com> Date: Thu, 14 Feb 2008 15:43:45 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs@oss.sgi.com, a.gruenbacher@computer.org Subject: Re: [PATCH, RFC] use generic ACL code References: <20080207083222.GA14317@lst.de> In-Reply-To: <20080207083222.GA14317@lst.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/5805/Wed Feb 13 15:29:12 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14431 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Hi Christoph, Been going thru some v4 acl code but a couple of comments: (1) it looks like you decided that an xfs_iget_acl and xfs_iset_acl (basing on the ext3 code of Andreas) are not worth it and you'd prefer to do the code directly. (2) on a quick look at the 053 failure it looks like it is more of a question of the EA not being updated and so the acl is not permanent on disk. I don't think 051 is actually unmounting and mounting again to check that it has made it to disk. Whereas for 053 it wanted to test repair and so it unmounted, repaired and then tested the ACL/EA. I haven't looked to see why yet (but will:). 053 - output mismatch (see 053.out.bad) 22c22 < $SCRATCH_MNT/test.5 [u::---,u:id2:r-x,g::---,m::rwx,o::---] --- > $SCRATCH_MNT/test.5 [u::---,g::rwx,o::---] 24c24 < $SCRATCH_MNT/test.7 [u::---,g::---,g:id2:r-x,m::-w-,o::---] --- > $SCRATCH_MNT/test.7 [u::---,g::-w-,o::---] You'll note that the 053.out.bad has the group perms matching with the mask ACE of the ACL. This is what happens when we sync up the mode bits with the ACL. So I'd say getfacl is just returning you the mode bits here instead of the ACL. Okay, looking at the inode it doesn't have an EA on it, so yeah it looks like we've somehow missed to set the EA. Perhaps we should add some unmounting to test 051 too :) --Tim Christoph Hellwig wrote: > This patch rips out the XFS ACL handling code and uses the generic > fs/posix_acl.c code instead. The ondisk format is of course left > unchanged. > > This also introduces the same ACL caching all other Linux filesystems do > by adding pointers to the acl and default acl in struct xfs_inode. > It'll probably need some benchmarking to find out whether bloating the > inode is worth it. It should be possible to use the generic code > without this caching by revamping the code a little, although no other > filesystem currently does that. > > This patch is only an RFC because it still introduces a regression in > XFSQA test 053, but I really want to get it out now to get more comments > or even someone having a look at it because I'm running a little out of > time currently. > > Note that this patch applies ontop of the various vnode cleanups I've > posted to the XFS list a few weeks ago that haven't been applied yet. > > > Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_acl.c > =================================================================== > --- /dev/null 1970-01-01 00:00:00.000000000 +0000 > +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_acl.c 2008-02-07 09:15:35.000000000 +0100 > @@ -0,0 +1,453 @@ > +/* > + * Copyright (C) 2007 Christoph Hellwig. > + * Released under GPL v2. > + */ > +#include "xfs.h" > +#include "xfs_acl.h" > +#include "xfs_attr.h" > +#include "xfs_bmap_btree.h" /* required by xfs_inode.h */ > +#include "xfs_inode.h" > +#include "xfs_vnodeops.h" > + > +#include > + > + > +#define XFS_ACL_NOT_CACHED ((void *)-1) > + > +/* > + * Convert from extended attribute to in-memory representation. > + */ > +static struct posix_acl *xfs_acl_from_disk(struct xfs_acl *aclp) > +{ > + struct posix_acl_entry *acl_e; > + struct posix_acl *acl; > + struct xfs_acl_entry *ace; > + int count, i; > + > + count = be32_to_cpu(aclp->acl_cnt); > + > + acl = posix_acl_alloc(count, GFP_KERNEL); > + if (!acl) > + return ERR_PTR(-ENOMEM); > + > + for (i = 0; i < count; i++) { > + acl_e = &acl->a_entries[i]; > + ace = &aclp->acl_entry[i]; > + > + /* > + * XXX(hch): the tag is 32 bits on disk and 16 bits in core. > + * Any special handling required?? > + */ > + acl_e->e_tag = be32_to_cpu(ace->ae_tag); > + acl_e->e_perm = be16_to_cpu(ace->ae_perm); > + > + switch(acl_e->e_tag) { > + case ACL_USER: > + case ACL_GROUP: > + acl_e->e_id = be32_to_cpu(ace->ae_id); > + break; > + case ACL_USER_OBJ: > + case ACL_GROUP_OBJ: > + case ACL_MASK: > + case ACL_OTHER: > + acl_e->e_id = ACL_UNDEFINED_ID; > + break; > + default: > + goto fail; > + } > + } > + return acl; > + > +fail: > + posix_acl_release(acl); > + return ERR_PTR(-EINVAL); > +} > + > +/* > + * Convert from in-memory to extended attribute representation. > + */ > +static void xfs_acl_to_disk(struct xfs_acl *aclp, const struct posix_acl *acl) > +{ > + const struct posix_acl_entry *acl_e; > + struct xfs_acl_entry *ace; > + int i; > + > + for (i = 0; i < acl->a_count; i++) { > + ace = &aclp->acl_entry[i]; > + acl_e = &acl->a_entries[i]; > + > + ace->ae_tag = cpu_to_be32(acl_e->e_tag); > + ace->ae_id = cpu_to_be32(acl_e->e_id); > + ace->ae_perm = cpu_to_be16(acl_e->e_perm); > + } > +} > + > +struct posix_acl *xfs_get_acl(struct inode *inode, int type) > +{ > + struct xfs_inode *ip = XFS_I(inode); > + struct posix_acl *acl = NULL, **p_acl; > + struct xfs_acl *xfs_acl; > + int len = sizeof(struct xfs_acl); > + char *ea_name; > + int error; > + > + switch (type) { > + case ACL_TYPE_ACCESS: > + ea_name = SGI_ACL_FILE; > + p_acl = &ip->i_acl; > + break; > + case ACL_TYPE_DEFAULT: > + ea_name = SGI_ACL_DEFAULT; > + p_acl = &ip->i_default_acl; > + break; > + default: > + return ERR_PTR(-EINVAL); > + } > + > + if (*p_acl != XFS_ACL_NOT_CACHED) > + return posix_acl_dup(*p_acl); > + > + xfs_acl = kzalloc(sizeof(struct xfs_acl), GFP_KERNEL); > + if (!xfs_acl) > + return ERR_PTR(-ENOMEM); > + > + error = -xfs_attr_get(ip, ea_name, (char *)xfs_acl, > + &len, ATTR_ROOT, sys_cred); > + if (!error) { > + acl = xfs_acl_from_disk(xfs_acl); > + if (!IS_ERR(acl)) > + *p_acl = posix_acl_dup(acl); > + } else { > + *p_acl = NULL; > + } > + > + kfree(xfs_acl); > + return acl; > +} > + > +static int xfs_set_acl(struct inode *inode, int type, struct posix_acl *acl) > +{ > + struct xfs_inode *ip = XFS_I(inode); > + struct posix_acl **p_acl; > + char *ea_name; > + int error; > + > + if (S_ISLNK(inode->i_mode)) > + return -EOPNOTSUPP; > + > + switch (type) { > + case ACL_TYPE_ACCESS: > + ea_name = SGI_ACL_FILE; > + p_acl = &ip->i_acl; > + break; > + case ACL_TYPE_DEFAULT: > + ea_name = SGI_ACL_DEFAULT; > + p_acl = &ip->i_default_acl; > + if (!S_ISDIR(inode->i_mode)) > + return acl ? -EACCES : 0; > + break; > + default: > + return -EINVAL; > + } > + > + if (acl) { > + struct xfs_acl *xfs_acl; > + int len; > + > + xfs_acl = kzalloc(sizeof(struct xfs_acl), GFP_KERNEL); > + if (!xfs_acl) > + return -ENOMEM; > + > + xfs_acl_to_disk(xfs_acl, acl); > + len = sizeof(struct xfs_acl) - > + (sizeof(struct xfs_acl_entry) * > + (XFS_ACL_MAX_ENTRIES - acl->a_count)); > + > + error = -xfs_attr_set(ip, ea_name, (char *)xfs_acl, > + len, ATTR_ROOT); > + > + kfree(xfs_acl); > + } else { > + error = -xfs_attr_remove(ip, ea_name, ATTR_ROOT); > + /* > + * If the attribute didn't exist to start with that's fine. > + */ > + if (error == -ENOATTR) > + error = 0; > + } > + > + if (!error) { > + if (*p_acl && *p_acl != XFS_ACL_NOT_CACHED) > + posix_acl_release(*p_acl); > + *p_acl = posix_acl_dup(acl); > + } > + return error; > +} > + > +static int xfs_check_acl(struct inode *inode, int mask) > +{ > + struct xfs_inode *ip = XFS_I(inode); > + > + xfs_itrace_entry(ip); > + > + if (!XFS_IFORK_Q(ip)) > + return -EAGAIN; > + > + if (ip->i_acl == XFS_ACL_NOT_CACHED) { > + struct posix_acl *acl = xfs_get_acl(inode, ACL_TYPE_ACCESS); > + if (IS_ERR(acl)) > + return PTR_ERR(acl); > + posix_acl_release(acl); > + } > + > + if (ip->i_acl) > + return posix_acl_permission(inode, ip->i_acl, mask); > + return -EAGAIN; > +} > + > +int xfs_vn_permission(struct inode *inode, int mask, struct nameidata *nd) > +{ > + return generic_permission(inode, mask, xfs_check_acl); > +} > + > +/* > + * Extended attribute handlers > + */ > +static int xfs_xattr_get_acl(struct inode *inode, int type, > + void *buffer, size_t size) > +{ > + struct posix_acl *acl; > + int error; > + > + acl = xfs_get_acl(inode, type); > + if (IS_ERR(acl)) > + return PTR_ERR(acl); > + if (acl == NULL) > + return -ENODATA; > + error = posix_acl_to_xattr(acl, buffer, size); > + posix_acl_release(acl); > + > + return error; > +} > + > +/* > + * Helper to propagate i_mode the xfs_inode. > + */ > +static int xfs_set_mode(struct inode *inode, mode_t mode) > +{ > + int error = 0; > + > + if (mode != inode->i_mode) { > + struct bhv_vattr va = { > + .va_mask = XFS_AT_MODE, > + .va_mode = mode, > + }; > + > + va.va_mask = XFS_AT_MODE; > + va.va_mode = mode; > + > + error = -xfs_setattr(XFS_I(inode), &va, 0, sys_cred); > + inode->i_mode = mode; > + } > + > + return error; > +} > + > +static int xfs_xattr_set_acl(struct inode *inode, int type, > + const void *value, size_t size) > +{ > + struct posix_acl *acl; > + int error; > + > + if ((current->fsuid != inode->i_uid) && !capable(CAP_FOWNER)) > + return -EPERM; > + > + if (value) { > + acl = posix_acl_from_xattr(value, size); > + if (IS_ERR(acl)) > + return PTR_ERR(acl); > + else if (acl) { > + error = posix_acl_valid(acl); > + if (error) > + goto release_and_out; > + if (acl->a_count > XFS_ACL_MAX_ENTRIES) { > + error = -EINVAL; > + goto release_and_out; > + } > + > + if (type == ACL_TYPE_ACCESS) { > + mode_t mode = inode->i_mode; > + error = posix_acl_equiv_mode(acl, &mode); > + if (error < 0) > + return error; > + if (error == 0) { > + posix_acl_release(acl); > + acl = NULL; > + } > + error = xfs_set_mode(inode, mode); > + if (error) > + goto release_and_out; > + } > + } > + } else > + acl = NULL; > + > + error = xfs_set_acl(inode, type, acl); > +release_and_out: > + posix_acl_release(acl); > + return error; > +} > + > +static int xfs_acl_exists(struct inode *inode, char *name) > +{ > + int len = sizeof(struct xfs_acl); > + > + return xfs_attr_get(XFS_I(inode), name, NULL, &len, > + ATTR_ROOT|ATTR_KERNOVAL, sys_cred); > +} > + > +static int posix_acl_access_get(struct inode *inode, char *name, void *data, > + size_t size, int xflags) > +{ > + return xfs_xattr_get_acl(inode, ACL_TYPE_ACCESS, data, size); > +} > + > +static int posix_acl_access_set(struct inode *inode, char *name, void *data, > + size_t size, int xflags) > +{ > + return xfs_xattr_set_acl(inode, ACL_TYPE_ACCESS, data, size); > +} > + > +static int posix_acl_access_remove(struct inode *inode, char *name, int xflags) > +{ > + return xfs_xattr_set_acl(inode, ACL_TYPE_ACCESS, NULL, 0); > +} > + > +static int posix_acl_access_exists(struct inode *inode) > +{ > + return xfs_acl_exists(inode, SGI_ACL_FILE); > +} > + > +static int posix_acl_default_get(struct inode *inode, char *name, void *data, > + size_t size, int xflags) > +{ > + return xfs_xattr_get_acl(inode, ACL_TYPE_DEFAULT, data, size); > +} > + > +static int posix_acl_default_set(struct inode *inode, char *name, void *data, > + size_t size, int xflags) > +{ > + if (!S_ISDIR(inode->i_mode)) > + return data ? -EACCES : 0; > + return xfs_xattr_set_acl(inode, ACL_TYPE_DEFAULT, data, size); > +} > + > +static int posix_acl_default_remove(struct inode *inode, char *name, int xflags) > +{ > + return xfs_xattr_set_acl(inode, ACL_TYPE_DEFAULT, NULL, 0); > +} > + > +int posix_acl_default_exists(struct inode *inode) > +{ > + if (!S_ISDIR(inode->i_mode)) > + return 0; > + return xfs_acl_exists(inode, SGI_ACL_DEFAULT); > +} > + > +struct attrnames posix_acl_access = { > + .attr_name = "posix_acl_access", > + .attr_namelen = sizeof("posix_acl_access") - 1, > + .attr_get = posix_acl_access_get, > + .attr_set = posix_acl_access_set, > + .attr_remove = posix_acl_access_remove, > + .attr_exists = posix_acl_access_exists, > +}; > + > +struct attrnames posix_acl_default = { > + .attr_name = "posix_acl_default", > + .attr_namelen = sizeof("posix_acl_default") - 1, > + .attr_get = posix_acl_default_get, > + .attr_set = posix_acl_default_set, > + .attr_remove = posix_acl_default_remove, > + .attr_exists = posix_acl_default_exists, > +}; > + > +/* > + * Unlike the other functions in this file this returns positive errors. > + */ > +int xfs_inherit_acl(struct inode *inode, struct posix_acl *default_acl) > +{ > + struct xfs_inode *ip = XFS_I(inode); > + struct posix_acl *clone; > + mode_t mode; > + int error = 0; > + > + if (S_ISDIR(inode->i_mode)) { > + error = xfs_set_acl(inode, ACL_TYPE_DEFAULT, default_acl); > + if (error) > + return -error; > + } > + > + clone = posix_acl_clone(default_acl, GFP_KERNEL); > + if (!clone) > + return ENOMEM; > + > + mode = inode->i_mode; > + error = posix_acl_create_masq(clone, &mode); > + if (error < 0) > + goto out_release_clone; > + > + error = xfs_set_mode(inode, mode); > + if (error > 0) > + error = xfs_set_acl(inode, ACL_TYPE_ACCESS, clone); > + xfs_iflags_set(ip, XFS_IMODIFIED); > + > + out_release_clone: > + posix_acl_release(clone); > + return -error; > +} > + > +int xfs_acl_chmod(struct inode *inode) > +{ > + struct posix_acl *acl, *clone; > + int error; > + > + if (S_ISLNK(inode->i_mode)) > + return -EOPNOTSUPP; > + > + acl = xfs_get_acl(inode, ACL_TYPE_ACCESS); > + if (IS_ERR(acl) || !acl) > + return PTR_ERR(acl); > + > + clone = posix_acl_clone(acl, GFP_KERNEL); > + posix_acl_release(acl); > + if (!clone) > + return -ENOMEM; > + > + error = posix_acl_chmod_masq(clone, inode->i_mode); > + if (!error) > + error = xfs_set_acl(inode, ACL_TYPE_ACCESS, clone); > + > + posix_acl_release(clone); > + return error; > +} > + > +void xfs_inode_init_acls(struct xfs_inode *ip) > +{ > + ip->i_acl = XFS_ACL_NOT_CACHED; > + ip->i_default_acl = XFS_ACL_NOT_CACHED; > +} > + > +static void xfs_clear_acl(struct posix_acl **aclp) > +{ > + if (*aclp != XFS_ACL_NOT_CACHED) { > + posix_acl_release(*aclp); > + *aclp = XFS_ACL_NOT_CACHED; > + } > +} > + > +void xfs_inode_clear_acls(struct xfs_inode *ip) > +{ > + xfs_clear_acl(&ip->i_acl); > + xfs_clear_acl(&ip->i_default_acl); > +} > Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_iops.c 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.c 2008-02-07 09:17:11.000000000 +0100 > @@ -51,6 +51,7 @@ > #include > #include > #include > +#include > #include > #include > > @@ -272,8 +273,7 @@ xfs_vn_mknod( > { > struct inode *inode; > struct xfs_inode *ip = NULL; > - xfs_acl_t *default_acl = NULL; > - attrexists_t test_default_acl = _ACL_DEFAULT_EXISTS; > + struct posix_acl *default_acl = NULL; > int error; > > /* > @@ -283,18 +283,14 @@ xfs_vn_mknod( > if (unlikely(!sysv_valid_dev(rdev) || MAJOR(rdev) & ~0x1ff)) > return -EINVAL; > > - if (test_default_acl && test_default_acl(dir)) { > - if (!_ACL_ALLOC(default_acl)) { > - return -ENOMEM; > - } > - if (!_ACL_GET_DEFAULT(dir, default_acl)) { > - _ACL_FREE(default_acl); > - default_acl = NULL; > - } > - } > + if (IS_POSIXACL(dir)) { > + default_acl = xfs_get_acl(dir, ACL_TYPE_DEFAULT); > + if (IS_ERR(default_acl)) > + return -PTR_ERR(default_acl); > > - if (IS_POSIXACL(dir) && !default_acl) > - mode &= ~current->fs->umask; > + if (!default_acl) > + mode &= ~current->fs->umask; > + } > > switch (mode & S_IFMT) { > case S_IFCHR: > @@ -323,11 +319,11 @@ xfs_vn_mknod( > goto out_cleanup_inode; > > if (default_acl) { > - error = _ACL_INHERIT(inode, mode, default_acl); > + error = xfs_inherit_acl(inode, default_acl); > if (unlikely(error)) > goto out_cleanup_inode; > xfs_iflags_set(ip, XFS_IMODIFIED); > - _ACL_FREE(default_acl); > + posix_acl_release(default_acl); > } > > > @@ -340,8 +336,7 @@ xfs_vn_mknod( > out_cleanup_inode: > xfs_cleanup_inode(dir, inode, dentry, mode); > out_free_acl: > - if (default_acl) > - _ACL_FREE(default_acl); > + posix_acl_release(default_acl); > return -error; > } > > @@ -545,38 +540,6 @@ xfs_vn_put_link( > kfree(s); > } > > -#ifdef CONFIG_XFS_POSIX_ACL > -STATIC int > -xfs_check_acl( > - struct inode *inode, > - int mask) > -{ > - struct xfs_inode *ip = XFS_I(inode); > - int error; > - > - xfs_itrace_entry(ip); > - > - if (XFS_IFORK_Q(ip)) { > - error = xfs_acl_iaccess(ip, mask, NULL); > - if (error != -1) > - return -error; > - } > - > - return -EAGAIN; > -} > - > -STATIC int > -xfs_vn_permission( > - struct inode *inode, > - int mask, > - struct nameidata *nd) > -{ > - return generic_permission(inode, mask, xfs_check_acl); > -} > -#else > -#define xfs_vn_permission NULL > -#endif > - > STATIC int > xfs_vn_getattr( > struct vfsmount *mnt, > @@ -689,6 +652,9 @@ xfs_vn_setattr( > error = xfs_setattr(XFS_I(inode), &vattr, flags, NULL); > if (likely(!error)) > vn_revalidate(vn_from_inode(inode)); > + > + if (!error && (attr->ia_valid & ATTR_MODE)) > + error = -xfs_acl_chmod(inode); > return -error; > } > > Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.h > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_iops.h 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.h 2008-02-07 09:15:35.000000000 +0100 > @@ -26,6 +26,12 @@ extern const struct file_operations xfs_ > extern const struct file_operations xfs_dir_file_operations; > extern const struct file_operations xfs_invis_file_operations; > > +#ifdef CONFIG_XFS_POSIX_ACL > +int xfs_vn_permission(struct inode *inode, int mask, struct nameidata *nd); > +#else > +#define xfs_vn_permission NULL > +#endif > + > > struct xfs_inode; > extern void xfs_ichgtime(struct xfs_inode *, int); > Index: linux-2.6-xfs/fs/xfs/xfs_acl.h > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_acl.h 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_acl.h 2008-02-07 09:15:35.000000000 +0100 > @@ -18,27 +18,25 @@ > #ifndef __XFS_ACL_H__ > #define __XFS_ACL_H__ > > +struct inode; > +struct posix_acl; > +struct xfs_inode; > + > + > /* > * Access Control Lists > */ > -typedef __uint16_t xfs_acl_perm_t; > -typedef __int32_t xfs_acl_type_t; > -typedef __int32_t xfs_acl_tag_t; > -typedef __int32_t xfs_acl_id_t; > - > #define XFS_ACL_MAX_ENTRIES 25 > #define XFS_ACL_NOT_PRESENT (-1) > > -typedef struct xfs_acl_entry { > - xfs_acl_tag_t ae_tag; > - xfs_acl_id_t ae_id; > - xfs_acl_perm_t ae_perm; > -} xfs_acl_entry_t; > - > -typedef struct xfs_acl { > - __int32_t acl_cnt; > - xfs_acl_entry_t acl_entry[XFS_ACL_MAX_ENTRIES]; > -} xfs_acl_t; > +struct xfs_acl { > + __be32 acl_cnt; > + struct xfs_acl_entry { > + __be32 ae_tag; > + __be32 ae_id; > + __be16 ae_perm; > + } acl_entry[XFS_ACL_MAX_ENTRIES]; > +}; > > /* On-disk XFS extended attribute names */ > #define SGI_ACL_FILE "SGI_ACL_FILE" > @@ -49,51 +47,31 @@ typedef struct xfs_acl { > > #ifdef CONFIG_XFS_POSIX_ACL > > -struct vattr; > -struct xfs_inode; > - > -extern struct kmem_zone *xfs_acl_zone; > -#define xfs_acl_zone_init(zone, name) \ > - (zone) = kmem_zone_init(sizeof(xfs_acl_t), (name)) > -#define xfs_acl_zone_destroy(zone) kmem_zone_destroy(zone) > - > -extern int xfs_acl_inherit(bhv_vnode_t *, mode_t mode, xfs_acl_t *); > -extern int xfs_acl_iaccess(struct xfs_inode *, mode_t, cred_t *); > -extern int xfs_acl_vtoacl(bhv_vnode_t *, xfs_acl_t *, xfs_acl_t *); > -extern int xfs_acl_vhasacl_access(bhv_vnode_t *); > -extern int xfs_acl_vhasacl_default(bhv_vnode_t *); > -extern int xfs_acl_vset(bhv_vnode_t *, void *, size_t, int); > -extern int xfs_acl_vget(bhv_vnode_t *, void *, size_t, int); > -extern int xfs_acl_vremove(bhv_vnode_t *, int); > - > -#define _ACL_TYPE_ACCESS 1 > -#define _ACL_TYPE_DEFAULT 2 > -#define _ACL_PERM_INVALID(perm) ((perm) & ~(ACL_READ|ACL_WRITE|ACL_EXECUTE)) > - > -#define _ACL_INHERIT(c,m,d) (xfs_acl_inherit(c,m,d)) > -#define _ACL_GET_ACCESS(pv,pa) (xfs_acl_vtoacl(pv,pa,NULL) == 0) > -#define _ACL_GET_DEFAULT(pv,pd) (xfs_acl_vtoacl(pv,NULL,pd) == 0) > -#define _ACL_ACCESS_EXISTS xfs_acl_vhasacl_access > -#define _ACL_DEFAULT_EXISTS xfs_acl_vhasacl_default > - > -#define _ACL_ALLOC(a) ((a) = kmem_zone_alloc(xfs_acl_zone, KM_SLEEP)) > -#define _ACL_FREE(a) ((a)? kmem_zone_free(xfs_acl_zone, (a)):(void)0) > +struct posix_acl *xfs_get_acl(struct inode *inode, int type); > +int xfs_inherit_acl(struct inode *inode, struct posix_acl *default_acl); > +int xfs_acl_chmod(struct inode *inode); > +void xfs_inode_init_acls(struct xfs_inode *ip); > +void xfs_inode_clear_acls(struct xfs_inode *ip); > > #else > -#define xfs_acl_zone_init(zone,name) > -#define xfs_acl_zone_destroy(zone) > -#define xfs_acl_vset(v,p,sz,t) (-EOPNOTSUPP) > -#define xfs_acl_vget(v,p,sz,t) (-EOPNOTSUPP) > -#define xfs_acl_vremove(v,t) (-EOPNOTSUPP) > -#define xfs_acl_vhasacl_access(v) (0) > -#define xfs_acl_vhasacl_default(v) (0) > -#define _ACL_ALLOC(a) (1) /* successfully allocate nothing */ > -#define _ACL_FREE(a) ((void)0) > -#define _ACL_INHERIT(c,m,d) (0) > -#define _ACL_GET_ACCESS(pv,pa) (0) > -#define _ACL_GET_DEFAULT(pv,pd) (0) > -#define _ACL_ACCESS_EXISTS (NULL) > -#define _ACL_DEFAULT_EXISTS (NULL) > -#endif > > +static inline struct posix_acl *xfs_get_acl(struct inode *inode, int type) > +{ > + BUG(); > +} > +static inline int xfs_inherit_acl(struct inode *inode, > + struct posix_acl *default_acl) > +{ > + BUG(); > +} > + > +static inline void xfs_inode_init_acls(struct xfs_inode *ip) > +{ > +} > + > +static inline void xfs_inode_clear_acls(struct xfs_inode *ip) > +{ > +} > + > +#endif /* CONFIG_XFS_POSIX_ACL */ > #endif /* __XFS_ACL_H__ */ > Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_ksyms.c 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c 2008-02-07 09:15:35.000000000 +0100 > @@ -52,7 +52,6 @@ > #include "xfs_dir2_block.h" > #include "xfs_dir2_node.h" > #include "xfs_dir2_trace.h" > -#include "xfs_acl.h" > #include "xfs_attr.h" > #include "xfs_attr_leaf.h" > #include "xfs_inode_item.h" > @@ -183,10 +182,6 @@ EXPORT_SYMBOL(uuid_table_remove); > EXPORT_SYMBOL(vn_hold); > EXPORT_SYMBOL(vn_revalidate); > > -#if defined(CONFIG_XFS_POSIX_ACL) > -EXPORT_SYMBOL(xfs_acl_vtoacl); > -EXPORT_SYMBOL(xfs_acl_inherit); > -#endif > EXPORT_SYMBOL(xfs_alloc_buftarg); > EXPORT_SYMBOL(xfs_flush_buftarg); > EXPORT_SYMBOL(xfs_free_buftarg); > Index: linux-2.6-xfs/fs/xfs/xfs_attr.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_attr.c 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_attr.c 2008-02-07 09:15:35.000000000 +0100 > @@ -58,8 +58,6 @@ > */ > > #define ATTR_SYSCOUNT 2 > -static struct attrnames posix_acl_access; > -static struct attrnames posix_acl_default; > static struct attrnames *attr_system_names[ATTR_SYSCOUNT]; > > /*======================================================================== > @@ -2427,80 +2425,6 @@ xfs_attr_trace_enter(int type, char *whe > * System (pseudo) namespace attribute interface routines. > *========================================================================*/ > > -STATIC int > -posix_acl_access_set( > - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) > -{ > - return xfs_acl_vset(vp, data, size, _ACL_TYPE_ACCESS); > -} > - > -STATIC int > -posix_acl_access_remove( > - bhv_vnode_t *vp, char *name, int xflags) > -{ > - return xfs_acl_vremove(vp, _ACL_TYPE_ACCESS); > -} > - > -STATIC int > -posix_acl_access_get( > - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) > -{ > - return xfs_acl_vget(vp, data, size, _ACL_TYPE_ACCESS); > -} > - > -STATIC int > -posix_acl_access_exists( > - bhv_vnode_t *vp) > -{ > - return xfs_acl_vhasacl_access(vp); > -} > - > -STATIC int > -posix_acl_default_set( > - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) > -{ > - return xfs_acl_vset(vp, data, size, _ACL_TYPE_DEFAULT); > -} > - > -STATIC int > -posix_acl_default_get( > - bhv_vnode_t *vp, char *name, void *data, size_t size, int xflags) > -{ > - return xfs_acl_vget(vp, data, size, _ACL_TYPE_DEFAULT); > -} > - > -STATIC int > -posix_acl_default_remove( > - bhv_vnode_t *vp, char *name, int xflags) > -{ > - return xfs_acl_vremove(vp, _ACL_TYPE_DEFAULT); > -} > - > -STATIC int > -posix_acl_default_exists( > - bhv_vnode_t *vp) > -{ > - return xfs_acl_vhasacl_default(vp); > -} > - > -static struct attrnames posix_acl_access = { > - .attr_name = "posix_acl_access", > - .attr_namelen = sizeof("posix_acl_access") - 1, > - .attr_get = posix_acl_access_get, > - .attr_set = posix_acl_access_set, > - .attr_remove = posix_acl_access_remove, > - .attr_exists = posix_acl_access_exists, > -}; > - > -static struct attrnames posix_acl_default = { > - .attr_name = "posix_acl_default", > - .attr_namelen = sizeof("posix_acl_default") - 1, > - .attr_get = posix_acl_default_get, > - .attr_set = posix_acl_default_set, > - .attr_remove = posix_acl_default_remove, > - .attr_exists = posix_acl_default_exists, > -}; > - > static struct attrnames *attr_system_names[] = > { &posix_acl_access, &posix_acl_default }; > > Index: linux-2.6-xfs/fs/xfs/xfs_attr.h > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_attr.h 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_attr.h 2008-02-07 09:15:35.000000000 +0100 > @@ -61,6 +61,8 @@ extern struct attrnames attr_secure; > extern struct attrnames attr_system; > extern struct attrnames attr_trusted; > extern struct attrnames *attr_namespaces[ATTR_NAMECOUNT]; > +extern struct attrnames posix_acl_access; > +extern struct attrnames posix_acl_default; > > extern attrnames_t *attr_lookup_namespace(char *, attrnames_t **, int); > extern int attr_generic_list(bhv_vnode_t *, void *, size_t, int, ssize_t *); > Index: linux-2.6-xfs/fs/xfs/xfs_vfsops.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_vfsops.c 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_vfsops.c 2008-02-07 09:15:35.000000000 +0100 > @@ -78,7 +78,6 @@ xfs_init(void) > kmem_zone_init(sizeof(xfs_da_state_t), "xfs_da_state"); > xfs_dabuf_zone = kmem_zone_init(sizeof(xfs_dabuf_t), "xfs_dabuf"); > xfs_ifork_zone = kmem_zone_init(sizeof(xfs_ifork_t), "xfs_ifork"); > - xfs_acl_zone_init(xfs_acl_zone, "xfs_acl"); > xfs_mru_cache_init(); > xfs_filestream_init(); > > @@ -160,7 +159,6 @@ xfs_cleanup(void) > xfs_refcache_destroy(); > xfs_filestream_uninit(); > xfs_mru_cache_uninit(); > - xfs_acl_zone_destroy(xfs_acl_zone); > > #ifdef XFS_DIR2_TRACE > ktrace_free(xfs_dir2_trace_buf); > Index: linux-2.6-xfs/fs/xfs/xfs_acl.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_acl.c 2008-02-05 08:43:31.000000000 +0100 > +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 > @@ -1,903 +0,0 @@ > -/* > - * Copyright (c) 2001-2002,2005 Silicon Graphics, Inc. > - * All Rights Reserved. > - * > - * This program is free software; you can redistribute it and/or > - * modify it under the terms of the GNU General Public License as > - * published by the Free Software Foundation. > - * > - * This program is distributed in the hope that it would be useful, > - * but WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > - * GNU General Public License for more details. > - * > - * You should have received a copy of the GNU General Public License > - * along with this program; if not, write the Free Software Foundation, > - * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > - */ > -#include "xfs.h" > -#include "xfs_fs.h" > -#include "xfs_types.h" > -#include "xfs_bit.h" > -#include "xfs_inum.h" > -#include "xfs_ag.h" > -#include "xfs_dir2.h" > -#include "xfs_bmap_btree.h" > -#include "xfs_alloc_btree.h" > -#include "xfs_ialloc_btree.h" > -#include "xfs_dir2_sf.h" > -#include "xfs_attr_sf.h" > -#include "xfs_dinode.h" > -#include "xfs_inode.h" > -#include "xfs_btree.h" > -#include "xfs_acl.h" > -#include "xfs_attr.h" > -#include "xfs_vnodeops.h" > - > -#include > -#include > - > -STATIC int xfs_acl_setmode(bhv_vnode_t *, xfs_acl_t *, int *); > -STATIC void xfs_acl_filter_mode(mode_t, xfs_acl_t *); > -STATIC void xfs_acl_get_endian(xfs_acl_t *); > -STATIC int xfs_acl_access(uid_t, gid_t, xfs_acl_t *, mode_t, cred_t *); > -STATIC int xfs_acl_invalid(xfs_acl_t *); > -STATIC void xfs_acl_sync_mode(mode_t, xfs_acl_t *); > -STATIC void xfs_acl_get_attr(bhv_vnode_t *, xfs_acl_t *, int, int, int *); > -STATIC void xfs_acl_set_attr(bhv_vnode_t *, xfs_acl_t *, int, int *); > -STATIC int xfs_acl_allow_set(bhv_vnode_t *, int); > - > -kmem_zone_t *xfs_acl_zone; > - > - > -/* > - * Test for existence of access ACL attribute as efficiently as possible. > - */ > -int > -xfs_acl_vhasacl_access( > - bhv_vnode_t *vp) > -{ > - int error; > - > - xfs_acl_get_attr(vp, NULL, _ACL_TYPE_ACCESS, ATTR_KERNOVAL, &error); > - return (error == 0); > -} > - > -/* > - * Test for existence of default ACL attribute as efficiently as possible. > - */ > -int > -xfs_acl_vhasacl_default( > - bhv_vnode_t *vp) > -{ > - int error; > - > - if (!VN_ISDIR(vp)) > - return 0; > - xfs_acl_get_attr(vp, NULL, _ACL_TYPE_DEFAULT, ATTR_KERNOVAL, &error); > - return (error == 0); > -} > - > -/* > - * Convert from extended attribute representation to in-memory for XFS. > - */ > -STATIC int > -posix_acl_xattr_to_xfs( > - posix_acl_xattr_header *src, > - size_t size, > - xfs_acl_t *dest) > -{ > - posix_acl_xattr_entry *src_entry; > - xfs_acl_entry_t *dest_entry; > - int n; > - > - if (!src || !dest) > - return EINVAL; > - > - if (size < sizeof(posix_acl_xattr_header)) > - return EINVAL; > - > - if (src->a_version != cpu_to_le32(POSIX_ACL_XATTR_VERSION)) > - return EOPNOTSUPP; > - > - memset(dest, 0, sizeof(xfs_acl_t)); > - dest->acl_cnt = posix_acl_xattr_count(size); > - if (dest->acl_cnt < 0 || dest->acl_cnt > XFS_ACL_MAX_ENTRIES) > - return EINVAL; > - > - /* > - * acl_set_file(3) may request that we set default ACLs with > - * zero length -- defend (gracefully) against that here. > - */ > - if (!dest->acl_cnt) > - return 0; > - > - src_entry = (posix_acl_xattr_entry *)((char *)src + sizeof(*src)); > - dest_entry = &dest->acl_entry[0]; > - > - for (n = 0; n < dest->acl_cnt; n++, src_entry++, dest_entry++) { > - dest_entry->ae_perm = le16_to_cpu(src_entry->e_perm); > - if (_ACL_PERM_INVALID(dest_entry->ae_perm)) > - return EINVAL; > - dest_entry->ae_tag = le16_to_cpu(src_entry->e_tag); > - switch(dest_entry->ae_tag) { > - case ACL_USER: > - case ACL_GROUP: > - dest_entry->ae_id = le32_to_cpu(src_entry->e_id); > - break; > - case ACL_USER_OBJ: > - case ACL_GROUP_OBJ: > - case ACL_MASK: > - case ACL_OTHER: > - dest_entry->ae_id = ACL_UNDEFINED_ID; > - break; > - default: > - return EINVAL; > - } > - } > - if (xfs_acl_invalid(dest)) > - return EINVAL; > - > - return 0; > -} > - > -/* > - * Comparison function called from xfs_sort(). > - * Primary key is ae_tag, secondary key is ae_id. > - */ > -STATIC int > -xfs_acl_entry_compare( > - const void *va, > - const void *vb) > -{ > - xfs_acl_entry_t *a = (xfs_acl_entry_t *)va, > - *b = (xfs_acl_entry_t *)vb; > - > - if (a->ae_tag == b->ae_tag) > - return (a->ae_id - b->ae_id); > - return (a->ae_tag - b->ae_tag); > -} > - > -/* > - * Convert from in-memory XFS to extended attribute representation. > - */ > -STATIC int > -posix_acl_xfs_to_xattr( > - xfs_acl_t *src, > - posix_acl_xattr_header *dest, > - size_t size) > -{ > - int n; > - size_t new_size = posix_acl_xattr_size(src->acl_cnt); > - posix_acl_xattr_entry *dest_entry; > - xfs_acl_entry_t *src_entry; > - > - if (size < new_size) > - return -ERANGE; > - > - /* Need to sort src XFS ACL by */ > - xfs_sort(src->acl_entry, src->acl_cnt, sizeof(src->acl_entry[0]), > - xfs_acl_entry_compare); > - > - dest->a_version = cpu_to_le32(POSIX_ACL_XATTR_VERSION); > - dest_entry = &dest->a_entries[0]; > - src_entry = &src->acl_entry[0]; > - for (n = 0; n < src->acl_cnt; n++, dest_entry++, src_entry++) { > - dest_entry->e_perm = cpu_to_le16(src_entry->ae_perm); > - if (_ACL_PERM_INVALID(src_entry->ae_perm)) > - return -EINVAL; > - dest_entry->e_tag = cpu_to_le16(src_entry->ae_tag); > - switch (src_entry->ae_tag) { > - case ACL_USER: > - case ACL_GROUP: > - dest_entry->e_id = cpu_to_le32(src_entry->ae_id); > - break; > - case ACL_USER_OBJ: > - case ACL_GROUP_OBJ: > - case ACL_MASK: > - case ACL_OTHER: > - dest_entry->e_id = cpu_to_le32(ACL_UNDEFINED_ID); > - break; > - default: > - return -EINVAL; > - } > - } > - return new_size; > -} > - > -int > -xfs_acl_vget( > - bhv_vnode_t *vp, > - void *acl, > - size_t size, > - int kind) > -{ > - int error; > - xfs_acl_t *xfs_acl = NULL; > - posix_acl_xattr_header *ext_acl = acl; > - int flags = 0; > - > - VN_HOLD(vp); > - if(size) { > - if (!(_ACL_ALLOC(xfs_acl))) { > - error = ENOMEM; > - goto out; > - } > - memset(xfs_acl, 0, sizeof(xfs_acl_t)); > - } else > - flags = ATTR_KERNOVAL; > - > - xfs_acl_get_attr(vp, xfs_acl, kind, flags, &error); > - if (error) > - goto out; > - > - if (!size) { > - error = -posix_acl_xattr_size(XFS_ACL_MAX_ENTRIES); > - } else { > - if (xfs_acl_invalid(xfs_acl)) { > - error = EINVAL; > - goto out; > - } > - if (kind == _ACL_TYPE_ACCESS) { > - bhv_vattr_t va; > - > - va.va_mask = XFS_AT_MODE; > - error = xfs_getattr(xfs_vtoi(vp), &va, 0); > - if (error) > - goto out; > - xfs_acl_sync_mode(va.va_mode, xfs_acl); > - } > - error = -posix_acl_xfs_to_xattr(xfs_acl, ext_acl, size); > - } > -out: > - VN_RELE(vp); > - if(xfs_acl) > - _ACL_FREE(xfs_acl); > - return -error; > -} > - > -int > -xfs_acl_vremove( > - bhv_vnode_t *vp, > - int kind) > -{ > - int error; > - > - VN_HOLD(vp); > - error = xfs_acl_allow_set(vp, kind); > - if (!error) { > - error = xfs_attr_remove(xfs_vtoi(vp), > - kind == _ACL_TYPE_DEFAULT? > - SGI_ACL_DEFAULT: SGI_ACL_FILE, > - ATTR_ROOT); > - if (error == ENOATTR) > - error = 0; /* 'scool */ > - } > - VN_RELE(vp); > - return -error; > -} > - > -int > -xfs_acl_vset( > - bhv_vnode_t *vp, > - void *acl, > - size_t size, > - int kind) > -{ > - posix_acl_xattr_header *ext_acl = acl; > - xfs_acl_t *xfs_acl; > - int error; > - int basicperms = 0; /* more than std unix perms? */ > - > - if (!acl) > - return -EINVAL; > - > - if (!(_ACL_ALLOC(xfs_acl))) > - return -ENOMEM; > - > - error = posix_acl_xattr_to_xfs(ext_acl, size, xfs_acl); > - if (error) { > - _ACL_FREE(xfs_acl); > - return -error; > - } > - if (!xfs_acl->acl_cnt) { > - _ACL_FREE(xfs_acl); > - return 0; > - } > - > - VN_HOLD(vp); > - error = xfs_acl_allow_set(vp, kind); > - if (error) > - goto out; > - > - /* Incoming ACL exists, set file mode based on its value */ > - if (kind == _ACL_TYPE_ACCESS) > - xfs_acl_setmode(vp, xfs_acl, &basicperms); > - > - /* > - * If we have more than std unix permissions, set up the actual attr. > - * Otherwise, delete any existing attr. This prevents us from > - * having actual attrs for permissions that can be stored in the > - * standard permission bits. > - */ > - if (!basicperms) { > - xfs_acl_set_attr(vp, xfs_acl, kind, &error); > - } else { > - xfs_acl_vremove(vp, _ACL_TYPE_ACCESS); > - } > - > -out: > - VN_RELE(vp); > - _ACL_FREE(xfs_acl); > - return -error; > -} > - > -int > -xfs_acl_iaccess( > - xfs_inode_t *ip, > - mode_t mode, > - cred_t *cr) > -{ > - xfs_acl_t *acl; > - int rval; > - > - if (!(_ACL_ALLOC(acl))) > - return -1; > - > - /* If the file has no ACL return -1. */ > - rval = sizeof(xfs_acl_t); > - if (xfs_attr_fetch(ip, SGI_ACL_FILE, SGI_ACL_FILE_SIZE, > - (char *)acl, &rval, ATTR_ROOT | ATTR_KERNACCESS, cr)) { > - _ACL_FREE(acl); > - return -1; > - } > - xfs_acl_get_endian(acl); > - > - /* If the file has an empty ACL return -1. */ > - if (acl->acl_cnt == XFS_ACL_NOT_PRESENT) { > - _ACL_FREE(acl); > - return -1; > - } > - > - /* Synchronize ACL with mode bits */ > - xfs_acl_sync_mode(ip->i_d.di_mode, acl); > - > - rval = xfs_acl_access(ip->i_d.di_uid, ip->i_d.di_gid, acl, mode, cr); > - _ACL_FREE(acl); > - return rval; > -} > - > -STATIC int > -xfs_acl_allow_set( > - bhv_vnode_t *vp, > - int kind) > -{ > - xfs_inode_t *ip = xfs_vtoi(vp); > - bhv_vattr_t va; > - int error; > - > - if (vp->i_flags & (S_IMMUTABLE|S_APPEND)) > - return EPERM; > - if (kind == _ACL_TYPE_DEFAULT && !VN_ISDIR(vp)) > - return ENOTDIR; > - if (vp->i_sb->s_flags & MS_RDONLY) > - return EROFS; > - va.va_mask = XFS_AT_UID; > - error = xfs_getattr(ip, &va, 0); > - if (error) > - return error; > - if (va.va_uid != current->fsuid && !capable(CAP_FOWNER)) > - return EPERM; > - return error; > -} > - > -/* > - * Note: cr is only used here for the capability check if the ACL test fails. > - * It is not used to find out the credentials uid or groups etc, as was > - * done in IRIX. It is assumed that the uid and groups for the current > - * thread are taken from "current" instead of the cr parameter. > - */ > -STATIC int > -xfs_acl_access( > - uid_t fuid, > - gid_t fgid, > - xfs_acl_t *fap, > - mode_t md, > - cred_t *cr) > -{ > - xfs_acl_entry_t matched; > - int i, allows; > - int maskallows = -1; /* true, but not 1, either */ > - int seen_userobj = 0; > - > - matched.ae_tag = 0; /* Invalid type */ > - matched.ae_perm = 0; > - > - for (i = 0; i < fap->acl_cnt; i++) { > - /* > - * Break out if we've got a user_obj entry or > - * a user entry and the mask (and have processed USER_OBJ) > - */ > - if (matched.ae_tag == ACL_USER_OBJ) > - break; > - if (matched.ae_tag == ACL_USER) { > - if (maskallows != -1 && seen_userobj) > - break; > - if (fap->acl_entry[i].ae_tag != ACL_MASK && > - fap->acl_entry[i].ae_tag != ACL_USER_OBJ) > - continue; > - } > - /* True if this entry allows the requested access */ > - allows = ((fap->acl_entry[i].ae_perm & md) == md); > - > - switch (fap->acl_entry[i].ae_tag) { > - case ACL_USER_OBJ: > - seen_userobj = 1; > - if (fuid != current->fsuid) > - continue; > - matched.ae_tag = ACL_USER_OBJ; > - matched.ae_perm = allows; > - break; > - case ACL_USER: > - if (fap->acl_entry[i].ae_id != current->fsuid) > - continue; > - matched.ae_tag = ACL_USER; > - matched.ae_perm = allows; > - break; > - case ACL_GROUP_OBJ: > - if ((matched.ae_tag == ACL_GROUP_OBJ || > - matched.ae_tag == ACL_GROUP) && !allows) > - continue; > - if (!in_group_p(fgid)) > - continue; > - matched.ae_tag = ACL_GROUP_OBJ; > - matched.ae_perm = allows; > - break; > - case ACL_GROUP: > - if ((matched.ae_tag == ACL_GROUP_OBJ || > - matched.ae_tag == ACL_GROUP) && !allows) > - continue; > - if (!in_group_p(fap->acl_entry[i].ae_id)) > - continue; > - matched.ae_tag = ACL_GROUP; > - matched.ae_perm = allows; > - break; > - case ACL_MASK: > - maskallows = allows; > - break; > - case ACL_OTHER: > - if (matched.ae_tag != 0) > - continue; > - matched.ae_tag = ACL_OTHER; > - matched.ae_perm = allows; > - break; > - } > - } > - /* > - * First possibility is that no matched entry allows access. > - * The capability to override DAC may exist, so check for it. > - */ > - switch (matched.ae_tag) { > - case ACL_OTHER: > - case ACL_USER_OBJ: > - if (matched.ae_perm) > - return 0; > - break; > - case ACL_USER: > - case ACL_GROUP_OBJ: > - case ACL_GROUP: > - if (maskallows && matched.ae_perm) > - return 0; > - break; > - case 0: > - break; > - } > - > - /* EACCES tells generic_permission to check for capability overrides */ > - return EACCES; > -} > -EXPORT_SYMBOL(xfs_acl_access); > - > -/* > - * ACL validity checker. > - * This acl validation routine checks each ACL entry read in makes sense. > - */ > -STATIC int > -xfs_acl_invalid( > - xfs_acl_t *aclp) > -{ > - xfs_acl_entry_t *entry, *e; > - int user = 0, group = 0, other = 0, mask = 0; > - int mask_required = 0; > - int i, j; > - > - if (!aclp) > - goto acl_invalid; > - > - if (aclp->acl_cnt > XFS_ACL_MAX_ENTRIES) > - goto acl_invalid; > - > - for (i = 0; i < aclp->acl_cnt; i++) { > - entry = &aclp->acl_entry[i]; > - switch (entry->ae_tag) { > - case ACL_USER_OBJ: > - if (user++) > - goto acl_invalid; > - break; > - case ACL_GROUP_OBJ: > - if (group++) > - goto acl_invalid; > - break; > - case ACL_OTHER: > - if (other++) > - goto acl_invalid; > - break; > - case ACL_USER: > - case ACL_GROUP: > - for (j = i + 1; j < aclp->acl_cnt; j++) { > - e = &aclp->acl_entry[j]; > - if (e->ae_id == entry->ae_id && > - e->ae_tag == entry->ae_tag) > - goto acl_invalid; > - } > - mask_required++; > - break; > - case ACL_MASK: > - if (mask++) > - goto acl_invalid; > - break; > - default: > - goto acl_invalid; > - } > - } > - if (!user || !group || !other || (mask_required && !mask)) > - goto acl_invalid; > - else > - return 0; > -acl_invalid: > - return EINVAL; > -} > - > -/* > - * Do ACL endian conversion. > - */ > -STATIC void > -xfs_acl_get_endian( > - xfs_acl_t *aclp) > -{ > - xfs_acl_entry_t *ace, *end; > - > - INT_SET(aclp->acl_cnt, ARCH_CONVERT, aclp->acl_cnt); > - end = &aclp->acl_entry[0]+aclp->acl_cnt; > - for (ace = &aclp->acl_entry[0]; ace < end; ace++) { > - INT_SET(ace->ae_tag, ARCH_CONVERT, ace->ae_tag); > - INT_SET(ace->ae_id, ARCH_CONVERT, ace->ae_id); > - INT_SET(ace->ae_perm, ARCH_CONVERT, ace->ae_perm); > - } > -} > - > -/* > - * Get the ACL from the EA and do endian conversion. > - */ > -STATIC void > -xfs_acl_get_attr( > - bhv_vnode_t *vp, > - xfs_acl_t *aclp, > - int kind, > - int flags, > - int *error) > -{ > - int len = sizeof(xfs_acl_t); > - > - ASSERT((flags & ATTR_KERNOVAL) ? (aclp == NULL) : 1); > - flags |= ATTR_ROOT; > - *error = xfs_attr_get(xfs_vtoi(vp), > - kind == _ACL_TYPE_ACCESS ? > - SGI_ACL_FILE : SGI_ACL_DEFAULT, > - (char *)aclp, &len, flags, sys_cred); > - if (*error || (flags & ATTR_KERNOVAL)) > - return; > - xfs_acl_get_endian(aclp); > -} > - > -/* > - * Set the EA with the ACL and do endian conversion. > - */ > -STATIC void > -xfs_acl_set_attr( > - bhv_vnode_t *vp, > - xfs_acl_t *aclp, > - int kind, > - int *error) > -{ > - xfs_acl_entry_t *ace, *newace, *end; > - xfs_acl_t *newacl; > - int len; > - > - if (!(_ACL_ALLOC(newacl))) { > - *error = ENOMEM; > - return; > - } > - > - len = sizeof(xfs_acl_t) - > - (sizeof(xfs_acl_entry_t) * (XFS_ACL_MAX_ENTRIES - aclp->acl_cnt)); > - end = &aclp->acl_entry[0]+aclp->acl_cnt; > - for (ace = &aclp->acl_entry[0], newace = &newacl->acl_entry[0]; > - ace < end; > - ace++, newace++) { > - INT_SET(newace->ae_tag, ARCH_CONVERT, ace->ae_tag); > - INT_SET(newace->ae_id, ARCH_CONVERT, ace->ae_id); > - INT_SET(newace->ae_perm, ARCH_CONVERT, ace->ae_perm); > - } > - INT_SET(newacl->acl_cnt, ARCH_CONVERT, aclp->acl_cnt); > - *error = xfs_attr_set(xfs_vtoi(vp), > - kind == _ACL_TYPE_ACCESS ? > - SGI_ACL_FILE: SGI_ACL_DEFAULT, > - (char *)newacl, len, ATTR_ROOT); > - _ACL_FREE(newacl); > -} > - > -int > -xfs_acl_vtoacl( > - bhv_vnode_t *vp, > - xfs_acl_t *access_acl, > - xfs_acl_t *default_acl) > -{ > - bhv_vattr_t va; > - int error = 0; > - > - if (access_acl) { > - /* > - * Get the Access ACL and the mode. If either cannot > - * be obtained for some reason, invalidate the access ACL. > - */ > - xfs_acl_get_attr(vp, access_acl, _ACL_TYPE_ACCESS, 0, &error); > - if (!error) { > - /* Got the ACL, need the mode... */ > - va.va_mask = XFS_AT_MODE; > - error = xfs_getattr(xfs_vtoi(vp), &va, 0); > - } > - > - if (error) > - access_acl->acl_cnt = XFS_ACL_NOT_PRESENT; > - else /* We have a good ACL and the file mode, synchronize. */ > - xfs_acl_sync_mode(va.va_mode, access_acl); > - } > - > - if (default_acl) { > - xfs_acl_get_attr(vp, default_acl, _ACL_TYPE_DEFAULT, 0, &error); > - if (error) > - default_acl->acl_cnt = XFS_ACL_NOT_PRESENT; > - } > - return error; > -} > - > -/* > - * This function retrieves the parent directory's acl, processes it > - * and lets the child inherit the acl(s) that it should. > - */ > -int > -xfs_acl_inherit( > - bhv_vnode_t *vp, > - mode_t mode, > - xfs_acl_t *pdaclp) > -{ > - xfs_acl_t *cacl; > - int error = 0; > - int basicperms = 0; > - > - /* > - * If the parent does not have a default ACL, or it's an > - * invalid ACL, we're done. > - */ > - if (!vp) > - return 0; > - if (!pdaclp || xfs_acl_invalid(pdaclp)) > - return 0; > - > - /* > - * Copy the default ACL of the containing directory to > - * the access ACL of the new file and use the mode that > - * was passed in to set up the correct initial values for > - * the u::,g::[m::], and o:: entries. This is what makes > - * umask() "work" with ACL's. > - */ > - > - if (!(_ACL_ALLOC(cacl))) > - return ENOMEM; > - > - memcpy(cacl, pdaclp, sizeof(xfs_acl_t)); > - xfs_acl_filter_mode(mode, cacl); > - xfs_acl_setmode(vp, cacl, &basicperms); > - > - /* > - * Set the Default and Access ACL on the file. The mode is already > - * set on the file, so we don't need to worry about that. > - * > - * If the new file is a directory, its default ACL is a copy of > - * the containing directory's default ACL. > - */ > - if (VN_ISDIR(vp)) > - xfs_acl_set_attr(vp, pdaclp, _ACL_TYPE_DEFAULT, &error); > - if (!error && !basicperms) > - xfs_acl_set_attr(vp, cacl, _ACL_TYPE_ACCESS, &error); > - _ACL_FREE(cacl); > - return error; > -} > - > -/* > - * Set up the correct mode on the file based on the supplied ACL. This > - * makes sure that the mode on the file reflects the state of the > - * u::,g::[m::], and o:: entries in the ACL. Since the mode is where > - * the ACL is going to get the permissions for these entries, we must > - * synchronize the mode whenever we set the ACL on a file. > - */ > -STATIC int > -xfs_acl_setmode( > - bhv_vnode_t *vp, > - xfs_acl_t *acl, > - int *basicperms) > -{ > - bhv_vattr_t va; > - xfs_acl_entry_t *ap; > - xfs_acl_entry_t *gap = NULL; > - int i, error, nomask = 1; > - > - *basicperms = 1; > - > - if (acl->acl_cnt == XFS_ACL_NOT_PRESENT) > - return 0; > - > - /* > - * Copy the u::, g::, o::, and m:: bits from the ACL into the > - * mode. The m:: bits take precedence over the g:: bits. > - */ > - va.va_mask = XFS_AT_MODE; > - error = xfs_getattr(xfs_vtoi(vp), &va, 0); > - if (error) > - return error; > - > - va.va_mask = XFS_AT_MODE; > - va.va_mode &= ~(S_IRWXU|S_IRWXG|S_IRWXO); > - ap = acl->acl_entry; > - for (i = 0; i < acl->acl_cnt; ++i) { > - switch (ap->ae_tag) { > - case ACL_USER_OBJ: > - va.va_mode |= ap->ae_perm << 6; > - break; > - case ACL_GROUP_OBJ: > - gap = ap; > - break; > - case ACL_MASK: /* more than just standard modes */ > - nomask = 0; > - va.va_mode |= ap->ae_perm << 3; > - *basicperms = 0; > - break; > - case ACL_OTHER: > - va.va_mode |= ap->ae_perm; > - break; > - default: /* more than just standard modes */ > - *basicperms = 0; > - break; > - } > - ap++; > - } > - > - /* Set the group bits from ACL_GROUP_OBJ if there's no ACL_MASK */ > - if (gap && nomask) > - va.va_mode |= gap->ae_perm << 3; > - > - return xfs_setattr(xfs_vtoi(vp), &va, 0, sys_cred); > -} > - > -/* > - * The permissions for the special ACL entries (u::, g::[m::], o::) are > - * actually stored in the file mode (if there is both a group and a mask, > - * the group is stored in the ACL entry and the mask is stored on the file). > - * This allows the mode to remain automatically in sync with the ACL without > - * the need for a call-back to the ACL system at every point where the mode > - * could change. This function takes the permissions from the specified mode > - * and places it in the supplied ACL. > - * > - * This implementation draws its validity from the fact that, when the ACL > - * was assigned, the mode was copied from the ACL. > - * If the mode did not change, therefore, the mode remains exactly what was > - * taken from the special ACL entries at assignment. > - * If a subsequent chmod() was done, the POSIX spec says that the change in > - * mode must cause an update to the ACL seen at user level and used for > - * access checks. Before and after a mode change, therefore, the file mode > - * most accurately reflects what the special ACL entries should permit/deny. > - * > - * CAVEAT: If someone sets the SGI_ACL_FILE attribute directly, > - * the existing mode bits will override whatever is in the > - * ACL. Similarly, if there is a pre-existing ACL that was > - * never in sync with its mode (owing to a bug in 6.5 and > - * before), it will now magically (or mystically) be > - * synchronized. This could cause slight astonishment, but > - * it is better than inconsistent permissions. > - * > - * The supplied ACL is a template that may contain any combination > - * of special entries. These are treated as place holders when we fill > - * out the ACL. This routine does not add or remove special entries, it > - * simply unites each special entry with its associated set of permissions. > - */ > -STATIC void > -xfs_acl_sync_mode( > - mode_t mode, > - xfs_acl_t *acl) > -{ > - int i, nomask = 1; > - xfs_acl_entry_t *ap; > - xfs_acl_entry_t *gap = NULL; > - > - /* > - * Set ACL entries. POSIX1003.1eD16 requires that the MASK > - * be set instead of the GROUP entry, if there is a MASK. > - */ > - for (ap = acl->acl_entry, i = 0; i < acl->acl_cnt; ap++, i++) { > - switch (ap->ae_tag) { > - case ACL_USER_OBJ: > - ap->ae_perm = (mode >> 6) & 0x7; > - break; > - case ACL_GROUP_OBJ: > - gap = ap; > - break; > - case ACL_MASK: > - nomask = 0; > - ap->ae_perm = (mode >> 3) & 0x7; > - break; > - case ACL_OTHER: > - ap->ae_perm = mode & 0x7; > - break; > - default: > - break; > - } > - } > - /* Set the ACL_GROUP_OBJ if there's no ACL_MASK */ > - if (gap && nomask) > - gap->ae_perm = (mode >> 3) & 0x7; > -} > - > -/* > - * When inheriting an Access ACL from a directory Default ACL, > - * the ACL bits are set to the intersection of the ACL default > - * permission bits and the file permission bits in mode. If there > - * are no permission bits on the file then we must not give them > - * the ACL. This is what what makes umask() work with ACLs. > - */ > -STATIC void > -xfs_acl_filter_mode( > - mode_t mode, > - xfs_acl_t *acl) > -{ > - int i, nomask = 1; > - xfs_acl_entry_t *ap; > - xfs_acl_entry_t *gap = NULL; > - > - /* > - * Set ACL entries. POSIX1003.1eD16 requires that the MASK > - * be merged with GROUP entry, if there is a MASK. > - */ > - for (ap = acl->acl_entry, i = 0; i < acl->acl_cnt; ap++, i++) { > - switch (ap->ae_tag) { > - case ACL_USER_OBJ: > - ap->ae_perm &= (mode >> 6) & 0x7; > - break; > - case ACL_GROUP_OBJ: > - gap = ap; > - break; > - case ACL_MASK: > - nomask = 0; > - ap->ae_perm &= (mode >> 3) & 0x7; > - break; > - case ACL_OTHER: > - ap->ae_perm &= mode & 0x7; > - break; > - default: > - break; > - } > - } > - /* Set the ACL_GROUP_OBJ if there's no ACL_MASK */ > - if (gap && nomask) > - gap->ae_perm &= (mode >> 3) & 0x7; > -} > Index: linux-2.6-xfs/fs/xfs/Makefile > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/Makefile 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/Makefile 2008-02-07 09:15:35.000000000 +0100 > @@ -29,7 +29,7 @@ obj-$(CONFIG_XFS_QUOTA) += quota/ > obj-$(CONFIG_XFS_DMAPI) += dmapi/ > > xfs-$(CONFIG_XFS_RT) += xfs_rtalloc.o > -xfs-$(CONFIG_XFS_POSIX_ACL) += xfs_acl.o > +xfs-$(CONFIG_XFS_POSIX_ACL) += $(XFS_LINUX)/xfs_acl.o > xfs-$(CONFIG_PROC_FS) += $(XFS_LINUX)/xfs_stats.o > xfs-$(CONFIG_SYSCTL) += $(XFS_LINUX)/xfs_sysctl.o > xfs-$(CONFIG_COMPAT) += $(XFS_LINUX)/xfs_ioctl32.o > Index: linux-2.6-xfs/fs/xfs/xfs_inode.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_inode.c 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_inode.c 2008-02-07 09:15:35.000000000 +0100 > @@ -52,6 +52,7 @@ > #include "xfs_acl.h" > #include "xfs_filestream.h" > #include "xfs_vnodeops.h" > +#include "xfs_acl.h" > > kmem_zone_t *xfs_ifork_zone; > kmem_zone_t *xfs_inode_zone; > @@ -870,6 +871,7 @@ xfs_iread( > ip->i_mount = mp; > atomic_set(&ip->i_iocount, 0); > spin_lock_init(&ip->i_flags_lock); > + xfs_inode_init_acls(ip); > > /* > * Get pointer's to the on-disk inode and the buffer containing it. > @@ -2793,6 +2795,8 @@ xfs_idestroy( > } > xfs_inode_item_destroy(ip); > } > + > + xfs_inode_clear_acls(ip); > kmem_zone_free(xfs_inode_zone, ip); > } > > Index: linux-2.6-xfs/fs/xfs/xfs_inode.h > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_inode.h 2008-02-05 08:43:31.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_inode.h 2008-02-07 09:15:35.000000000 +0100 > @@ -18,6 +18,7 @@ > #ifndef __XFS_INODE_H__ > #define __XFS_INODE_H__ > > +struct posix_acl; > struct xfs_dinode; > struct xfs_dinode_core; > > @@ -258,6 +259,11 @@ typedef struct xfs_inode { > xfs_fsize_t i_size; /* in-memory size */ > xfs_fsize_t i_new_size; /* size when write completes */ > atomic_t i_iocount; /* outstanding I/O count */ > + > +#ifdef CONFIG_XFS_POSIX_ACL > + struct posix_acl *i_acl; > + struct posix_acl *i_default_acl; > +#endif > /* Trace buffers per inode. */ > #ifdef XFS_INODE_TRACE > struct ktrace *i_trace; /* general inode trace */ > Index: linux-2.6-xfs/fs/xfs/xfs_vnodeops.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_vnodeops.c 2008-02-07 09:15:55.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.c 2008-02-07 09:16:07.000000000 +0100 > @@ -77,132 +77,6 @@ xfs_open( > } > > /* > - * xfs_getattr > - */ > -int > -xfs_getattr( > - xfs_inode_t *ip, > - bhv_vattr_t *vap, > - int flags) > -{ > - bhv_vnode_t *vp = XFS_ITOV(ip); > - xfs_mount_t *mp = ip->i_mount; > - > - xfs_itrace_entry(ip); > - > - if (XFS_FORCED_SHUTDOWN(mp)) > - return XFS_ERROR(EIO); > - > - if (!(flags & ATTR_LAZY)) > - xfs_ilock(ip, XFS_ILOCK_SHARED); > - > - vap->va_size = XFS_ISIZE(ip); > - if (vap->va_mask == XFS_AT_SIZE) > - goto all_done; > - > - vap->va_nblocks = > - XFS_FSB_TO_BB(mp, ip->i_d.di_nblocks + ip->i_delayed_blks); > - vap->va_nodeid = ip->i_ino; > -#if XFS_BIG_INUMS > - vap->va_nodeid += mp->m_inoadd; > -#endif > - vap->va_nlink = ip->i_d.di_nlink; > - > - /* > - * Quick exit for non-stat callers > - */ > - if ((vap->va_mask & > - ~(XFS_AT_SIZE|XFS_AT_FSID|XFS_AT_NODEID| > - XFS_AT_NLINK|XFS_AT_BLKSIZE)) == 0) > - goto all_done; > - > - /* > - * Copy from in-core inode. > - */ > - vap->va_mode = ip->i_d.di_mode; > - vap->va_uid = ip->i_d.di_uid; > - vap->va_gid = ip->i_d.di_gid; > - vap->va_projid = ip->i_d.di_projid; > - > - /* > - * Check vnode type block/char vs. everything else. > - */ > - switch (ip->i_d.di_mode & S_IFMT) { > - case S_IFBLK: > - case S_IFCHR: > - vap->va_rdev = ip->i_df.if_u2.if_rdev; > - vap->va_blocksize = BLKDEV_IOSIZE; > - break; > - default: > - vap->va_rdev = 0; > - > - if (!(XFS_IS_REALTIME_INODE(ip))) { > - vap->va_blocksize = xfs_preferred_iosize(mp); > - } else { > - > - /* > - * If the file blocks are being allocated from a > - * realtime partition, then return the inode's > - * realtime extent size or the realtime volume's > - * extent size. > - */ > - vap->va_blocksize = > - xfs_get_extsz_hint(ip) << mp->m_sb.sb_blocklog; > - } > - break; > - } > - > - vn_atime_to_timespec(vp, &vap->va_atime); > - vap->va_mtime.tv_sec = ip->i_d.di_mtime.t_sec; > - vap->va_mtime.tv_nsec = ip->i_d.di_mtime.t_nsec; > - vap->va_ctime.tv_sec = ip->i_d.di_ctime.t_sec; > - vap->va_ctime.tv_nsec = ip->i_d.di_ctime.t_nsec; > - > - /* > - * Exit for stat callers. See if any of the rest of the fields > - * to be filled in are needed. > - */ > - if ((vap->va_mask & > - (XFS_AT_XFLAGS|XFS_AT_EXTSIZE|XFS_AT_NEXTENTS|XFS_AT_ANEXTENTS| > - XFS_AT_GENCOUNT|XFS_AT_VCODE)) == 0) > - goto all_done; > - > - /* > - * Convert di_flags to xflags. > - */ > - vap->va_xflags = xfs_ip2xflags(ip); > - > - /* > - * Exit for inode revalidate. See if any of the rest of > - * the fields to be filled in are needed. > - */ > - if ((vap->va_mask & > - (XFS_AT_EXTSIZE|XFS_AT_NEXTENTS|XFS_AT_ANEXTENTS| > - XFS_AT_GENCOUNT|XFS_AT_VCODE)) == 0) > - goto all_done; > - > - vap->va_extsize = ip->i_d.di_extsize << mp->m_sb.sb_blocklog; > - vap->va_nextents = > - (ip->i_df.if_flags & XFS_IFEXTENTS) ? > - ip->i_df.if_bytes / sizeof(xfs_bmbt_rec_t) : > - ip->i_d.di_nextents; > - if (ip->i_afp) > - vap->va_anextents = > - (ip->i_afp->if_flags & XFS_IFEXTENTS) ? > - ip->i_afp->if_bytes / sizeof(xfs_bmbt_rec_t) : > - ip->i_d.di_anextents; > - else > - vap->va_anextents = 0; > - vap->va_gen = ip->i_d.di_gen; > - > - all_done: > - if (!(flags & ATTR_LAZY)) > - xfs_iunlock(ip, XFS_ILOCK_SHARED); > - return 0; > -} > - > - > -/* > * xfs_setattr > */ > int > Index: linux-2.6-xfs/fs/xfs/xfs_vnodeops.h > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_vnodeops.h 2008-02-07 09:15:48.000000000 +0100 > +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.h 2008-02-07 09:15:53.000000000 +0100 > @@ -15,7 +15,6 @@ struct xfs_iomap; > > > int xfs_open(struct xfs_inode *ip); > -int xfs_getattr(struct xfs_inode *ip, struct bhv_vattr *vap, int flags); > int xfs_setattr(struct xfs_inode *ip, struct bhv_vattr *vap, int flags, > struct cred *credp); > int xfs_readlink(struct xfs_inode *ip, char *link); > From owner-xfs@oss.sgi.com Thu Feb 14 00:47:37 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 00:47:42 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,MIME_8BIT_HEADER autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1E8lXZE018569 for ; Thu, 14 Feb 2008 00:47:37 -0800 X-ASG-Debug-ID: 1202978876-674701280000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ug-out-1314.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id F1284E1220C for ; Thu, 14 Feb 2008 00:47:57 -0800 (PST) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.173]) by cuda.sgi.com with ESMTP id LV3UvzLGn1HVeDg2 for ; Thu, 14 Feb 2008 00:47:57 -0800 (PST) Received: by ug-out-1314.google.com with SMTP id o29so1946148ugd.20 for ; Thu, 14 Feb 2008 00:47:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=TAU1jLQny2T4l29TN4fe6MO8rndsq1RD0rKuclf7Pkg=; b=v5VkVKgopD5wQv4AC38BGWy8hGppiot8B8OMCK3ZOXeM7a2vX5U5qMhkOXpFgwTebNUj81BQ//4LZYuP3cPOkHarDrdNrNjgKicK2SG1zTRuPZ40JQuX4R6Wf+9BRSibHRH3Z9xBYDS67Tv1RA/GD+Rb+/DjVNwWOrEclIlRdH4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=XxaGrJ/8gp4KKTRBg2E4v9uwzU8owjLBNzEPPVMIMJKXy5MR3tgfg8n8s+nvi9AVMmKRRyPPYr3vPRHspfXZ8q4VMfYM03n4S7ac8l3FApqnqOZs+/JnUAd+20HG1rAkcQmBrYu7MX0ClwhsNC5U7WBKHdS+SdbvmN9i+5KzD6U= Received: by 10.151.109.11 with SMTP id l11mr384399ybm.52.1202978481553; Thu, 14 Feb 2008 00:41:21 -0800 (PST) Received: by 10.150.191.13 with HTTP; Thu, 14 Feb 2008 00:41:21 -0800 (PST) Message-ID: <1a4a774c0802140041h49a88b9l281f5ac3213381a2@mail.gmail.com> Date: Thu, 14 Feb 2008 09:41:21 +0100 From: "=?ISO-8859-1?Q?Christian_R=F8snes?=" To: "David Chinner" X-ASG-Orig-Subj: Re: XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c Subject: Re: XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c Cc: xfs@oss.sgi.com In-Reply-To: <20080213214551.GR155407@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline References: <1a4a774c0802130251h657a52f7lb97942e7afdf6e3f@mail.gmail.com> <20080213214551.GR155407@sgi.com> X-Barracuda-Connect: ug-out-1314.google.com[66.249.92.173] X-Barracuda-Start-Time: 1202978877 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42216 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5805/Wed Feb 13 15:29:12 2008 on oss.sgi.com X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id m1E8lbZE018571 X-archive-position: 14432 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: christian.rosnes@gmail.com Precedence: bulk X-list: xfs On Wed, Feb 13, 2008 at 10:45 PM, David Chinner wrote: > On Wed, Feb 13, 2008 at 11:51:51AM +0100, Christian Røsnes wrote: > > Kernel: 2.6.17.7 (custom compiled) > > > > Are there any known XFS problems with this kernel version and nearly > > full partitions ? > > Yes. Deadlocks that weren't properly fixed until 2.6.18 (partially > fixed in 2.6.17) and an accounting problem in the transaction code > that leads to the shutdown you are seeing. The accounting problem is > fixed by this commit: > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=45c34141126a89da07197d5b89c04c6847f1171a > > which I think went into 2.6.22. > > Luckily, neither of these problems result in corruption. > > > > I'm thinking about upgrading the kernel to a newer version, to see if > > it fixes this problem. > > Are there any known XFS problems with version 2.6.24.2 ? > > Yes - a problem with readdir. The fix is currently in the stable > queue (i.e for 2.6.24.3): > > http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=commit;h=ee864b866419890b019352412c7bc9634d96f61b > > So we are just waiting for Greg to release 2.6.24.3 now. > Thanks. I ran memtest overnight just to be sure and no errors were found. I'll wait for 2.6.24.3, and then upgrade the kernel. Christian From owner-xfs@oss.sgi.com Thu Feb 14 03:18:01 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 03:18:05 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_54 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1EBHxqC028059 for ; Thu, 14 Feb 2008 03:18:01 -0800 X-ASG-Debug-ID: 1202987902-4f7401890000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mta5.srv.hcvlny.cv.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 421E35D647C for ; Thu, 14 Feb 2008 03:18:22 -0800 (PST) Received: from mta5.srv.hcvlny.cv.net (mta5.srv.hcvlny.cv.net [167.206.4.200]) by cuda.sgi.com with ESMTP id P7a5u82JiibqHnpc for ; Thu, 14 Feb 2008 03:18:22 -0800 (PST) Received: from freyr.home (ool-44c218a8.dyn.optonline.net [68.194.24.168]) by mta5.srv.hcvlny.cv.net (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) with ESMTP id <0JW7009UHXLA60U0@mta5.srv.hcvlny.cv.net> for xfs@oss.sgi.com; Thu, 14 Feb 2008 02:46:30 -0500 (EST) Received: by freyr.home (Postfix, from userid 1000) id 2B7BC8DC760; Thu, 14 Feb 2008 02:45:40 -0500 (EST) Date: Thu, 14 Feb 2008 02:45:39 -0500 From: "Josef 'Jeff' Sipek" X-ASG-Orig-Subj: [PATCH 1/1] XFS: replace *_IDELETE with *_IKEEP Subject: [PATCH 1/1] XFS: replace *_IDELETE with *_IKEEP In-reply-to: <47B3B6AE.4030505@sandeen.net> To: xfs@oss.sgi.com Cc: sandeen@sandeen.net, "Josef 'Jeff' Sipek" Message-id: <1202975139-10546-1-git-send-email-jeffpc@josefsipek.net> X-Mailer: git-send-email 1.5.4.rc2.85.g9de45-dirty Content-transfer-encoding: 7BIT References: <47B3B6AE.4030505@sandeen.net> X-Barracuda-Connect: mta5.srv.hcvlny.cv.net[167.206.4.200] X-Barracuda-Start-Time: 1202987903 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -0.97 X-Barracuda-Spam-Status: No, SCORE=-0.97 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=BSF_RULE_7582B X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42226 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 1.05 BSF_RULE_7582B BODY: Custom Rule 7582B X-Virus-Scanned: ClamAV 0.91.2/5806/Thu Feb 14 01:02:39 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14433 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jeffpc@josefsipek.net Precedence: bulk X-list: xfs Change the *_IDELETE flags to *_IKEEP, and flip the logic as necessary. This completely eliminates the no-no-no-idelete madness. Additionally, "ikeep" or "noikeep" is always displayed in /proc/mounts option string. This should help clear up any confusion about what the current mode is. Signed-off-by: Josef 'Jeff' Sipek --- fs/xfs/linux-2.6/xfs_super.c | 11 ++++++----- fs/xfs/xfs_clnt.h | 2 +- fs/xfs/xfs_ialloc.c | 2 +- fs/xfs/xfs_mount.h | 2 +- fs/xfs/xfs_vfsops.c | 4 ++-- fs/xfs/xfsidbg.c | 2 +- 6 files changed, 12 insertions(+), 11 deletions(-) diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c index a0b1235..5c343a0 100644 --- a/fs/xfs/linux-2.6/xfs_super.c +++ b/fs/xfs/linux-2.6/xfs_super.c @@ -171,7 +171,7 @@ xfs_parseargs( char *this_char, *value, *eov; int dsunit, dswidth, vol_dsunit, vol_dswidth; int iosize; - int ikeep = 0; + int ikeep = 0; /* don't keep by default */ args->flags |= XFSMNT_BARRIER; args->flags2 |= XFSMNT2_COMPAT_IOSIZE; @@ -303,9 +303,9 @@ xfs_parseargs( args->flags &= ~XFSMNT_BARRIER; } else if (!strcmp(this_char, MNTOPT_IKEEP)) { ikeep = 1; - args->flags &= ~XFSMNT_IDELETE; + args->flags |= XFSMNT_IKEEP; } else if (!strcmp(this_char, MNTOPT_NOIKEEP)) { - args->flags |= XFSMNT_IDELETE; + args->flags &= ~XFSMNT_IKEEP; } else if (!strcmp(this_char, MNTOPT_LARGEIO)) { args->flags2 &= ~XFSMNT2_COMPAT_IOSIZE; } else if (!strcmp(this_char, MNTOPT_NOLARGEIO)) { @@ -411,7 +411,7 @@ xfs_parseargs( * supplied, then they are honored. */ if (!(args->flags & XFSMNT_DMAPI) && !ikeep) - args->flags |= XFSMNT_IDELETE; + args->flags &= ~XFSMNT_IKEEP; if ((args->flags & XFSMNT_NOALIGN) != XFSMNT_NOALIGN) { if (dsunit) { @@ -446,6 +446,7 @@ xfs_showargs( { static struct proc_xfs_info xfs_info_set[] = { /* the few simple ones we can get from the mount struct */ + { XFS_MOUNT_IKEEP, "," MNTOPT_IKEEP }, { XFS_MOUNT_WSYNC, "," MNTOPT_WSYNC }, { XFS_MOUNT_INO64, "," MNTOPT_INO64 }, { XFS_MOUNT_NOALIGN, "," MNTOPT_NOALIGN }, @@ -461,7 +462,7 @@ xfs_showargs( }; static struct proc_xfs_info xfs_info_unset[] = { /* the few simple ones we can get from the mount struct */ - { XFS_MOUNT_IDELETE, "," MNTOPT_IKEEP }, + { XFS_MOUNT_IKEEP, "," MNTOPT_NOIKEEP }, { XFS_MOUNT_COMPAT_IOSIZE, "," MNTOPT_LARGEIO }, { XFS_MOUNT_BARRIER, "," MNTOPT_NOBARRIER }, { XFS_MOUNT_SMALL_INUMS, "," MNTOPT_64BITINODE }, diff --git a/fs/xfs/xfs_clnt.h b/fs/xfs/xfs_clnt.h index d16c1b9..d5d1e60 100644 --- a/fs/xfs/xfs_clnt.h +++ b/fs/xfs/xfs_clnt.h @@ -86,7 +86,7 @@ struct xfs_mount_args { #define XFSMNT_NOUUID 0x01000000 /* Ignore fs uuid */ #define XFSMNT_DMAPI 0x02000000 /* enable dmapi/xdsm */ #define XFSMNT_BARRIER 0x04000000 /* use write barriers */ -#define XFSMNT_IDELETE 0x08000000 /* inode cluster delete */ +#define XFSMNT_IKEEP 0x08000000 /* inode cluster delete */ #define XFSMNT_SWALLOC 0x10000000 /* turn on stripe width * allocation */ #define XFSMNT_DIRSYNC 0x40000000 /* sync creat,link,unlink,rename diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c index 1409c2d..badf745 100644 --- a/fs/xfs/xfs_ialloc.c +++ b/fs/xfs/xfs_ialloc.c @@ -1053,7 +1053,7 @@ xfs_difree( /* * When an inode cluster is free, it becomes eligible for removal */ - if ((mp->m_flags & XFS_MOUNT_IDELETE) && + if (!(mp->m_flags & XFS_MOUNT_IKEEP) && (rec.ir_freecount == XFS_IALLOC_INODES(mp))) { *delete = 1; diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index 435d625..87ee8b8 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -366,7 +366,7 @@ typedef struct xfs_mount { #define XFS_MOUNT_SMALL_INUMS (1ULL << 15) /* users wants 32bit inodes */ #define XFS_MOUNT_NOUUID (1ULL << 16) /* ignore uuid during mount */ #define XFS_MOUNT_BARRIER (1ULL << 17) -#define XFS_MOUNT_IDELETE (1ULL << 18) /* delete empty inode clusters*/ +#define XFS_MOUNT_IKEEP (1ULL << 18) /* keep empty inode clusters*/ #define XFS_MOUNT_SWALLOC (1ULL << 19) /* turn on stripe width * allocation */ #define XFS_MOUNT_RDONLY (1ULL << 20) /* read-only fs */ diff --git a/fs/xfs/xfs_vfsops.c b/fs/xfs/xfs_vfsops.c index a0f287e..e809b1c 100644 --- a/fs/xfs/xfs_vfsops.c +++ b/fs/xfs/xfs_vfsops.c @@ -279,8 +279,8 @@ xfs_start_flags( mp->m_readio_log = mp->m_writeio_log = ap->iosizelog; } - if (ap->flags & XFSMNT_IDELETE) - mp->m_flags |= XFS_MOUNT_IDELETE; + if (ap->flags & XFSMNT_IKEEP) + mp->m_flags |= XFS_MOUNT_IKEEP; if (ap->flags & XFSMNT_DIRSYNC) mp->m_flags |= XFS_MOUNT_DIRSYNC; if (ap->flags & XFSMNT_ATTR2) diff --git a/fs/xfs/xfsidbg.c b/fs/xfs/xfsidbg.c index aa029da..a875351 100644 --- a/fs/xfs/xfsidbg.c +++ b/fs/xfs/xfsidbg.c @@ -6282,7 +6282,7 @@ xfsidbg_xmount(xfs_mount_t *mp) "SMALL_INUMS", /* 0x8000 */ "NOUUID", /* 0x10000 */ "BARRIER", /* 0x20000 */ - "IDELETE", /* 0x40000 */ + "IKEEP", /* 0x40000 */ "SWALLOC", /* 0x80000 */ "RDONLY", /* 0x100000 */ "DIRSYNC", /* 0x200000 */ -- 1.5.4.rc2.85.g9de45-dirty From owner-xfs@oss.sgi.com Thu Feb 14 08:31:20 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 08:31:23 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1EGVJW5029623 for ; Thu, 14 Feb 2008 08:31:20 -0800 X-ASG-Debug-ID: 1203006702-1936002f0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A5DB2E1C2D7 for ; Thu, 14 Feb 2008 08:31:42 -0800 (PST) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id 6UjEXHKa6eYrTdrE for ; Thu, 14 Feb 2008 08:31:42 -0800 (PST) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m1EGVaF3031701 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Thu, 14 Feb 2008 17:31:36 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m1EGVawc031699 for xfs@oss.sgi.com; Thu, 14 Feb 2008 17:31:36 +0100 Date: Thu, 14 Feb 2008 17:31:35 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] kill xfs_rwlock/xfs_rwunlock Subject: Re: [PATCH] kill xfs_rwlock/xfs_rwunlock Message-ID: <20080214163135.GA31502@lst.de> References: <20080103125641.GC5331@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080103125641.GC5331@lst.de> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1203006703 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42245 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5809/Thu Feb 14 02:52:07 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14434 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Thu, Jan 03, 2008 at 01:56:41PM +0100, Christoph Hellwig wrote: > We can just use xfs_ilock/xfs_iunlock instead and get rid of the ugly > bhv_vrwlock_t. Here's an updated version that applies after the refcache removal and the successive fixup have hit the tree: Index: linux-2.6-xfs/fs/xfs/dmapi/xfs_dm.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/dmapi/xfs_dm.c 2008-02-08 05:20:51.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/dmapi/xfs_dm.c 2008-02-14 17:15:57.000000000 +0100 @@ -138,7 +138,7 @@ xfs_dm_send_data_event( xfs_off_t offset, size_t length, int flags, - bhv_vrwlock_t *locktype) + int *lock_flags) { int error; xfs_inode_t *ip; @@ -150,8 +150,8 @@ xfs_dm_send_data_event( ip = xfs_vtoi(vp); do { dmstate = ip->i_d.di_dmstate; - if (locktype) - xfs_rwunlock(ip, *locktype); + if (lock_flags) + xfs_iunlock(ip, *lock_flags); up_rw_sems(inode, flags); @@ -161,8 +161,8 @@ xfs_dm_send_data_event( down_rw_sems(inode, flags); - if (locktype) - xfs_rwlock(ip, *locktype); + if (lock_flags) + xfs_ilock(ip, *lock_flags); } while (!error && (ip->i_d.di_dmstate != dmstate)); return error; @@ -3085,7 +3085,6 @@ xfs_dm_send_mmap_event( xfs_inode_t *ip; int error = 0; dm_eventtype_t max_event = DM_EVENT_READ; - bhv_vrwlock_t locktype; xfs_fsize_t filesize; xfs_off_t length, end_of_area, evsize, offset; int iolock; @@ -3140,20 +3139,16 @@ xfs_dm_send_mmap_event( if (evsize < 0) evsize = 0; - if (max_event == DM_EVENT_READ) { - locktype = VRWLOCK_READ; + if (max_event == DM_EVENT_READ) iolock = XFS_IOLOCK_SHARED; - } - else { - locktype = VRWLOCK_WRITE; + else iolock = XFS_IOLOCK_EXCL; - } xfs_ilock(ip, iolock); /* If write possible, try a DMAPI write event */ if (max_event == DM_EVENT_WRITE && DM_EVENT_ENABLED(ip, max_event)) { error = xfs_dm_send_data_event(max_event, vp, offset, - evsize, 0, &locktype); + evsize, 0, &iolock); goto out_unlock; } @@ -3162,7 +3157,7 @@ xfs_dm_send_mmap_event( */ if (DM_EVENT_ENABLED(ip, DM_EVENT_READ)) { error = xfs_dm_send_data_event(DM_EVENT_READ, vp, offset, - evsize, 0, &locktype); + evsize, 0, &iolock); } out_unlock: xfs_iunlock(ip, iolock); Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_aops.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_aops.c 2008-02-08 05:20:52.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_aops.c 2008-02-14 17:15:57.000000000 +0100 @@ -1532,9 +1532,9 @@ xfs_vm_bmap( struct xfs_inode *ip = XFS_I(inode); xfs_itrace_entry(XFS_I(inode)); - xfs_rwlock(ip, VRWLOCK_READ); + xfs_ilock(ip, XFS_IOLOCK_SHARED); xfs_flush_pages(ip, (xfs_off_t)0, -1, 0, FI_REMAPF); - xfs_rwunlock(ip, VRWLOCK_READ); + xfs_iunlock(ip, XFS_IOLOCK_SHARED); return generic_block_bmap(mapping, block, xfs_get_blocks); } Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_ksyms.c 2008-02-13 14:24:34.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c 2008-02-14 17:15:57.000000000 +0100 @@ -262,8 +262,6 @@ EXPORT_SYMBOL(xfs_mountfs); EXPORT_SYMBOL(xfs_qm_dqcheck); EXPORT_SYMBOL(xfs_readsb); EXPORT_SYMBOL(xfs_read_buf); -EXPORT_SYMBOL(xfs_rwlock); -EXPORT_SYMBOL(xfs_rwunlock); EXPORT_SYMBOL(xfs_setattr); EXPORT_SYMBOL(xfs_attr_get); EXPORT_SYMBOL(xfs_attr_set); Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_lrw.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_lrw.c 2008-02-08 05:20:52.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_lrw.c 2008-02-14 17:15:57.000000000 +0100 @@ -228,11 +228,11 @@ xfs_read( xfs_ilock(ip, XFS_IOLOCK_SHARED); if (DM_EVENT_ENABLED(ip, DM_EVENT_READ) && !(ioflags & IO_INVIS)) { - bhv_vrwlock_t locktype = VRWLOCK_READ; int dmflags = FILP_DELAY_FLAG(file) | DM_SEM_FLAG_RD(ioflags); + int iolock = XFS_IOLOCK_SHARED; ret = -XFS_SEND_DATA(mp, DM_EVENT_READ, vp, *offset, size, - dmflags, &locktype); + dmflags, &iolock); if (ret) { xfs_iunlock(ip, XFS_IOLOCK_SHARED); if (unlikely(ioflags & IO_ISDIRECT)) @@ -287,11 +287,11 @@ xfs_splice_read( xfs_ilock(ip, XFS_IOLOCK_SHARED); if (DM_EVENT_ENABLED(ip, DM_EVENT_READ) && !(ioflags & IO_INVIS)) { - bhv_vrwlock_t locktype = VRWLOCK_READ; + int iolock = XFS_IOLOCK_SHARED; int error; error = XFS_SEND_DATA(mp, DM_EVENT_READ, vp, *ppos, count, - FILP_DELAY_FLAG(infilp), &locktype); + FILP_DELAY_FLAG(infilp), &iolock); if (error) { xfs_iunlock(ip, XFS_IOLOCK_SHARED); return -error; @@ -330,11 +330,11 @@ xfs_splice_write( xfs_ilock(ip, XFS_IOLOCK_EXCL); if (DM_EVENT_ENABLED(ip, DM_EVENT_WRITE) && !(ioflags & IO_INVIS)) { - bhv_vrwlock_t locktype = VRWLOCK_WRITE; + int iolock = XFS_IOLOCK_EXCL; int error; error = XFS_SEND_DATA(mp, DM_EVENT_WRITE, vp, *ppos, count, - FILP_DELAY_FLAG(outfilp), &locktype); + FILP_DELAY_FLAG(outfilp), &iolock); if (error) { xfs_iunlock(ip, XFS_IOLOCK_EXCL); return -error; @@ -580,7 +580,6 @@ xfs_write( xfs_fsize_t isize, new_size; int iolock; int eventsent = 0; - bhv_vrwlock_t locktype; size_t ocount = 0, count; loff_t pos; int need_i_mutex; @@ -607,11 +606,9 @@ xfs_write( relock: if (ioflags & IO_ISDIRECT) { iolock = XFS_IOLOCK_SHARED; - locktype = VRWLOCK_WRITE_DIRECT; need_i_mutex = 0; } else { iolock = XFS_IOLOCK_EXCL; - locktype = VRWLOCK_WRITE; need_i_mutex = 1; mutex_lock(&inode->i_mutex); } @@ -635,8 +632,7 @@ start: xfs_iunlock(xip, XFS_ILOCK_EXCL); error = XFS_SEND_DATA(xip->i_mount, DM_EVENT_WRITE, vp, - pos, count, - dmflags, &locktype); + pos, count, dmflags, &iolock); if (error) { goto out_unlock_internal; } @@ -667,7 +663,6 @@ start: if (!need_i_mutex && (VN_CACHED(vp) || pos > xip->i_size)) { xfs_iunlock(xip, XFS_ILOCK_EXCL|iolock); iolock = XFS_IOLOCK_EXCL; - locktype = VRWLOCK_WRITE; need_i_mutex = 1; mutex_lock(&inode->i_mutex); xfs_ilock(xip, XFS_ILOCK_EXCL|iolock); @@ -744,7 +739,6 @@ retry: mutex_unlock(&inode->i_mutex); iolock = XFS_IOLOCK_SHARED; - locktype = VRWLOCK_WRITE_DIRECT; need_i_mutex = 0; } @@ -781,7 +775,7 @@ retry: if (ret == -ENOSPC && DM_EVENT_ENABLED(xip, DM_EVENT_NOSPACE) && !(ioflags & IO_INVIS)) { - xfs_rwunlock(xip, locktype); + xfs_iunlock(xip, iolock); if (need_i_mutex) mutex_unlock(&inode->i_mutex); error = XFS_SEND_NAMESP(xip->i_mount, DM_EVENT_NOSPACE, vp, @@ -789,7 +783,7 @@ retry: 0, 0, 0); /* Delay flag intentionally unused */ if (need_i_mutex) mutex_lock(&inode->i_mutex); - xfs_rwlock(xip, locktype); + xfs_ilock(xip, iolock); if (error) goto out_unlock_internal; pos = xip->i_size; @@ -817,7 +811,8 @@ retry: /* Handle various SYNC-type writes */ if ((file->f_flags & O_SYNC) || IS_SYNC(inode)) { int error2; - xfs_rwunlock(xip, locktype); + + xfs_iunlock(xip, iolock); if (need_i_mutex) mutex_unlock(&inode->i_mutex); error2 = sync_page_range(inode, mapping, pos, ret); @@ -825,7 +820,7 @@ retry: error = error2; if (need_i_mutex) mutex_lock(&inode->i_mutex); - xfs_rwlock(xip, locktype); + xfs_ilock(xip, iolock); error2 = xfs_write_sync_logforce(mp, xip); if (!error) error = error2; @@ -846,7 +841,7 @@ retry: xip->i_d.di_size = xip->i_size; xfs_iunlock(xip, XFS_ILOCK_EXCL); } - xfs_rwunlock(xip, locktype); + xfs_iunlock(xip, iolock); out_unlock_mutex: if (need_i_mutex) mutex_unlock(&inode->i_mutex); Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_vnode.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_vnode.h 2008-02-14 17:15:53.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_vnode.h 2008-02-14 17:15:57.000000000 +0100 @@ -46,18 +46,6 @@ static inline struct inode *vn_to_inode( } /* - * Values for the vop_rwlock/rwunlock flags parameter. - */ -typedef enum bhv_vrwlock { - VRWLOCK_NONE, - VRWLOCK_READ, - VRWLOCK_WRITE, - VRWLOCK_WRITE_DIRECT, - VRWLOCK_TRY_READ, - VRWLOCK_TRY_WRITE -} bhv_vrwlock_t; - -/* * Return values for xfs_inactive. A return value of * VN_INACTIVE_NOCACHE implies that the file system behavior * has disassociated its state and bhv_desc_t from the vnode. Index: linux-2.6-xfs/fs/xfs/xfs_mount.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_mount.h 2008-02-14 17:15:50.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_mount.h 2008-02-14 17:15:57.000000000 +0100 @@ -67,7 +67,7 @@ struct xfs_mru_cache; */ typedef int (*xfs_send_data_t)(int, bhv_vnode_t *, - xfs_off_t, size_t, int, bhv_vrwlock_t *); + xfs_off_t, size_t, int, int *); typedef int (*xfs_send_mmap_t)(struct vm_area_struct *, uint); typedef int (*xfs_send_destroy_t)(bhv_vnode_t *, dm_right_t); typedef int (*xfs_send_namesp_t)(dm_eventtype_t, struct xfs_mount *, Index: linux-2.6-xfs/fs/xfs/xfs_vnodeops.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_vnodeops.c 2008-02-14 17:15:53.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.c 2008-02-14 17:16:23.000000000 +0100 @@ -3375,47 +3375,6 @@ std_return: } int -xfs_rwlock( - xfs_inode_t *ip, - bhv_vrwlock_t locktype) -{ - if (S_ISDIR(ip->i_d.di_mode)) - return 1; - if (locktype == VRWLOCK_WRITE) { - xfs_ilock(ip, XFS_IOLOCK_EXCL); - } else if (locktype == VRWLOCK_TRY_READ) { - return xfs_ilock_nowait(ip, XFS_IOLOCK_SHARED); - } else if (locktype == VRWLOCK_TRY_WRITE) { - return xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL); - } else { - ASSERT((locktype == VRWLOCK_READ) || - (locktype == VRWLOCK_WRITE_DIRECT)); - xfs_ilock(ip, XFS_IOLOCK_SHARED); - } - - return 1; -} - - -void -xfs_rwunlock( - xfs_inode_t *ip, - bhv_vrwlock_t locktype) -{ - if (S_ISDIR(ip->i_d.di_mode)) - return; - if (locktype == VRWLOCK_WRITE) { - xfs_iunlock(ip, XFS_IOLOCK_EXCL); - } else { - ASSERT((locktype == VRWLOCK_READ) || - (locktype == VRWLOCK_WRITE_DIRECT)); - xfs_iunlock(ip, XFS_IOLOCK_SHARED); - } - return; -} - - -int xfs_inode_flush( xfs_inode_t *ip, int flags) Index: linux-2.6-xfs/fs/xfs/xfs_vnodeops.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_vnodeops.h 2008-02-08 05:20:52.000000000 +0100 +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.h 2008-02-14 17:15:57.000000000 +0100 @@ -38,8 +38,6 @@ int xfs_readdir(struct xfs_inode *dp, vo int xfs_symlink(struct xfs_inode *dp, bhv_vname_t *dentry, char *target_path, mode_t mode, bhv_vnode_t **vpp, struct cred *credp); -int xfs_rwlock(struct xfs_inode *ip, bhv_vrwlock_t locktype); -void xfs_rwunlock(struct xfs_inode *ip, bhv_vrwlock_t locktype); int xfs_inode_flush(struct xfs_inode *ip, int flags); int xfs_set_dmattrs(struct xfs_inode *ip, u_int evmask, u_int16_t state); int xfs_reclaim(struct xfs_inode *ip); From owner-xfs@oss.sgi.com Thu Feb 14 15:45:41 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 15:45:46 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1ENjaA4002817 for ; Thu, 14 Feb 2008 15:45:40 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA03669; Fri, 15 Feb 2008 10:46:01 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1ENk0LF63839974; Fri, 15 Feb 2008 10:46:00 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1ENjxJo63635754; Fri, 15 Feb 2008 10:45:59 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 15 Feb 2008 10:45:59 +1100 From: David Chinner To: David Chinner Cc: Timothy Shimmin , xfs-dev , xfs-oss Subject: Re: [patch] Prevent AIL lock contention during transaction completion Message-ID: <20080214234559.GO155259@sgi.com> References: <20080121052330.GG155259@sgi.com> <4796E8C8.3030702@sgi.com> <20080123073446.GU155259@sgi.com> <479986F5.7070800@sgi.com> <20080125074235.GI155407@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080125074235.GI155407@sgi.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5819/Thu Feb 14 14:29:43 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14435 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Fri, Jan 25, 2008 at 06:42:35PM +1100, David Chinner wrote: > On Fri, Jan 25, 2008 at 05:51:33PM +1100, Timothy Shimmin wrote: > > So do we really need to call xlog_assign_tail_lsn() then? > > Or are we just being conservative in case we missed something? > > Conservative - the last thing I want is to introduce a subtle > difference to the tail lsn in the log record because we didn't > update it immediately before writing it to disk. I think we are > probably safe removing it, but lets leave that until we got some > wider test coverage on this change first.... Tim - did you finish the review of this? Testing on the 2048p machine appears to have been successful, so I'm just waiting on review ACKs now.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Feb 14 15:47:39 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 15:47:42 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1ENlZtA003042 for ; Thu, 14 Feb 2008 15:47:37 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA03777; Fri, 15 Feb 2008 10:48:00 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1ENlxLF63091283; Fri, 15 Feb 2008 10:47:59 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1ENlwCf63851209; Fri, 15 Feb 2008 10:47:58 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 15 Feb 2008 10:47:58 +1100 From: David Chinner To: Timothy Shimmin Cc: David Chinner , xfs-dev , xfs-oss Subject: Re: [patch] Use atomics for iclog reference counting Message-ID: <20080214234758.GP155259@sgi.com> References: <20080121053021.GH155259@sgi.com> <4796CCF5.8010509@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4796CCF5.8010509@sgi.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5819/Thu Feb 14 14:29:43 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14436 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Wed, Jan 23, 2008 at 04:13:25PM +1100, Timothy Shimmin wrote: > I'll have a look... Tim, have you had a chance to look at this one yet? I'd like to push this too, but I understand you are kinda busy right now :/ Cheers, Dave. > David Chinner wrote: > >Now that we update the log tail LSN less frequently on > >transaction completion, we pass the contention straight to > >the global block stat lock (l_iclog_lock) during transaction > >completion. > > > >We currently have to take this lock to decrement the iclog > >reference count. there is a reference count on each iclog, > >so we need to take þhe global lock for all refcount changes. > > > >When large numbers of processes are all doing small trnasctions, > >the iclog reference counts will be quite high, and the state change > >that absolutely requires the l_iclog_lock is the except rather than > >the norm. > > > >Change the reference counting on the iclogs to use atomic_inc/dec > >so that we can use atomic_dec_and_lock during transaction completion > >and avoid the need for grabbing the l_iclog_lock for every reference > >count decrement except the one that matters - the last. > > > >Signed-off-by: Dave Chinner > >--- > > fs/xfs/xfs_log.c | 36 +++++++++++++++++++++--------------- > > fs/xfs/xfs_log_priv.h | 2 +- > > fs/xfs/xfsidbg.c | 2 +- > > 3 files changed, 23 insertions(+), 17 deletions(-) > > > >Index: 2.6.x-xfs-new/fs/xfs/xfs_log.c > >=================================================================== > >--- 2.6.x-xfs-new.orig/fs/xfs/xfs_log.c 2008-01-21 > >16:16:51.804146394 +1100 > >+++ 2.6.x-xfs-new/fs/xfs/xfs_log.c 2008-01-21 16:23:35.369691221 +1100 > >@@ -675,7 +675,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) > > > > spin_lock(&log->l_icloglock); > > iclog = log->l_iclog; > >- iclog->ic_refcnt++; > >+ atomic_inc(&iclog->ic_refcnt); > > spin_unlock(&log->l_icloglock); > > xlog_state_want_sync(log, iclog); > > (void) xlog_state_release_iclog(log, iclog); > >@@ -713,7 +713,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) > > */ > > spin_lock(&log->l_icloglock); > > iclog = log->l_iclog; > >- iclog->ic_refcnt++; > >+ atomic_inc(&iclog->ic_refcnt); > > spin_unlock(&log->l_icloglock); > > > > xlog_state_want_sync(log, iclog); > >@@ -1405,7 +1405,7 @@ xlog_sync(xlog_t *log, > > int v2 = XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb); > > > > XFS_STATS_INC(xs_log_writes); > >- ASSERT(iclog->ic_refcnt == 0); > >+ ASSERT(atomic_read(&iclog->ic_refcnt) == 0); > > > > /* Add for LR header */ > > count_init = log->l_iclog_hsize + iclog->ic_offset; > >@@ -2311,7 +2311,7 @@ xlog_state_done_syncing( > > > > ASSERT(iclog->ic_state == XLOG_STATE_SYNCING || > > iclog->ic_state == XLOG_STATE_IOERROR); > >- ASSERT(iclog->ic_refcnt == 0); > >+ ASSERT(atomic_read(&iclog->ic_refcnt) == 0); > > ASSERT(iclog->ic_bwritecnt == 1 || iclog->ic_bwritecnt == 2); > > > > > >@@ -2393,7 +2393,7 @@ restart: > > ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE); > > head = &iclog->ic_header; > > > >- iclog->ic_refcnt++; /* prevents sync */ > >+ atomic_inc(&iclog->ic_refcnt); /* prevents sync */ > > log_offset = iclog->ic_offset; > > > > /* On the 1st write to an iclog, figure out lsn. This works > >@@ -2425,12 +2425,12 @@ restart: > > xlog_state_switch_iclogs(log, iclog, iclog->ic_size); > > > > /* If I'm the only one writing to this iclog, sync it to > > disk */ > >- if (iclog->ic_refcnt == 1) { > >+ if (atomic_read(&iclog->ic_refcnt) == 1) { > > spin_unlock(&log->l_icloglock); > > if ((error = xlog_state_release_iclog(log, iclog))) > > return error; > > } else { > >- iclog->ic_refcnt--; > >+ atomic_dec(&iclog->ic_refcnt); > > spin_unlock(&log->l_icloglock); > > } > > goto restart; > >@@ -2821,18 +2821,23 @@ xlog_state_release_iclog( > > { > > int sync = 0; /* do we sync? */ > > > >- spin_lock(&log->l_icloglock); > > if (iclog->ic_state & XLOG_STATE_IOERROR) { > > spin_unlock(&log->l_icloglock); > > return XFS_ERROR(EIO); > > } > > > >- ASSERT(iclog->ic_refcnt > 0); > >+ ASSERT(atomic_read(&iclog->ic_refcnt) > 0); > >+ if (!atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) > >+ return 0; > >+ > >+ if (iclog->ic_state & XLOG_STATE_IOERROR) { > >+ spin_unlock(&log->l_icloglock); > >+ return XFS_ERROR(EIO); > >+ } > > ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE || > > iclog->ic_state == XLOG_STATE_WANT_SYNC); > > > >- if (--iclog->ic_refcnt == 0 && > >- iclog->ic_state == XLOG_STATE_WANT_SYNC) { > >+ if (iclog->ic_state == XLOG_STATE_WANT_SYNC) { > > /* update tail before writing to iclog */ > > xlog_assign_tail_lsn(log->l_mp); > > sync++; > >@@ -2952,7 +2957,8 @@ xlog_state_sync_all(xlog_t *log, uint fl > > * previous iclog and go to sleep. > > */ > > if (iclog->ic_state == XLOG_STATE_DIRTY || > >- (iclog->ic_refcnt == 0 && iclog->ic_offset == 0)) { > >+ (atomic_read(&iclog->ic_refcnt) == 0 > >+ && iclog->ic_offset == 0)) { > > iclog = iclog->ic_prev; > > if (iclog->ic_state == XLOG_STATE_ACTIVE || > > iclog->ic_state == XLOG_STATE_DIRTY) > >@@ -2960,14 +2966,14 @@ xlog_state_sync_all(xlog_t *log, uint fl > > else > > goto maybe_sleep; > > } else { > >- if (iclog->ic_refcnt == 0) { > >+ if (atomic_read(&iclog->ic_refcnt) == 0) { > > /* We are the only one with access to this > > * iclog. Flush it out now. There should > > * be a roundoff of zero to show that someone > > * has already taken care of the roundoff > > from > > * the previous sync. > > */ > >- iclog->ic_refcnt++; > >+ atomic_inc(&iclog->ic_refcnt); > > lsn = be64_to_cpu(iclog->ic_header.h_lsn); > > xlog_state_switch_iclogs(log, iclog, 0); > > spin_unlock(&log->l_icloglock); > >@@ -3099,7 +3105,7 @@ try_again: > > already_slept = 1; > > goto try_again; > > } else { > >- iclog->ic_refcnt++; > >+ atomic_inc(&iclog->ic_refcnt); > > xlog_state_switch_iclogs(log, iclog, 0); > > spin_unlock(&log->l_icloglock); > > if (xlog_state_release_iclog(log, iclog)) > >Index: 2.6.x-xfs-new/fs/xfs/xfs_log_priv.h > >=================================================================== > >--- 2.6.x-xfs-new.orig/fs/xfs/xfs_log_priv.h 2008-01-21 > >16:06:27.127557437 +1100 > >+++ 2.6.x-xfs-new/fs/xfs/xfs_log_priv.h 2008-01-21 > >16:23:35.369691221 +1100 > >@@ -339,7 +339,7 @@ typedef struct xlog_iclog_fields { > > #endif > > int ic_size; > > int ic_offset; > >- int ic_refcnt; > >+ atomic_t ic_refcnt; > > int ic_bwritecnt; > > ushort_t ic_state; > > char *ic_datap; /* pointer to iclog data */ > >Index: 2.6.x-xfs-new/fs/xfs/xfsidbg.c > >=================================================================== > >--- 2.6.x-xfs-new.orig/fs/xfs/xfsidbg.c 2008-01-21 > >16:06:27.127557437 +1100 > >+++ 2.6.x-xfs-new/fs/xfs/xfsidbg.c 2008-01-21 16:23:35.385689220 +1100 > >@@ -5633,7 +5633,7 @@ xfsidbg_xiclog(xlog_in_core_t *iclog) > > #else > > NULL, > > #endif > >- iclog->ic_refcnt, iclog->ic_bwritecnt); > >+ atomic_read(&iclog->ic_refcnt), iclog->ic_bwritecnt); > > if (iclog->ic_state & XLOG_STATE_ALL) > > printflags(iclog->ic_state, ic_flags, " state:"); > > else > -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Feb 14 15:52:10 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 15:52:13 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1ENq5Mt003602 for ; Thu, 14 Feb 2008 15:52:07 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA04006; Fri, 15 Feb 2008 10:52:25 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1ENqOLF63838836; Fri, 15 Feb 2008 10:52:25 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1ENqNAO63802828; Fri, 15 Feb 2008 10:52:23 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 15 Feb 2008 10:52:23 +1100 From: David Chinner To: xfs-dev Cc: xfs-oss Subject: [patch] remove icluster V2 Message-ID: <20080214235223.GQ155259@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5819/Thu Feb 14 14:29:43 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14437 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Remove the xfs_icluster structure and replace with a radix tree lookup. We don't need to keep a list of inodes in each cluster around anymore as we can look them up quickly when we need to. The only time we need to do this now is during inode writeback. Factor the inode cluster writeback code out of xfs_iflush and convert it to use radix_tree_gang_lookup() instead of walking a list of inodes built when we first read in the inodes. This remove 3 pointers from each xfs_inode structure and the xfs_icluster structure per inode cluster. Hence we reduce the cache footprint of the xfs_inodes by between 5-10% depending on cluster sparseness. To be truly efficient we need a radix_tree_gang_lookup_range() call to stop searching once we are past the end of the cluster instead of trying to find a full cluster's worth of inodes. Before (ia64): $ cat /sys/slab/xfs_inode/object_size 536 After: $ cat /sys/slab/xfs_inode/object_size 512 Version 2: o fix bad inode list loop iteration on inodes returned by radix_tree_gang_lookup() in xfs_icluster_flush Signed-off-by: Dave Chinner --- fs/xfs/linux-2.6/xfs_ksyms.c | 1 fs/xfs/xfs_iget.c | 49 ------- fs/xfs/xfs_inode.c | 268 ++++++++++++++++++++++++------------------- fs/xfs/xfs_inode.h | 16 -- fs/xfs/xfs_vfsops.c | 5 fs/xfs/xfsidbg.c | 4 6 files changed, 156 insertions(+), 187 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_iget.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_iget.c 2008-02-08 15:17:49.321523070 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_iget.c 2008-02-12 11:09:45.116925025 +1100 @@ -78,7 +78,6 @@ xfs_iget_core( xfs_inode_t *ip; xfs_inode_t *iq; int error; - xfs_icluster_t *icl, *new_icl = NULL; unsigned long first_index, mask; xfs_perag_t *pag; xfs_agino_t agino; @@ -229,11 +228,9 @@ finish_inode: } /* - * This is a bit messy - we preallocate everything we _might_ - * need before we pick up the ici lock. That way we don't have to - * juggle locks and go all the way back to the start. + * Preload the radix tree so we can insert safely under the + * write spinlock. */ - new_icl = kmem_zone_alloc(xfs_icluster_zone, KM_SLEEP); if (radix_tree_preload(GFP_KERNEL)) { delay(1); goto again; @@ -241,17 +238,6 @@ finish_inode: mask = ~(((XFS_INODE_CLUSTER_SIZE(mp) >> mp->m_sb.sb_inodelog)) - 1); first_index = agino & mask; write_lock(&pag->pag_ici_lock); - - /* - * Find the cluster if it exists - */ - icl = NULL; - if (radix_tree_gang_lookup(&pag->pag_ici_root, (void**)&iq, - first_index, 1)) { - if ((XFS_INO_TO_AGINO(mp, iq->i_ino) & mask) == first_index) - icl = iq->i_cluster; - } - /* * insert the new inode */ @@ -266,30 +252,13 @@ finish_inode: } /* - * These values _must_ be set before releasing ihlock! + * These values _must_ be set before releasing the radix tree lock! */ ip->i_udquot = ip->i_gdquot = NULL; xfs_iflags_set(ip, XFS_INEW); - ASSERT(ip->i_cluster == NULL); - - if (!icl) { - spin_lock_init(&new_icl->icl_lock); - INIT_HLIST_HEAD(&new_icl->icl_inodes); - icl = new_icl; - new_icl = NULL; - } else { - ASSERT(!hlist_empty(&icl->icl_inodes)); - } - spin_lock(&icl->icl_lock); - hlist_add_head(&ip->i_cnode, &icl->icl_inodes); - ip->i_cluster = icl; - spin_unlock(&icl->icl_lock); - write_unlock(&pag->pag_ici_lock); radix_tree_preload_end(); - if (new_icl) - kmem_zone_free(xfs_icluster_zone, new_icl); /* * Link ip to its mount and thread it on the mount's inode list. @@ -528,18 +497,6 @@ xfs_iextract( xfs_put_perag(mp, pag); /* - * Remove from cluster list - */ - mp = ip->i_mount; - spin_lock(&ip->i_cluster->icl_lock); - hlist_del(&ip->i_cnode); - spin_unlock(&ip->i_cluster->icl_lock); - - /* was last inode in cluster? */ - if (hlist_empty(&ip->i_cluster->icl_inodes)) - kmem_zone_free(xfs_icluster_zone, ip->i_cluster); - - /* * Remove from mount's inode list. */ XFS_MOUNT_ILOCK(mp); Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.c 2008-02-08 15:17:49.321523070 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.c 2008-02-12 11:09:28.499082126 +1100 @@ -55,7 +55,6 @@ kmem_zone_t *xfs_ifork_zone; kmem_zone_t *xfs_inode_zone; -kmem_zone_t *xfs_icluster_zone; /* * Used in xfs_itruncate(). This is the maximum number of extents @@ -2994,6 +2993,153 @@ xfs_iflush_fork( return 0; } +STATIC int +xfs_iflush_cluster( + xfs_inode_t *ip, + xfs_buf_t *bp) +{ + xfs_mount_t *mp = ip->i_mount; + xfs_perag_t *pag = xfs_get_perag(mp, ip->i_ino); + unsigned long first_index, mask; + int ilist_size; + xfs_inode_t **ilist; + xfs_inode_t *iq; + xfs_inode_log_item_t *iip; + int nr_found; + int clcount = 0; + int bufwasdelwri; + int i; + + ASSERT(pag->pagi_inodeok); + ASSERT(pag->pag_ici_init); + + ilist_size = XFS_INODE_CLUSTER_SIZE(mp) * sizeof(xfs_inode_t *); + ilist = kmem_alloc(ilist_size, KM_MAYFAIL); + if (!ilist) + return 0; + + mask = ~(((XFS_INODE_CLUSTER_SIZE(mp) >> mp->m_sb.sb_inodelog)) - 1); + first_index = XFS_INO_TO_AGINO(mp, ip->i_ino) & mask; + read_lock(&pag->pag_ici_lock); + /* really need a gang lookup range call here */ + nr_found = radix_tree_gang_lookup(&pag->pag_ici_root, (void**)ilist, + first_index, + XFS_INODE_CLUSTER_SIZE(mp)); + if (nr_found == 0) + goto out_free; + + for (i = 0; i < nr_found; i++) { + iq = ilist[i]; + if (iq == ip) + continue; + /* if the inode lies outside this cluster, we're done. */ + if ((XFS_INO_TO_AGINO(mp, iq->i_ino) & mask) != first_index) + break; + /* + * Do an un-protected check to see if the inode is dirty and + * is a candidate for flushing. These checks will be repeated + * later after the appropriate locks are acquired. + */ + iip = iq->i_itemp; + if ((iq->i_update_core == 0) && + ((iip == NULL) || + !(iip->ili_format.ilf_fields & XFS_ILOG_ALL)) && + xfs_ipincount(iq) == 0) { + continue; + } + + /* + * Try to get locks. If any are unavailable or it is pinned, + * then this inode cannot be flushed and is skipped. + */ + + if (!xfs_ilock_nowait(iq, XFS_ILOCK_SHARED)) + continue; + if (!xfs_iflock_nowait(iq)) { + xfs_iunlock(iq, XFS_ILOCK_SHARED); + continue; + } + if (xfs_ipincount(iq)) { + xfs_ifunlock(iq); + xfs_iunlock(iq, XFS_ILOCK_SHARED); + continue; + } + + /* + * arriving here means that this inode can be flushed. First + * re-check that it's dirty before flushing. + */ + iip = iq->i_itemp; + if ((iq->i_update_core != 0) || ((iip != NULL) && + (iip->ili_format.ilf_fields & XFS_ILOG_ALL))) { + int error; + error = xfs_iflush_int(iq, bp); + if (error) { + xfs_iunlock(iq, XFS_ILOCK_SHARED); + goto cluster_corrupt_out; + } + clcount++; + } else { + xfs_ifunlock(iq); + } + xfs_iunlock(iq, XFS_ILOCK_SHARED); + } + + if (clcount) { + XFS_STATS_INC(xs_icluster_flushcnt); + XFS_STATS_ADD(xs_icluster_flushinode, clcount); + } + +out_free: + read_unlock(&pag->pag_ici_lock); + kmem_free(ilist, ilist_size); + return 0; + + +cluster_corrupt_out: + /* + * Corruption detected in the clustering loop. Invalidate the + * inode buffer and shut down the filesystem. + */ + read_unlock(&pag->pag_ici_lock); + /* + * Clean up the buffer. If it was B_DELWRI, just release it -- + * brelse can handle it with no problems. If not, shut down the + * filesystem before releasing the buffer. + */ + bufwasdelwri = XFS_BUF_ISDELAYWRITE(bp); + if (bufwasdelwri) + xfs_buf_relse(bp); + + xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); + + if (!bufwasdelwri) { + /* + * Just like incore_relse: if we have b_iodone functions, + * mark the buffer as an error and call them. Otherwise + * mark it as stale and brelse. + */ + if (XFS_BUF_IODONE_FUNC(bp)) { + XFS_BUF_CLR_BDSTRAT_FUNC(bp); + XFS_BUF_UNDONE(bp); + XFS_BUF_STALE(bp); + XFS_BUF_SHUT(bp); + XFS_BUF_ERROR(bp,EIO); + xfs_biodone(bp); + } else { + XFS_BUF_STALE(bp); + xfs_buf_relse(bp); + } + } + + /* + * Unlocks the flush lock + */ + xfs_iflush_abort(iq); + kmem_free(ilist, ilist_size); + return XFS_ERROR(EFSCORRUPTED); +} + /* * xfs_iflush() will write a modified inode's changes out to the * inode's on disk home. The caller must have the inode lock held @@ -3013,13 +3159,8 @@ xfs_iflush( xfs_dinode_t *dip; xfs_mount_t *mp; int error; - /* REFERENCED */ - xfs_inode_t *iq; - int clcount; /* count of inodes clustered */ - int bufwasdelwri; - struct hlist_node *entry; - enum { INT_DELWRI = (1 << 0), INT_ASYNC = (1 << 1) }; int noblock = (flags == XFS_IFLUSH_ASYNC_NOBLOCK); + enum { INT_DELWRI = (1 << 0), INT_ASYNC = (1 << 1) }; XFS_STATS_INC(xs_iflush_count); @@ -3138,9 +3279,8 @@ xfs_iflush( * First flush out the inode that xfs_iflush was called with. */ error = xfs_iflush_int(ip, bp); - if (error) { + if (error) goto corrupt_out; - } /* * If the buffer is pinned then push on the log now so we won't @@ -3153,70 +3293,9 @@ xfs_iflush( * inode clustering: * see if other inodes can be gathered into this write */ - spin_lock(&ip->i_cluster->icl_lock); - ip->i_cluster->icl_buf = bp; - - clcount = 0; - hlist_for_each_entry(iq, entry, &ip->i_cluster->icl_inodes, i_cnode) { - if (iq == ip) - continue; - - /* - * Do an un-protected check to see if the inode is dirty and - * is a candidate for flushing. These checks will be repeated - * later after the appropriate locks are acquired. - */ - iip = iq->i_itemp; - if ((iq->i_update_core == 0) && - ((iip == NULL) || - !(iip->ili_format.ilf_fields & XFS_ILOG_ALL)) && - xfs_ipincount(iq) == 0) { - continue; - } - - /* - * Try to get locks. If any are unavailable, - * then this inode cannot be flushed and is skipped. - */ - - /* get inode locks (just i_lock) */ - if (xfs_ilock_nowait(iq, XFS_ILOCK_SHARED)) { - /* get inode flush lock */ - if (xfs_iflock_nowait(iq)) { - /* check if pinned */ - if (xfs_ipincount(iq) == 0) { - /* arriving here means that - * this inode can be flushed. - * first re-check that it's - * dirty - */ - iip = iq->i_itemp; - if ((iq->i_update_core != 0)|| - ((iip != NULL) && - (iip->ili_format.ilf_fields & XFS_ILOG_ALL))) { - clcount++; - error = xfs_iflush_int(iq, bp); - if (error) { - xfs_iunlock(iq, - XFS_ILOCK_SHARED); - goto cluster_corrupt_out; - } - } else { - xfs_ifunlock(iq); - } - } else { - xfs_ifunlock(iq); - } - } - xfs_iunlock(iq, XFS_ILOCK_SHARED); - } - } - spin_unlock(&ip->i_cluster->icl_lock); - - if (clcount) { - XFS_STATS_INC(xs_icluster_flushcnt); - XFS_STATS_ADD(xs_icluster_flushinode, clcount); - } + error = xfs_iflush_cluster(ip, bp); + if (error) + goto cluster_corrupt_out; if (flags & INT_DELWRI) { xfs_bdwrite(mp, bp); @@ -3230,52 +3309,11 @@ xfs_iflush( corrupt_out: xfs_buf_relse(bp); xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); - xfs_iflush_abort(ip); - /* - * Unlocks the flush lock - */ - return XFS_ERROR(EFSCORRUPTED); - cluster_corrupt_out: - /* Corruption detected in the clustering loop. Invalidate the - * inode buffer and shut down the filesystem. - */ - spin_unlock(&ip->i_cluster->icl_lock); - - /* - * Clean up the buffer. If it was B_DELWRI, just release it -- - * brelse can handle it with no problems. If not, shut down the - * filesystem before releasing the buffer. - */ - if ((bufwasdelwri= XFS_BUF_ISDELAYWRITE(bp))) { - xfs_buf_relse(bp); - } - - xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); - - if(!bufwasdelwri) { - /* - * Just like incore_relse: if we have b_iodone functions, - * mark the buffer as an error and call them. Otherwise - * mark it as stale and brelse. - */ - if (XFS_BUF_IODONE_FUNC(bp)) { - XFS_BUF_CLR_BDSTRAT_FUNC(bp); - XFS_BUF_UNDONE(bp); - XFS_BUF_STALE(bp); - XFS_BUF_SHUT(bp); - XFS_BUF_ERROR(bp,EIO); - xfs_biodone(bp); - } else { - XFS_BUF_STALE(bp); - xfs_buf_relse(bp); - } - } - - xfs_iflush_abort(iq); /* * Unlocks the flush lock */ + xfs_iflush_abort(ip); return XFS_ERROR(EFSCORRUPTED); } Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.h 2008-02-08 15:17:49.321523070 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.h 2008-02-08 16:00:31.062714635 +1100 @@ -133,19 +133,6 @@ typedef struct dm_attrs_s { } dm_attrs_t; /* - * This is the xfs inode cluster structure. This structure is used by - * xfs_iflush to find inodes that share a cluster and can be flushed to disk at - * the same time. - */ -typedef struct xfs_icluster { - struct hlist_head icl_inodes; /* list of inodes on cluster */ - xfs_daddr_t icl_blkno; /* starting block number of - * the cluster */ - struct xfs_buf *icl_buf; /* the inode buffer */ - spinlock_t icl_lock; /* inode list lock */ -} xfs_icluster_t; - -/* * This is the xfs in-core inode structure. * Most of the on-disk inode is embedded in the i_d field. * @@ -252,8 +239,6 @@ typedef struct xfs_inode { unsigned int i_delayed_blks; /* count of delay alloc blks */ xfs_icdinode_t i_d; /* most of ondisk inode */ - xfs_icluster_t *i_cluster; /* cluster list header */ - struct hlist_node i_cnode; /* cluster link node */ xfs_fsize_t i_size; /* in-memory size */ xfs_fsize_t i_new_size; /* size when write completes */ @@ -598,7 +583,6 @@ void xfs_inobp_check(struct xfs_mount * #define xfs_inobp_check(mp, bp) #endif /* DEBUG */ -extern struct kmem_zone *xfs_icluster_zone; extern struct kmem_zone *xfs_ifork_zone; extern struct kmem_zone *xfs_inode_zone; extern struct kmem_zone *xfs_ili_zone; Index: 2.6.x-xfs-new/fs/xfs/xfs_vfsops.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_vfsops.c 2008-02-08 15:17:49.325522555 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_vfsops.c 2008-02-08 16:00:31.062714635 +1100 @@ -113,9 +113,6 @@ xfs_init(void) xfs_ili_zone = kmem_zone_init_flags(sizeof(xfs_inode_log_item_t), "xfs_ili", KM_ZONE_SPREAD, NULL); - xfs_icluster_zone = - kmem_zone_init_flags(sizeof(xfs_icluster_t), "xfs_icluster", - KM_ZONE_SPREAD, NULL); /* * Allocate global trace buffers. @@ -153,7 +150,6 @@ xfs_cleanup(void) extern kmem_zone_t *xfs_inode_zone; extern kmem_zone_t *xfs_efd_zone; extern kmem_zone_t *xfs_efi_zone; - extern kmem_zone_t *xfs_icluster_zone; xfs_cleanup_procfs(); xfs_sysctl_unregister(); @@ -189,7 +185,6 @@ xfs_cleanup(void) kmem_zone_destroy(xfs_efi_zone); kmem_zone_destroy(xfs_ifork_zone); kmem_zone_destroy(xfs_ili_zone); - kmem_zone_destroy(xfs_icluster_zone); } /* Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_ksyms.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_ksyms.c 2008-02-08 15:17:49.465504551 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_ksyms.c 2008-02-08 16:00:31.078712575 +1100 @@ -211,7 +211,6 @@ EXPORT_SYMBOL(xfs_bulkstat); EXPORT_SYMBOL(xfs_bunmapi); EXPORT_SYMBOL(xfs_bwrite); EXPORT_SYMBOL(xfs_change_file_space); -EXPORT_SYMBOL(xfs_icluster_zone); EXPORT_SYMBOL(xfs_dev_is_read_only); EXPORT_SYMBOL(xfs_dir_ialloc); EXPORT_SYMBOL(xfs_error_report); Index: 2.6.x-xfs-new/fs/xfs/xfsidbg.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfsidbg.c 2008-02-08 15:17:49.485501979 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfsidbg.c 2008-02-08 16:00:31.094710514 +1100 @@ -6465,10 +6465,6 @@ xfsidbg_xnode(xfs_inode_t *ip) qprintf(" dir trace 0x%p\n", ip->i_dir_trace); #endif kdb_printf("\n"); - kdb_printf("icluster 0x%p cnext 0x%p cprev 0x%p\n", - ip->i_cluster, - ip->i_cnode.next, - ip->i_cnode.pprev); xfs_xnode_fork("data", &ip->i_df); xfs_xnode_fork("attr", ip->i_afp); kdb_printf("\n"); -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Feb 14 15:52:44 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 15:52:48 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1ENqfDa003713 for ; Thu, 14 Feb 2008 15:52:43 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA04028; Fri, 15 Feb 2008 10:53:01 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1ENr0LF62988587; Fri, 15 Feb 2008 10:53:01 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1ENqxep63852684; Fri, 15 Feb 2008 10:52:59 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 15 Feb 2008 10:52:59 +1100 From: David Chinner To: David Chinner Cc: xfs-dev , xfs-oss Subject: Re: [patch] Use xfs_inode_clean() in more places Message-ID: <20080214235259.GR155259@sgi.com> References: <20080121051647.GF155259@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080121051647.GF155259@sgi.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5819/Thu Feb 14 14:29:43 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14439 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs ping? On Mon, Jan 21, 2008 at 04:16:47PM +1100, David Chinner wrote: > Use xfs_inode_clean() in more places. > > Version 2: > - remove eye-hurting STATIC_INLINE > - make check less verbose > > Signed-off-by: Dave Chinner > --- > fs/xfs/xfs_inode.c | 27 +++++---------------------- > fs/xfs/xfs_inode_item.h | 8 ++++++++ > fs/xfs/xfs_vnodeops.c | 4 +--- > 3 files changed, 14 insertions(+), 25 deletions(-) > > Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.c 2008-01-21 16:06:27.000000000 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.c 2008-01-21 16:08:47.893673473 +1100 > @@ -2118,13 +2118,6 @@ xfs_iunlink_remove( > return 0; > } > > -STATIC_INLINE int xfs_inode_clean(xfs_inode_t *ip) > -{ > - return (((ip->i_itemp == NULL) || > - !(ip->i_itemp->ili_format.ilf_fields & XFS_ILOG_ALL)) && > - (ip->i_update_core == 0)); > -} > - > STATIC void > xfs_ifree_cluster( > xfs_inode_t *free_ip, > @@ -3004,7 +2997,6 @@ xfs_iflush_cluster( > int ilist_size; > xfs_inode_t *ilist; > xfs_inode_t *iq; > - xfs_inode_log_item_t *iip; > int nr_found; > int clcount = 0; > int bufwasdelwri; > @@ -3038,13 +3030,8 @@ xfs_iflush_cluster( > * is a candidate for flushing. These checks will be repeated > * later after the appropriate locks are acquired. > */ > - iip = iq->i_itemp; > - if ((iq->i_update_core == 0) && > - ((iip == NULL) || > - !(iip->ili_format.ilf_fields & XFS_ILOG_ALL)) && > - xfs_ipincount(iq) == 0) { > + if (xfs_inode_clean(iq) && xfs_ipincount(iq) == 0) > continue; > - } > > /* > * Try to get locks. If any are unavailable or it is pinned, > @@ -3067,10 +3054,8 @@ xfs_iflush_cluster( > * arriving here means that this inode can be flushed. First > * re-check that it's dirty before flushing. > */ > - iip = iq->i_itemp; > - if ((iq->i_update_core != 0) || ((iip != NULL) && > - (iip->ili_format.ilf_fields & XFS_ILOG_ALL))) { > - int error; > + if (!xfs_inode_clean(iq)) { > + int error; > error = xfs_iflush_int(iq, bp); > if (error) { > xfs_iunlock(iq, XFS_ILOCK_SHARED); > @@ -3174,8 +3159,7 @@ xfs_iflush( > * If the inode isn't dirty, then just release the inode > * flush lock and do nothing. > */ > - if ((ip->i_update_core == 0) && > - ((iip == NULL) || !(iip->ili_format.ilf_fields & XFS_ILOG_ALL))) { > + if (xfs_inode_clean(ip)) { > ASSERT((iip != NULL) ? > !(iip->ili_item.li_flags & XFS_LI_IN_AIL) : 1); > xfs_ifunlock(ip); > @@ -3341,8 +3325,7 @@ xfs_iflush_int( > * If the inode isn't dirty, then just release the inode > * flush lock and do nothing. > */ > - if ((ip->i_update_core == 0) && > - ((iip == NULL) || !(iip->ili_format.ilf_fields & XFS_ILOG_ALL))) { > + if (xfs_inode_clean(ip)) { > xfs_ifunlock(ip); > return 0; > } > Index: 2.6.x-xfs-new/fs/xfs/xfs_vnodeops.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_vnodeops.c 2008-01-21 16:06:27.000000000 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfs_vnodeops.c 2008-01-21 16:08:47.897672964 +1100 > @@ -3490,7 +3490,6 @@ xfs_inode_flush( > int flags) > { > xfs_mount_t *mp = ip->i_mount; > - xfs_inode_log_item_t *iip = ip->i_itemp; > int error = 0; > > if (XFS_FORCED_SHUTDOWN(mp)) > @@ -3500,8 +3499,7 @@ xfs_inode_flush( > * Bypass inodes which have already been cleaned by > * the inode flush clustering code inside xfs_iflush > */ > - if ((ip->i_update_core == 0) && > - ((iip == NULL) || !(iip->ili_format.ilf_fields & XFS_ILOG_ALL))) > + if (xfs_inode_clean(ip)) > return 0; > > /* > Index: 2.6.x-xfs-new/fs/xfs/xfs_inode_item.h > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode_item.h 2008-01-21 16:06:27.000000000 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfs_inode_item.h 2008-01-21 16:14:34.001674576 +1100 > @@ -168,6 +168,14 @@ static inline int xfs_ilog_fext(int w) > return (w == XFS_DATA_FORK ? XFS_ILOG_DEXT : XFS_ILOG_AEXT); > } > > +static inline int xfs_inode_clean(xfs_inode_t *ip) > +{ > + return (!ip->i_itemp || > + !(ip->i_itemp->ili_format.ilf_fields & XFS_ILOG_ALL)) && > + !ip->i_update_core; > +} > + > + > #ifdef __KERNEL__ > > extern void xfs_inode_item_init(struct xfs_inode *, struct xfs_mount *); -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Feb 14 15:52:38 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 15:52:41 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1ENqVD8003674 for ; Thu, 14 Feb 2008 15:52:36 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA04015; Fri, 15 Feb 2008 10:52:52 +1100 Message-ID: <47B4D454.4010600@sgi.com> Date: Fri, 15 Feb 2008 10:52:52 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: David Chinner CC: xfs-dev , xfs-oss Subject: Re: [patch] Prevent AIL lock contention during transaction completion References: <20080121052330.GG155259@sgi.com> <4796E8C8.3030702@sgi.com> <20080123073446.GU155259@sgi.com> <479986F5.7070800@sgi.com> <20080125074235.GI155407@sgi.com> <20080214234559.GO155259@sgi.com> In-Reply-To: <20080214234559.GO155259@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/5819/Thu Feb 14 14:29:43 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14438 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs David Chinner wrote: > On Fri, Jan 25, 2008 at 06:42:35PM +1100, David Chinner wrote: >> On Fri, Jan 25, 2008 at 05:51:33PM +1100, Timothy Shimmin wrote: >>> So do we really need to call xlog_assign_tail_lsn() then? >>> Or are we just being conservative in case we missed something? >> Conservative - the last thing I want is to introduce a subtle >> difference to the tail lsn in the log record because we didn't >> update it immediately before writing it to disk. I think we are >> probably safe removing it, but lets leave that until we got some >> wider test coverage on this change first.... > > Tim - did you finish the review of this? Testing on the 2048p > machine appears to have been successful, so I'm just waiting > on review ACKs now.... > > Cheers, > > Dave. Yep, that was an ACK. --Tim From owner-xfs@oss.sgi.com Thu Feb 14 16:24:12 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 16:24:15 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1F0O7bP010206 for ; Thu, 14 Feb 2008 16:24:10 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA05039; Fri, 15 Feb 2008 11:24:31 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1F0OULF63760634; Fri, 15 Feb 2008 11:24:31 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1F0OT1563830839; Fri, 15 Feb 2008 11:24:29 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 15 Feb 2008 11:24:29 +1100 From: David Chinner To: David Chinner Cc: Timothy Shimmin , xfs-dev , xfs-oss Subject: Re: [patch] Use atomics for iclog reference counting Message-ID: <20080215002429.GT155259@sgi.com> References: <20080121053021.GH155259@sgi.com> <4796CCF5.8010509@sgi.com> <20080214234758.GP155259@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20080214234758.GP155259@sgi.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5819/Thu Feb 14 14:29:43 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14440 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Fri, Feb 15, 2008 at 10:47:58AM +1100, David Chinner wrote: > On Wed, Jan 23, 2008 at 04:13:25PM +1100, Timothy Shimmin wrote: > > I'll have a look... > > Tim, have you had a chance to look at this one yet? I'd like to > push this too, but I understand you are kinda busy right now :/ FWIW, you might want to review this version ;) ---- Now that we update the log tail LSN less frequently on transaction completion, we pass the contention straight to the global log state lock (l_iclog_lock) during transaction completion. We currently have to take this lock to decrement the iclog reference count. there is a reference count on each iclog, so we need to take þhe global lock for all refcount changes. When large numbers of processes are all doing small trnasctions, the iclog reference counts will be quite high, and the state change that absolutely requires the l_iclog_lock is the except rather than the norm. Change the reference counting on the iclogs to use atomic_inc/dec so that we can use atomic_dec_and_lock during transaction completion and avoid the need for grabbing the l_iclog_lock for every reference count decrement except the one that matters - the last. Version 2: o remove spurious unlock in shutdown path in xlog_state_release_iclog() Signed-off-by: Dave Chinner --- fs/xfs/xfs_log.c | 36 ++++++++++++++++++++---------------- fs/xfs/xfs_log_priv.h | 2 +- fs/xfs/xfsidbg.c | 2 +- 3 files changed, 22 insertions(+), 18 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_log.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_log.c 2008-02-15 11:19:08.076544539 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_log.c 2008-02-15 11:20:22.558911855 +1100 @@ -675,7 +675,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) spin_lock(&log->l_icloglock); iclog = log->l_iclog; - iclog->ic_refcnt++; + atomic_inc(&iclog->ic_refcnt); spin_unlock(&log->l_icloglock); xlog_state_want_sync(log, iclog); (void) xlog_state_release_iclog(log, iclog); @@ -713,7 +713,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) */ spin_lock(&log->l_icloglock); iclog = log->l_iclog; - iclog->ic_refcnt++; + atomic_inc(&iclog->ic_refcnt); spin_unlock(&log->l_icloglock); xlog_state_want_sync(log, iclog); @@ -1405,7 +1405,7 @@ xlog_sync(xlog_t *log, int v2 = XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb); XFS_STATS_INC(xs_log_writes); - ASSERT(iclog->ic_refcnt == 0); + ASSERT(atomic_read(&iclog->ic_refcnt) == 0); /* Add for LR header */ count_init = log->l_iclog_hsize + iclog->ic_offset; @@ -2312,7 +2312,7 @@ xlog_state_done_syncing( ASSERT(iclog->ic_state == XLOG_STATE_SYNCING || iclog->ic_state == XLOG_STATE_IOERROR); - ASSERT(iclog->ic_refcnt == 0); + ASSERT(atomic_read(&iclog->ic_refcnt) == 0); ASSERT(iclog->ic_bwritecnt == 1 || iclog->ic_bwritecnt == 2); @@ -2394,7 +2394,7 @@ restart: ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE); head = &iclog->ic_header; - iclog->ic_refcnt++; /* prevents sync */ + atomic_inc(&iclog->ic_refcnt); /* prevents sync */ log_offset = iclog->ic_offset; /* On the 1st write to an iclog, figure out lsn. This works @@ -2426,12 +2426,12 @@ restart: xlog_state_switch_iclogs(log, iclog, iclog->ic_size); /* If I'm the only one writing to this iclog, sync it to disk */ - if (iclog->ic_refcnt == 1) { + if (atomic_read(&iclog->ic_refcnt) == 1) { spin_unlock(&log->l_icloglock); if ((error = xlog_state_release_iclog(log, iclog))) return error; } else { - iclog->ic_refcnt--; + atomic_dec(&iclog->ic_refcnt); spin_unlock(&log->l_icloglock); } goto restart; @@ -2822,18 +2822,21 @@ xlog_state_release_iclog( { int sync = 0; /* do we sync? */ - spin_lock(&log->l_icloglock); + if (iclog->ic_state & XLOG_STATE_IOERROR) + return XFS_ERROR(EIO); + + ASSERT(atomic_read(&iclog->ic_refcnt) > 0); + if (!atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) + return 0; + if (iclog->ic_state & XLOG_STATE_IOERROR) { spin_unlock(&log->l_icloglock); return XFS_ERROR(EIO); } - - ASSERT(iclog->ic_refcnt > 0); ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE || iclog->ic_state == XLOG_STATE_WANT_SYNC); - if (--iclog->ic_refcnt == 0 && - iclog->ic_state == XLOG_STATE_WANT_SYNC) { + if (iclog->ic_state == XLOG_STATE_WANT_SYNC) { /* update tail before writing to iclog */ xlog_assign_tail_lsn(log->l_mp); sync++; @@ -2953,7 +2956,8 @@ xlog_state_sync_all(xlog_t *log, uint fl * previous iclog and go to sleep. */ if (iclog->ic_state == XLOG_STATE_DIRTY || - (iclog->ic_refcnt == 0 && iclog->ic_offset == 0)) { + (atomic_read(&iclog->ic_refcnt) == 0 + && iclog->ic_offset == 0)) { iclog = iclog->ic_prev; if (iclog->ic_state == XLOG_STATE_ACTIVE || iclog->ic_state == XLOG_STATE_DIRTY) @@ -2961,14 +2965,14 @@ xlog_state_sync_all(xlog_t *log, uint fl else goto maybe_sleep; } else { - if (iclog->ic_refcnt == 0) { + if (atomic_read(&iclog->ic_refcnt) == 0) { /* We are the only one with access to this * iclog. Flush it out now. There should * be a roundoff of zero to show that someone * has already taken care of the roundoff from * the previous sync. */ - iclog->ic_refcnt++; + atomic_inc(&iclog->ic_refcnt); lsn = be64_to_cpu(iclog->ic_header.h_lsn); xlog_state_switch_iclogs(log, iclog, 0); spin_unlock(&log->l_icloglock); @@ -3100,7 +3104,7 @@ try_again: already_slept = 1; goto try_again; } else { - iclog->ic_refcnt++; + atomic_inc(&iclog->ic_refcnt); xlog_state_switch_iclogs(log, iclog, 0); spin_unlock(&log->l_icloglock); if (xlog_state_release_iclog(log, iclog)) Index: 2.6.x-xfs-new/fs/xfs/xfs_log_priv.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_log_priv.h 2008-02-15 11:19:08.080544022 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_log_priv.h 2008-02-15 11:19:14.403726218 +1100 @@ -339,7 +339,7 @@ typedef struct xlog_iclog_fields { #endif int ic_size; int ic_offset; - int ic_refcnt; + atomic_t ic_refcnt; int ic_bwritecnt; ushort_t ic_state; char *ic_datap; /* pointer to iclog data */ Index: 2.6.x-xfs-new/fs/xfs/xfsidbg.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfsidbg.c 2008-02-15 11:19:08.096541953 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfsidbg.c 2008-02-15 11:19:14.407725701 +1100 @@ -5633,7 +5633,7 @@ xfsidbg_xiclog(xlog_in_core_t *iclog) #else NULL, #endif - iclog->ic_refcnt, iclog->ic_bwritecnt); + atomic_read(&iclog->ic_refcnt), iclog->ic_bwritecnt); if (iclog->ic_state & XLOG_STATE_ALL) printflags(iclog->ic_state, ic_flags, " state:"); else -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Feb 14 19:34:44 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 19:34:46 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1F3YZKP020751 for ; Thu, 14 Feb 2008 19:34:39 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA09923; Fri, 15 Feb 2008 14:34:53 +1100 Message-ID: <47B5085D.30409@sgi.com> Date: Fri, 15 Feb 2008 14:34:53 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: David Chinner CC: xfs-dev , xfs-oss Subject: Re: [patch] Use atomics for iclog reference counting References: <20080121053021.GH155259@sgi.com> <4796CCF5.8010509@sgi.com> <20080214234758.GP155259@sgi.com> <20080215002429.GT155259@sgi.com> In-Reply-To: <20080215002429.GT155259@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/5826/Thu Feb 14 18:57:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14441 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Dave, So a bunch of incs/decs/tests converted to the atomic versions. And the interesting stuff appears to be in xlog_state_release_iclog(). Okay that looks reasonable. If the decrement of the cnt doesn't go down to zero then we just return straight away - because we won't be going to sync anything. And if we do go to zero then we take the lock and continue. Why do we test for the error/EIO beforehand now too? Because we don't want to return 0 if we have an error to return? Seems good. In the 1st 2 cases of the patch: > @@ -675,7 +675,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) > > spin_lock(&log->l_icloglock); > iclog = log->l_iclog; > - iclog->ic_refcnt++; > + atomic_inc(&iclog->ic_refcnt); > spin_unlock(&log->l_icloglock); > xlog_state_want_sync(log, iclog); > (void) xlog_state_release_iclog(log, iclog); > @@ -713,7 +713,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) > */ > spin_lock(&log->l_icloglock); > iclog = log->l_iclog; > - iclog->ic_refcnt++; > + atomic_inc(&iclog->ic_refcnt); > spin_unlock(&log->l_icloglock); Do we still really need to take the lock etc? --Tim David Chinner wrote: > On Fri, Feb 15, 2008 at 10:47:58AM +1100, David Chinner wrote: >> On Wed, Jan 23, 2008 at 04:13:25PM +1100, Timothy Shimmin wrote: >>> I'll have a look... >> Tim, have you had a chance to look at this one yet? I'd like to >> push this too, but I understand you are kinda busy right now :/ > > FWIW, you might want to review this version ;) > > ---- > > Now that we update the log tail LSN less frequently on > transaction completion, we pass the contention straight to > the global log state lock (l_iclog_lock) during transaction > completion. > > We currently have to take this lock to decrement the iclog > reference count. there is a reference count on each iclog, > so we need to take the global lock for all refcount changes. > > When large numbers of processes are all doing small trnasctions, > the iclog reference counts will be quite high, and the state change > that absolutely requires the l_iclog_lock is the except rather than > the norm. > > Change the reference counting on the iclogs to use atomic_inc/dec > so that we can use atomic_dec_and_lock during transaction completion > and avoid the need for grabbing the l_iclog_lock for every reference > count decrement except the one that matters - the last. > > Version 2: > o remove spurious unlock in shutdown path in xlog_state_release_iclog() > > Signed-off-by: Dave Chinner > --- > fs/xfs/xfs_log.c | 36 ++++++++++++++++++++---------------- > fs/xfs/xfs_log_priv.h | 2 +- > fs/xfs/xfsidbg.c | 2 +- > 3 files changed, 22 insertions(+), 18 deletions(-) > > Index: 2.6.x-xfs-new/fs/xfs/xfs_log.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_log.c 2008-02-15 11:19:08.076544539 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfs_log.c 2008-02-15 11:20:22.558911855 +1100 > @@ -675,7 +675,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) > > spin_lock(&log->l_icloglock); > iclog = log->l_iclog; > - iclog->ic_refcnt++; > + atomic_inc(&iclog->ic_refcnt); > spin_unlock(&log->l_icloglock); > xlog_state_want_sync(log, iclog); > (void) xlog_state_release_iclog(log, iclog); > @@ -713,7 +713,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) > */ > spin_lock(&log->l_icloglock); > iclog = log->l_iclog; > - iclog->ic_refcnt++; > + atomic_inc(&iclog->ic_refcnt); > spin_unlock(&log->l_icloglock); > > xlog_state_want_sync(log, iclog); > @@ -1405,7 +1405,7 @@ xlog_sync(xlog_t *log, > int v2 = XFS_SB_VERSION_HASLOGV2(&log->l_mp->m_sb); > > XFS_STATS_INC(xs_log_writes); > - ASSERT(iclog->ic_refcnt == 0); > + ASSERT(atomic_read(&iclog->ic_refcnt) == 0); > > /* Add for LR header */ > count_init = log->l_iclog_hsize + iclog->ic_offset; > @@ -2312,7 +2312,7 @@ xlog_state_done_syncing( > > ASSERT(iclog->ic_state == XLOG_STATE_SYNCING || > iclog->ic_state == XLOG_STATE_IOERROR); > - ASSERT(iclog->ic_refcnt == 0); > + ASSERT(atomic_read(&iclog->ic_refcnt) == 0); > ASSERT(iclog->ic_bwritecnt == 1 || iclog->ic_bwritecnt == 2); > > > @@ -2394,7 +2394,7 @@ restart: > ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE); > head = &iclog->ic_header; > > - iclog->ic_refcnt++; /* prevents sync */ > + atomic_inc(&iclog->ic_refcnt); /* prevents sync */ > log_offset = iclog->ic_offset; > > /* On the 1st write to an iclog, figure out lsn. This works > @@ -2426,12 +2426,12 @@ restart: > xlog_state_switch_iclogs(log, iclog, iclog->ic_size); > > /* If I'm the only one writing to this iclog, sync it to disk */ > - if (iclog->ic_refcnt == 1) { > + if (atomic_read(&iclog->ic_refcnt) == 1) { > spin_unlock(&log->l_icloglock); > if ((error = xlog_state_release_iclog(log, iclog))) > return error; > } else { > - iclog->ic_refcnt--; > + atomic_dec(&iclog->ic_refcnt); > spin_unlock(&log->l_icloglock); > } > goto restart; > @@ -2822,18 +2822,21 @@ xlog_state_release_iclog( > { > int sync = 0; /* do we sync? */ > > - spin_lock(&log->l_icloglock); > + if (iclog->ic_state & XLOG_STATE_IOERROR) > + return XFS_ERROR(EIO); > + > + ASSERT(atomic_read(&iclog->ic_refcnt) > 0); > + if (!atomic_dec_and_lock(&iclog->ic_refcnt, &log->l_icloglock)) > + return 0; > + > if (iclog->ic_state & XLOG_STATE_IOERROR) { > spin_unlock(&log->l_icloglock); > return XFS_ERROR(EIO); > } > - > - ASSERT(iclog->ic_refcnt > 0); > ASSERT(iclog->ic_state == XLOG_STATE_ACTIVE || > iclog->ic_state == XLOG_STATE_WANT_SYNC); > > - if (--iclog->ic_refcnt == 0 && > - iclog->ic_state == XLOG_STATE_WANT_SYNC) { > + if (iclog->ic_state == XLOG_STATE_WANT_SYNC) { > /* update tail before writing to iclog */ > xlog_assign_tail_lsn(log->l_mp); > sync++; > @@ -2953,7 +2956,8 @@ xlog_state_sync_all(xlog_t *log, uint fl > * previous iclog and go to sleep. > */ > if (iclog->ic_state == XLOG_STATE_DIRTY || > - (iclog->ic_refcnt == 0 && iclog->ic_offset == 0)) { > + (atomic_read(&iclog->ic_refcnt) == 0 > + && iclog->ic_offset == 0)) { > iclog = iclog->ic_prev; > if (iclog->ic_state == XLOG_STATE_ACTIVE || > iclog->ic_state == XLOG_STATE_DIRTY) > @@ -2961,14 +2965,14 @@ xlog_state_sync_all(xlog_t *log, uint fl > else > goto maybe_sleep; > } else { > - if (iclog->ic_refcnt == 0) { > + if (atomic_read(&iclog->ic_refcnt) == 0) { > /* We are the only one with access to this > * iclog. Flush it out now. There should > * be a roundoff of zero to show that someone > * has already taken care of the roundoff from > * the previous sync. > */ > - iclog->ic_refcnt++; > + atomic_inc(&iclog->ic_refcnt); > lsn = be64_to_cpu(iclog->ic_header.h_lsn); > xlog_state_switch_iclogs(log, iclog, 0); > spin_unlock(&log->l_icloglock); > @@ -3100,7 +3104,7 @@ try_again: > already_slept = 1; > goto try_again; > } else { > - iclog->ic_refcnt++; > + atomic_inc(&iclog->ic_refcnt); > xlog_state_switch_iclogs(log, iclog, 0); > spin_unlock(&log->l_icloglock); > if (xlog_state_release_iclog(log, iclog)) > Index: 2.6.x-xfs-new/fs/xfs/xfs_log_priv.h > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_log_priv.h 2008-02-15 11:19:08.080544022 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfs_log_priv.h 2008-02-15 11:19:14.403726218 +1100 > @@ -339,7 +339,7 @@ typedef struct xlog_iclog_fields { > #endif > int ic_size; > int ic_offset; > - int ic_refcnt; > + atomic_t ic_refcnt; > int ic_bwritecnt; > ushort_t ic_state; > char *ic_datap; /* pointer to iclog data */ > Index: 2.6.x-xfs-new/fs/xfs/xfsidbg.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfsidbg.c 2008-02-15 11:19:08.096541953 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfsidbg.c 2008-02-15 11:19:14.407725701 +1100 > @@ -5633,7 +5633,7 @@ xfsidbg_xiclog(xlog_in_core_t *iclog) > #else > NULL, > #endif > - iclog->ic_refcnt, iclog->ic_bwritecnt); > + atomic_read(&iclog->ic_refcnt), iclog->ic_bwritecnt); > if (iclog->ic_state & XLOG_STATE_ALL) > printflags(iclog->ic_state, ic_flags, " state:"); > else > From owner-xfs@oss.sgi.com Thu Feb 14 20:23:36 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 20:23:39 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1F4NZHh029261 for ; Thu, 14 Feb 2008 20:23:36 -0800 X-ASG-Debug-ID: 1203049438-7da600f60000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4790EE242D2; Thu, 14 Feb 2008 20:23:59 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id VCiJ14p09REcQmPz; Thu, 14 Feb 2008 20:23:59 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JPs7C-0007jY-3h; Fri, 15 Feb 2008 04:23:58 +0000 Date: Thu, 14 Feb 2008 23:23:58 -0500 From: Christoph Hellwig To: David Chinner Cc: xfs-dev , xfs-oss X-ASG-Orig-Subj: Re: [patch] Use xfs_inode_clean() in more places Subject: Re: [patch] Use xfs_inode_clean() in more places Message-ID: <20080215042358.GA29715@infradead.org> References: <20080121051647.GF155259@sgi.com> <20080214235259.GR155259@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080214235259.GR155259@sgi.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1203049440 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42281 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5826/Thu Feb 14 18:57:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14442 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Fri, Feb 15, 2008 at 10:52:59AM +1100, David Chinner wrote: > ping? Looks good. From owner-xfs@oss.sgi.com Thu Feb 14 20:25:59 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 20:26:03 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1F4PwgI029708 for ; Thu, 14 Feb 2008 20:25:59 -0800 X-ASG-Debug-ID: 1203049582-129703360000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 869BB5DC101; Thu, 14 Feb 2008 20:26:22 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id 6aXWZAoxq4P8mp4b; Thu, 14 Feb 2008 20:26:22 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JPs9W-00082I-8r; Fri, 15 Feb 2008 04:26:22 +0000 Date: Thu, 14 Feb 2008 23:26:22 -0500 From: Christoph Hellwig To: David Chinner Cc: xfs-dev , xfs-oss X-ASG-Orig-Subj: Re: [patch] remove icluster V2 Subject: Re: [patch] remove icluster V2 Message-ID: <20080215042622.GB29715@infradead.org> References: <20080214235223.GQ155259@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080214235223.GQ155259@sgi.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1203049583 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42281 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5826/Thu Feb 14 18:57:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14443 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs looks fine to me. From owner-xfs@oss.sgi.com Thu Feb 14 20:26:53 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 20:26:58 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1F4QnHY029918 for ; Thu, 14 Feb 2008 20:26:51 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA10872; Fri, 15 Feb 2008 15:27:14 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1F4RDLF63628506; Fri, 15 Feb 2008 15:27:13 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1F4RCMJ63966698; Fri, 15 Feb 2008 15:27:12 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 15 Feb 2008 15:27:12 +1100 From: David Chinner To: Timothy Shimmin Cc: David Chinner , xfs-dev , xfs-oss Subject: Re: [patch] Use atomics for iclog reference counting Message-ID: <20080215042712.GY155259@sgi.com> References: <20080121053021.GH155259@sgi.com> <4796CCF5.8010509@sgi.com> <20080214234758.GP155259@sgi.com> <20080215002429.GT155259@sgi.com> <47B5085D.30409@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47B5085D.30409@sgi.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5826/Thu Feb 14 18:57:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14444 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Fri, Feb 15, 2008 at 02:34:53PM +1100, Timothy Shimmin wrote: > Dave, > > So a bunch of incs/decs/tests converted to the atomic versions. > And the interesting stuff appears to be in xlog_state_release_iclog(). > Okay that looks reasonable. > If the decrement of the cnt doesn't go down to zero then we just > return straight away - because we won't be going to sync anything. > And if we do go to zero then we take the lock and continue. > Why do we test for the error/EIO beforehand now too? > Because we don't want to return 0 if we have an error to return? Right. Effectively it retains the same behaviour as the old code. i.e. A call to xlog_state_release_iclog() with an elevated refcount used to return EIO if the log had been shutdown and we need the initial (unlocked) check to retain that behaviour. However, this check is racy and so in the case where the last ref goes away and we get the lock we need to check again when we can't possibly race with a shutdown state change. > Seems good. > > In the 1st 2 cases of the patch: > > @@ -675,7 +675,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) > > > > spin_lock(&log->l_icloglock); > > iclog = log->l_iclog; > > - iclog->ic_refcnt++; > > + atomic_inc(&iclog->ic_refcnt); > > spin_unlock(&log->l_icloglock); > > xlog_state_want_sync(log, iclog); > > (void) xlog_state_release_iclog(log, iclog); > > @@ -713,7 +713,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) > > */ > > spin_lock(&log->l_icloglock); > > iclog = log->l_iclog; > > - iclog->ic_refcnt++; > > + atomic_inc(&iclog->ic_refcnt); > > spin_unlock(&log->l_icloglock); > > Do we still really need to take the lock etc? log->iclog is protected by the l_icloglock as well, so the lock needs to be retained to prevent races when reading and taking a reference to it. IOWs, the l_icloglock still synchronises increments and the final decrement on an iclog; we only need the atomic counter to enable unlocked refcount decrements when the refcount is > 1. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Feb 14 21:04:51 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 21:04:54 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1F54lDF032401 for ; Thu, 14 Feb 2008 21:04:49 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA11726; Fri, 15 Feb 2008 16:05:07 +1100 Message-ID: <47B51D82.1060509@sgi.com> Date: Fri, 15 Feb 2008 16:05:06 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: David Chinner CC: xfs-dev , xfs-oss Subject: Re: [patch] Use atomics for iclog reference counting References: <20080121053021.GH155259@sgi.com> <4796CCF5.8010509@sgi.com> <20080214234758.GP155259@sgi.com> <20080215002429.GT155259@sgi.com> <47B5085D.30409@sgi.com> <20080215042712.GY155259@sgi.com> In-Reply-To: <20080215042712.GY155259@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/5826/Thu Feb 14 18:57:35 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14445 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs David Chinner wrote: > On Fri, Feb 15, 2008 at 02:34:53PM +1100, Timothy Shimmin wrote: >> In the 1st 2 cases of the patch: >>> @@ -675,7 +675,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) >>> >>> spin_lock(&log->l_icloglock); >>> iclog = log->l_iclog; >>> - iclog->ic_refcnt++; >>> + atomic_inc(&iclog->ic_refcnt); >>> spin_unlock(&log->l_icloglock); >>> xlog_state_want_sync(log, iclog); >>> (void) xlog_state_release_iclog(log, iclog); >>> @@ -713,7 +713,7 @@ xfs_log_unmount_write(xfs_mount_t *mp) >>> */ >>> spin_lock(&log->l_icloglock); >>> iclog = log->l_iclog; >>> - iclog->ic_refcnt++; >>> + atomic_inc(&iclog->ic_refcnt); >>> spin_unlock(&log->l_icloglock); >> Do we still really need to take the lock etc? > > log->iclog is protected by the l_icloglock as well, Ah, yep :) > so the lock > needs to be retained to prevent races when reading and taking a > reference to it. IOWs, the l_icloglock still synchronises increments > and the final decrement on an iclog; we only need the atomic counter > to enable unlocked refcount decrements when the refcount is > 1. > Yep. --Tim From owner-xfs@oss.sgi.com Thu Feb 14 22:53:16 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 22:53:19 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1F6rBGk006737 for ; Thu, 14 Feb 2008 22:53:14 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA14133; Fri, 15 Feb 2008 17:53:30 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 60A3058C4C11; Fri, 15 Feb 2008 17:53:30 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: PARTIAL TAKE 970925 - Factor xfs_itobp() and xfs_inotobp() Message-Id: <20080215065330.60A3058C4C11@chook.melbourne.sgi.com> Date: Fri, 15 Feb 2008 17:53:30 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/5829/Thu Feb 14 20:00:17 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14446 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Factor xfs_itobp() and xfs_inotobp(). The only difference between the functions is one passes an inode for the lookup, the other passes an inode number. However, they don't do the same validity checking or set all the same state on the buffer that is returned yet they should. Factor the functions into a common implementation. Date: Fri Feb 15 17:53:02 AEDT 2008 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: lachlan@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30500a fs/xfs/xfs_inode.c - 1.491 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.491&r2=text&tr2=1.490&f=h - Factor xfs_itobp() and xfs_inotobp() to use a common implementation. From owner-xfs@oss.sgi.com Thu Feb 14 23:16:10 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 23:16:15 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1F7G3rG008454 for ; Thu, 14 Feb 2008 23:16:09 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA14616; Fri, 15 Feb 2008 18:16:24 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 5616558C4C11; Fri, 15 Feb 2008 18:16:24 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: PARTIAL TAKE 970925 - Don't block pdflush when writing back inodes Message-Id: <20080215071624.5616558C4C11@chook.melbourne.sgi.com> Date: Fri, 15 Feb 2008 18:16:24 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/5829/Thu Feb 14 20:00:17 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14447 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Don't block pdflush when writing back inodes When pdflush is writing back inodes, it can get stuck on inode cluster buffers that are currently under I/O. This occurs when we write data to multiple inodes in the same inode cluster at the same time. Effectively, delayed allocation marks the inode dirty during the data writeback. Hence if the inode cluster was flushed during the writeback of the first inode, the writeback of the second inode will block waiting for the inode cluster write to complete before writing it again for the newly dirtied inode. Basically, we want to avoid this from happening so we don't block pdflush and slow down all of writeback. Hence we introduce a non-blocking async inode flush flag that pdflush uses. If this flag is set, we use non-blocking operations (e.g. try locks) whereever we can to avoid blocking or extra I/O being issued. Date: Fri Feb 15 18:15:54 AEDT 2008 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: lachlan@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30501a fs/xfs/xfs_vnodeops.c - 1.734 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.734&r2=text&tr2=1.733&f=h - make xfs_inode_flush() specify non-blocking inode flushes and kill dead FLUSH_LOG code. fs/xfs/xfs_itable.c - 1.161 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_itable.c.diff?r1=text&tr1=1.161&r2=text&tr2=1.160&f=h - Added new buffer flag parameter to xfs_itobp(). fs/xfs/xfs_log_recover.c - 1.334 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_recover.c.diff?r1=text&tr1=1.334&r2=text&tr2=1.333&f=h - Added new buffer flag parameter to xfs_itobp(). fs/xfs/xfs_inode.c - 1.492 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.492&r2=text&tr2=1.491&f=h - introduce new non-blocking inode flush options into the writeout code. If we specify a non-blocking flush, try as hard as possible not to get stuck anywhere and return EAGAIN instead. fs/xfs/xfs_inode.h - 1.242 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.h.diff?r1=text&tr1=1.242&r2=text&tr2=1.241&f=h - Added new buffer flag parameter to xfs_itobp(). fs/xfs/xfs_trans_buf.c - 1.129 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_trans_buf.c.diff?r1=text&tr1=1.129&r2=text&tr2=1.128&f=h - Added trylock support to xfs_trans_read_buf() for non-blocking access to buffers. fs/xfs/linux-2.6/xfs_vnode.h - 1.144 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_vnode.h.diff?r1=text&tr1=1.144&r2=text&tr2=1.143&f=h - FLUSH_INODE flag is no longer needed when calling xfs_inode_flush(). FLUSH_LOG flag is no longer used. fs/xfs/linux-2.6/xfs_super.c - 1.409 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_super.c.diff?r1=text&tr1=1.409&r2=text&tr2=1.408&f=h - FLUSH_INODE flag is no longer needed when calling xfs_inode_flush(). From owner-xfs@oss.sgi.com Thu Feb 14 23:30:20 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 23:30:24 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1F7UFHA017846 for ; Thu, 14 Feb 2008 23:30:18 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA15066; Fri, 15 Feb 2008 18:30:36 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 0EE4B58C4C11; Fri, 15 Feb 2008 18:30:36 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 977460 - Remove the xfs_icluster structure Message-Id: <20080215073036.0EE4B58C4C11@chook.melbourne.sgi.com> Date: Fri, 15 Feb 2008 18:30:36 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/5829/Thu Feb 14 20:00:17 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14448 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Remove the xfs_icluster structure and replace with a radix tree lookup. We don't need to keep a list of inodes in each cluster around anymore as we can look them up quickly when we need to. The only time we need to do this now is during inode writeback. Factor the inode cluster writeback code out of xfs_iflush and convert it to use radix_tree_gang_lookup() instead of walking a list of inodes built when we first read in the inodes. This remove 3 pointers from each xfs_inode structure and the xfs_icluster structure per inode cluster. Hence we reduce the cache footprint of the xfs_inodes by between 5-10% depending on cluster sparseness. To be truly efficient we need a radix_tree_gang_lookup_range() call to stop searching once we are past the end of the cluster instead of trying to find a full cluster's worth of inodes. Before (ia64): $ cat /sys/slab/xfs_inode/object_size 536 After: $ cat /sys/slab/xfs_inode/object_size 512 Date: Fri Feb 15 18:29:55 AEDT 2008 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: hch@infradead.org The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30502a fs/xfs/xfsidbg.c - 1.343 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfsidbg.c.diff?r1=text&tr1=1.343&r2=text&tr2=1.342&f=h - remove the cluster structures from the inode cache and use the radix trees for lookups instead. fs/xfs/xfs_vfsops.c - 1.553 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vfsops.c.diff?r1=text&tr1=1.553&r2=text&tr2=1.552&f=h - remove the cluster structures from the inode cache and use the radix trees for lookups instead. fs/xfs/xfs_iget.c - 1.239 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_iget.c.diff?r1=text&tr1=1.239&r2=text&tr2=1.238&f=h - remove the cluster structures from the inode cache and use the radix trees for lookups instead. fs/xfs/xfs_inode.c - 1.493 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.493&r2=text&tr2=1.492&f=h - remove the cluster structures from the inode cache and use the radix trees for lookups instead. fs/xfs/xfs_inode.h - 1.243 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.h.diff?r1=text&tr1=1.243&r2=text&tr2=1.242&f=h - remove the cluster structures from the inode cache and use the radix trees for lookups instead. fs/xfs/linux-2.6/xfs_ksyms.c - 1.80 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_ksyms.c.diff?r1=text&tr1=1.80&r2=text&tr2=1.79&f=h - the xfs_icluster_zone is no longer... From owner-xfs@oss.sgi.com Thu Feb 14 23:37:27 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 23:37:29 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1F7bMQr018461 for ; Thu, 14 Feb 2008 23:37:25 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA15210; Fri, 15 Feb 2008 18:37:43 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id B336358C4C11; Fri, 15 Feb 2008 18:37:43 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 977461 - Use xfs_inode_clean() in more places Message-Id: <20080215073743.B336358C4C11@chook.melbourne.sgi.com> Date: Fri, 15 Feb 2008 18:37:43 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/5829/Thu Feb 14 20:00:17 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14449 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Use xfs_inode_clean() in more places Remove open coded checks for the whether the inode is clean and replace them with an inlined function. Date: Fri Feb 15 18:37:27 AEDT 2008 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: hch@infradead.org The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30503a fs/xfs/xfs_vnodeops.c - 1.735 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.735&r2=text&tr2=1.734&f=h - Use xfs_inode_clean() rather than open coding the check. fs/xfs/xfs_inode_item.h - 1.50 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode_item.h.diff?r1=text&tr1=1.50&r2=text&tr2=1.49&f=h - Use xfs_inode_clean() rather than open coding the check. fs/xfs/xfs_inode.c - 1.494 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.494&r2=text&tr2=1.493&f=h - Use xfs_inode_clean() rather than open coding the check. From owner-xfs@oss.sgi.com Thu Feb 14 23:50:16 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 23:50:23 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1F7oDOG019413 for ; Thu, 14 Feb 2008 23:50:15 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA15490; Fri, 15 Feb 2008 18:50:34 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 30F6058C4C11; Fri, 15 Feb 2008 18:50:34 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: PARTIAL TAKE 975671 - Prevent AIL lock contention during transaction completion Message-Id: <20080215075034.30F6058C4C11@chook.melbourne.sgi.com> Date: Fri, 15 Feb 2008 18:50:34 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/5829/Thu Feb 14 20:00:17 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14450 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Prevent AIL lock contention during transaction completion When hundreds of processors attempt to commit transactions at the same time, they can contend on the AIL lock when updating the tail LSN held in the in-core log structure. At the moment, the tail LSN is only needed when actually writing out an iclog, so it really does not need to be updated on every single transaction completion - only those that result in switching iclogs and flushing them to disk. The result is that we reduce the number of times we need to grab the AIL lock and the log grant lock by up to two orders of magnitude on large processor count machines. The problem has previously been hidden by AIL lock contention walking the AIL list which was recently solved and uncovered this issue. Date: Fri Feb 15 18:49:54 AEDT 2008 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: tes@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30504a fs/xfs/xfs_log.c - 1.347 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log.c.diff?r1=text&tr1=1.347&r2=text&tr2=1.346&f=h - Only update the tail lsn when we need to write it to disk rather than every time we release an iclog to reduce lock contention on the AIL lock. From owner-xfs@oss.sgi.com Thu Feb 14 23:56:03 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 14 Feb 2008 23:56:08 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1F7tuEw019935 for ; Thu, 14 Feb 2008 23:56:02 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA15579; Fri, 15 Feb 2008 18:56:16 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 3993958C4C11; Fri, 15 Feb 2008 18:56:16 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 975671 - Use atomics for iclog reference counting Message-Id: <20080215075616.3993958C4C11@chook.melbourne.sgi.com> Date: Fri, 15 Feb 2008 18:56:16 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/5829/Thu Feb 14 20:00:17 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14451 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Use atomics for iclog reference counting Now that we update the log tail LSN less frequently on transaction completion, we pass the contention straight to the global log state lock (l_iclog_lock) during transaction completion. We currently have to take this lock to decrement the iclog reference count. there is a reference count on each iclog, so we need to take þhe global lock for all refcount changes. When large numbers of processes are all doing small trnasctions, the iclog reference counts will be quite high, and the state change that absolutely requires the l_iclog_lock is the except rather than the norm. Change the reference counting on the iclogs to use atomic_inc/dec so that we can use atomic_dec_and_lock during transaction completion and avoid the need for grabbing the l_iclog_lock for every reference count decrement except the one that matters - the last. Date: Fri Feb 15 18:55:54 AEDT 2008 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: tes@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30505a fs/xfs/xfsidbg.c - 1.344 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfsidbg.c.diff?r1=text&tr1=1.344&r2=text&tr2=1.343&f=h - use correct atomic accessor for the iclog refcount. fs/xfs/xfs_log.c - 1.348 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log.c.diff?r1=text&tr1=1.348&r2=text&tr2=1.347&f=h - Reduce contention on the iclog state lock by using atomic reference counters for the iclogs and only grabbing the iclog lock on transaction completion when the last reference to the iclog is being removed. fs/xfs/xfs_log_priv.h - 1.126 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_priv.h.diff?r1=text&tr1=1.126&r2=text&tr2=1.125&f=h - change ic_refcount to an atomic_t. From owner-xfs@oss.sgi.com Fri Feb 15 08:19:11 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 08:19:20 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1FGJAhI016839 for ; Fri, 15 Feb 2008 08:19:11 -0800 X-ASG-Debug-ID: 1203092374-2513006e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 87D2D5DE7C3 for ; Fri, 15 Feb 2008 08:19:35 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id Jv8NY5qEItgxRLkc for ; Fri, 15 Feb 2008 08:19:35 -0800 (PST) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1JQ3HC-0003Rv-9e; Fri, 15 Feb 2008 16:19:02 +0000 Date: Fri, 15 Feb 2008 11:19:02 -0500 From: Christoph Hellwig To: "Josef 'Jeff' Sipek" Cc: xfs@oss.sgi.com, sandeen@sandeen.net X-ASG-Orig-Subj: Re: [PATCH 1/1] XFS: replace *_IDELETE with *_IKEEP Subject: Re: [PATCH 1/1] XFS: replace *_IDELETE with *_IKEEP Message-ID: <20080215161902.GA32398@infradead.org> References: <47B3B6AE.4030505@sandeen.net> <1202975139-10546-1-git-send-email-jeffpc@josefsipek.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1202975139-10546-1-git-send-email-jeffpc@josefsipek.net> User-Agent: Mutt/1.5.17 (2007-11-01) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1203092375 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42297 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5830/Fri Feb 15 05:07:34 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14452 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Thu, Feb 14, 2008 at 02:45:39AM -0500, Josef 'Jeff' Sipek wrote: > Change the *_IDELETE flags to *_IKEEP, and flip the logic as necessary. > > This completely eliminates the no-no-no-idelete madness. > > Additionally, "ikeep" or "noikeep" is always displayed in /proc/mounts > option string. This should help clear up any confusion about what the > current mode is. Looks fine to me, and I think the changed display in /proc//mounts is fine aswell. From owner-xfs@oss.sgi.com Fri Feb 15 08:23:56 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 08:23:59 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1FGNtx4017323 for ; Fri, 15 Feb 2008 08:23:56 -0800 X-ASG-Debug-ID: 1203092659-135f00890000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id D07E3E28B60 for ; Fri, 15 Feb 2008 08:24:19 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id jKf69w4R7docCIsh for ; Fri, 15 Feb 2008 08:24:19 -0800 (PST) Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id BB1F31807DEF8; Fri, 15 Feb 2008 10:23:45 -0600 (CST) Message-ID: <47B5BC8F.2090102@sandeen.net> Date: Fri, 15 Feb 2008 10:23:43 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Christoph Hellwig CC: "Josef 'Jeff' Sipek" , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 1/1] XFS: replace *_IDELETE with *_IKEEP Subject: Re: [PATCH 1/1] XFS: replace *_IDELETE with *_IKEEP References: <47B3B6AE.4030505@sandeen.net> <1202975139-10546-1-git-send-email-jeffpc@josefsipek.net> <20080215161902.GA32398@infradead.org> In-Reply-To: <20080215161902.GA32398@infradead.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1203092660 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42296 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5830/Fri Feb 15 05:07:34 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14453 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Christoph Hellwig wrote: > On Thu, Feb 14, 2008 at 02:45:39AM -0500, Josef 'Jeff' Sipek wrote: >> Change the *_IDELETE flags to *_IKEEP, and flip the logic as necessary. >> >> This completely eliminates the no-no-no-idelete madness. >> >> Additionally, "ikeep" or "noikeep" is always displayed in /proc/mounts >> option string. This should help clear up any confusion about what the >> current mode is. > > Looks fine to me, and I think the changed display in /proc//mounts > is fine aswell. > IMHO if we want to display defaults, then we should probably change it so that all defaults are displayed, and not make noikeep special in this respect. (oh, and "noquota" is already there, too) Doesn't matter much to me either way but it should be consistent across all options, I think. -Eric From owner-xfs@oss.sgi.com Fri Feb 15 10:51:08 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 10:51:10 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=BAYES_50,MIME_8BIT_HEADER autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1FIp50L023894 for ; Fri, 15 Feb 2008 10:51:08 -0800 X-ASG-Debug-ID: 1203101489-6e1b00050000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from fg-out-1718.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id EB0A95DF7F3 for ; Fri, 15 Feb 2008 10:51:29 -0800 (PST) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.154]) by cuda.sgi.com with ESMTP id 81fNVozFyZ1BE3im for ; Fri, 15 Feb 2008 10:51:29 -0800 (PST) Received: by fg-out-1718.google.com with SMTP id e12so541257fga.8 for ; Fri, 15 Feb 2008 10:51:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:user-agent:mime-version:to:cc:subject:content-type:content-transfer-encoding; bh=v0vg0EycYPvG2K66qKcM8GfJDmytvHb0krLF76+PocQ=; b=F0w6RD33IOsoxJna0Z70RSsbP6KEgsPUMwtFRkMJ9FqaaiwHOslmUwM3IMO0y8WpSmiT0uAGhKF9F+TnF2bgHA/Ue9gCogoe+XiLumB6kGmaQd610p8cFQO5hLGgYBEfZKyaJ0ihr++R00OO8PkSX2SD6AsBrCEtcrMeMIYqO2I= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject:content-type:content-transfer-encoding; b=Xj3wOa7WfACCTZbCGLQq3+VWTkN8T9SHQoYbZ788XHheK7VEew68HfPcboBqkjE/8pstju1hMwh0TlZizTMgsqOrqTKcZboTfiesilwDhDCGvK3w8XheZea8oRY7ch/CIPVRBFrcbBJqS8eEL80IvlVB0fbWnr7s9Oqr/HRWpdQ= Received: by 10.86.65.11 with SMTP id n11mr2832377fga.4.1203101091695; Fri, 15 Feb 2008 10:44:51 -0800 (PST) Received: from ?192.168.0.2? ( [86.123.240.62]) by mx.google.com with ESMTPS id e11sm4916838fga.5.2008.02.15.10.44.49 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 15 Feb 2008 10:44:50 -0800 (PST) Message-ID: <47B5DD9C.3080906@gmail.com> Date: Fri, 15 Feb 2008 20:44:44 +0200 From: =?ISO-8859-1?Q?T=F6r=F6k_Edwin?= User-Agent: Mozilla-Thunderbird 2.0.0.9 (X11/20080109) MIME-Version: 1.0 To: Arjan van de Ven CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Marking inode dirty latency > 1000 msec on XFS! Subject: Marking inode dirty latency > 1000 msec on XFS! Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: fg-out-1718.google.com[72.14.220.154] X-Barracuda-Start-Time: 1203101490 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42304 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5832/Fri Feb 15 08:26:21 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14454 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: edwintorok@gmail.com Precedence: bulk X-list: xfs Hi, Using LatencyTOP 0.3, on the latest 2.6.25-git I see latencies of over a second on __mark_ inode_dirty. [see a stacktrace at end of this email] I tried to locate xfs's implementation of super_operations.dirty_inode, but it isn't specified in xfs_super.c. I don't know how mark_inode_dirty ends up calling xfs_trans_commit, but is it required to commit the dirty status of an inode to the transaction log? FWIW, this is a slow laptop hdd (5400 rpm, ST96812AS), but latency of 1 second is still big. Are there any settings I can tweak to reduce latency? LatencyTOP output during a 'svn up' on llvm-gcc source tree: Cause Maximum Percentage Marking inode dirty 1105.8 msec 7.8 % _xfs_buf_ioapply default_wake_function xlog_state_1065.2 msec 7.0 % Deleting an inode 964.8 msec 20.0 % _xfs_buf_ioapply default_wake_function xlog_state_780.1 msec 8.3 % _xfs_buf_ioapply default_wake_function xlog_state_679.4 msec 3.3 % _xfs_buf_ioapply default_wake_function xlog_state_610.1 msec 5.6 % XFS I/O wait 585.9 msec 12.6 % _xfs_buf_ioapply default_wake_function xlog_state_528.8 msec 6.8 % Creating block layer request 499.6 msec 5.7 % Earlier I've seen this latencyTOP output too: Cause Maximum Percentage XFS I/O wait 407.6 msec 53.4 % Marking inode dirty 173.0 msec 0.9 % Writing a page to disk 141.6 msec 42.6 % __generic_unplug_device default_wake_function xfs_ 86.0 msec 0.3 % Page fault 44.1 msec 0.2 % kobject_put put_device blk_start_queueing __generi 15.9 msec 0.1 % Scheduler: waiting for cpubuf_find kmem_zone_alloc 12.4 msec 2.2 % put_device scsi_request_fn blk_start_queueing defa 4.9 msec 0.0 % Waiting for event (poll) 4.7 msec 0.4 % Process svn (10685) Writing a page to disk 23.9 msec 55.9 % XFS I/O wait 15.9 msec 35.2 % Scheduler: waiting for cpu 0.8 msec 8.9 % Raw output from /proc/latency shows stacktrace: 7 93862 26567 _xfs_buf_ioapply default_wake_function xlog_state_get_iclog_space xlog_state_release_iclog xlog_write xfs_log_write _xfs_trans_commit __mark_inode_dirty igrab xfs_create xfs_vn_mknod security_inode_permission 1 96331 96331 default_wake_function xlog_state_get_iclog_space xlog_state_release_iclog xlog_write xfs_log_write _xfs_trans_commit __mark_inode_dirty igrab xfs_create xfs_vn_mknod security_inode_permission xfs_vn_permission Best regards, --Edwin From owner-xfs@oss.sgi.com Fri Feb 15 11:15:35 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 11:15:40 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_92 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1FJFYFX025377 for ; Fri, 15 Feb 2008 11:15:35 -0800 X-ASG-Debug-ID: 1203102958-6ddf00380000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bob.dscon.sk (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3BFB5E31BD7 for ; Fri, 15 Feb 2008 11:15:58 -0800 (PST) Received: from bob.dscon.sk (bob.dscon.sk [88.86.113.10]) by cuda.sgi.com with ESMTP id ovvSlOVVHsN3heIE for ; Fri, 15 Feb 2008 11:15:58 -0800 (PST) Received: by bob.dscon.sk (Postfix, from userid 1007) id 54893DC359; Fri, 15 Feb 2008 20:16:37 +0100 (CET) Date: Fri, 15 Feb 2008 20:16:37 +0100 To: xfs@oss.sgi.com X-ASG-Orig-Subj: rewrite very slow Subject: rewrite very slow Message-ID: <20080215191636.GC4859@bob.dscon.sk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.13 (2006-08-11) From: xfs@bob.dscon.sk (DS) X-Barracuda-Connect: bob.dscon.sk[88.86.113.10] X-Barracuda-Start-Time: 1203102959 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42305 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5832/Fri Feb 15 08:26:21 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14455 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: xfs@bob.dscon.sk Precedence: bulk X-list: xfs Hello, I need some help to tunning my storage. Very simple test at begin (perl script test.pl): #!/usr/bin/perl $time=time(); for ($i=1;$i<100;$i++) { open (SUBOR,">$i.txt"); print SUBOR "aaaaaaaaaaaaaaaaaaa\n"; close (SUBOR); print "WRITE $i. FILE\n"; } $time2=time(); $rozdiel_casov=$time2-$time; print "TIME ".$rozdiel_casov." sekund\n"; First run: file1:/mnt/hosting/test# ./test.pl WRITE 1. FILE WRITE 2. FILE ... WRITE 98. FILE WRITE 99. FILE TIME 0 sekund Every next run (rewrite existing files) is very very very slow: file1:/mnt/hosting/test# ./test.pl WRITE 1. FILE WRITE 2. FILE ... WRITE 98. FILE WRITE 99. FILE TIME 43 sekund I try it on ext3. Every run is only 1-3seconds. I tried tunning my storage. My last mount specification: (rw,noatime,nodiratime,nobarrier,osyncisosync,logbsize=256k,logbufs=8,quota) I tried all possible configuration (barrier, nobarrier, osyncisosync, logbssize=XX,logbufs=2-8) nothing help me. file1:/mnt/hosting/test# xfs_info /mnt/hosting/ meta-data=/dev/mapper/vghosting-hosting isize=256 agcount=32, agsize=7626752 blks = sectsz=512 attr=2 data = bsize=4096 blocks=244056064, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=2 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 I test it on 4 other xfs storages and results was similar. Any ideas? Thanks. Dusan From owner-xfs@oss.sgi.com Fri Feb 15 16:35:45 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 16:35:49 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_44, J_CHICKENPOX_52,J_CHICKENPOX_74 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1G0ZhHJ019114 for ; Fri, 15 Feb 2008 16:35:44 -0800 X-ASG-Debug-ID: 1203122167-1ea903cd0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 89FE1E344FD for ; Fri, 15 Feb 2008 16:36:07 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id WeYXcmt42aRiNcmY for ; Fri, 15 Feb 2008 16:36:07 -0800 (PST) Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id C8BEA18004B44; Fri, 15 Feb 2008 18:36:06 -0600 (CST) Message-ID: <47B62FF6.2000903@sandeen.net> Date: Fri, 15 Feb 2008 18:36:06 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: DS CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: rewrite very slow Subject: Re: rewrite very slow References: <20080215191636.GC4859@bob.dscon.sk> In-Reply-To: <20080215191636.GC4859@bob.dscon.sk> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1203122168 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42319 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14456 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs DS wrote: > Hello, > > I need some help to tunning my storage. ... > TIME 43 sekund What kernel? when I test on my 2.6.23.9-85.fc8 and 2.6.22.5 boxes, I see 2 and 7 seconds for rewrite, respectively. but granted, on ext3 I get 0 seconds for every run. Also the difference appears to be O_TRUNC (which the perl script does); if I code it in c: #include #include #include void main(void) { int i; int fd; char file[4]; for (i = 0; i < 100; i++) { sprintf(file, "%d.txt", i); fd = open(file, O_CREAT|O_RDWR|O_TRUNC, 0644); write(fd, "aaaaaaaaaaaaaaaaaaa\n"); close(fd); } } rewrite is a bit slower w/ O_TRUNC in place, plenty fast w/o it. Not sure about the xfs/ext3 difference... this is probably a side-effect of flushes xfs put into place on truncate (IIRC?) -Eric From owner-xfs@oss.sgi.com Fri Feb 15 17:33:48 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 17:33:53 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1G1XlNV021699 for ; Fri, 15 Feb 2008 17:33:48 -0800 X-ASG-Debug-ID: 1203125651-6776001c0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from flyingAngel.upjs.sk (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 960115E1BCD for ; Fri, 15 Feb 2008 17:34:11 -0800 (PST) Received: from flyingAngel.upjs.sk (static113-109.rudna.net [212.20.113.109]) by cuda.sgi.com with ESMTP id ZXKQA7fB1OQa1UtD for ; Fri, 15 Feb 2008 17:34:11 -0800 (PST) Received: by flyingAngel.upjs.sk (Postfix, from userid 500) id A46C928E65A; Sat, 16 Feb 2008 02:34:08 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by flyingAngel.upjs.sk (Postfix) with ESMTP id 90D212235A2 for ; Sat, 16 Feb 2008 02:34:08 +0100 (CET) Date: Sat, 16 Feb 2008 02:34:08 +0100 (CET) From: Jan Derfinak To: xfs@oss.sgi.com X-ASG-Orig-Subj: Differences in mkfs.xfs and xfs_info output. Subject: Differences in mkfs.xfs and xfs_info output. Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Barracuda-Connect: static113-109.rudna.net[212.20.113.109] X-Barracuda-Start-Time: 1203125652 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42320 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14457 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: ja@mail.upjs.sk Precedence: bulk X-list: xfs Hello. I found following problem with xfs_info (xfs_grows -p xfs_info) command: # mkfs.xfs -f /dev/loop0 meta-data=/dev/loop0 isize=256 agcount=4, agsize=32000 blks = sectsz=512 attr=2 data = bsize=4096 blocks=128000, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=1200, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 # mount /dev/loop0 /mnt/usb # xfs_info /mnt/usb meta-data=/dev/loop0 isize=256 agcount=4, agsize=32000 blks = sectsz=512 attr=0 data = bsize=4096 blocks=128000, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 log =internal bsize=4096 blocks=1200, version=2 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 Can somebody explain the difference in attr and lazy-count parameters? Which output is the right one? mkfs.xfs version 2.9.6 SGI-XFS CVS-2008-02-14_08:00_UTC with ACLs, large block/inode numbers, no debug enabled SGI XFS Quota Management subsystem architecture x86_64 # xfs_db -r /dev/loop0 xfs_db> sb xfs_db> print magicnum = 0x58465342 blocksize = 4096 dblocks = 128000 rblocks = 0 rextents = 0 uuid = 925f5530-62d4-4385-ad9e-e5c72e2fb609 logstart = 65540 rootino = 128 rbmino = 129 rsumino = 130 rextsize = 1 agblocks = 32000 agcount = 4 rbmblocks = 0 logblocks = 1200 versionnum = 0xb4a4 sectsize = 512 inodesize = 256 inopblock = 16 fname = "\000\000\000\000\000\000\000\000\000\000\000\000" blocklog = 12 sectlog = 9 inodelog = 8 inopblog = 4 agblklog = 15 rextslog = 0 inprogress = 0 imax_pct = 25 icount = 64 ifree = 61 fdblocks = 126780 frextents = 0 uquotino = 0 gquotino = 0 qflags = 0 flags = 0 shared_vn = 0 inoalignmt = 2 unit = 0 width = 0 dirblklog = 0 logsectlog = 0 logsectsize = 0 logsunit = 1 features2 = 0 Thanks, Jan -- From owner-xfs@oss.sgi.com Fri Feb 15 18:46:28 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 18:46:31 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_44, J_CHICKENPOX_52,J_CHICKENPOX_74 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1G2kNnT023329 for ; Fri, 15 Feb 2008 18:46:27 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA22040; Sat, 16 Feb 2008 13:46:40 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1G2kcLF65152734; Sat, 16 Feb 2008 13:46:38 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1G2kY0c65147798; Sat, 16 Feb 2008 13:46:34 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Sat, 16 Feb 2008 13:46:34 +1100 From: David Chinner To: Eric Sandeen Cc: DS , xfs@oss.sgi.com Subject: Re: rewrite very slow Message-ID: <20080216024634.GU155407@sgi.com> References: <20080215191636.GC4859@bob.dscon.sk> <47B62FF6.2000903@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47B62FF6.2000903@sandeen.net> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14458 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Fri, Feb 15, 2008 at 06:36:06PM -0600, Eric Sandeen wrote: > DS wrote: > > Hello, > > > > I need some help to tunning my storage. > > ... > > TIME 43 sekund > > > What kernel? when I test on my 2.6.23.9-85.fc8 and 2.6.22.5 boxes, I > see 2 and 7 seconds for rewrite, respectively. > > but granted, on ext3 I get 0 seconds for every run. > > Also the difference appears to be O_TRUNC (which the perl script does); > if I code it in c: > > #include > #include > #include > > void main(void) > { > int i; > int fd; > char file[4]; > > for (i = 0; i < 100; i++) { > sprintf(file, "%d.txt", i); > fd = open(file, O_CREAT|O_RDWR|O_TRUNC, 0644); > write(fd, "aaaaaaaaaaaaaaaaaaa\n"); > close(fd); > } > } > > rewrite is a bit slower w/ O_TRUNC in place, plenty fast w/o it. Not > sure about the xfs/ext3 difference... this is probably a side-effect of > flushes xfs put into place on truncate (IIRC?) Yup - after a truncate we use flush-on-close semantics if the file is closed before pdflush does writeback. yes, it has a measurable impact on silly microbenchmarks like this, but nobody even noticed it when we introduced this code 2-3 years ago... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Fri Feb 15 20:10:23 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 20:10:29 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1G4ALwK025769 for ; Fri, 15 Feb 2008 20:10:23 -0800 X-ASG-Debug-ID: 1203135045-6ad7007f0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5D5BBE369A9 for ; Fri, 15 Feb 2008 20:10:45 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id YxV1235eBnHOE4NK for ; Fri, 15 Feb 2008 20:10:45 -0800 (PST) Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 3EADE18004B4C; Fri, 15 Feb 2008 22:10:12 -0600 (CST) Message-ID: <47B66223.4080604@sandeen.net> Date: Fri, 15 Feb 2008 22:10:11 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Jan Derfinak CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Differences in mkfs.xfs and xfs_info output. Subject: Re: Differences in mkfs.xfs and xfs_info output. References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1203135046 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42323 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14459 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Jan Derfinak wrote: > Hello. > > I found following problem with xfs_info (xfs_grows -p xfs_info) command: > > # mkfs.xfs -f /dev/loop0 > meta-data=/dev/loop0 isize=256 agcount=4, agsize=32000 blks > = sectsz=512 attr=2 > data = bsize=4096 blocks=128000, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 > log =internal log bsize=4096 blocks=1200, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > # mount /dev/loop0 /mnt/usb > # xfs_info /mnt/usb > meta-data=/dev/loop0 isize=256 agcount=4, agsize=32000 blks > = sectsz=512 attr=0 > data = bsize=4096 blocks=128000, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 > log =internal bsize=4096 blocks=1200, version=2 > = sectsz=512 sunit=0 blks, lazy-count=0 > realtime =none extsz=4096 blocks=0, rtextents=0 > > Can somebody explain the difference in attr and lazy-count parameters? > Which output is the right one? ... > features2 = 0 Looks like neither XFS_SB_VERSION2_ATTR2BIT nor XFS_SB_VERSION2_LAZYSBCOUNTBIT is in fact set. You're on x86_64 aren't you... This reminds me of an email I sent in 2006.... > If you do a fresh mkfs.xfs on x86_64, with -i attr=2, and dump out the > superblock (or look at it with xfs_db) you will find that although the > versionnum says that there is a morebits bit, the features2 flag is 0. > > if you dd/hexdump the superblock, you will find the attr2 flag, but at > the wrong offset. > > This is because the xfs_sb_t struct is padded out to 64 bits on 64-bit > arches, and the xfs_xlatesb() routine and xfs_sb_info[] array take this > padding to mean that the last item is 4 bytes bigger than it is, and > treats sb_features2 as 8 bytes not four. This then gets endian-flipped out... which is exactly what is (still) going on. if you hexdump out the filesystem that was made you'll see: .... sbqf sv vn inode_algn sbunit sbwidth 000000b0 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 |................| bl sl lgss logsunit features2 (nothing) 000000c0 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 08 |................| that "08" out in no mans land is what would *like* to be features2 - and if it were, it'd give you attr2. Urk. -Eric From owner-xfs@oss.sgi.com Fri Feb 15 20:33:43 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 20:33:45 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1G4XgGO030965 for ; Fri, 15 Feb 2008 20:33:42 -0800 X-ASG-Debug-ID: 1203136446-2bd000f50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 808285E21E2 for ; Fri, 15 Feb 2008 20:34:06 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id VMWRnOHoZoUBBJNN for ; Fri, 15 Feb 2008 20:34:06 -0800 (PST) Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id BA57818004B4B; Fri, 15 Feb 2008 22:33:33 -0600 (CST) Message-ID: <47B6679C.2040109@sandeen.net> Date: Fri, 15 Feb 2008 22:33:32 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Jan Derfinak CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Differences in mkfs.xfs and xfs_info output. Subject: Re: Differences in mkfs.xfs and xfs_info output. References: <47B66223.4080604@sandeen.net> In-Reply-To: <47B66223.4080604@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1203136447 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42324 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14460 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Eric Sandeen wrote: > Jan Derfinak wrote: > >> features2 = 0 >> > > > Looks like neither XFS_SB_VERSION2_ATTR2BIT nor > XFS_SB_VERSION2_LAZYSBCOUNTBIT is in fact set. > > You're on x86_64 aren't you... > > This reminds me of an email I sent in 2006.... > Does this make it happier for you? Hmm actually the kernel needs the analogous change too. And then probably something to fix up all the misformatted filesystems out there.... yuck. Hmm and I don't know how lazy-sb-count is getting lost, yet :) Index: xfsprogs/include/xfs_sb.h =================================================================== --- xfsprogs.orig/include/xfs_sb.h +++ xfsprogs/include/xfs_sb.h @@ -151,6 +151,7 @@ typedef struct xfs_sb __uint16_t sb_logsectsize; /* sector size for the log, bytes */ __uint32_t sb_logsunit; /* stripe unit size for the log */ __uint32_t sb_features2; /* additional feature bits */ + __uint32_t sb_dummy; /* explicit padding */ } xfs_sb_t; /* @@ -169,7 +170,7 @@ typedef enum { XFS_SBS_GQUOTINO, XFS_SBS_QFLAGS, XFS_SBS_FLAGS, XFS_SBS_SHARED_VN, XFS_SBS_INOALIGNMT, XFS_SBS_UNIT, XFS_SBS_WIDTH, XFS_SBS_DIRBLKLOG, XFS_SBS_LOGSECTLOG, XFS_SBS_LOGSECTSIZE, XFS_SBS_LOGSUNIT, - XFS_SBS_FEATURES2, + XFS_SBS_FEATURES2, XFS_SBS_DUMMY, XFS_SBS_FIELDCOUNT } xfs_sb_field_t; Index: xfsprogs/libxfs/xfs_mount.c =================================================================== --- xfsprogs.orig/libxfs/xfs_mount.c +++ xfsprogs/libxfs/xfs_mount.c @@ -140,6 +140,7 @@ static struct { { offsetof(xfs_sb_t, sb_logsectsize),0 }, { offsetof(xfs_sb_t, sb_logsunit), 0 }, { offsetof(xfs_sb_t, sb_features2), 0 }, + { offsetof(xfs_sb_t, sb_dummy), 0 }, { sizeof(xfs_sb_t), 0 } }; From owner-xfs@oss.sgi.com Fri Feb 15 20:36:07 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 20:36:10 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1G4a58r031328 for ; Fri, 15 Feb 2008 20:36:07 -0800 X-ASG-Debug-ID: 1203136590-4ae403150000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 55AC45E21FC for ; Fri, 15 Feb 2008 20:36:30 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id yC6SacGR2QiEfnBc for ; Fri, 15 Feb 2008 20:36:30 -0800 (PST) Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 18BBC18004B4C; Fri, 15 Feb 2008 22:36:30 -0600 (CST) Message-ID: <47B6684D.4060502@sandeen.net> Date: Fri, 15 Feb 2008 22:36:29 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Jan Derfinak CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Differences in mkfs.xfs and xfs_info output. Subject: Re: Differences in mkfs.xfs and xfs_info output. References: <47B66223.4080604@sandeen.net> <47B6679C.2040109@sandeen.net> In-Reply-To: <47B6679C.2040109@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1203136591 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42324 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14461 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Eric Sandeen wrote: > Hmm and I don't know how lazy-sb-count is getting lost, yet :) oh same reason of course. it's just that on my mkfs lazy-count isn't default... -Eric From owner-xfs@oss.sgi.com Fri Feb 15 20:41:32 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 20:41:34 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1G4fVHM031787 for ; Fri, 15 Feb 2008 20:41:32 -0800 X-ASG-Debug-ID: 1203136916-6a7400c30000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C75AFE364C4 for ; Fri, 15 Feb 2008 20:41:56 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id huPkmAxRnNUr2XH7 for ; Fri, 15 Feb 2008 20:41:56 -0800 (PST) Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id AD85418004B4B; Fri, 15 Feb 2008 22:41:54 -0600 (CST) Message-ID: <47B66991.5040504@sandeen.net> Date: Fri, 15 Feb 2008 22:41:53 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Jan Derfinak CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Differences in mkfs.xfs and xfs_info output. Subject: Re: Differences in mkfs.xfs and xfs_info output. References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1203136916 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42324 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14462 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Jan Derfinak wrote: > Hello. > > I found following problem with xfs_info (xfs_grows -p xfs_info) command: > > # mkfs.xfs -f /dev/loop0 > meta-data=/dev/loop0 isize=256 agcount=4, agsize=32000 blks > = sectsz=512 attr=2 > data = bsize=4096 blocks=128000, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 > log =internal log bsize=4096 blocks=1200, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > # mount /dev/loop0 /mnt/usb > # xfs_info /mnt/usb > meta-data=/dev/loop0 isize=256 agcount=4, agsize=32000 blks > = sectsz=512 attr=0 > data = bsize=4096 blocks=128000, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 > log =internal bsize=4096 blocks=1200, version=2 > = sectsz=512 sunit=0 blks, lazy-count=0 > realtime =none extsz=4096 blocks=0, rtextents=0 ... sorry for replying to my own thread 100 times, but... do you happen to have a 32-bit mkfs and a 64-bit kernrel? -Eric From owner-xfs@oss.sgi.com Fri Feb 15 21:07:18 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 21:07:29 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1G57Hw1032508 for ; Fri, 15 Feb 2008 21:07:18 -0800 X-ASG-Debug-ID: 1203138462-2b5702c30000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from rv-out-0910.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 17A0FE36B18 for ; Fri, 15 Feb 2008 21:07:42 -0800 (PST) Received: from rv-out-0910.google.com (rv-out-0910.google.com [209.85.198.187]) by cuda.sgi.com with ESMTP id 1IHnTHutEqBCBf1p for ; Fri, 15 Feb 2008 21:07:42 -0800 (PST) Received: by rv-out-0910.google.com with SMTP id k20so576545rvb.32 for ; Fri, 15 Feb 2008 21:07:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition:x-google-sender-auth; bh=njBDmaQlkhjjyabhz8BqP8PSJu0arxXKmZMeRXfkT8g=; b=B3rQJ/z/CqQR2RfK5PPLtk3YiPCBIqSk+XDiaivAlF0KF4XG3d1i9XAXsICFaJ4p33eYMmOZUZgvO9aYuuzzf1qfnTt/eunI3BxJOwtmu+wTbBUaMzmM9uk8QQ3L6+BYRNz3CJKLbIOMP3i7BSmaIcKu3EC6M272yI1aCh1qYxs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition:x-google-sender-auth; b=go2eHfIBQVbTR+kJJb5so+jRWsYRKHy3dt+ISNLfSPzH46HfEWsXi4vFF7UuHgPq3YxBwWjmW23ecyHR6IN641jZXOVkHBlz/X65Qy1wtvse4O/AvIFBiNn+jrotmiPb5MeIBoEzumAvEO93nhrMNM+QiOb293J2CyKos33SdhY= Received: by 10.142.187.2 with SMTP id k2mr299435wff.77.1203138071067; Fri, 15 Feb 2008 21:01:11 -0800 (PST) Received: by 10.142.166.3 with HTTP; Fri, 15 Feb 2008 21:01:10 -0800 (PST) Message-ID: Date: Fri, 15 Feb 2008 21:01:10 -0800 From: "Jeff Breidenbach" To: xfs@oss.sgi.com X-ASG-Orig-Subj: tuning, many small files, small blocksize Subject: tuning, many small files, small blocksize MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Google-Sender-Auth: eb2a6231184f4aad X-Barracuda-Connect: rv-out-0910.google.com[209.85.198.187] X-Barracuda-Start-Time: 1203138463 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42325 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14463 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jeff@jab.org Precedence: bulk X-list: xfs I'm testing xfs for use in storing 100 million+ small files (roughly 4 to 10KB each) and some directories will contain tens of thousands of files. There will be a lot of random reading, and also some random writing, and very little deletion. The underlying disks use linux software RAID-1 manged by mdadm with 5X redundancy. E.g. 5 drives that completely mirror each other. I am setting up the xfs partition now, and have only played with blocksize so far. 512 byte blocks are most space efficient, 1024 byte blocks cost 3.3% additional space, and 4096 byte blocks cost 22.3% additional space. I do not know of a good way to benchmark filesystem speed; iozone -s 5 did not provide meaningful results due to poor timing quantization. My questions are: a) Should I just go with the 512 byte blocksize or is that going to be bad for some performance reason? Going to 1024 is no problem, but I'd prefer not to waste 20% of the partition capacity by using 4096. b) Are there any other mkfs.xfs paramters that I should play with. Thanks for any response; I did do quite some searching for recommended turning parameters, but did not find definitive answers. The general consensus was xfs does pretty good tuning itself, but almost none of the published benchmarks or recommendation go with small blocksizes and I want to make sure I'm not about to do something totally stupid. Like quadruple the number of seeks on the disk. From owner-xfs@oss.sgi.com Fri Feb 15 21:40:37 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 21:40:42 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_44, J_CHICKENPOX_52,J_CHICKENPOX_74 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1G5eZJI001085 for ; Fri, 15 Feb 2008 21:40:36 -0800 X-ASG-Debug-ID: 1203140459-283c01f70000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bob.dscon.sk (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 56F6E5E24BC for ; Fri, 15 Feb 2008 21:41:00 -0800 (PST) Received: from bob.dscon.sk (bob.dscon.sk [88.86.113.10]) by cuda.sgi.com with ESMTP id Qf8KdOWpaBq1LUrY for ; Fri, 15 Feb 2008 21:41:00 -0800 (PST) Received: by bob.dscon.sk (Postfix, from userid 1007) id 802E7DC359; Sat, 16 Feb 2008 06:41:42 +0100 (CET) Date: Sat, 16 Feb 2008 06:41:42 +0100 To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: rewrite very slow Subject: Re: rewrite very slow Message-ID: <20080216054142.GD4859@bob.dscon.sk> References: <20080215191636.GC4859@bob.dscon.sk> <47B62FF6.2000903@sandeen.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47B62FF6.2000903@sandeen.net> User-Agent: Mutt/1.5.13 (2006-08-11) From: xfs@bob.dscon.sk (DS) X-Barracuda-Connect: bob.dscon.sk[88.86.113.10] X-Barracuda-Start-Time: 1203140461 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42326 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14464 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: xfs@bob.dscon.sk Precedence: bulk X-list: xfs Test configuration: Linux kernel 2.6.23.1 #1 SMP 2x Intel(R) Xeon(TM) CPU 2.40GHz with HT iSCSI storage (1TB - 7 sata disks in RAID6, 2GB cache controler) Yes, your "test" works fine: file1:/mnt/hosting/test# time ./test real 0m0.334s user 0m0.000s sys 0m0.000s Is there any way to get it work for perl/php/other scripts/programs? DS On Fri, Feb 15, 2008 at 06:36:06PM -0600, Eric Sandeen wrote: > DS wrote: > > Hello, > > > > I need some help to tunning my storage. > > ... > > TIME 43 sekund > > > What kernel? when I test on my 2.6.23.9-85.fc8 and 2.6.22.5 boxes, I > see 2 and 7 seconds for rewrite, respectively. > > but granted, on ext3 I get 0 seconds for every run. > > Also the difference appears to be O_TRUNC (which the perl script does); > if I code it in c: > > #include > #include > #include > > void main(void) > { > int i; > int fd; > char file[4]; > > for (i = 0; i < 100; i++) { > sprintf(file, "%d.txt", i); > fd = open(file, O_CREAT|O_RDWR|O_TRUNC, 0644); > write(fd, "aaaaaaaaaaaaaaaaaaa\n"); > close(fd); > } > } > > rewrite is a bit slower w/ O_TRUNC in place, plenty fast w/o it. Not > sure about the xfs/ext3 difference... this is probably a side-effect of > flushes xfs put into place on truncate (IIRC?) > > -Eric > > From owner-xfs@oss.sgi.com Fri Feb 15 21:43:14 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 21:43:18 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_44, J_CHICKENPOX_52,J_CHICKENPOX_74 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1G5hDRt001355 for ; Fri, 15 Feb 2008 21:43:14 -0800 X-ASG-Debug-ID: 1203140617-6deb01790000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bob.dscon.sk (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A8C76E36C89 for ; Fri, 15 Feb 2008 21:43:38 -0800 (PST) Received: from bob.dscon.sk (bob.dscon.sk [88.86.113.10]) by cuda.sgi.com with ESMTP id AQP5yjTF50jviAkp for ; Fri, 15 Feb 2008 21:43:38 -0800 (PST) Received: by bob.dscon.sk (Postfix, from userid 1007) id D85FCDC359; Sat, 16 Feb 2008 06:43:49 +0100 (CET) Date: Sat, 16 Feb 2008 06:43:49 +0100 To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: rewrite very slow Subject: Re: rewrite very slow Message-ID: <20080216054349.GE4859@bob.dscon.sk> References: <20080215191636.GC4859@bob.dscon.sk> <47B62FF6.2000903@sandeen.net> <20080216024634.GU155407@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080216024634.GU155407@sgi.com> User-Agent: Mutt/1.5.13 (2006-08-11) From: xfs@bob.dscon.sk (DS) X-Barracuda-Connect: bob.dscon.sk[88.86.113.10] X-Barracuda-Start-Time: 1203140618 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests= X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42327 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14465 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: xfs@bob.dscon.sk Precedence: bulk X-list: xfs Hmm, and what cat I do to get it work now? DS On Sat, Feb 16, 2008 at 01:46:34PM +1100, David Chinner wrote: > On Fri, Feb 15, 2008 at 06:36:06PM -0600, Eric Sandeen wrote: > > DS wrote: > > > Hello, > > > > > > I need some help to tunning my storage. > > > > ... > > > TIME 43 sekund > > > > > > What kernel? when I test on my 2.6.23.9-85.fc8 and 2.6.22.5 boxes, I > > see 2 and 7 seconds for rewrite, respectively. > > > > but granted, on ext3 I get 0 seconds for every run. > > > > Also the difference appears to be O_TRUNC (which the perl script does); > > if I code it in c: > > > > #include > > #include > > #include > > > > void main(void) > > { > > int i; > > int fd; > > char file[4]; > > > > for (i = 0; i < 100; i++) { > > sprintf(file, "%d.txt", i); > > fd = open(file, O_CREAT|O_RDWR|O_TRUNC, 0644); > > write(fd, "aaaaaaaaaaaaaaaaaaa\n"); > > close(fd); > > } > > } > > > > rewrite is a bit slower w/ O_TRUNC in place, plenty fast w/o it. Not > > sure about the xfs/ext3 difference... this is probably a side-effect of > > flushes xfs put into place on truncate (IIRC?) > > Yup - after a truncate we use flush-on-close semantics if the file > is closed before pdflush does writeback. yes, it has a measurable > impact on silly microbenchmarks like this, but nobody even noticed > it when we introduced this code 2-3 years ago... > > Cheers, > > Dave. > -- > Dave Chinner > Principal Engineer > SGI Australian Software Group From owner-xfs@oss.sgi.com Fri Feb 15 23:19:34 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 23:19:38 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m1G7JWcC004136 for ; Fri, 15 Feb 2008 23:19:34 -0800 X-ASG-Debug-ID: 1203146395-5bfe00190000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 46B055E2966; Fri, 15 Feb 2008 23:19:56 -0800 (PST) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id CD22cgL0DPFJUeRy; Fri, 15 Feb 2008 23:19:56 -0800 (PST) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m1G7JdF3010595 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Sat, 16 Feb 2008 08:19:40 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m1G7JdCh010593; Sat, 16 Feb 2008 08:19:39 +0100 Date: Sat, 16 Feb 2008 08:19:39 +0100 From: Christoph Hellwig To: Timothy Shimmin Cc: Christoph Hellwig , xfs@oss.sgi.com, a.gruenbacher@computer.org X-ASG-Orig-Subj: Re: [PATCH, RFC] use generic ACL code Subject: Re: [PATCH, RFC] use generic ACL code Message-ID: <20080216071939.GA10578@lst.de> References: <20080207083222.GA14317@lst.de> <47B3C701.6090409@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47B3C701.6090409@sgi.com> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1203146398 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -0.82 X-Barracuda-Spam-Status: No, SCORE=-0.82 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=3.0 tests=COMMA_SUBJECT, MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.1, rules version 3.1.42329 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 COMMA_SUBJECT Subject is like 'Re: FDSDS, this is a subject' 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14466 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Thu, Feb 14, 2008 at 03:43:45PM +1100, Timothy Shimmin wrote: > Hi Christoph, > > Been going thru some v4 acl code but a couple of comments: > > (1) it looks like you decided that an xfs_iget_acl and xfs_iset_acl > (basing on the ext3 code of Andreas) > are not worth it and you'd prefer to do the code directly. I was actually looking at jfs because I was involved with the creation of that code and it seemed a tad cleaner than ext2/ext3. I'm not sure we want the helpers, but we might need the locking in there. From owner-xfs@oss.sgi.com Fri Feb 15 23:40:08 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 15 Feb 2008 23:40:12 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43, J_CHICKENPOX_66 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m1G7e2mm004769 for ; Fri, 15 Feb 2008 23:40:06 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA29339; Sat, 16 Feb 2008 18:40:22 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id m1G7eLLF64733784; Sat, 16 Feb 2008 18:40:21 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id m1G7eJqE65228997; Sat, 16 Feb 2008 18:40:19 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Sat, 16 Feb 2008 18:40:19 +1100 From: David Chinner To: Jan Derfinak Cc: xfs@oss.sgi.com Subject: Re: Differences in mkfs.xfs and xfs_info output. Message-ID: <20080216074019.GV155407@sgi.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/5833/Fri Feb 15 11:30:30 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 14467 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Sat, Feb 16, 2008 at 02:34:08AM +0100, Jan Derfinak wrote: > Hello. > > I found following problem with xfs_info (xfs_grows -p xfs_info) command: > > # mkfs.xfs -f /dev/loop0 > meta-data=/dev/loop0 isize=256 agcount=4, agsize=32000 blks > = sectsz=512 attr=2 > data = bsize=4096 blocks=128000, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 > log =internal log bsize=4096 blocks=1200, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > # mount /dev/loop0 /mnt/usb > # xfs_info /mnt/usb > meta-data=/dev/loop0 isize=256 agcount=4, agsize=32000 blks > = sectsz=512 attr=0 > data = bsize=4096 blocks=128000, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 > log =internal bsize=4096 blocks=1200, version=2 > = sectsz=512 sunit=0 blks, lazy-count=0 > realtime =none extsz=4096 blocks=0, rtextents=0 > > Can somebody explain the difference in attr and lazy-count parameters? > Which output is the right one? ..... > # xfs_db -r /dev/loop0 > xfs_db> sb > xfs_db> print .... > features2 = 0 That's why. I've been meaning to push out a patch to fix this - just haven't had time. The patch below should fix the problem - mkfs.xfs is writing the features2 field to the wrong location in the superblock, and this patch detects and corrects it. You'll probably see the output: XFS: correcting sb_features alignment problem in dmesg when you first mount the filesystem with the patched kernel, and then it won't appear again (unless you mkfs it again). Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/xfs_mount.c | 34 ++++++++++++++++++++++++++++------ fs/xfs/xfs_sb.h | 37 ++++++++++++++++++++++++++++++++++--- 2 files changed, 62 insertions(+), 9 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_mount.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_mount.c 2007-11-22 10:25:25.590278381 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_mount.c 2007-11-22 10:31:27.891999961 +1100 @@ -44,7 +44,7 @@ #include "xfs_quota.h" #include "xfs_fsops.h" -STATIC void xfs_mount_log_sbunit(xfs_mount_t *, __int64_t); +STATIC void xfs_mount_log_sb(xfs_mount_t *, __int64_t); STATIC int xfs_uuid_mount(xfs_mount_t *); STATIC void xfs_uuid_unmount(xfs_mount_t *mp); STATIC void xfs_unmountfs_wait(xfs_mount_t *); @@ -119,6 +119,7 @@ static const struct { { offsetof(xfs_sb_t, sb_logsectsize),0 }, { offsetof(xfs_sb_t, sb_logsunit), 0 }, { offsetof(xfs_sb_t, sb_features2), 0 }, + { offsetof(xfs_sb_t, sb_bad_features2), 0 }, { sizeof(xfs_sb_t), 0 } }; @@ -455,6 +456,7 @@ xfs_sb_from_disk( to->sb_logsectsize = be16_to_cpu(from->sb_logsectsize); to->sb_logsunit = be32_to_cpu(from->sb_logsunit); to->sb_features2 = be32_to_cpu(from->sb_features2); + to->sb_bad_features2 = be32_to_cpu(from->sb_bad_features2); } /* @@ -976,6 +978,26 @@ xfs_mountfs( xfs_mount_common(mp, sbp); /* + * Check for a bad features2 field alignment. This happened on + * some platforms due to xfs_sb_t not being 64bit size aligned + * when sb_features was added and hence the compiler put it in + * the wrong place. + * + * If we detect a bad field, we or the set bits into the existing + * features2 field in case it has already been modified and we + * don't want to lose any features. Zero the bad one and mark + * the two fields as needing updates once the transaction subsystem + * is online. + */ + if (xfs_sb_has_bad_features2(sbp)) { + cmn_err(CE_WARN, + "XFS: correcting sb_features alignment problem"); + sbp->sb_features2 |= sbp->sb_bad_features2; + sbp->sb_bad_features2 = 0; + update_flags |= XFS_SB_FEATURES2 | XFS_SB_BAD_FEATURES2; + } + + /* * Check if sb_agblocks is aligned at stripe boundary * If sb_agblocks is NOT aligned turn off m_dalign since * allocator alignment is within an ag, therefore ag has @@ -1165,11 +1187,10 @@ xfs_mountfs( } /* - * If fs is not mounted readonly, then update the superblock - * unit and width changes. + * If fs is not mounted readonly, then update the superblock c