From owner-xfs@oss.sgi.com Fri Aug 1 00:29:25 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 00:29:34 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_72 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m717TP56015704 for ; Fri, 1 Aug 2008 00:29:25 -0700 X-ASG-Debug-ID: 1217575837-16fe03ce0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 79D85EEAF81 for ; Fri, 1 Aug 2008 00:30:37 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id FgzSmxucNgrc8tx0 for ; Fri, 01 Aug 2008 00:30:37 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEALlWkkh5LDlw/2dsb2JhbACLIKVP X-IronPort-AV: E=Sophos;i="4.31,291,1215354600"; d="scan'208";a="162074320" Received: from ppp121-44-57-112.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.57.112]) by ipmail01.adl6.internode.on.net with ESMTP; 01 Aug 2008 17:00:34 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KOp5t-0006ji-HJ; Fri, 01 Aug 2008 17:30:33 +1000 Date: Fri, 1 Aug 2008 17:30:33 +1000 From: Dave Chinner To: Jasper Bryant-Greene Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com, hch@lst.de X-ASG-Orig-Subj: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Subject: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Message-ID: <20080801073033.GF6201@disturbed> Mail-Followup-To: Jasper Bryant-Greene , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, hch@lst.de References: <1217553598.3860.16.camel@luna.unix.geek.nz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1217553598.3860.16.camel@luna.unix.geek.nz> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217575838 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1425 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17291 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs [cc'd xfs@oss.sgi.com] On Fri, Aug 01, 2008 at 01:19:58PM +1200, Jasper Bryant-Greene wrote: > I mount a bunch of XFS filesystems with "noikeep,attr2,noatime", and a > couple also with "ro". > > Sometimes I want to remount one of the "ro" ones "rw", to make changes. > This doesn't work anymore in 2.6.27-rc1-next-20080730. The last kernel I > tried which it worked in was 2.6.26 release. > > luna ~ # mount -o remount,rw /usr > mount: /usr not mounted already, or bad option > > luna ~ # dmesg | tail -n 1 > [18702.291344] XFS: mount option "noikeep" not supported for remount > > Is this intended behaviour? Side effect of this commit: http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs-2.6.git;a=commit;h=0327f9d799ebb96f67c80dd732b1fdb09527365e Christoph? FWIW, noikeep is the default, so you don't need to specify it. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Fri Aug 1 01:11:35 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 01:11:38 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m718BYBO019934 for ; Fri, 1 Aug 2008 01:11:34 -0700 X-ASG-Debug-ID: 1217578366-0e6d00de0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from lucidpixels.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C260E1967F12 for ; Fri, 1 Aug 2008 01:12:46 -0700 (PDT) Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by cuda.sgi.com with ESMTP id 8LxbVS60esK6GoHG for ; Fri, 01 Aug 2008 01:12:46 -0700 (PDT) Received: by lucidpixels.com (Postfix, from userid 1001) id D87A06E22A; Fri, 1 Aug 2008 04:12:45 -0400 (EDT) Date: Fri, 1 Aug 2008 04:12:45 -0400 (EDT) From: Justin Piszcz To: Matt Garman cc: David Greaves , linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: 12 Veliciraptors again w/x4 card (~1gbyte/sec aggregate read)! Subject: Re: 12 Veliciraptors again w/x4 card (~1gbyte/sec aggregate read)! In-Reply-To: <20080801034150.GA32026@sewage.raw-sewage.fake> Message-ID: References: <4872A2F4.2010604@dgreaves.com> <20080801034150.GA32026@sewage.raw-sewage.fake> User-Agent: Alpine 1.10 (DEB 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Barracuda-Connect: lucidpixels.com[75.144.35.66] X-Barracuda-Start-Time: 1217578367 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1428 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17292 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs On Thu, 31 Jul 2008, Matt Garman wrote: > On Tue, Jul 08, 2008 at 04:39:34AM -0400, Justin Piszcz wrote: >> On Tue, 8 Jul 2008, Justin Piszcz wrote: >>> On Tue, 8 Jul 2008, David Greaves wrote: >>>> Justin Piszcz wrote: >>>>> Each PCI-e x1 card has 1 veliciraptor on it now. >>>>> Got an x4 card wit 4 sata ports: >>>> Useful - which card? >>> >>> StarTech 4 Port PCI Express x4 SATA II Card Model PEXSATA24E >>> >>> >> Chipset: Marvell 88SX7042 >> SATA Connectivity: Use four internal ports at the same time or two internal >> and two external ports >> >> Which is fully supported in the latest kernels (didn't try an old kernel): >> >> linux-2.6.25.10/drivers/ata/sata_mv.c >> >> /* Marvell 7042 support */ >> { PCI_VDEVICE(MARVELL, 0x7042), chip_7042 }, > > How have that card and the array attached to it been doing since you > originally posted this? No problems to report yet, working well, I wish I had multiple x4 or x16 slots and use those instead of x1 cards (the mobo would need to support the I/O as well).. > > Doesn't that HighPoint 23xx card also use the Marvell 88SX7042 chip? I read remeber 'locked' 64kb read problem but not any corruption issue. > I remember seeing the threads about the HighPoint card silently > corrupting data by silently writing over parts of the disk(s) with Also I MD5SUM -c (verify) 100-150GiB of data daily or so to make sure it is OK, so far, no problems. > its own info. I'm guessing that's not a "feature" of the Marvell > chip, but still... admittedly irrational fear here :) Have not seen it yet. > > It would be nice if that Supermicro 8-port SATA card (AOC-SAT2-MV8) > was available in PCIe... I wonder if it's possible to just graft > multiple SiI 3132 controllers (2 SATA each) on a single PCIe card? Yes, I asked Supermicro about it and they never mailed me back about a PCIe version.. That's a good question, I suppose if it was easy it would have already been done? Justin. From owner-xfs@oss.sgi.com Fri Aug 1 05:31:44 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 05:32:11 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71CVik8013327 for ; Fri, 1 Aug 2008 05:31:44 -0700 X-ASG-Debug-ID: 1217593976-7dab00b50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id BF7C434EA5E for ; Fri, 1 Aug 2008 05:32:56 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id ZGkFRW4HSHMJf3oq for ; Fri, 01 Aug 2008 05:32:56 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEAAmdkkh5LDlw/2dsb2JhbACLIKYk X-IronPort-AV: E=Sophos;i="4.31,293,1215354600"; d="scan'208";a="162219518" Received: from ppp121-44-57-112.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.57.112]) by ipmail01.adl6.internode.on.net with ESMTP; 01 Aug 2008 22:02:54 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KOtoT-0004qh-3S; Fri, 01 Aug 2008 22:32:53 +1000 Date: Fri, 1 Aug 2008 22:32:53 +1000 From: Dave Chinner To: Timothy Shimmin Cc: "Linda A. Walsh" , xfs-oss X-ASG-Orig-Subj: Re: Question about extended attributes... Subject: Re: Question about extended attributes... Message-ID: <20080801123253.GG6201@disturbed> Mail-Followup-To: Timothy Shimmin , "Linda A. Walsh" , xfs-oss References: <48925495.7040804@tlinx.org> <4892B361.9030900@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4892B361.9030900@sgi.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217593977 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1444 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17293 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Fri, Aug 01, 2008 at 04:55:29PM +1000, Timothy Shimmin wrote: > Hi Linda, > > Linda A. Walsh wrote: > > my man page says extended xfs attributes can have 256-byte names > > with up to 64K of data. > > > > Is there a limit on the number of extended attributes max data size or > > name size? > > > > I.e. could I have 1000 attributes with 64K of data each? > > > Yep. > > > Is there a strong reason why the file and data sizes were limited to > > 256/64K? .... > I'm not sure why 64K was chosen for a value size limit. Because changes to EAs are journalled. Hence there must be a bound size limit because log space is limited. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Fri Aug 1 12:01:31 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:01:35 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71J1TRn016095 for ; Fri, 1 Aug 2008 12:01:31 -0700 X-ASG-Debug-ID: 1217617361-6f9802950000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 568E5EECD81; Fri, 1 Aug 2008 12:02:41 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id 2iJ922mYLnWryOEb; Fri, 01 Aug 2008 12:02:41 -0700 (PDT) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1KOztg-0002UY-FB; Fri, 01 Aug 2008 19:02:40 +0000 Date: Fri, 1 Aug 2008 15:02:40 -0400 From: Christoph Hellwig To: Lachlan McIlroy , torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, xfs@oss.sgi.com, akpm@linux-foundation.org X-ASG-Orig-Subj: Re: [GIT PULL] XFS update for 2.6.27 Subject: Re: [GIT PULL] XFS update for 2.6.27 Message-ID: <20080801190240.GA23604@infradead.org> References: <20080731025328.6328358C4C3F@chook.melbourne.sgi.com> <20080731034952.GC13395@disturbed> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080731034952.GC13395@disturbed> User-Agent: Mutt/1.5.18 (2008-05-17) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1217617363 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1468 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17294 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Thu, Jul 31, 2008 at 01:49:52PM +1000, Dave Chinner wrote: > author of that patch. I don't know why there's duplicates of it > in the tree - the only thing I can see is that the committer's > email address is not stable. There is a growing number of > duplicate commits like this in the XFS tree..... Yes, I pointed this out before. But at this point I'd rather have the XFS changes in .27 and not completely miss the window even if that means some patches are not correctly atributed to me. From owner-xfs@oss.sgi.com Fri Aug 1 12:21:33 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:21:35 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JLXsL022388 for ; Fri, 1 Aug 2008 12:21:33 -0700 X-ASG-Debug-ID: 1217618566-184601f80000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0A890EECFA5; Fri, 1 Aug 2008 12:22:46 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id oDawIsWtqxBJF8Yk; Fri, 01 Aug 2008 12:22:46 -0700 (PDT) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1KP0D7-0007pX-Tg; Fri, 01 Aug 2008 19:22:45 +0000 Date: Fri, 1 Aug 2008 15:22:45 -0400 From: Christoph Hellwig To: Barry Naujok Cc: "xfs@oss.sgi.com" X-ASG-Orig-Subj: Re: [REVIEW] Cleanup more dir v1 macros and stuff Subject: Re: [REVIEW] Cleanup more dir v1 macros and stuff Message-ID: <20080801192245.GA26807@infradead.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1217618567 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1472 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17295 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Tue, Jul 29, 2008 at 11:29:36AM +1000, Barry Naujok wrote: > Well, it seems I found a few remnants (cookie crumbs) of dirv1 stuff > in the kernel! This patch removes those remnants! Looks good. From owner-xfs@oss.sgi.com Fri Aug 1 12:30:34 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:30:37 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_72, RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JUV42023276 for ; Fri, 1 Aug 2008 12:30:34 -0700 X-ASG-Debug-ID: 1217619103-5b8a01b40000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0F44419693FD for ; Fri, 1 Aug 2008 12:31:43 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id jHIXvIJMLnwE85BA for ; Fri, 01 Aug 2008 12:31:43 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JVYIF001202 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:31:34 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71JVXYn001200; Fri, 1 Aug 2008 21:31:33 +0200 Date: Fri, 1 Aug 2008 21:31:33 +0200 From: Christoph Hellwig To: Jasper Bryant-Greene , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, util-linux-ng@vger.kernel.org X-ASG-Orig-Subj: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Subject: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Message-ID: <20080801193133.GA838@lst.de> References: <1217553598.3860.16.camel@luna.unix.geek.nz> <20080801073033.GF6201@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080801073033.GF6201@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217619105 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1473 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17296 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Fri, Aug 01, 2008 at 05:30:33PM +1000, Dave Chinner wrote: > On Fri, Aug 01, 2008 at 01:19:58PM +1200, Jasper Bryant-Greene wrote: > > I mount a bunch of XFS filesystems with "noikeep,attr2,noatime", and a > > couple also with "ro". > > > > Sometimes I want to remount one of the "ro" ones "rw", to make changes. > > This doesn't work anymore in 2.6.27-rc1-next-20080730. The last kernel I > > tried which it worked in was 2.6.26 release. > > > > luna ~ # mount -o remount,rw /usr > > mount: /usr not mounted already, or bad option > > > > luna ~ # dmesg | tail -n 1 > > [18702.291344] XFS: mount option "noikeep" not supported for remount > > > > Is this intended behaviour? > > Side effect of this commit: > > http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs-2.6.git;a=commit;h=0327f9d799ebb96f67c80dd732b1fdb09527365e > > Christoph? I'ts most likely a fallout, but I wonder why. To get this behaviour moutn would have to add all the options it finds in /proc/self/mounts to the command line. Added the util-linux-ng list to shed some light on this, but I suspect I'll have to change xfs_fs_remount to check if the option passed in is identical to the one we already have and only reject it when it changes due to this mount(1) dumbness. From owner-xfs@oss.sgi.com Fri Aug 1 12:33:04 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:33:06 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JX4eW023720 for ; Fri, 1 Aug 2008 12:33:04 -0700 X-ASG-Debug-ID: 1217619256-35d602cf0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id DA78779F8F1 for ; Fri, 1 Aug 2008 12:34:16 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id 5SzCTZFbpCnClorY for ; Fri, 01 Aug 2008 12:34:16 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JYIIF001361 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:34:18 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71JYIN9001359; Fri, 1 Aug 2008 21:34:18 +0200 Date: Fri, 1 Aug 2008 21:34:18 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 00/21] btree cleanups and unification Subject: Re: [PATCH 00/21] btree cleanups and unification Message-ID: <20080801193418.GA1263@lst.de> References: <20080729193000.GA19104@lst.de> <20080730005338.GG13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730005338.GG13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217619257 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1472 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17297 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Wed, Jul 30, 2008 at 10:53:39AM +1000, Dave Chinner wrote: > On Tue, Jul 29, 2008 at 09:30:00PM +0200, Christoph Hellwig wrote: > > This is the full btree unficiation based on Dave's initial patches plus > > various cleanups. This second versions post now contains the complete > > btree refactoring, and passes XFSQA. > > Cool. Are the first few patches unchanged from the ones I reviewed > earlier? As mentioned below only the trace and readahead patches changed a little bit. From owner-xfs@oss.sgi.com Fri Aug 1 12:33:59 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:34:04 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JXnU9023922 for ; Fri, 1 Aug 2008 12:33:59 -0700 X-ASG-Debug-ID: 1217619302-35b5029f0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 102A379F655 for ; Fri, 1 Aug 2008 12:35:02 -0700 (PDT) Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id MSZf9a3CUzBCftQU for ; Fri, 01 Aug 2008 12:35:02 -0700 (PDT) Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 3CD61AC6275; Fri, 1 Aug 2008 14:35:01 -0500 (CDT) Message-ID: <48936564.706@sandeen.net> Date: Fri, 01 Aug 2008 14:35:00 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.16 (Macintosh/20080707) MIME-Version: 1.0 To: Christoph Hellwig CC: Barry Naujok , "xfs@oss.sgi.com" X-ASG-Orig-Subj: Re: [REVIEW] Cleanup more dir v1 macros and stuff Subject: Re: [REVIEW] Cleanup more dir v1 macros and stuff References: <20080801192245.GA26807@infradead.org> In-Reply-To: <20080801192245.GA26807@infradead.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1217619303 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1472 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17298 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Christoph Hellwig wrote: > On Tue, Jul 29, 2008 at 11:29:36AM +1000, Barry Naujok wrote: >> Well, it seems I found a few remnants (cookie crumbs) of dirv1 stuff >> in the kernel! This patch removes those remnants! > > > Looks good. > > is the 0x7fffffff stuff in xfs_file.c also a dir v1 remnant? cvs history seems to indicate so.... -Eric From owner-xfs@oss.sgi.com Fri Aug 1 12:35:11 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:35:12 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_64, RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JZA4p024489 for ; Fri, 1 Aug 2008 12:35:10 -0700 X-ASG-Debug-ID: 1217619382-435602f80000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3B64C1969487 for ; Fri, 1 Aug 2008 12:36:23 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id 1BydtQl8xroo0TOQ for ; Fri, 01 Aug 2008 12:36:23 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JaPIF001471 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:36:25 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71JaPaP001469; Fri, 1 Aug 2008 21:36:25 +0200 Date: Fri, 1 Aug 2008 21:36:25 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 08/21] make btree tracing generic Subject: Re: [PATCH 08/21] make btree tracing generic Message-ID: <20080801193625.GB1263@lst.de> References: <20080729193044.GI19104@lst.de> <20080730013920.GI13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730013920.GI13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217619384 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1473 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17299 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Wed, Jul 30, 2008 at 11:39:20AM +1000, Dave Chinner wrote: > > + void (*trace_enter)(struct xfs_btree_cur *, const char *, > > + char *, int, int, __psunsigned_t, > > + __psunsigned_t, __psunsigned_t, > > + __psunsigned_t, __psunsigned_t, > > + __psunsigned_t, __psunsigned_t, > > + __psunsigned_t, __psunsigned_t, > > + __psunsigned_t, __psunsigned_t); > > Would it be better to use a 'trace args' structure here rather > than passing a heap of parameters? memset(args, 0,...) rather than > passing a whole heap of zeros in most cases seems like a better > approach to me, esp. as they all get cast to (void *) anyway.... Well, we need to case to void * somewhere, and going from struct to arguments list isn't helpful either. Changing the whole ktrace thing to an args strucutr and/or varags would be nice, but is not in scope for this patchset.. > > +static void > > +xfsidb_btree_trace_record( > > shouldn't these all use "xfsidbg" prefixes? Probably. From owner-xfs@oss.sgi.com Fri Aug 1 12:38:46 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:38:49 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JckLP024998 for ; Fri, 1 Aug 2008 12:38:46 -0700 X-ASG-Debug-ID: 1217619598-351602cd0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1390179F96A for ; Fri, 1 Aug 2008 12:39:58 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id Zr3kXKc4lVE7HkcM for ; Fri, 01 Aug 2008 12:39:58 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71Je0IF001662 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:40:00 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71Je0kl001660; Fri, 1 Aug 2008 21:40:00 +0200 Date: Fri, 1 Aug 2008 21:40:00 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 09/21] implement generic xfs_btree_increment Subject: Re: [PATCH 09/21] implement generic xfs_btree_increment Message-ID: <20080801194000.GC1263@lst.de> References: <20080729193053.GJ19104@lst.de> <20080730020645.GJ13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730020645.GJ13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217619600 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1472 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17300 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Wed, Jul 30, 2008 at 12:06:45PM +1000, Dave Chinner wrote: > > +int xfs_btree_increment(struct xfs_btree_cur *, int, int *); > > I think for these interfaces it woul dbe better to include > variable names in the prototypes. i.e.: > > int xfs_btree_increment(struct xfs_btree_cur *cur, int level, int *stat); Well, all this and much more is documented at the implementation site, so I'd avoid this churn. If there are strong feelings for it I can change the prototypes. > Can kill the else there: > > if (cur->bc_flags & XFS_BTREE_LONG_PTRS) > return be64_to_cpu(ptr->l) == NULLFSBLOCK; > return be32_to_cpu(ptr->s) == NULLAGBLOCK; Normally I'd agree with you, but for the short/long pointer case the if else make the symmetry of the code clear and thus is IMHO better readable. > > + ptr->s = block->bb_u.s.bb_rightsib; > > + else > > + ptr->s = block->bb_u.s.bb_leftsib; > > + } > > +} > > Should we use trinary notation for this? i.e: I find the if else much easier to read. > Hmmm - if xfs_btree_check_block() returns an error, we won't release > the buffer in the error handling path. While this may not be a > problem right now (as the only error is EFSCORRUPTED) we want to be > able to recover from errors here in the future so we should really > exit cleanly form this function.... Fixed. > Would it be better here to explicitly test this rather than assert? > ie.: > > if (lev == cur->bc_nlevels) { > if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) > goto out0; > ASSERT(0); > error = EFSCORRUPTED; > goto error0; > } > > So that we get an explicit error reported in the production systems > rather than carrying on and dying a horrible death somewhere else.... Yes, this is much better. But we're getting on a slipperly slope here to introduce too many improvements. For the initial patches I'd like to stay as close as possible to the old btree implementations. Anyway, this change is trivial enough and I've added it. From owner-xfs@oss.sgi.com Fri Aug 1 12:39:05 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:39:06 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71Jd4IY025126 for ; Fri, 1 Aug 2008 12:39:04 -0700 X-ASG-Debug-ID: 1217619616-56c701d70000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 564701969148 for ; Fri, 1 Aug 2008 12:40:17 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id ZfNPWglSUvzbnPnl for ; Fri, 01 Aug 2008 12:40:17 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JeJIF001690 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:40:19 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71JeJBL001688; Fri, 1 Aug 2008 21:40:19 +0200 Date: Fri, 1 Aug 2008 21:40:19 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 10/21] implement generic xfs_btree_decrement Subject: Re: [PATCH 10/21] implement generic xfs_btree_decrement Message-ID: <20080801194019.GD1263@lst.de> References: <20080729193058.GK19104@lst.de> <20080730020919.GK13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730020919.GK13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217619618 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1473 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17301 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Wed, Jul 30, 2008 at 12:09:19PM +1000, Dave Chinner wrote: > On Tue, Jul 29, 2008 at 09:30:58PM +0200, Christoph Hellwig wrote: > > From: Dave Chinner > > > > [hch: split out from bigger patch and minor adaptions] > > > > Signed-off-by: Christoph Hellwig > .... > > + /* > > + * If we went off the root then we are seriously confused. > > + * or the root of the tree is in an inode. > > + */ > > + if (lev == cur->bc_nlevels) { > > + ASSERT(cur->bc_flags & XFS_BTREE_ROOT_IN_INODE); > > + goto out0; > > + } > > Same question as for the increment case. Ok, changed. From owner-xfs@oss.sgi.com Fri Aug 1 12:42:25 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:42:28 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JgN2D025847 for ; Fri, 1 Aug 2008 12:42:25 -0700 X-ASG-Debug-ID: 1217619815-033902d30000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 76B7EEECE54 for ; Fri, 1 Aug 2008 12:43:36 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id jyQHmKs3e9O8FkXt for ; Fri, 01 Aug 2008 12:43:36 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JhaIF001834 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:43:36 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71Jha3U001831; Fri, 1 Aug 2008 21:43:36 +0200 Date: Fri, 1 Aug 2008 21:43:36 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 11/21] implement generic xfs_btree_lookup Subject: Re: [PATCH 11/21] implement generic xfs_btree_lookup Message-ID: <20080801194336.GE1263@lst.de> References: <20080729193104.GL19104@lst.de> <20080730045937.GL13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730045937.GL13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217619817 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1474 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17302 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Wed, Jul 30, 2008 at 02:59:37PM +1000, Dave Chinner wrote: > > + if (level == 0) { > > + union xfs_btree_rec *krp; > > + > > + krp = cur->bc_ops->rec_addr(cur, keyno, block); > > + cur->bc_ops->init_key_from_rec(cur, kp, krp); > > + return kp; > > I think the record pointer variable "krp" is confusing with "kp", the key > pointer. It's a record pointer; 'rp' it should be. Ok, fixed. > Can you make the {} consistent here ant move the comment for the > else? i.e. Yeah, done. > > + /* Return if we succeeded or not. */ > > + if (keyno == 0 || keyno > xfs_btree_get_numrecs(block)) > > + *stat = 0; > > + else > > + *stat = ((dir != XFS_LOOKUP_EQ) || (diff == 0)); > > Can probably kill all the extra () in that. I've turned this into an if else if else with the superflous braces removed. > > --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc.c 2008-07-15 17:46:52.000000000 +0200 > > +++ linux-2.6-xfs/fs/xfs/xfs_alloc.c 2008-07-15 17:51:41.000000000 +0200 > > @@ -90,6 +90,54 @@ STATIC int xfs_alloc_ag_vextent_small(xf > > */ > > > > /* > > + * Lookup the record equal to [bno, len] in the btree given by cur. > > + */ > > +STATIC int /* error */ > > +xfs_alloc_lookup_eq( > > Should these be xfs_allocbt_lookup_*() to be consistent > with all the other allocbt functions (and inobt_lookup/bmbt_lookup)? Currently only the btree_ops methods are named allocbt. Comments on that scheme would be appreciated. From owner-xfs@oss.sgi.com Fri Aug 1 12:43:39 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:43:42 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71Jhdh8026218 for ; Fri, 1 Aug 2008 12:43:39 -0700 X-ASG-Debug-ID: 1217619890-034b02cc0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3D7BEEECE8D for ; Fri, 1 Aug 2008 12:44:51 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id kn13qPxLKWg7KvlN for ; Fri, 01 Aug 2008 12:44:51 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JirIF001895 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:44:53 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71Jirfa001893; Fri, 1 Aug 2008 21:44:53 +0200 Date: Fri, 1 Aug 2008 21:44:52 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 12/21] implement generic xfs_btree_updkey Subject: Re: [PATCH 12/21] implement generic xfs_btree_updkey Message-ID: <20080801194452.GF1263@lst.de> References: <20080729193110.GM19104@lst.de> <20080730050936.GM13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730050936.GM13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217619893 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1474 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17303 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Wed, Jul 30, 2008 at 03:09:36PM +1000, Dave Chinner wrote: > > + int ptr; > > +#ifdef DEBUG > > + int error; > > +#endif > > This can be scoped inside the for loop. Done. > And even then I think we might not need an error variable - it can > only return EFSCORRUPTED, so: > > #ifdef DEBUG > if (xfs_btree_check_block(cur, block, level, bp)) { > XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); > return EFSCORRUPTED; > } > #endif > > Would remove the need for the error variable. I'm not a big fan of losing detailed error information. Depending on how the bigger btree blocks work out we might return other errors here in the near future. From owner-xfs@oss.sgi.com Fri Aug 1 12:45:10 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:45:13 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JjAMj026682 for ; Fri, 1 Aug 2008 12:45:10 -0700 X-ASG-Debug-ID: 1217619982-034002e60000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 2D79DEECEAF for ; Fri, 1 Aug 2008 12:46:22 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id HeRMpGKsBZpdtR9O for ; Fri, 01 Aug 2008 12:46:22 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JkOIF001957 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:46:24 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71JkO7E001955; Fri, 1 Aug 2008 21:46:24 +0200 Date: Fri, 1 Aug 2008 21:46:24 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 13/21] implement generic xfs_btree_update Subject: Re: [PATCH 13/21] implement generic xfs_btree_update Message-ID: <20080801194624.GG1263@lst.de> References: <20080729193116.GN19104@lst.de> <20080730052959.GN13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730052959.GN13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217619984 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1474 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17304 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Wed, Jul 30, 2008 at 03:29:59PM +1000, Dave Chinner wrote: > Oh, it's be moved inside the update code itself. So, why always call > the update function and then check the ptr? Why not the way it was > originally done? Because all three callers do different checks, and I could not proof that they are either identical or hamrless for the other cases. We can clean this mess up later in small standalone patches. > > + /* updated last record information */ > > + void (*update_lastrec)(struct xfs_btree_cur *, > > + struct xfs_btree_block *, > > + union xfs_btree_rec *, int, int); > > Can you add the variable names to the prototype parameters? Done. From owner-xfs@oss.sgi.com Fri Aug 1 12:45:34 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:45:37 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JjY4x026877 for ; Fri, 1 Aug 2008 12:45:34 -0700 X-ASG-Debug-ID: 1217620006-5b8a020e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B011C19694BD for ; Fri, 1 Aug 2008 12:46:46 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id UqLSddrE3S7uYl8U for ; Fri, 01 Aug 2008 12:46:46 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JkmIF001985 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:46:48 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71Jkm4a001983; Fri, 1 Aug 2008 21:46:48 +0200 Date: Fri, 1 Aug 2008 21:46:48 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 14/21] add get_maxrecs btree operation Subject: Re: [PATCH 14/21] add get_maxrecs btree operation Message-ID: <20080801194648.GH1263@lst.de> References: <20080729193121.GO19104@lst.de> <20080730053455.GO13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730053455.GO13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217620007 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1473 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17305 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Wed, Jul 30, 2008 at 03:34:55PM +1000, Dave Chinner wrote: > On Tue, Jul 29, 2008 at 09:31:21PM +0200, Christoph Hellwig wrote: > > Factor xfs_btree_maxrecs into a per-btree operation. > > > > > > Signed-off-by: Christoph Hellwig > > ->get_maxrecs was extracted out of my original patchset, right? > > I didn't apply it to removing xfs_btree_maxrecs(), though. Good > idea. Looks fine to me. Yeah, should have added a comment that this has been taken from the big patch. From owner-xfs@oss.sgi.com Fri Aug 1 12:48:00 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:48:01 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_62, J_CHICKENPOX_64,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JlxZH027520 for ; Fri, 1 Aug 2008 12:48:00 -0700 X-ASG-Debug-ID: 1217620151-5b9201fa0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 35F471623F45 for ; Fri, 1 Aug 2008 12:49:12 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id A2VJIBGzgXV6IG3Q for ; Fri, 01 Aug 2008 12:49:12 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JnEIF002080 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:49:14 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71JnEt0002078; Fri, 1 Aug 2008 21:49:14 +0200 Date: Fri, 1 Aug 2008 21:49:14 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 15/21] implement generic xfs_btree_rshift Subject: Re: [PATCH 15/21] implement generic xfs_btree_rshift Message-ID: <20080801194914.GI1263@lst.de> References: <20080729193125.GP19104@lst.de> <20080730060808.GP13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730060808.GP13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217620153 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1473 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17306 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs > > +{ > > + ASSERT(from >= 0 && from <= 1000); > > + ASSERT(to >= 0 && to <= 1000); > > + ASSERT(numptrs >= 0); > > Those numbers are not safe. I plucked them out of thin air to verify > validity on 4k block size filesystem which had (IIRC) a max of about > 500 ptrs to a block. It was throwaway debug code to find a problem. > Larger block sizes can well exceed 1000. So realistically, the only > valid assert there is this one: > > ASSERT(numptrs >= 0); Yeah, I've actually managed to trigger it now. The >=0 checks for from and to still make sense, although they might be overkill. > How about: > > union xfs_btree_ptr *pp; > xfs_caddr_t *block = XFS_BUF_TO_BLOCK(bp); > xfs_caddr_t start; /* first byte offset logged */ > xfs_caddr_t end; /* last byte offset logged */ > > pp = cur->bc_ops->ptr_addr(cur, 1, XFS_BUF_TO_BLOCK(bp)); > > if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { > __be64 *lpp = &pp->l; > > start = (xfs_caddr_t)&lpp[first - 1] - block; > end = ((xfs_caddr_t)&lpp[last] - 1) - block; > } else { > __be32 *spp = &pp->s; > > start = (xfs_caddr_t)&spp[first - 1] - block; > end = ((xfs_caddr_t)&spp[last] - 1) - block; > } > > xfs_trans_log_buf(cur->bc_tp, bp, (int)start, (int)end); > > That makes it much easier to read (to me, anyway). Yes, absolutely. And there's also another set of useless braces. I've also applied a similar cleanup to the log_keys and log_recs implementations. > > + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); > > + XFS_BTREE_TRACE_ARGBI(cur, bp, fields); > > + > > + if (bp) { > > + xfs_btree_offsets(fields, > > + (cur->bc_flags & XFS_BTREE_LONG_PTRS) ? > > + loffsets : soffsets, > ^^ > Some stray whitespace there. Fixed. > > + XFS_BB_NUM_BITS, &first, &last); > > + xfs_trans_log_buf(cur->bc_tp, bp, first, last); > > + } else { > > + /* XXX(hch): maybe factor out into a method? */ > > + xfs_trans_log_inode(cur->bc_tp, cur->bc_private.b.ip, > > + XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); > > I don't think it is necessary at this point. It's the only leakage of the detailed inode root implementation into the generic code, so I'm still wondering whether a method would be better. From owner-xfs@oss.sgi.com Fri Aug 1 12:51:36 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:51:37 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JpZmo028078 for ; Fri, 1 Aug 2008 12:51:36 -0700 X-ASG-Debug-ID: 1217620367-584301f00000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id DC4321969680 for ; Fri, 1 Aug 2008 12:52:47 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id 1CMxQ23uQF61qxEA for ; Fri, 01 Aug 2008 12:52:47 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JqnIF002292 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:52:50 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71JqngI002290; Fri, 1 Aug 2008 21:52:49 +0200 Date: Fri, 1 Aug 2008 21:52:49 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 16/21] implement generic xfs_btree_lshift Subject: Re: [PATCH 16/21] implement generic xfs_btree_lshift Message-ID: <20080801195249.GJ1263@lst.de> References: <20080729193132.GQ19104@lst.de> <20080730062422.GQ13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730062422.GQ13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217620368 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1475 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17307 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs > > +xfs_btree_copy_ptrs( > > + struct xfs_btree_cur *cur, > > + union xfs_btree_ptr *src_ptr, > > + union xfs_btree_ptr *dst_ptr, > > + int numptrs) > > +{ > > + ASSERT(numptrs > 0); > > + > > + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) > > + memcpy(dst_ptr, src_ptr, numptrs * sizeof(__be64)); > > + else > > + memcpy(dst_ptr, src_ptr, numptrs * sizeof(__be32)); > > +} > > These should really use memmove, not memcpy. There is no guarantee > the source and destination do not overlap. > > At minimum, we need comments to say this must only be used to > copy between blocks, and xfs_btree_move_ptrs() must be used to > copy within a block. I note the original patchset of mine > commented on this distinction when defining the ->move_* and > ->copy_* operations. > > FWIW, that also helps explain why they have different interfaces... There were some comments in the pre-walkthru cleanup version but they were already lost in that patch. But yes, adding some comments makes sense. Or moving back to single one that unlike your very first version always passes src and dst pointers and always uses memmove. > This can be moved inside the branch it is used in. Done. > Move the XFS_BTREE_STATS_ADD() above the comment to match the rshift > code. Done. From owner-xfs@oss.sgi.com Fri Aug 1 12:53:56 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:53:58 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JrtNe028501 for ; Fri, 1 Aug 2008 12:53:55 -0700 X-ASG-Debug-ID: 1217620507-763103280000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id BDFA0EED08C for ; Fri, 1 Aug 2008 12:55:07 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id LG5LbHyakmEweHdm for ; Fri, 01 Aug 2008 12:55:07 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71Jt7IF002453 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:55:07 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71Jt7jP002451; Fri, 1 Aug 2008 21:55:07 +0200 Date: Fri, 1 Aug 2008 21:55:07 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 17/21] implement generic xfs_btree_split Subject: Re: [PATCH 17/21] implement generic xfs_btree_split Message-ID: <20080801195507.GK1263@lst.de> References: <20080729192113.493074843@verein.lst.de> <20080729193137.GR19104@lst.de> <20080730065349.GR13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730065349.GR13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217620509 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1474 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17308 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs > This is where I begin to question this approach (i.e. using > helpers like this rather than specific ops like I did). It's > taken me 4 ??r 5 patches to put my finger on it. > > The intent of this factorisation is to make implementing new btree > structures easy, not making the current code better or more > managable. The first thing we need is is btrees with different > header blocks (self describing information, CRCs, etc). This above > function will suddenly have four combinations to deal with - long and > short, version 1 and version 2 header formats. The more we change, > the more this complicates these helpers. That is why I pushed > seemingly trivial stuff out to operations vectors - because of the > future flexibility it allowed in implementation of new btrees..... > > I don't see this a problem for this patch series, but I can see that > some of this work will end up being converted back to ops vectors > as soon as we start modifying between structures.... Maybe. But even when we convert it to ops vectors it should not be the btree implementation vector, but a btree_block_ops that's implemented once instead of duplicated for the alloc vs ialloc btree. And for now having all this in xfs_btree.c makes reading and working on the patch series easier, so.. > > + /* need to sort out how callers deal with failures first */ > > + ASSERT(!(flags & XFS_BUF_TRYLOCK)); > > + > > + d = xfs_btree_ptr_to_daddr(cur, ptr); > > + *bpp = xfs_trans_get_buf(cur->bc_tp, mp->m_ddev_targp, d, > > + mp->m_bsize, flags); > > + > > + ASSERT(*bpp); > > + ASSERT(!XFS_BUF_GETERROR(*bpp)); > > xfs_trans_get_buf() can return NULL, right? Only when XFS_BUF_TRYLOCK is set, which it never is for the btree code, and the assert above makes sure we catch any new caller early. > > + /* block allocation / freeing */ > > + int (*alloc_block)(struct xfs_btree_cur *cur, > > + union xfs_btree_ptr *sbno, > > + union xfs_btree_ptr *nbno, > > start_bno, new_bno. Done. From owner-xfs@oss.sgi.com Fri Aug 1 12:54:42 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 12:54:44 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_62, J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71JsgKN028830 for ; Fri, 1 Aug 2008 12:54:42 -0700 X-ASG-Debug-ID: 1217620554-6f97039f0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 27313EEC4A1 for ; Fri, 1 Aug 2008 12:55:54 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id urN8ecjl7BPmuyPE for ; Fri, 01 Aug 2008 12:55:54 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71JtuIF002510 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 21:55:56 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71JtuW6002508; Fri, 1 Aug 2008 21:55:56 +0200 Date: Fri, 1 Aug 2008 21:55:56 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 21/21] clean up xfs_bmap_btree.c Subject: Re: [PATCH 21/21] clean up xfs_bmap_btree.c Message-ID: <20080801195556.GL1263@lst.de> References: <20080729193200.GV19104@lst.de> <20080730073245.GV13395@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080730073245.GV13395@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217620556 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1474 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17309 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Wed, Jul 30, 2008 at 05:32:45PM +1000, Dave Chinner wrote: > > + ASSERT(nrecs == crecs); > > + kp = xfs_bmbt_key_addr(cur, 1, block); > > + ckp = xfs_bmbt_key_addr(cur, 1, cblock); > > + memcpy(kp, ckp, nrecs * sizeof(xfs_bmbt_key_t)); > > Should really be xfs_bmbt_copy_keys(), right? Yes, fixed. > > + memcpy(pp, cpp, nrecs * sizeof(xfs_bmbt_key_t)); > > xfs_bmbt_copy_recs()? Yes, fixed. From owner-xfs@oss.sgi.com Fri Aug 1 14:17:04 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 14:17:05 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m71LH2bR002221 for ; Fri, 1 Aug 2008 14:17:03 -0700 X-ASG-Debug-ID: 1217625484-3fa702930000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id BEC82EEDE55 for ; Fri, 1 Aug 2008 14:18:14 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id NDHJcMaEBADVbetj for ; Fri, 01 Aug 2008 14:18:14 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m71LI7IF006672 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 1 Aug 2008 23:18:07 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m71LI6cK006670; Fri, 1 Aug 2008 23:18:06 +0200 Date: Fri, 1 Aug 2008 23:18:06 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 2/3] optimize xfs_ichgtime Subject: Re: [PATCH 2/3] optimize xfs_ichgtime Message-ID: <20080801211806.GA6529@lst.de> References: <20080726063335.GB22603@lst.de> <20080726091230.GS5947@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080726091230.GS5947@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217625495 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1480 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17310 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Sat, Jul 26, 2008 at 07:12:30PM +1000, Dave Chinner wrote: > On Sat, Jul 26, 2008 at 08:33:35AM +0200, Christoph Hellwig wrote: > > Port a little optmization from file_update_time to xfs_ichgtime, and > > only update the timestamp and mark the inode dirty if the timestamp > > actually changes in the timer tick resultion supported by the running > > kernel. > > Looks ok. Updated version that removes the I_NEW check (I could do this in patch 1, but doing it here causes less churn) Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_iops.c 2008-07-26 11:14:45.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_iops.c 2008-07-26 11:15:54.000000000 +0200 @@ -97,17 +97,23 @@ xfs_ichgtime( { struct inode *inode = VFS_I(ip); timespec_t tv; + int sync_it = 0; - nanotime(&tv); - if (flags & XFS_ICHGTIME_MOD) { + tv = current_fs_time(inode->i_sb); + + if ((flags & XFS_ICHGTIME_MOD) && + !timespec_equal(&inode->i_mtime, &tv)) { inode->i_mtime = tv; ip->i_d.di_mtime.t_sec = (__int32_t)tv.tv_sec; ip->i_d.di_mtime.t_nsec = (__int32_t)tv.tv_nsec; + sync_it = 1; } - if (flags & XFS_ICHGTIME_CHG) { + if ((flags & XFS_ICHGTIME_CHG) && + !timespec_equal(&inode->i_ctime, &tv)) { inode->i_ctime = tv; ip->i_d.di_ctime.t_sec = (__int32_t)tv.tv_sec; ip->i_d.di_ctime.t_nsec = (__int32_t)tv.tv_nsec; + sync_it = 1; } /* @@ -119,10 +125,11 @@ xfs_ichgtime( * ensure that the compiler does not reorder the update * of i_update_core above the timestamp updates above. */ - SYNCHRONIZE(); - ip->i_update_core = 1; - if (!(inode->i_state & I_NEW)) + if (sync_it) { + SYNCHRONIZE(); + ip->i_update_core = 1; mark_inode_dirty_sync(inode); + } } /* From owner-xfs@oss.sgi.com Fri Aug 1 17:33:57 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 17:34:29 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m720Xv7a019133 for ; Fri, 1 Aug 2008 17:33:57 -0700 X-ASG-Debug-ID: 1217637309-26d303aa0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 09D5B2FF491 for ; Fri, 1 Aug 2008 17:35:10 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id mrCcq1W0WvGEFFte for ; Fri, 01 Aug 2008 17:35:10 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEAMlFk0h5LDlw/2dsb2JhbACLH6Rf X-IronPort-AV: E=Sophos;i="4.31,296,1215354600"; d="scan'208";a="162464048" Received: from ppp121-44-57-112.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.57.112]) by ipmail01.adl6.internode.on.net with ESMTP; 02 Aug 2008 10:05:08 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KP55O-0005Th-RK; Sat, 02 Aug 2008 10:35:06 +1000 Date: Sat, 2 Aug 2008 10:35:06 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 09/21] implement generic xfs_btree_increment Subject: Re: [PATCH 09/21] implement generic xfs_btree_increment Message-ID: <20080802003506.GJ6201@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080729193053.GJ19104@lst.de> <20080730020645.GJ13395@disturbed> <20080801194000.GC1263@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080801194000.GC1263@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217637311 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1492 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17311 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Fri, Aug 01, 2008 at 09:40:00PM +0200, Christoph Hellwig wrote: > On Wed, Jul 30, 2008 at 12:06:45PM +1000, Dave Chinner wrote: > > > +int xfs_btree_increment(struct xfs_btree_cur *, int, int *); > > > > I think for these interfaces it woul dbe better to include > > variable names in the prototypes. i.e.: > > > > int xfs_btree_increment(struct xfs_btree_cur *cur, int level, int *stat); > > Well, all this and much more is documented at the implementation site, > so I'd avoid this churn. If there are strong feelings for it I > can change the prototypes. Not a big deal, really. I was just looking at consistency with the other prototypes around this one.... > > Would it be better here to explicitly test this rather than assert? > > ie.: > > > > if (lev == cur->bc_nlevels) { > > if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) > > goto out0; > > ASSERT(0); > > error = EFSCORRUPTED; > > goto error0; > > } > > > > So that we get an explicit error reported in the production systems > > rather than carrying on and dying a horrible death somewhere else.... > > Yes, this is much better. But we're getting on a slipperly slope here > to introduce too many improvements. For the initial patches I'd like > to stay as close as possible to the old btree implementations. Sure, but this particular piece of code is different to wthe original btree anyway because it has to handle two different cases. better to get it right first time, IMO. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Fri Aug 1 18:12:52 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 18:12:53 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m721Cpum021806 for ; Fri, 1 Aug 2008 18:12:51 -0700 X-ASG-Debug-ID: 1217639642-37be01040000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C2466196B220 for ; Fri, 1 Aug 2008 18:14:03 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id CfDGaCG3TOEDPv5A for ; Fri, 01 Aug 2008 18:14:03 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEAFBQk0h5LDlw/2dsb2JhbACLH6RY X-IronPort-AV: E=Sophos;i="4.31,296,1215354600"; d="scan'208";a="162481693" Received: from ppp121-44-57-112.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.57.112]) by ipmail01.adl6.internode.on.net with ESMTP; 02 Aug 2008 10:44:01 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KP5h2-0006LB-NO; Sat, 02 Aug 2008 11:14:00 +1000 Date: Sat, 2 Aug 2008 11:14:00 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 11/21] implement generic xfs_btree_lookup Subject: Re: [PATCH 11/21] implement generic xfs_btree_lookup Message-ID: <20080802011400.GK6201@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080729193104.GL19104@lst.de> <20080730045937.GL13395@disturbed> <20080801194336.GE1263@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080801194336.GE1263@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217639644 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1495 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17312 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Fri, Aug 01, 2008 at 09:43:36PM +0200, Christoph Hellwig wrote: > On Wed, Jul 30, 2008 at 02:59:37PM +1000, Dave Chinner wrote: > > > --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc.c 2008-07-15 17:46:52.000000000 +0200 > > > +++ linux-2.6-xfs/fs/xfs/xfs_alloc.c 2008-07-15 17:51:41.000000000 +0200 > > > @@ -90,6 +90,54 @@ STATIC int xfs_alloc_ag_vextent_small(xf > > > */ > > > > > > /* > > > + * Lookup the record equal to [bno, len] in the btree given by cur. > > > + */ > > > +STATIC int /* error */ > > > +xfs_alloc_lookup_eq( > > > > Should these be xfs_allocbt_lookup_*() to be consistent > > with all the other allocbt functions (and inobt_lookup/bmbt_lookup)? > > Currently only the btree_ops methods are named allocbt. Comments on > that scheme would be appreciated. I think I used 'abt' for it. Perhaps 'albt'? allocbt is probably ok, though, just longer. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Fri Aug 1 18:13:31 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 18:13:39 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m721DVv5022110 for ; Fri, 1 Aug 2008 18:13:31 -0700 X-ASG-Debug-ID: 1217639684-6f1801a50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 27DE6EF0C34 for ; Fri, 1 Aug 2008 18:14:44 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id PN8bTZHTJPa5s2Ya for ; Fri, 01 Aug 2008 18:14:44 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEAFBQk0h5LDlw/2dsb2JhbACLH6RY X-IronPort-AV: E=Sophos;i="4.31,296,1215354600"; d="scan'208";a="162482320" Received: from ppp121-44-57-112.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.57.112]) by ipmail01.adl6.internode.on.net with ESMTP; 02 Aug 2008 10:44:42 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KP5hi-0006MG-IA; Sat, 02 Aug 2008 11:14:42 +1000 Date: Sat, 2 Aug 2008 11:14:42 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 12/21] implement generic xfs_btree_updkey Subject: Re: [PATCH 12/21] implement generic xfs_btree_updkey Message-ID: <20080802011442.GL6201@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080729193110.GM19104@lst.de> <20080730050936.GM13395@disturbed> <20080801194452.GF1263@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080801194452.GF1263@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217639685 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1496 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17313 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Fri, Aug 01, 2008 at 09:44:52PM +0200, Christoph Hellwig wrote: > On Wed, Jul 30, 2008 at 03:09:36PM +1000, Dave Chinner wrote: > > And even then I think we might not need an error variable - it can > > only return EFSCORRUPTED, so: > > > > #ifdef DEBUG > > if (xfs_btree_check_block(cur, block, level, bp)) { > > XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); > > return EFSCORRUPTED; > > } > > #endif > > > > Would remove the need for the error variable. > > I'm not a big fan of losing detailed error information. Depending > on how the bigger btree blocks work out we might return other errors > here in the near future. Fair enough. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Fri Aug 1 18:14:12 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 18:14:17 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m721ECDk022516 for ; Fri, 1 Aug 2008 18:14:12 -0700 X-ASG-Debug-ID: 1217639724-6b9301920000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id BB57F352C7A for ; Fri, 1 Aug 2008 18:15:25 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id F6hpy0xbA8HlFo7W for ; Fri, 01 Aug 2008 18:15:25 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEAFBQk0h5LDlw/2dsb2JhbACLH6RY X-IronPort-AV: E=Sophos;i="4.31,296,1215354600"; d="scan'208";a="162482876" Received: from ppp121-44-57-112.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.57.112]) by ipmail01.adl6.internode.on.net with ESMTP; 02 Aug 2008 10:45:24 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KP5iN-0006Ni-Ot; Sat, 02 Aug 2008 11:15:23 +1000 Date: Sat, 2 Aug 2008 11:15:23 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 13/21] implement generic xfs_btree_update Subject: Re: [PATCH 13/21] implement generic xfs_btree_update Message-ID: <20080802011523.GM6201@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080729193116.GN19104@lst.de> <20080730052959.GN13395@disturbed> <20080801194624.GG1263@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080801194624.GG1263@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217639725 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1495 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17314 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Fri, Aug 01, 2008 at 09:46:24PM +0200, Christoph Hellwig wrote: > On Wed, Jul 30, 2008 at 03:29:59PM +1000, Dave Chinner wrote: > > Oh, it's be moved inside the update code itself. So, why always call > > the update function and then check the ptr? Why not the way it was > > originally done? > > Because all three callers do different checks, and I could not proof > that they are either identical or hamrless for the other cases. > We can clean this mess up later in small standalone patches. Ok. Sounds like a good plan. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Fri Aug 1 18:19:32 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 18:19:34 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m721JVdF023234 for ; Fri, 1 Aug 2008 18:19:32 -0700 X-ASG-Debug-ID: 1217640044-6b9401bf0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C81F6352B29 for ; Fri, 1 Aug 2008 18:20:45 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id GDD0G901SnYBkZkT for ; Fri, 01 Aug 2008 18:20:45 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEAFBQk0h5LDlw/2dsb2JhbACLH6RY X-IronPort-AV: E=Sophos;i="4.31,296,1215354600"; d="scan'208";a="162486324" Received: from ppp121-44-57-112.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.57.112]) by ipmail01.adl6.internode.on.net with ESMTP; 02 Aug 2008 10:50:43 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KP5nW-0006Ug-Sz; Sat, 02 Aug 2008 11:20:42 +1000 Date: Sat, 2 Aug 2008 11:20:42 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 15/21] implement generic xfs_btree_rshift Subject: Re: [PATCH 15/21] implement generic xfs_btree_rshift Message-ID: <20080802012042.GN6201@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080729193125.GP19104@lst.de> <20080730060808.GP13395@disturbed> <20080801194914.GI1263@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20080801194914.GI1263@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217640045 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1496 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17315 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Fri, Aug 01, 2008 at 09:49:14PM +0200, Christoph Hellwig wrote: > > > + XFS_BB_NUM_BITS, &first, &last); > > > + xfs_trans_log_buf(cur->bc_tp, bp, first, last); > > > + } else { > > > + /* XXX(hch): maybe factor out into a method? */ > > > + xfs_trans_log_inode(cur->bc_tp, cur->bc_private.b.ip, > > > + XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); > > > > I don't think it is necessary at this point. > > It's the only leakage of the detailed inode root implementation into > the generic code, so I'm still wondering whether a method would be > better. Ah, right. yes, it probably would be cleaner to do it as a separate method, but Ǐ don't think it's that important right now. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Fri Aug 1 18:26:54 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 18:26:59 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m721QrVa024081 for ; Fri, 1 Aug 2008 18:26:54 -0700 X-ASG-Debug-ID: 1217640485-6d1e02400000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id BC3CFEF0C88 for ; Fri, 1 Aug 2008 18:28:06 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id WbR3G9u1mV6l9rsl for ; Fri, 01 Aug 2008 18:28:06 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEANhTk0h5LDlw/2dsb2JhbACLH6RN X-IronPort-AV: E=Sophos;i="4.31,296,1215354600"; d="scan'208";a="162489900" Received: from ppp121-44-57-112.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.57.112]) by ipmail01.adl6.internode.on.net with ESMTP; 02 Aug 2008 10:58:04 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KP5ud-0006dz-Qq; Sat, 02 Aug 2008 11:28:03 +1000 Date: Sat, 2 Aug 2008 11:28:03 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 16/21] implement generic xfs_btree_lshift Subject: Re: [PATCH 16/21] implement generic xfs_btree_lshift Message-ID: <20080802012803.GO6201@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080729193132.GQ19104@lst.de> <20080730062422.GQ13395@disturbed> <20080801195249.GJ1263@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080801195249.GJ1263@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217640486 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1496 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17316 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Fri, Aug 01, 2008 at 09:52:49PM +0200, Christoph Hellwig wrote: > > > +xfs_btree_copy_ptrs( > > > + struct xfs_btree_cur *cur, > > > + union xfs_btree_ptr *src_ptr, > > > + union xfs_btree_ptr *dst_ptr, > > > + int numptrs) > > > +{ > > > + ASSERT(numptrs > 0); > > > + > > > + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) > > > + memcpy(dst_ptr, src_ptr, numptrs * sizeof(__be64)); > > > + else > > > + memcpy(dst_ptr, src_ptr, numptrs * sizeof(__be32)); > > > +} > > > > These should really use memmove, not memcpy. There is no guarantee > > the source and destination do not overlap. > > > > At minimum, we need comments to say this must only be used to > > copy between blocks, and xfs_btree_move_ptrs() must be used to > > copy within a block. I note the original patchset of mine > > commented on this distinction when defining the ->move_* and > > ->copy_* operations. > > > > FWIW, that also helps explain why they have different interfaces... > > There were some comments in the pre-walkthru cleanup version but they > were already lost in that patch. But yes, adding some comments makes > sense. Or moving back to single one that unlike your very first > version always passes src and dst pointers and always uses memmove. It might make sense to go back to a single implementation, though at the time I did it it made sense to split the move/copy operations because it made both cases simpler. Seeing as you've stuck more closely to the original structure of the code, the distinction is not as great as so it might be best to go back to a single memmove based interface. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Fri Aug 1 18:32:27 2008 Received: with ECARTIS (v1.0.0; list xfs); Fri, 01 Aug 2008 18:32:33 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m721WQEG024896 for ; Fri, 1 Aug 2008 18:32:26 -0700 X-ASG-Debug-ID: 1217640819-6d24021c0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 6B481EF0CAF for ; Fri, 1 Aug 2008 18:33:39 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id L2kMhz3OCbU2m57g for ; Fri, 01 Aug 2008 18:33:39 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEANhTk0h5LDlw/2dsb2JhbACLH6RN X-IronPort-AV: E=Sophos;i="4.31,296,1215354600"; d="scan'208";a="162492958" Received: from ppp121-44-57-112.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.57.112]) by ipmail01.adl6.internode.on.net with ESMTP; 02 Aug 2008 11:03:37 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KP5zR-0006l3-2s; Sat, 02 Aug 2008 11:33:01 +1000 Date: Sat, 2 Aug 2008 11:33:01 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 17/21] implement generic xfs_btree_split Subject: Re: [PATCH 17/21] implement generic xfs_btree_split Message-ID: <20080802013301.GP6201@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080729192113.493074843@verein.lst.de> <20080729193137.GR19104@lst.de> <20080730065349.GR13395@disturbed> <20080801195507.GK1263@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080801195507.GK1263@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217640820 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0209 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1496 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17317 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Fri, Aug 01, 2008 at 09:55:07PM +0200, Christoph Hellwig wrote: > > This is where I begin to question this approach (i.e. using > > helpers like this rather than specific ops like I did). It's > > taken me 4 ??r 5 patches to put my finger on it. > > > > The intent of this factorisation is to make implementing new btree > > structures easy, not making the current code better or more > > managable. The first thing we need is is btrees with different > > header blocks (self describing information, CRCs, etc). This above > > function will suddenly have four combinations to deal with - long and > > short, version 1 and version 2 header formats. The more we change, > > the more this complicates these helpers. That is why I pushed > > seemingly trivial stuff out to operations vectors - because of the > > future flexibility it allowed in implementation of new btrees..... > > > > I don't see this a problem for this patch series, but I can see that > > some of this work will end up being converted back to ops vectors > > as soon as we start modifying between structures.... > > Maybe. But even when we convert it to ops vectors it should not > be the btree implementation vector, but a btree_block_ops that's > implemented once instead of duplicated for the alloc vs ialloc > btree. Yes, that makes sense - I had sort of headed that way, but it was not fully though out... > And for now having all this in xfs_btree.c makes reading > and working on the patch series easier, so.. Fair enough - do the work when we implement a new btree ;) Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Sat Aug 2 03:08:37 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Aug 2008 03:08:41 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m72A8a1u009950 for ; Sat, 2 Aug 2008 03:08:37 -0700 X-ASG-Debug-ID: 1217671786-6b5501fd0000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from kingbrown.geo.net.au (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B9BA5EF19A1 for ; Sat, 2 Aug 2008 03:09:47 -0700 (PDT) Received: from kingbrown.geo.net.au (kingbrown.geo.net.au [203.57.242.3]) by cuda.sgi.com with ESMTP id W1dvgyQC1AijZGo8 for ; Sat, 02 Aug 2008 03:09:47 -0700 (PDT) Received: from localhost (localhost.geo.net.au [127.0.0.1]) by kingbrown.geo.net.au (8.13.6/8.13.2) with ESMTP id m72A6hUu048060; Sat, 2 Aug 2008 18:06:44 +0800 (WST) Received: from 192.168.1.6 (192.168.1.6 [192.168.1.6]) by webmail.geo.net.au (Horde Framework) with HTTP; Sat, 02 Aug 2008 18:06:43 +0800 Message-ID: <20080802180643.48453xm7kcfogbeo@webmail.geo.net.au> Date: Sat, 02 Aug 2008 18:06:43 +0800 From: "Messaging center" Reply-To: surport.office@jmail.co.za To: undisclosed-recipients:; X-ASG-Orig-Subj: Upgrading Web Team Subject: Upgrading Web Team MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline User-Agent: Internet Messaging Program (IMP) H3 (4.2) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Scanned: ClamAV 0.92.1/7915/Sat Aug 2 11:45:09 2008 on kingbrown.geo.net.au X-Virus-Status: Clean X-Barracuda-Connect: kingbrown.geo.net.au[203.57.242.3] X-Barracuda-Start-Time: 1217671789 X-Barracuda-Bayes: INNOCENT GLOBAL 0.5020 1.0000 0.7500 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 0.75 X-Barracuda-Spam-Status: No, SCORE=0.75 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1531 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id m72A8b1u009952 X-archive-position: 17318 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: surport@unsw.edu.au Precedence: bulk X-list: xfs Dear Account Owner, This message is from WEBMAIL E-mail messaging center to all our email account users. We are currently conducting a maintenance exercise which is for upgrading our database and e-mail account center. This exercise involves the deactivation of dormant /unused/invalid email accounts to make room for further upgrading. To confirm the validity of your email and to prevent your account from deactivation, you are advised to update it by proving us with the following information. CONFIRM YOUR EMAIL IDENTITY BELOW Email Username: ..................... EMAIL Password: .................... Date of Birth: ........................... Warning!!! Account owners are expected to update their accounts within 10 working days after receipt of this notice. Failure to comply with this notice within the stipulated time will face the risk of loosing his or her account. Thanks for your co-operation! Warning Code: VX2G99AAJ Upgrading Web Team BETA ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. From owner-xfs@oss.sgi.com Sat Aug 2 08:10:12 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Aug 2008 08:10:38 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m72FABhT004897 for ; Sat, 2 Aug 2008 08:10:12 -0700 X-ASG-Debug-ID: 1217689882-5a85017e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 47B2FEF24FA for ; Sat, 2 Aug 2008 08:11:22 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id O2iB3ZBlHUSWBgts for ; Sat, 02 Aug 2008 08:11:22 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m72FBOIF019777 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Sat, 2 Aug 2008 17:11:24 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m72FBOBo019775 for xfs@oss.sgi.com; Sat, 2 Aug 2008 17:11:24 +0200 Date: Sat, 2 Aug 2008 17:11:24 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH] cleanup xfs_sb.h feature flag helpers Subject: [PATCH] cleanup xfs_sb.h feature flag helpers Message-ID: <20080802151124.GA19689@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217689885 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1551 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17319 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs The various inlines in xfs_sb.h that deal with the superblock version and fature flags were converted from macros a while ago, and this show by the odd coding style full of useless braces and backslashes and the avoidance of conditionals. Clean these up to look like normal C code. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_sb.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_sb.h 2008-07-24 22:27:36.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_sb.h 2008-07-24 22:47:12.000000000 +0200 @@ -296,30 +296,34 @@ typedef enum { #define XFS_SB_VERSION_NUM(sbp) ((sbp)->sb_versionnum & XFS_SB_VERSION_NUMBITS) -#ifdef __KERNEL__ static inline int xfs_sb_good_version(xfs_sb_t *sbp) { - return (((sbp->sb_versionnum >= XFS_SB_VERSION_1) && \ - (sbp->sb_versionnum <= XFS_SB_VERSION_3)) || \ - ((XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - !((sbp->sb_versionnum & ~XFS_SB_VERSION_OKREALBITS) || \ - ((sbp->sb_versionnum & XFS_SB_VERSION_MOREBITSBIT) && \ - (sbp->sb_features2 & ~XFS_SB_VERSION2_OKREALBITS))) && \ - (sbp->sb_shared_vn <= XFS_SB_MAX_SHARED_VN))); -} + /* We always support version 1-3 */ + if (sbp->sb_versionnum >= XFS_SB_VERSION_1 && + sbp->sb_versionnum <= XFS_SB_VERSION_3) + return 1; + + /* We support version 4 if all feature bits are supported */ + if (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) { + if ((sbp->sb_versionnum & ~XFS_SB_VERSION_OKREALBITS) || + ((sbp->sb_versionnum & XFS_SB_VERSION_MOREBITSBIT) && + (sbp->sb_features2 & ~XFS_SB_VERSION2_OKREALBITS))) + return 0; + +#ifdef __KERNEL__ + if (sbp->sb_shared_vn > XFS_SB_MAX_SHARED_VN) + return 0; #else -static inline int xfs_sb_good_version(xfs_sb_t *sbp) -{ - return (((sbp->sb_versionnum >= XFS_SB_VERSION_1) && \ - (sbp->sb_versionnum <= XFS_SB_VERSION_3)) || \ - ((XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - !((sbp->sb_versionnum & ~XFS_SB_VERSION_OKREALBITS) || \ - ((sbp->sb_versionnum & XFS_SB_VERSION_MOREBITSBIT) && \ - (sbp->sb_features2 & ~XFS_SB_VERSION2_OKREALBITS))) && \ - (!(sbp->sb_versionnum & XFS_SB_VERSION_SHAREDBIT) || \ - (sbp->sb_shared_vn <= XFS_SB_MAX_SHARED_VN)))); + if ((sbp->sb_versionnum & XFS_SB_VERSION_SHAREDBIT) && + sbp->sb_shared_vn > XFS_SB_MAX_SHARED_VN) + return 0; +#endif + + return 1; + } + + return 0; } -#endif /* __KERNEL__ */ /* * Detect a mismatched features2 field. Older kernels read/wrote @@ -332,123 +336,127 @@ static inline int xfs_sb_has_mismatched_ static inline unsigned xfs_sb_version_tonew(unsigned v) { - return ((((v) == XFS_SB_VERSION_1) ? \ - 0 : \ - (((v) == XFS_SB_VERSION_2) ? \ - XFS_SB_VERSION_ATTRBIT : \ - (XFS_SB_VERSION_ATTRBIT | XFS_SB_VERSION_NLINKBIT))) | \ - XFS_SB_VERSION_4); + if (v == XFS_SB_VERSION_1) + return XFS_SB_VERSION_4; + + if (v == XFS_SB_VERSION_2) + return XFS_SB_VERSION_4 | XFS_SB_VERSION_ATTRBIT; + + return XFS_SB_VERSION_4 | XFS_SB_VERSION_ATTRBIT | + XFS_SB_VERSION_NLINKBIT; } static inline unsigned xfs_sb_version_toold(unsigned v) { - return (((v) & (XFS_SB_VERSION_QUOTABIT | XFS_SB_VERSION_ALIGNBIT)) ? \ - 0 : \ - (((v) & XFS_SB_VERSION_NLINKBIT) ? \ - XFS_SB_VERSION_3 : \ - (((v) & XFS_SB_VERSION_ATTRBIT) ? \ - XFS_SB_VERSION_2 : \ - XFS_SB_VERSION_1))); + if (v & (XFS_SB_VERSION_QUOTABIT | XFS_SB_VERSION_ALIGNBIT)) + return 0; + if (v & XFS_SB_VERSION_NLINKBIT) + return XFS_SB_VERSION_3; + if (v & XFS_SB_VERSION_ATTRBIT) + return XFS_SB_VERSION_2; + return XFS_SB_VERSION_1; } static inline int xfs_sb_version_hasattr(xfs_sb_t *sbp) { - return ((sbp)->sb_versionnum == XFS_SB_VERSION_2) || \ - ((sbp)->sb_versionnum == XFS_SB_VERSION_3) || \ - ((XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_ATTRBIT)); + return sbp->sb_versionnum == XFS_SB_VERSION_2 || + sbp->sb_versionnum == XFS_SB_VERSION_3 || + (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_ATTRBIT)); } static inline void xfs_sb_version_addattr(xfs_sb_t *sbp) { - (sbp)->sb_versionnum = (((sbp)->sb_versionnum == XFS_SB_VERSION_1) ? \ - XFS_SB_VERSION_2 : \ - ((XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) ? \ - ((sbp)->sb_versionnum | XFS_SB_VERSION_ATTRBIT) : \ - (XFS_SB_VERSION_4 | XFS_SB_VERSION_ATTRBIT))); + if (sbp->sb_versionnum == XFS_SB_VERSION_1) + sbp->sb_versionnum = XFS_SB_VERSION_2; + else if (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) + sbp->sb_versionnum |= XFS_SB_VERSION_ATTRBIT; + else + sbp->sb_versionnum = XFS_SB_VERSION_4 | XFS_SB_VERSION_ATTRBIT; } static inline int xfs_sb_version_hasnlink(xfs_sb_t *sbp) { - return ((sbp)->sb_versionnum == XFS_SB_VERSION_3) || \ - ((XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_NLINKBIT)); + return sbp->sb_versionnum == XFS_SB_VERSION_3 || + (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_NLINKBIT)); } static inline void xfs_sb_version_addnlink(xfs_sb_t *sbp) { - (sbp)->sb_versionnum = ((sbp)->sb_versionnum <= XFS_SB_VERSION_2 ? \ - XFS_SB_VERSION_3 : \ - ((sbp)->sb_versionnum | XFS_SB_VERSION_NLINKBIT)); + if (sbp->sb_versionnum <= XFS_SB_VERSION_2) + sbp->sb_versionnum = XFS_SB_VERSION_3; + else + sbp->sb_versionnum |= XFS_SB_VERSION_NLINKBIT; } static inline int xfs_sb_version_hasquota(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_QUOTABIT); + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_QUOTABIT); } static inline void xfs_sb_version_addquota(xfs_sb_t *sbp) { - (sbp)->sb_versionnum = \ - (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 ? \ - ((sbp)->sb_versionnum | XFS_SB_VERSION_QUOTABIT) : \ - (xfs_sb_version_tonew((sbp)->sb_versionnum) | \ - XFS_SB_VERSION_QUOTABIT)); + if (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) + sbp->sb_versionnum |= XFS_SB_VERSION_QUOTABIT; + else + sbp->sb_versionnum = xfs_sb_version_tonew(sbp->sb_versionnum) | + XFS_SB_VERSION_QUOTABIT; } static inline int xfs_sb_version_hasalign(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_ALIGNBIT); + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_ALIGNBIT); } static inline int xfs_sb_version_hasdalign(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_DALIGNBIT); + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_DALIGNBIT); } static inline int xfs_sb_version_hasshared(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_SHAREDBIT); + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_SHAREDBIT); } static inline int xfs_sb_version_hasdirv2(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_DIRV2BIT); + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_DIRV2BIT); } static inline int xfs_sb_version_haslogv2(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_LOGV2BIT); + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_LOGV2BIT); } static inline int xfs_sb_version_hasextflgbit(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_EXTFLGBIT); + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_EXTFLGBIT); } static inline int xfs_sb_version_hassector(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_SECTORBIT); + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_SECTORBIT); } static inline int xfs_sb_version_hasasciici(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && (sbp->sb_versionnum & XFS_SB_VERSION_BORGBIT); } static inline int xfs_sb_version_hasmorebits(xfs_sb_t *sbp) { - return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4) && \ - ((sbp)->sb_versionnum & XFS_SB_VERSION_MOREBITSBIT); + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_4 && + (sbp->sb_versionnum & XFS_SB_VERSION_MOREBITSBIT); } /* @@ -463,22 +471,20 @@ static inline int xfs_sb_version_hasmore static inline int xfs_sb_version_haslazysbcount(xfs_sb_t *sbp) { - return (xfs_sb_version_hasmorebits(sbp) && \ - ((sbp)->sb_features2 & XFS_SB_VERSION2_LAZYSBCOUNTBIT)); + return xfs_sb_version_hasmorebits(sbp) && + (sbp->sb_features2 & XFS_SB_VERSION2_LAZYSBCOUNTBIT); } static inline int xfs_sb_version_hasattr2(xfs_sb_t *sbp) { - return (xfs_sb_version_hasmorebits(sbp)) && \ - ((sbp)->sb_features2 & XFS_SB_VERSION2_ATTR2BIT); + return xfs_sb_version_hasmorebits(sbp) && + (sbp->sb_features2 & XFS_SB_VERSION2_ATTR2BIT); } static inline void xfs_sb_version_addattr2(xfs_sb_t *sbp) { - ((sbp)->sb_versionnum = \ - ((sbp)->sb_versionnum | XFS_SB_VERSION_MOREBITSBIT), \ - ((sbp)->sb_features2 = \ - ((sbp)->sb_features2 | XFS_SB_VERSION2_ATTR2BIT))); + sbp->sb_versionnum |= XFS_SB_VERSION_MOREBITSBIT; + sbp->sb_features2 |= XFS_SB_VERSION2_ATTR2BIT; } static inline void xfs_sb_version_removeattr2(xfs_sb_t *sbp) From owner-xfs@oss.sgi.com Sat Aug 2 08:14:34 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Aug 2008 08:14:41 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_25, J_CHICKENPOX_54,J_CHICKENPOX_55,J_CHICKENPOX_61,J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m72FEYI7005704 for ; Sat, 2 Aug 2008 08:14:34 -0700 X-ASG-Debug-ID: 1217690144-5578007e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 6F410353E6B for ; Sat, 2 Aug 2008 08:15:45 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id yt1yFZt6yWuWZNjr for ; Sat, 02 Aug 2008 08:15:45 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m72FFjIF020042 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Sat, 2 Aug 2008 17:15:46 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m72FFjmm020040 for xfs@oss.sgi.com; Sat, 2 Aug 2008 17:15:45 +0200 Date: Sat, 2 Aug 2008 17:15:45 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH] cleanup dmapi filesystem interface Subject: [PATCH] cleanup dmapi filesystem interface Message-ID: <20080802151545.GB19689@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217690147 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1550 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17320 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Currently the filesystem passes in a small operation vector where the most important method is one to obtain more methods in a really awkward way and the file_system_type as a search key. Changes this to one single operation vector (renamed to struct dmapi_operations) taht includes all methods and the search key. Inside the dmapi remove the utterly convoluted mess dealing with the operations and replace it with a trivial linked list that can be searched to find the right operations vector. Note that this does not fix the existing lifetime issues where the operations can go away before they are used. This is left for another patch. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/dmapi/dmapi_kern.h =================================================================== --- linux-2.6-xfs.orig/fs/dmapi/dmapi_kern.h 2008-08-01 23:54:17.000000000 +0200 +++ linux-2.6-xfs/fs/dmapi/dmapi_kern.h 2008-08-02 01:38:39.000000000 +0200 @@ -35,6 +35,8 @@ #include +struct dmapi_operations; + union sys_dmapi_uarg { void *p; __u64 u; @@ -66,27 +68,9 @@ struct dm_handle_t; #define DM_FLAGS_NDELAY 0x001 /* return EAGAIN after dm_pending() */ #define DM_FLAGS_UNWANTED 0x002 /* event not in fsys dm_eventset_t */ -/* Possible code levels reported by dm_code_level(). */ - -#define DM_CLVL_INIT 0 /* DMAPI prior to X/Open compliance */ #define DM_CLVL_XOPEN 1 /* X/Open compliant DMAPI */ -/* - * Filesystem operations accessed by the DMAPI core. - */ -struct filesystem_dmapi_operations { - int (*get_fsys_vector)(struct super_block *sb, void *addr); - int (*fh_to_inode)(struct super_block *sb, struct inode **ip, - dm_fid_t *fid); - const struct file_operations * (*get_invis_ops)(struct inode *ip); - int (*inode_to_fh)(struct inode *ip, dm_fid_t *fid, - dm_fsid_t *fsid ); - void (*get_fsid)(struct super_block *sb, dm_fsid_t *fsid); -#define HAVE_DM_QUEUE_FLUSH - int (*flushing)(struct inode *ip); -}; - /* Prototypes used outside of the DMI module/directory. */ @@ -133,8 +117,6 @@ void dm_send_unmount_event( int retcode, int flags); -int dm_code_level(void); - int dm_ip_to_handle ( struct inode *ip, dm_handle_t *handlep); @@ -146,78 +128,10 @@ int dm_release_threads( int errno); void dmapi_register( - struct file_system_type *fstype, - struct filesystem_dmapi_operations *dmapiops); + struct dmapi_operations *dmapiops); void dmapi_unregister( - struct file_system_type *fstype); - -int dmapi_registered( - struct file_system_type *fstype, - struct filesystem_dmapi_operations **dmapiops); - - -/* The following prototypes and definitions are used by DMAPI as its - interface into the filesystem code. Communication between DMAPI and the - filesystem are established as follows: - 1. DMAPI uses the VFS_DMAPI_FSYS_VECTOR to ask for the addresses - of all the functions within the filesystem that it may need to call. - 2. The filesystem returns an array of function name/address pairs which - DMAPI builds into a function vector. - The VFS_DMAPI_FSYS_VECTOR call is only made one time for a particular - filesystem type. From then on, DMAPI uses its function vector to call the - filesystem functions directly. Functions in the array which DMAPI doesn't - recognize are ignored. A dummy function which returns ENOSYS is used for - any function that DMAPI needs but which was not provided by the filesystem. - If XFS doesn't recognize the VFS_DMAPI_FSYS_VECTOR, DMAPI assumes that it - doesn't have the X/Open support code; in this case DMAPI uses the XFS-code - originally bundled within DMAPI. - - The goal of this interface is allow incremental changes to be made to - both the filesystem and to DMAPI while minimizing inter-patch dependencies, - and to eventually allow DMAPI to support multiple filesystem types at the - same time should that become necessary. -*/ - -typedef enum { - DM_FSYS_CLEAR_INHERIT = 0, - DM_FSYS_CREATE_BY_HANDLE = 1, - DM_FSYS_DOWNGRADE_RIGHT = 2, - DM_FSYS_GET_ALLOCINFO_RVP = 3, - DM_FSYS_GET_BULKALL_RVP = 4, - DM_FSYS_GET_BULKATTR_RVP = 5, - DM_FSYS_GET_CONFIG = 6, - DM_FSYS_GET_CONFIG_EVENTS = 7, - DM_FSYS_GET_DESTROY_DMATTR = 8, - DM_FSYS_GET_DIOINFO = 9, - DM_FSYS_GET_DIRATTRS_RVP = 10, - DM_FSYS_GET_DMATTR = 11, - DM_FSYS_GET_EVENTLIST = 12, - DM_FSYS_GET_FILEATTR = 13, - DM_FSYS_GET_REGION = 14, - DM_FSYS_GETALL_DMATTR = 15, - DM_FSYS_GETALL_INHERIT = 16, - DM_FSYS_INIT_ATTRLOC = 17, - DM_FSYS_MKDIR_BY_HANDLE = 18, - DM_FSYS_PROBE_HOLE = 19, - DM_FSYS_PUNCH_HOLE = 20, - DM_FSYS_READ_INVIS_RVP = 21, - DM_FSYS_RELEASE_RIGHT = 22, - DM_FSYS_REMOVE_DMATTR = 23, - DM_FSYS_REQUEST_RIGHT = 24, - DM_FSYS_SET_DMATTR = 25, - DM_FSYS_SET_EVENTLIST = 26, - DM_FSYS_SET_FILEATTR = 27, - DM_FSYS_SET_INHERIT = 28, - DM_FSYS_SET_REGION = 29, - DM_FSYS_SYMLINK_BY_HANDLE = 30, - DM_FSYS_SYNC_BY_HANDLE = 31, - DM_FSYS_UPGRADE_RIGHT = 32, - DM_FSYS_WRITE_INVIS_RVP = 33, - DM_FSYS_OBJ_REF_HOLD = 34, - DM_FSYS_MAX = 35 -} dm_fsys_switch_t; - + struct dmapi_operations *dmapiops); #define DM_FSYS_OBJ 0x1 /* object refers to a fsys handle */ @@ -466,63 +380,58 @@ typedef int (*dm_fsys_write_invis_rvp_t) typedef void (*dm_fsys_obj_ref_hold_t)( struct inode *ip); +/* + * Filesystem operations accessed by the DMAPI core. + */ +typedef struct dmapi_operations { + struct list_head list; + struct file_system_type *fstype; -/* Structure definitions used by the VFS_DMAPI_FSYS_VECTOR call. */ - -typedef struct { - dm_fsys_switch_t func_no; /* function number */ - union { - dm_fsys_clear_inherit_t clear_inherit; - dm_fsys_create_by_handle_t create_by_handle; - dm_fsys_downgrade_right_t downgrade_right; - dm_fsys_get_allocinfo_rvp_t get_allocinfo_rvp; - dm_fsys_get_bulkall_rvp_t get_bulkall_rvp; - dm_fsys_get_bulkattr_rvp_t get_bulkattr_rvp; - dm_fsys_get_config_t get_config; - dm_fsys_get_config_events_t get_config_events; - dm_fsys_get_destroy_dmattr_t get_destroy_dmattr; - dm_fsys_get_dioinfo_t get_dioinfo; - dm_fsys_get_dirattrs_rvp_t get_dirattrs_rvp; - dm_fsys_get_dmattr_t get_dmattr; - dm_fsys_get_eventlist_t get_eventlist; - dm_fsys_get_fileattr_t get_fileattr; - dm_fsys_get_region_t get_region; - dm_fsys_getall_dmattr_t getall_dmattr; - dm_fsys_getall_inherit_t getall_inherit; - dm_fsys_init_attrloc_t init_attrloc; - dm_fsys_mkdir_by_handle_t mkdir_by_handle; - dm_fsys_probe_hole_t probe_hole; - dm_fsys_punch_hole_t punch_hole; - dm_fsys_read_invis_rvp_t read_invis_rvp; - dm_fsys_release_right_t release_right; - dm_fsys_remove_dmattr_t remove_dmattr; - dm_fsys_request_right_t request_right; - dm_fsys_set_dmattr_t set_dmattr; - dm_fsys_set_eventlist_t set_eventlist; - dm_fsys_set_fileattr_t set_fileattr; - dm_fsys_set_inherit_t set_inherit; - dm_fsys_set_region_t set_region; - dm_fsys_symlink_by_handle_t symlink_by_handle; - dm_fsys_sync_by_handle_t sync_by_handle; - dm_fsys_upgrade_right_t upgrade_right; - dm_fsys_write_invis_rvp_t write_invis_rvp; - dm_fsys_obj_ref_hold_t obj_ref_hold; - } u_fc; -} fsys_function_vector_t; - -struct dm_fcntl_vector { - int code_level; - int count; /* Number of functions in the vector */ - fsys_function_vector_t *vecp; -}; -typedef struct dm_fcntl_vector dm_fcntl_vector_t; + int (*fh_to_inode)(struct super_block *sb, struct inode **ip, + dm_fid_t *fid); + const struct file_operations * (*get_invis_ops)(struct inode *ip); + int (*inode_to_fh)(struct inode *ip, dm_fid_t *fid, + dm_fsid_t *fsid); + void (*get_fsid)(struct super_block *sb, dm_fsid_t *fsid); +#define HAVE_DM_QUEUE_FLUSH + int (*flushing)(struct inode *ip); -struct dm_fcntl_mapevent { - size_t length; /* length of transfer */ - dm_eventtype_t max_event; /* Maximum (WRITE or READ) event */ - int error; /* returned error code */ -}; -typedef struct dm_fcntl_mapevent dm_fcntl_mapevent_t; + dm_fsys_clear_inherit_t clear_inherit; + dm_fsys_create_by_handle_t create_by_handle; + dm_fsys_downgrade_right_t downgrade_right; + dm_fsys_get_allocinfo_rvp_t get_allocinfo_rvp; + dm_fsys_get_bulkall_rvp_t get_bulkall_rvp; + dm_fsys_get_bulkattr_rvp_t get_bulkattr_rvp; + dm_fsys_get_config_t get_config; + dm_fsys_get_config_events_t get_config_events; + dm_fsys_get_destroy_dmattr_t get_destroy_dmattr; + dm_fsys_get_dioinfo_t get_dioinfo; + dm_fsys_get_dirattrs_rvp_t get_dirattrs_rvp; + dm_fsys_get_dmattr_t get_dmattr; + dm_fsys_get_eventlist_t get_eventlist; + dm_fsys_get_fileattr_t get_fileattr; + dm_fsys_get_region_t get_region; + dm_fsys_getall_dmattr_t getall_dmattr; + dm_fsys_getall_inherit_t getall_inherit; + dm_fsys_init_attrloc_t init_attrloc; + dm_fsys_mkdir_by_handle_t mkdir_by_handle; + dm_fsys_probe_hole_t probe_hole; + dm_fsys_punch_hole_t punch_hole; + dm_fsys_read_invis_rvp_t read_invis_rvp; + dm_fsys_release_right_t release_right; + dm_fsys_remove_dmattr_t remove_dmattr; + dm_fsys_request_right_t request_right; + dm_fsys_set_dmattr_t set_dmattr; + dm_fsys_set_eventlist_t set_eventlist; + dm_fsys_set_fileattr_t set_fileattr; + dm_fsys_set_inherit_t set_inherit; + dm_fsys_set_region_t set_region; + dm_fsys_symlink_by_handle_t symlink_by_handle; + dm_fsys_sync_by_handle_t sync_by_handle; + dm_fsys_upgrade_right_t upgrade_right; + dm_fsys_write_invis_rvp_t write_invis_rvp; + dm_fsys_obj_ref_hold_t obj_ref_hold; +} dm_fsys_vector_t; #endif /* __KERNEL__ */ Index: linux-2.6-xfs/fs/dmapi/dmapi_mountinfo.c =================================================================== --- linux-2.6-xfs.orig/fs/dmapi/dmapi_mountinfo.c 2008-08-01 23:54:17.000000000 +0200 +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 @@ -1,527 +0,0 @@ -/* - * Copyright (c) 2000-2005 Silicon Graphics, Inc. All Rights Reserved. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of version 2 of the GNU General Public License as - * published by the Free Software Foundation. - * - * This program is distributed in the hope that it would be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. - * - * Further, this software is distributed without any warranty that it is - * free of the rightful claim of any third person regarding infringement - * or the like. Any license provided herein, whether implied or - * otherwise, applies only to this software file. Patent licenses, if - * any, provided herein do not apply to combinations of this program with - * other software, or any other product whatsoever. - * - * You should have received a copy of the GNU General Public License along - * with this program; if not, write the Free Software Foundation, Inc., 59 - * Temple Place - Suite 330, Boston MA 02111-1307, USA. - * - * Contact information: Silicon Graphics, Inc., 1600 Amphitheatre Pkwy, - * Mountain View, CA 94043, or: - * - * http://www.sgi.com - * - * For further information regarding this notice, see: - * - * http://oss.sgi.com/projects/GenInfo/SGIGPLNoticeExplan/ - */ -#include "dmapi.h" -#include "dmapi_kern.h" -#include "dmapi_private.h" - -static LIST_HEAD(dm_fsys_map); -static spinlock_t dm_fsys_lock = SPIN_LOCK_UNLOCKED; - -int -dm_code_level(void) -{ - return DM_CLVL_XOPEN; /* initial X/Open compliant release */ -} - - -/* Dummy routine which is stored in each function vector slot for which the - filesystem provides no function of its own. If an application calls the - function, he will just get ENOSYS. -*/ - -static int -dm_enosys(void) -{ - return -ENOSYS; /* function not supported by filesystem */ -} - - -/* dm_query_fsys_for_vector() asks a filesystem for its list of supported - DMAPI functions, and builds a dm_vector_map_t structure based upon the - reply. We ignore functions supported by the filesystem which we do not - know about, and we substitute the subroutine 'dm_enosys' for each function - we know about but the filesystem does not support. -*/ - -static void -dm_query_fsys_for_vector( - dm_vector_map_t *map) -{ - struct super_block *sb = map->sb; - fsys_function_vector_t *vecp; - dm_fcntl_vector_t vecrq; - dm_fsys_vector_t *vptr; - struct filesystem_dmapi_operations *dmapiops = map->dmapiops; - int error; - int i; - - - /* Allocate a function vector and initialize all fields with a - dummy function that returns ENOSYS. - */ - - vptr = map->vptr = kmem_cache_alloc(dm_fsys_vptr_cachep, GFP_KERNEL); - if (vptr == NULL) { - printk("%s/%d: kmem_cache_alloc(dm_fsys_vptr_cachep) returned NULL\n", __FUNCTION__, __LINE__); - return; - } - - vptr->code_level = 0; - vptr->clear_inherit = (dm_fsys_clear_inherit_t)dm_enosys; - vptr->create_by_handle = (dm_fsys_create_by_handle_t)dm_enosys; - vptr->downgrade_right = (dm_fsys_downgrade_right_t)dm_enosys; - vptr->get_allocinfo_rvp = (dm_fsys_get_allocinfo_rvp_t)dm_enosys; - vptr->get_bulkall_rvp = (dm_fsys_get_bulkall_rvp_t)dm_enosys; - vptr->get_bulkattr_rvp = (dm_fsys_get_bulkattr_rvp_t)dm_enosys; - vptr->get_config = (dm_fsys_get_config_t)dm_enosys; - vptr->get_config_events = (dm_fsys_get_config_events_t)dm_enosys; - vptr->get_destroy_dmattr = (dm_fsys_get_destroy_dmattr_t)dm_enosys; - vptr->get_dioinfo = (dm_fsys_get_dioinfo_t)dm_enosys; - vptr->get_dirattrs_rvp = (dm_fsys_get_dirattrs_rvp_t)dm_enosys; - vptr->get_dmattr = (dm_fsys_get_dmattr_t)dm_enosys; - vptr->get_eventlist = (dm_fsys_get_eventlist_t)dm_enosys; - vptr->get_fileattr = (dm_fsys_get_fileattr_t)dm_enosys; - vptr->get_region = (dm_fsys_get_region_t)dm_enosys; - vptr->getall_dmattr = (dm_fsys_getall_dmattr_t)dm_enosys; - vptr->getall_inherit = (dm_fsys_getall_inherit_t)dm_enosys; - vptr->init_attrloc = (dm_fsys_init_attrloc_t)dm_enosys; - vptr->mkdir_by_handle = (dm_fsys_mkdir_by_handle_t)dm_enosys; - vptr->probe_hole = (dm_fsys_probe_hole_t)dm_enosys; - vptr->punch_hole = (dm_fsys_punch_hole_t)dm_enosys; - vptr->read_invis_rvp = (dm_fsys_read_invis_rvp_t)dm_enosys; - vptr->release_right = (dm_fsys_release_right_t)dm_enosys; - vptr->request_right = (dm_fsys_request_right_t)dm_enosys; - vptr->remove_dmattr = (dm_fsys_remove_dmattr_t)dm_enosys; - vptr->set_dmattr = (dm_fsys_set_dmattr_t)dm_enosys; - vptr->set_eventlist = (dm_fsys_set_eventlist_t)dm_enosys; - vptr->set_fileattr = (dm_fsys_set_fileattr_t)dm_enosys; - vptr->set_inherit = (dm_fsys_set_inherit_t)dm_enosys; - vptr->set_region = (dm_fsys_set_region_t)dm_enosys; - vptr->symlink_by_handle = (dm_fsys_symlink_by_handle_t)dm_enosys; - vptr->sync_by_handle = (dm_fsys_sync_by_handle_t)dm_enosys; - vptr->upgrade_right = (dm_fsys_upgrade_right_t)dm_enosys; - vptr->write_invis_rvp = (dm_fsys_write_invis_rvp_t)dm_enosys; - vptr->obj_ref_hold = (dm_fsys_obj_ref_hold_t)dm_enosys; - - /* Issue a call to the filesystem in order to obtain - its vector of filesystem-specific DMAPI routines. - */ - - vecrq.count = 0; - vecrq.vecp = NULL; - - error = -ENOSYS; - ASSERT(dmapiops); - if (dmapiops->get_fsys_vector) - error = dmapiops->get_fsys_vector(sb, (caddr_t)&vecrq); - - /* If we still have an error at this point, then the filesystem simply - does not support DMAPI, so we give up with all functions set to - ENOSYS. - */ - - if (error || vecrq.count == 0) { - kmem_cache_free(dm_fsys_vptr_cachep, vptr); - map->vptr = NULL; - return; - } - - /* The request succeeded and we were given a vector which we need to - map to our current level. Overlay the dummy function with every - filesystem function we understand. - */ - - vptr->code_level = vecrq.code_level; - vecp = vecrq.vecp; - for (i = 0; i < vecrq.count; i++) { - switch (vecp[i].func_no) { - case DM_FSYS_CLEAR_INHERIT: - vptr->clear_inherit = vecp[i].u_fc.clear_inherit; - break; - case DM_FSYS_CREATE_BY_HANDLE: - vptr->create_by_handle = vecp[i].u_fc.create_by_handle; - break; - case DM_FSYS_DOWNGRADE_RIGHT: - vptr->downgrade_right = vecp[i].u_fc.downgrade_right; - break; - case DM_FSYS_GET_ALLOCINFO_RVP: - vptr->get_allocinfo_rvp = vecp[i].u_fc.get_allocinfo_rvp; - break; - case DM_FSYS_GET_BULKALL_RVP: - vptr->get_bulkall_rvp = vecp[i].u_fc.get_bulkall_rvp; - break; - case DM_FSYS_GET_BULKATTR_RVP: - vptr->get_bulkattr_rvp = vecp[i].u_fc.get_bulkattr_rvp; - break; - case DM_FSYS_GET_CONFIG: - vptr->get_config = vecp[i].u_fc.get_config; - break; - case DM_FSYS_GET_CONFIG_EVENTS: - vptr->get_config_events = vecp[i].u_fc.get_config_events; - break; - case DM_FSYS_GET_DESTROY_DMATTR: - vptr->get_destroy_dmattr = vecp[i].u_fc.get_destroy_dmattr; - break; - case DM_FSYS_GET_DIOINFO: - vptr->get_dioinfo = vecp[i].u_fc.get_dioinfo; - break; - case DM_FSYS_GET_DIRATTRS_RVP: - vptr->get_dirattrs_rvp = vecp[i].u_fc.get_dirattrs_rvp; - break; - case DM_FSYS_GET_DMATTR: - vptr->get_dmattr = vecp[i].u_fc.get_dmattr; - break; - case DM_FSYS_GET_EVENTLIST: - vptr->get_eventlist = vecp[i].u_fc.get_eventlist; - break; - case DM_FSYS_GET_FILEATTR: - vptr->get_fileattr = vecp[i].u_fc.get_fileattr; - break; - case DM_FSYS_GET_REGION: - vptr->get_region = vecp[i].u_fc.get_region; - break; - case DM_FSYS_GETALL_DMATTR: - vptr->getall_dmattr = vecp[i].u_fc.getall_dmattr; - break; - case DM_FSYS_GETALL_INHERIT: - vptr->getall_inherit = vecp[i].u_fc.getall_inherit; - break; - case DM_FSYS_INIT_ATTRLOC: - vptr->init_attrloc = vecp[i].u_fc.init_attrloc; - break; - case DM_FSYS_MKDIR_BY_HANDLE: - vptr->mkdir_by_handle = vecp[i].u_fc.mkdir_by_handle; - break; - case DM_FSYS_PROBE_HOLE: - vptr->probe_hole = vecp[i].u_fc.probe_hole; - break; - case DM_FSYS_PUNCH_HOLE: - vptr->punch_hole = vecp[i].u_fc.punch_hole; - break; - case DM_FSYS_READ_INVIS_RVP: - vptr->read_invis_rvp = vecp[i].u_fc.read_invis_rvp; - break; - case DM_FSYS_RELEASE_RIGHT: - vptr->release_right = vecp[i].u_fc.release_right; - break; - case DM_FSYS_REMOVE_DMATTR: - vptr->remove_dmattr = vecp[i].u_fc.remove_dmattr; - break; - case DM_FSYS_REQUEST_RIGHT: - vptr->request_right = vecp[i].u_fc.request_right; - break; - case DM_FSYS_SET_DMATTR: - vptr->set_dmattr = vecp[i].u_fc.set_dmattr; - break; - case DM_FSYS_SET_EVENTLIST: - vptr->set_eventlist = vecp[i].u_fc.set_eventlist; - break; - case DM_FSYS_SET_FILEATTR: - vptr->set_fileattr = vecp[i].u_fc.set_fileattr; - break; - case DM_FSYS_SET_INHERIT: - vptr->set_inherit = vecp[i].u_fc.set_inherit; - break; - case DM_FSYS_SET_REGION: - vptr->set_region = vecp[i].u_fc.set_region; - break; - case DM_FSYS_SYMLINK_BY_HANDLE: - vptr->symlink_by_handle = vecp[i].u_fc.symlink_by_handle; - break; - case DM_FSYS_SYNC_BY_HANDLE: - vptr->sync_by_handle = vecp[i].u_fc.sync_by_handle; - break; - case DM_FSYS_UPGRADE_RIGHT: - vptr->upgrade_right = vecp[i].u_fc.upgrade_right; - break; - case DM_FSYS_WRITE_INVIS_RVP: - vptr->write_invis_rvp = vecp[i].u_fc.write_invis_rvp; - break; - case DM_FSYS_OBJ_REF_HOLD: - vptr->obj_ref_hold = vecp[i].u_fc.obj_ref_hold; - break; - default: /* ignore ones we don't understand */ - break; - } - } -} - - -/* Must hold dm_fsys_lock. - * This returns the prototype for all instances of the fstype. - */ -static dm_vector_map_t * -dm_fsys_map_by_fstype( - struct file_system_type *fstype) -{ - struct list_head *p; - dm_vector_map_t *proto = NULL; - dm_vector_map_t *m; - - ASSERT_ALWAYS(fstype); - list_for_each(p, &dm_fsys_map) { - m = list_entry(p, dm_vector_map_t, ftype_list); - if (m->f_type == fstype) { - proto = m; - break; - } - } - return proto; -} - - -/* Must hold dm_fsys_lock */ -static dm_vector_map_t * -dm_fsys_map_by_sb( - struct super_block *sb) -{ - struct list_head *p; - dm_vector_map_t *proto; - dm_vector_map_t *m; - dm_vector_map_t *foundmap = NULL; - - proto = dm_fsys_map_by_fstype(sb->s_type); - if(proto == NULL) { - return NULL; - } - - list_for_each(p, &proto->sb_list) { - m = list_entry(p, dm_vector_map_t, sb_list); - if (m->sb == sb) { - foundmap = m; - break; - } - } - return foundmap; -} - - -#ifdef CONFIG_DMAPI_DEBUG -static void -sb_list( - struct super_block *sb) -{ - struct list_head *p; - dm_vector_map_t *proto; - dm_vector_map_t *m; - - proto = dm_fsys_map_by_fstype(sb->s_type); - ASSERT(proto); - -printk("%s/%d: Current sb_list\n", __FUNCTION__, __LINE__); - list_for_each(p, &proto->sb_list) { - m = list_entry(p, dm_vector_map_t, sb_list); -printk("%s/%d: map 0x%p, sb 0x%p, vptr 0x%p, dmapiops 0x%p\n", __FUNCTION__, __LINE__, m, m->sb, m->vptr, m->dmapiops); - } -printk("%s/%d: Done sb_list\n", __FUNCTION__, __LINE__); -} -#else -#define sb_list(x) -#endif - -#ifdef CONFIG_DMAPI_DEBUG -static void -ftype_list(void) -{ - struct list_head *p; - dm_vector_map_t *m; - -printk("%s/%d: Current ftype_list\n", __FUNCTION__, __LINE__); - list_for_each(p, &dm_fsys_map) { - m = list_entry(p, dm_vector_map_t, ftype_list); - printk("%s/%d: FS 0x%p, ftype 0x%p %s\n", __FUNCTION__, __LINE__, m, m->f_type, m->f_type->name); - } -printk("%s/%d: Done ftype_list\n", __FUNCTION__, __LINE__); -} -#else -#define ftype_list() -#endif - -/* Ask for vptr for this filesystem instance. - * The caller knows this inode is on a dmapi-managed filesystem. - */ -dm_fsys_vector_t * -dm_fsys_vector( - struct inode *ip) -{ - dm_vector_map_t *map; - - spin_lock(&dm_fsys_lock); - ftype_list(); - map = dm_fsys_map_by_sb(ip->i_sb); - spin_unlock(&dm_fsys_lock); - ASSERT(map); - ASSERT(map->vptr); - return map->vptr; -} - - -/* Ask for the dmapiops for this filesystem instance. The caller is - * also asking if this is a dmapi-managed filesystem. - */ -struct filesystem_dmapi_operations * -dm_fsys_ops( - struct super_block *sb) -{ - dm_vector_map_t *proto = NULL; - dm_vector_map_t *map; - - spin_lock(&dm_fsys_lock); - ftype_list(); - sb_list(sb); - map = dm_fsys_map_by_sb(sb); - if (map == NULL) - proto = dm_fsys_map_by_fstype(sb->s_type); - spin_unlock(&dm_fsys_lock); - - if ((map == NULL) && (proto == NULL)) - return NULL; - - if (map == NULL) { - /* Find out if it's dmapi-managed */ - dm_vector_map_t *m; - - ASSERT(proto); - m = kmem_cache_alloc(dm_fsys_map_cachep, GFP_KERNEL); - if (m == NULL) { - printk("%s/%d: kmem_cache_alloc(dm_fsys_map_cachep) returned NULL\n", __FUNCTION__, __LINE__); - return NULL; - } - memset(m, 0, sizeof(*m)); - m->dmapiops = proto->dmapiops; - m->f_type = sb->s_type; - m->sb = sb; - INIT_LIST_HEAD(&m->sb_list); - INIT_LIST_HEAD(&m->ftype_list); - - dm_query_fsys_for_vector(m); - if (m->vptr == NULL) { - /* This isn't dmapi-managed */ - kmem_cache_free(dm_fsys_map_cachep, m); - return NULL; - } - - spin_lock(&dm_fsys_lock); - if ((map = dm_fsys_map_by_sb(sb)) == NULL) - list_add(&m->sb_list, &proto->sb_list); - spin_unlock(&dm_fsys_lock); - - if (map) { - kmem_cache_free(dm_fsys_vptr_cachep, m->vptr); - kmem_cache_free(dm_fsys_map_cachep, m); - } - else { - map = m; - } - } - - return map->dmapiops; -} - - - -/* Called when a filesystem instance is unregistered from dmapi */ -void -dm_fsys_ops_release( - struct super_block *sb) -{ - dm_vector_map_t *map; - - spin_lock(&dm_fsys_lock); - ASSERT(!list_empty(&dm_fsys_map)); - map = dm_fsys_map_by_sb(sb); - ASSERT(map); - list_del(&map->sb_list); - spin_unlock(&dm_fsys_lock); - - ASSERT(map->vptr); - kmem_cache_free(dm_fsys_vptr_cachep, map->vptr); - kmem_cache_free(dm_fsys_map_cachep, map); -} - - -/* Called by a filesystem module that is loading into the kernel. - * This creates a new dm_vector_map_t which serves as the prototype - * for instances of this fstype and also provides the list_head - * for instances of this fstype. The prototypes are the only ones - * on the fstype_list, and will never be on the sb_list. - */ -void -dmapi_register( - struct file_system_type *fstype, - struct filesystem_dmapi_operations *dmapiops) -{ - dm_vector_map_t *proto; - - proto = kmem_cache_alloc(dm_fsys_map_cachep, GFP_KERNEL); - if (proto == NULL) { - printk("%s/%d: kmem_cache_alloc(dm_fsys_map_cachep) returned NULL\n", __FUNCTION__, __LINE__); - return; - } - memset(proto, 0, sizeof(*proto)); - proto->dmapiops = dmapiops; - proto->f_type = fstype; - INIT_LIST_HEAD(&proto->sb_list); - INIT_LIST_HEAD(&proto->ftype_list); - - spin_lock(&dm_fsys_lock); - ASSERT(dm_fsys_map_by_fstype(fstype) == NULL); - list_add(&proto->ftype_list, &dm_fsys_map); - ftype_list(); - spin_unlock(&dm_fsys_lock); -} - -/* Called by a filesystem module that is unloading from the kernel */ -void -dmapi_unregister( - struct file_system_type *fstype) -{ - struct list_head *p; - dm_vector_map_t *proto; - dm_vector_map_t *m; - - spin_lock(&dm_fsys_lock); - ASSERT(!list_empty(&dm_fsys_map)); - proto = dm_fsys_map_by_fstype(fstype); - ASSERT(proto); - list_del(&proto->ftype_list); - spin_unlock(&dm_fsys_lock); - - p = &proto->sb_list; - while (!list_empty(p)) { - m = list_entry(p->next, dm_vector_map_t, sb_list); - list_del(&m->sb_list); - ASSERT(m->vptr); - kmem_cache_free(dm_fsys_vptr_cachep, m->vptr); - kmem_cache_free(dm_fsys_map_cachep, m); - } - kmem_cache_free(dm_fsys_map_cachep, proto); -} - - -int -dmapi_registered( - struct file_system_type *fstype, - struct filesystem_dmapi_operations **dmapiops) -{ - return 0; -} Index: linux-2.6-xfs/fs/xfs/dmapi/xfs_dm.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/dmapi/xfs_dm.c 2008-08-02 00:43:29.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/dmapi/xfs_dm.c 2008-08-02 01:36:44.000000000 +0200 @@ -2930,103 +2930,6 @@ xfs_dm_obj_ref_hold( } -static fsys_function_vector_t xfs_fsys_vector[DM_FSYS_MAX]; - - -STATIC int -xfs_dm_get_dmapiops( - struct super_block *sb, - void *addr) -{ - static int initialized = 0; - dm_fcntl_vector_t *vecrq; - fsys_function_vector_t *vecp; - int i = 0; - - vecrq = (dm_fcntl_vector_t *)addr; - vecrq->count = - sizeof(xfs_fsys_vector) / sizeof(xfs_fsys_vector[0]); - vecrq->vecp = xfs_fsys_vector; - if (initialized) - return(0); - vecrq->code_level = DM_CLVL_XOPEN; - vecp = xfs_fsys_vector; - - vecp[i].func_no = DM_FSYS_CLEAR_INHERIT; - vecp[i++].u_fc.clear_inherit = xfs_dm_clear_inherit; - vecp[i].func_no = DM_FSYS_CREATE_BY_HANDLE; - vecp[i++].u_fc.create_by_handle = xfs_dm_create_by_handle; - vecp[i].func_no = DM_FSYS_DOWNGRADE_RIGHT; - vecp[i++].u_fc.downgrade_right = xfs_dm_downgrade_right; - vecp[i].func_no = DM_FSYS_GET_ALLOCINFO_RVP; - vecp[i++].u_fc.get_allocinfo_rvp = xfs_dm_get_allocinfo_rvp; - vecp[i].func_no = DM_FSYS_GET_BULKALL_RVP; - vecp[i++].u_fc.get_bulkall_rvp = xfs_dm_get_bulkall_rvp; - vecp[i].func_no = DM_FSYS_GET_BULKATTR_RVP; - vecp[i++].u_fc.get_bulkattr_rvp = xfs_dm_get_bulkattr_rvp; - vecp[i].func_no = DM_FSYS_GET_CONFIG; - vecp[i++].u_fc.get_config = xfs_dm_get_config; - vecp[i].func_no = DM_FSYS_GET_CONFIG_EVENTS; - vecp[i++].u_fc.get_config_events = xfs_dm_get_config_events; - vecp[i].func_no = DM_FSYS_GET_DESTROY_DMATTR; - vecp[i++].u_fc.get_destroy_dmattr = xfs_dm_get_destroy_dmattr; - vecp[i].func_no = DM_FSYS_GET_DIOINFO; - vecp[i++].u_fc.get_dioinfo = xfs_dm_get_dioinfo; - vecp[i].func_no = DM_FSYS_GET_DIRATTRS_RVP; - vecp[i++].u_fc.get_dirattrs_rvp = xfs_dm_get_dirattrs_rvp; - vecp[i].func_no = DM_FSYS_GET_DMATTR; - vecp[i++].u_fc.get_dmattr = xfs_dm_get_dmattr; - vecp[i].func_no = DM_FSYS_GET_EVENTLIST; - vecp[i++].u_fc.get_eventlist = xfs_dm_get_eventlist; - vecp[i].func_no = DM_FSYS_GET_FILEATTR; - vecp[i++].u_fc.get_fileattr = xfs_dm_get_fileattr; - vecp[i].func_no = DM_FSYS_GET_REGION; - vecp[i++].u_fc.get_region = xfs_dm_get_region; - vecp[i].func_no = DM_FSYS_GETALL_DMATTR; - vecp[i++].u_fc.getall_dmattr = xfs_dm_getall_dmattr; - vecp[i].func_no = DM_FSYS_GETALL_INHERIT; - vecp[i++].u_fc.getall_inherit = xfs_dm_getall_inherit; - vecp[i].func_no = DM_FSYS_INIT_ATTRLOC; - vecp[i++].u_fc.init_attrloc = xfs_dm_init_attrloc; - vecp[i].func_no = DM_FSYS_MKDIR_BY_HANDLE; - vecp[i++].u_fc.mkdir_by_handle = xfs_dm_mkdir_by_handle; - vecp[i].func_no = DM_FSYS_PROBE_HOLE; - vecp[i++].u_fc.probe_hole = xfs_dm_probe_hole; - vecp[i].func_no = DM_FSYS_PUNCH_HOLE; - vecp[i++].u_fc.punch_hole = xfs_dm_punch_hole; - vecp[i].func_no = DM_FSYS_READ_INVIS_RVP; - vecp[i++].u_fc.read_invis_rvp = xfs_dm_read_invis_rvp; - vecp[i].func_no = DM_FSYS_RELEASE_RIGHT; - vecp[i++].u_fc.release_right = xfs_dm_release_right; - vecp[i].func_no = DM_FSYS_REMOVE_DMATTR; - vecp[i++].u_fc.remove_dmattr = xfs_dm_remove_dmattr; - vecp[i].func_no = DM_FSYS_REQUEST_RIGHT; - vecp[i++].u_fc.request_right = xfs_dm_request_right; - vecp[i].func_no = DM_FSYS_SET_DMATTR; - vecp[i++].u_fc.set_dmattr = xfs_dm_set_dmattr; - vecp[i].func_no = DM_FSYS_SET_EVENTLIST; - vecp[i++].u_fc.set_eventlist = xfs_dm_set_eventlist; - vecp[i].func_no = DM_FSYS_SET_FILEATTR; - vecp[i++].u_fc.set_fileattr = xfs_dm_set_fileattr; - vecp[i].func_no = DM_FSYS_SET_INHERIT; - vecp[i++].u_fc.set_inherit = xfs_dm_set_inherit; - vecp[i].func_no = DM_FSYS_SET_REGION; - vecp[i++].u_fc.set_region = xfs_dm_set_region; - vecp[i].func_no = DM_FSYS_SYMLINK_BY_HANDLE; - vecp[i++].u_fc.symlink_by_handle = xfs_dm_symlink_by_handle; - vecp[i].func_no = DM_FSYS_SYNC_BY_HANDLE; - vecp[i++].u_fc.sync_by_handle = xfs_dm_sync_by_handle; - vecp[i].func_no = DM_FSYS_UPGRADE_RIGHT; - vecp[i++].u_fc.upgrade_right = xfs_dm_upgrade_right; - vecp[i].func_no = DM_FSYS_WRITE_INVIS_RVP; - vecp[i++].u_fc.write_invis_rvp = xfs_dm_write_invis_rvp; - vecp[i].func_no = DM_FSYS_OBJ_REF_HOLD; - vecp[i++].u_fc.obj_ref_hold = xfs_dm_obj_ref_hold; - - return(0); -} - - /* xfs_dm_send_mmap_event - send events needed for memory mapping a file. * * This is a workaround called for files that are about to be @@ -3291,12 +3194,47 @@ xfs_dm_get_fsid( /* * Filesystem operations accessed by the DMAPI core. */ -static struct filesystem_dmapi_operations xfs_dmapiops = { - .get_fsys_vector = xfs_dm_get_dmapiops, +static struct dmapi_operations xfs_dmapiops = { + .fstype = &xfs_fs_type, .fh_to_inode = xfs_dm_fh_to_inode, .get_invis_ops = xfs_dm_get_invis_ops, .inode_to_fh = xfs_dm_inode_to_fh, .get_fsid = xfs_dm_get_fsid, + .clear_inherit = xfs_dm_clear_inherit, + .create_by_handle = xfs_dm_create_by_handle, + .downgrade_right = xfs_dm_downgrade_right, + .get_allocinfo_rvp = xfs_dm_get_allocinfo_rvp, + .get_bulkall_rvp = xfs_dm_get_bulkall_rvp, + .get_bulkattr_rvp = xfs_dm_get_bulkattr_rvp, + .get_config = xfs_dm_get_config, + .get_config_events = xfs_dm_get_config_events, + .get_destroy_dmattr = xfs_dm_get_destroy_dmattr, + .get_dioinfo = xfs_dm_get_dioinfo, + .get_dirattrs_rvp = xfs_dm_get_dirattrs_rvp, + .get_dmattr = xfs_dm_get_dmattr, + .get_eventlist = xfs_dm_get_eventlist, + .get_fileattr = xfs_dm_get_fileattr, + .get_region = xfs_dm_get_region, + .getall_dmattr = xfs_dm_getall_dmattr, + .getall_inherit = xfs_dm_getall_inherit, + .init_attrloc = xfs_dm_init_attrloc, + .mkdir_by_handle = xfs_dm_mkdir_by_handle, + .probe_hole = xfs_dm_probe_hole, + .punch_hole = xfs_dm_punch_hole, + .read_invis_rvp = xfs_dm_read_invis_rvp, + .release_right = xfs_dm_release_right, + .remove_dmattr = xfs_dm_remove_dmattr, + .request_right = xfs_dm_request_right, + .set_dmattr = xfs_dm_set_dmattr, + .set_eventlist = xfs_dm_set_eventlist, + .set_fileattr = xfs_dm_set_fileattr, + .set_inherit = xfs_dm_set_inherit, + .set_region = xfs_dm_set_region, + .symlink_by_handle = xfs_dm_symlink_by_handle, + .sync_by_handle = xfs_dm_sync_by_handle, + .upgrade_right = xfs_dm_upgrade_right, + .write_invis_rvp = xfs_dm_write_invis_rvp, + .obj_ref_hold = xfs_dm_obj_ref_hold, }; static int __init @@ -3304,14 +3242,14 @@ xfs_dm_init(void) { printk(KERN_INFO "SGI XFS Data Management API subsystem\n"); - dmapi_register(&xfs_fs_type, &xfs_dmapiops); + dmapi_register(&xfs_dmapiops); return 0; } static void __exit xfs_dm_exit(void) { - dmapi_unregister(&xfs_fs_type); + dmapi_unregister(&xfs_dmapiops); } MODULE_AUTHOR("Silicon Graphics, Inc."); Index: linux-2.6-xfs/fs/dmapi/dmapi_private.h =================================================================== --- linux-2.6-xfs.orig/fs/dmapi/dmapi_private.h 2008-08-01 23:54:17.000000000 +0200 +++ linux-2.6-xfs/fs/dmapi/dmapi_private.h 2008-08-02 01:38:20.000000000 +0200 @@ -44,8 +44,6 @@ extern struct kmem_cache *dm_fsreg_cachep; extern struct kmem_cache *dm_tokdata_cachep; extern struct kmem_cache *dm_session_cachep; -extern struct kmem_cache *dm_fsys_map_cachep; -extern struct kmem_cache *dm_fsys_vptr_cachep; typedef struct dm_tokdata { struct dm_tokdata *td_next; @@ -243,60 +241,6 @@ typedef struct dm_fsreg { #define DM_MAX_MSG_DATA 3960 - -/* Supported filesystem function vector functions. */ - - -typedef struct { - int code_level; - dm_fsys_clear_inherit_t clear_inherit; - dm_fsys_create_by_handle_t create_by_handle; - dm_fsys_downgrade_right_t downgrade_right; - dm_fsys_get_allocinfo_rvp_t get_allocinfo_rvp; - dm_fsys_get_bulkall_rvp_t get_bulkall_rvp; - dm_fsys_get_bulkattr_rvp_t get_bulkattr_rvp; - dm_fsys_get_config_t get_config; - dm_fsys_get_config_events_t get_config_events; - dm_fsys_get_destroy_dmattr_t get_destroy_dmattr; - dm_fsys_get_dioinfo_t get_dioinfo; - dm_fsys_get_dirattrs_rvp_t get_dirattrs_rvp; - dm_fsys_get_dmattr_t get_dmattr; - dm_fsys_get_eventlist_t get_eventlist; - dm_fsys_get_fileattr_t get_fileattr; - dm_fsys_get_region_t get_region; - dm_fsys_getall_dmattr_t getall_dmattr; - dm_fsys_getall_inherit_t getall_inherit; - dm_fsys_init_attrloc_t init_attrloc; - dm_fsys_mkdir_by_handle_t mkdir_by_handle; - dm_fsys_probe_hole_t probe_hole; - dm_fsys_punch_hole_t punch_hole; - dm_fsys_read_invis_rvp_t read_invis_rvp; - dm_fsys_release_right_t release_right; - dm_fsys_remove_dmattr_t remove_dmattr; - dm_fsys_request_right_t request_right; - dm_fsys_set_dmattr_t set_dmattr; - dm_fsys_set_eventlist_t set_eventlist; - dm_fsys_set_fileattr_t set_fileattr; - dm_fsys_set_inherit_t set_inherit; - dm_fsys_set_region_t set_region; - dm_fsys_symlink_by_handle_t symlink_by_handle; - dm_fsys_sync_by_handle_t sync_by_handle; - dm_fsys_upgrade_right_t upgrade_right; - dm_fsys_write_invis_rvp_t write_invis_rvp; - dm_fsys_obj_ref_hold_t obj_ref_hold; -} dm_fsys_vector_t; - - -typedef struct { - struct list_head ftype_list; /* list of fstypes */ - struct list_head sb_list; /* list of sb's per fstype */ - struct file_system_type *f_type; - struct filesystem_dmapi_operations *dmapiops; - dm_fsys_vector_t *vptr; - struct super_block *sb; -} dm_vector_map_t; - - extern dm_session_t *dm_sessions; /* head of session list */ extern dm_fsreg_t *dm_registers; extern lock_t dm_reg_lock; /* lock for registration list */ @@ -462,12 +406,9 @@ void dm_remove_fsys_entry( dm_fsys_vector_t *dm_fsys_vector( struct inode *ip); -struct filesystem_dmapi_operations *dm_fsys_ops( +struct dmapi_operations *dm_fsys_ops( struct super_block *sb); -void dm_fsys_ops_release( - struct super_block *sb); - int dm_waitfor_disp_session( struct super_block *sb, dm_tokevent_t *tevp, Index: linux-2.6-xfs/fs/dmapi/dmapi_sysent.c =================================================================== --- linux-2.6-xfs.orig/fs/dmapi/dmapi_sysent.c 2008-08-01 23:54:17.000000000 +0200 +++ linux-2.6-xfs/fs/dmapi/dmapi_sysent.c 2008-08-02 01:42:42.000000000 +0200 @@ -54,8 +54,56 @@ struct kmem_cache *dm_fsreg_cachep = NULL; struct kmem_cache *dm_tokdata_cachep = NULL; struct kmem_cache *dm_session_cachep = NULL; -struct kmem_cache *dm_fsys_map_cachep = NULL; -struct kmem_cache *dm_fsys_vptr_cachep = NULL; + + +static LIST_HEAD(dm_fsys_map); +static DEFINE_SPINLOCK(dm_fsys_lock); + + +/* + * Ask for the dmapiops for this filesystem instance. The caller is + * also asking if this is a dmapi-managed filesystem. + */ +struct dmapi_operations *dm_fsys_ops(struct super_block *sb) +{ + struct dmapi_operations *ops; + + spin_lock(&dm_fsys_lock); + list_for_each_entry(ops, &dm_fsys_map, list) { + if (ops->fstype == sb->s_type) { + spin_unlock(&dm_fsys_lock); + return ops; + } + } + spin_unlock(&dm_fsys_lock); + return NULL; +} + +struct dmapi_operations *dm_fsys_vector(struct inode *inode) +{ + return dm_fsys_ops(inode->i_sb); +} + +/* + * Called by a filesystem module that is loading into the kernel + * to register it's dmapi operation vector. + */ +void dmapi_register(struct dmapi_operations *dmapiops) +{ + spin_lock(&dm_fsys_lock); + list_add_tail(&dmapiops->list, &dm_fsys_map); + spin_unlock(&dm_fsys_lock); +} + +/* + * Called by a filesystem module that is unloading from the kernel. + */ +void dmapi_unregister(struct dmapi_operations *dmapiops) +{ + spin_lock(&dm_fsys_lock); + list_del(&dmapiops->list); + spin_unlock(&dm_fsys_lock); +} static int dmapi_ioctl(struct inode *inode, struct file *file, unsigned int cmd, @@ -741,28 +789,15 @@ int __init dmapi_init(void) if (dm_session_cachep == NULL) goto out_free_fsreg_cachep; - dm_fsys_map_cachep = kmem_cache_create("dm_fsys_map", - sizeof(dm_vector_map_t), 0, 0, NULL); - if (dm_fsys_map_cachep == NULL) - goto out_free_session_cachep; - dm_fsys_vptr_cachep = kmem_cache_create("dm_fsys_vptr", - sizeof(dm_fsys_vector_t), 0, 0, NULL); - if (dm_fsys_vptr_cachep == NULL) - goto out_free_fsys_map_cachep; - ret = misc_register(&dmapi_dev); if (ret) { printk(KERN_ERR "dmapi_init: misc_register returned %d\n", ret); - goto out_free_fsys_vptr_cachep; + goto out_free_session_cachep; } dmapi_init_procfs(dmapi_dev.minor); return 0; - out_free_fsys_vptr_cachep: - kmem_cache_destroy(dm_fsys_vptr_cachep); - out_free_fsys_map_cachep: - kmem_cache_destroy(dm_fsys_map_cachep); out_free_session_cachep: kmem_cache_destroy(dm_session_cachep); out_free_fsreg_cachep: @@ -781,8 +816,6 @@ void __exit dmapi_uninit(void) kmem_cache_destroy(dm_tokdata_cachep); kmem_cache_destroy(dm_fsreg_cachep); kmem_cache_destroy(dm_session_cachep); - kmem_cache_destroy(dm_fsys_map_cachep); - kmem_cache_destroy(dm_fsys_vptr_cachep); } #endif @@ -801,5 +834,4 @@ EXPORT_SYMBOL(dm_send_destroy_event); EXPORT_SYMBOL(dm_ip_to_handle); EXPORT_SYMBOL(dmapi_register); EXPORT_SYMBOL(dmapi_unregister); -EXPORT_SYMBOL(dmapi_registered); EXPORT_SYMBOL(dm_release_threads); Index: linux-2.6-xfs/fs/dmapi/dmapi_register.c =================================================================== --- linux-2.6-xfs.orig/fs/dmapi/dmapi_register.c 2008-08-02 01:35:34.000000000 +0200 +++ linux-2.6-xfs/fs/dmapi/dmapi_register.c 2008-08-02 01:38:00.000000000 +0200 @@ -206,7 +206,7 @@ dm_add_fsys_entry( void *msg; unsigned long lc; /* lock cookie */ dm_fsid_t fsid; - struct filesystem_dmapi_operations *dops; + struct dmapi_operations *dops; dops = dm_fsys_ops(sb); ASSERT(dops); @@ -297,7 +297,7 @@ dm_change_fsys_entry( int seq_error; unsigned long lc; /* lock cookie */ dm_fsid_t fsid; - struct filesystem_dmapi_operations *dops; + struct dmapi_operations *dops; /* Find the filesystem referenced by the sb's fsid_t. This should always succeed. @@ -393,7 +393,7 @@ dm_remove_fsys_entry( dm_fsreg_t **fsrpp; dm_fsreg_t *fsrp; unsigned long lc; /* lock cookie */ - struct filesystem_dmapi_operations *dops; + struct dmapi_operations *dops; dm_fsid_t fsid; dops = dm_fsys_ops(sb); @@ -464,7 +464,6 @@ dm_remove_fsys_entry( remove_proc_entry(buf, NULL); } #endif - dm_fsys_ops_release(sb); sv_destroy(&fsrp->fr_dispq); sv_destroy(&fsrp->fr_queue); spinlock_destroy(&fsrp->fr_lock); @@ -506,7 +505,7 @@ dm_handle_to_ip( struct super_block *sb; struct inode *ip; int filetype; - struct filesystem_dmapi_operations *dmapiops; + struct dmapi_operations *dmapiops; if ((fsrp = dm_find_fsreg_and_lock(&handlep->ha_fsid, &lc)) == NULL) return NULL; @@ -591,7 +590,7 @@ dm_ip_to_handle( dm_fid_t fid; dm_fsid_t fsid; int hsize; - struct filesystem_dmapi_operations *dops; + struct dmapi_operations *dops; dops = dm_fsys_ops(ip->i_sb); ASSERT(dops); @@ -683,7 +682,7 @@ dm_waitfor_disp( dm_session_t *s; dm_fsreg_t *fsrp; dm_fsid_t fsid; - struct filesystem_dmapi_operations *dops; + struct dmapi_operations *dops; dops = dm_fsys_ops(sb); ASSERT(dops); @@ -864,7 +863,7 @@ dm_path_to_hdl( struct inode *inode; size_t len; char *name; - struct filesystem_dmapi_operations *dops; + struct dmapi_operations *dops; /* XXX get things straightened out so getname() works here? */ if (!(len = strnlen_user(path, PATH_MAX))) @@ -940,7 +939,7 @@ dm_path_to_fshdl( struct inode *inode; size_t len; char *name; - struct filesystem_dmapi_operations *dops; + struct dmapi_operations *dops; /* XXX get things straightened out so getname() works here? */ if(!(len = strnlen_user(path, PATH_MAX))) @@ -1577,7 +1576,7 @@ dm_open_by_handle_rvp( } if (td_type == DM_TDT_REG) { - struct filesystem_dmapi_operations *dmapiops; + struct dmapi_operations *dmapiops; dmapiops = dm_fsys_ops(inodep->i_sb); if (dmapiops && dmapiops->get_invis_ops) { /* invisible operation should not change atime */ Index: linux-2.6-xfs/fs/dmapi/Makefile =================================================================== --- linux-2.6-xfs.orig/fs/dmapi/Makefile 2008-08-02 01:42:50.000000000 +0200 +++ linux-2.6-xfs/fs/dmapi/Makefile 2008-08-02 01:42:56.000000000 +0200 @@ -46,7 +46,6 @@ dmapi-y += dmapi_sysent.o \ dmapi_handle.o \ dmapi_hole.o \ dmapi_io.o \ - dmapi_mountinfo.o \ dmapi_region.o \ dmapi_register.o \ dmapi_right.o \ Index: linux-2.6-xfs/fs/dmapi/dmapi_event.c =================================================================== --- linux-2.6-xfs.orig/fs/dmapi/dmapi_event.c 2008-08-02 01:37:00.000000000 +0200 +++ linux-2.6-xfs/fs/dmapi/dmapi_event.c 2008-08-02 01:37:14.000000000 +0200 @@ -199,7 +199,7 @@ dm_sb_data( dm_right_t right) { dm_tokdata_t *tdp; - struct filesystem_dmapi_operations *dops; + struct dmapi_operations *dops; dm_fsid_t fsid; dops = dm_fsys_ops(sb); Index: linux-2.6-xfs/fs/dmapi/dmapi_session.c =================================================================== --- linux-2.6-xfs.orig/fs/dmapi/dmapi_session.c 2008-08-02 01:37:19.000000000 +0200 +++ linux-2.6-xfs/fs/dmapi/dmapi_session.c 2008-08-02 01:37:29.000000000 +0200 @@ -1381,7 +1381,7 @@ dm_enqueue( return tevp->te_reply; } else { - struct filesystem_dmapi_operations *dops; + struct dmapi_operations *dops; dm_tokdata_t *tdp; int errno = 0; @@ -1767,7 +1767,7 @@ dm_release_threads( int i; int found_events = 0; dm_fsid_t fsid; - struct filesystem_dmapi_operations *dops; + struct dmapi_operations *dops; ASSERT(sb); dops = dm_fsys_ops(sb); From owner-xfs@oss.sgi.com Sat Aug 2 08:30:21 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Aug 2008 08:30:22 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m72FUK3R011567 for ; Sat, 2 Aug 2008 08:30:21 -0700 X-ASG-Debug-ID: 1217691093-233f01690000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id BBD8C1972F01 for ; Sat, 2 Aug 2008 08:31:33 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id Tp0wGIyr9dOUxEEY for ; Sat, 02 Aug 2008 08:31:33 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m72FVZIF020876 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Sat, 2 Aug 2008 17:31:35 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m72FVZRu020874; Sat, 2 Aug 2008 17:31:35 +0200 Date: Sat, 2 Aug 2008 17:31:35 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 15/21] implement generic xfs_btree_rshift Subject: Re: [PATCH 15/21] implement generic xfs_btree_rshift Message-ID: <20080802153135.GD19689@lst.de> References: <20080729193125.GP19104@lst.de> <20080730060808.GP13395@disturbed> <20080801194914.GI1263@lst.de> <20080802012042.GN6201@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080802012042.GN6201@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217691094 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1552 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17321 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Sat, Aug 02, 2008 at 11:20:42AM +1000, Dave Chinner wrote: > > It's the only leakage of the detailed inode root implementation into > > the generic code, so I'm still wondering whether a method would be > > better. > > Ah, right. yes, it probably would be cleaner to do it as a > separate method, but ?? don't think it's that important right now. That's why I left the XXX in, this is something to get right eventually, just not now :) From owner-xfs@oss.sgi.com Sat Aug 2 08:33:49 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Aug 2008 08:33:51 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m72FXnjW012065 for ; Sat, 2 Aug 2008 08:33:49 -0700 X-ASG-Debug-ID: 1217691301-525e01790000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 85B68353CDF for ; Sat, 2 Aug 2008 08:35:02 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id V4VEmyKZfFRJKZ8j for ; Sat, 02 Aug 2008 08:35:02 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m72FZ3IF021004 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Sat, 2 Aug 2008 17:35:04 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m72FZ31Y021002 for xfs@oss.sgi.com; Sat, 2 Aug 2008 17:35:03 +0200 Date: Sat, 2 Aug 2008 17:35:03 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 16/21] implement generic xfs_btree_lshift Subject: Re: [PATCH 16/21] implement generic xfs_btree_lshift Message-ID: <20080802153503.GE19689@lst.de> References: <20080729193132.GQ19104@lst.de> <20080730062422.GQ13395@disturbed> <20080801195249.GJ1263@lst.de> <20080802012803.GO6201@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080802012803.GO6201@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217691303 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1551 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17322 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Sat, Aug 02, 2008 at 11:28:03AM +1000, Dave Chinner wrote: > It might make sense to go back to a single implementation, > though at the time I did it it made sense to split the move/copy > operations because it made both cases simpler. Seeing as you've > stuck more closely to the original structure of the code, the > distinction is not as great as so it might be best to go back to a > single memmove based interface. Btw, one other idea I still have in my mind is to add rec_len and key_len methods to the core btree code, that way quite a few methods (ptr_addr, key_addr, rec_addr, set_key, move_keys, move_recs, copy_keys, copy_recs, log_keys, and log_recs) could be implemented in common code, leaving the actual btree implementations really small. From owner-xfs@oss.sgi.com Sat Aug 2 08:40:05 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Aug 2008 08:40:07 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m72Fe4Ge012848 for ; Sat, 2 Aug 2008 08:40:05 -0700 X-ASG-Debug-ID: 1217691676-233f02120000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C77301973016 for ; Sat, 2 Aug 2008 08:41:17 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id kK7alXz41Ee9w5tj for ; Sat, 02 Aug 2008 08:41:17 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m72FfJIF021247 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Sat, 2 Aug 2008 17:41:19 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m72FfJa5021245 for xfs@oss.sgi.com; Sat, 2 Aug 2008 17:41:19 +0200 Date: Sat, 2 Aug 2008 17:41:19 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH] kill xfs_buf_iostart Subject: [PATCH] kill xfs_buf_iostart Message-ID: <20080802154119.GA21145@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217691677 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1552 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17323 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs xfs_buf_iostart is a "shared" helper for xfs_buf_read_flags, xfs_bawrite, and xfs_bdwrite - except that there isn't much shared code but rather special cases for each caller. So remove this function and move the functionality to the caller. xfs_bawrite and xfs_bdwrite are now big enough to be moved out of line and the xfs_buf_read_flags is moved into a new helper called _xfs_buf_read. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_buf.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_buf.c 2008-05-15 14:15:45.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_buf.c 2008-05-15 14:38:41.000000000 +0200 @@ -630,6 +630,29 @@ xfs_buf_get_flags( return NULL; } +STATIC int +_xfs_buf_read( + xfs_buf_t *bp, + xfs_buf_flags_t flags) +{ + int status; + + XB_TRACE(bp, "_xfs_buf_read", (unsigned long)flags); + + ASSERT(!(flags & (XBF_DELWRI|XBF_WRITE))); + ASSERT(bp->b_bn != XFS_BUF_DADDR_NULL); + + bp->b_flags &= ~(XBF_WRITE | XBF_ASYNC | XBF_DELWRI | \ + XBF_READ_AHEAD | _XBF_RUN_QUEUES); + bp->b_flags |= flags & (XBF_READ | XBF_ASYNC | \ + XBF_READ_AHEAD | _XBF_RUN_QUEUES); + + status = xfs_buf_iorequest(bp); + if (!status && !(flags & XBF_ASYNC)) + status = xfs_buf_iowait(bp); + return status; +} + xfs_buf_t * xfs_buf_read_flags( xfs_buftarg_t *target, @@ -646,7 +669,7 @@ xfs_buf_read_flags( if (!XFS_BUF_ISDONE(bp)) { XB_TRACE(bp, "read", (unsigned long)flags); XFS_STATS_INC(xb_get_read); - xfs_buf_iostart(bp, flags); + _xfs_buf_read(bp, flags); } else if (flags & XBF_ASYNC) { XB_TRACE(bp, "read_async", (unsigned long)flags); /* @@ -1052,50 +1075,39 @@ xfs_buf_ioerror( XB_TRACE(bp, "ioerror", (unsigned long)error); } -/* - * Initiate I/O on a buffer, based on the flags supplied. - * The b_iodone routine in the buffer supplied will only be called - * when all of the subsidiary I/O requests, if any, have been completed. - */ int -xfs_buf_iostart( - xfs_buf_t *bp, - xfs_buf_flags_t flags) +xfs_bawrite( + void *mp, + struct xfs_buf *bp) { - int status = 0; + XB_TRACE(bp, "bawrite", 0); - XB_TRACE(bp, "iostart", (unsigned long)flags); + ASSERT(bp->b_bn != XFS_BUF_DADDR_NULL); - if (flags & XBF_DELWRI) { - bp->b_flags &= ~(XBF_READ | XBF_WRITE | XBF_ASYNC); - bp->b_flags |= flags & (XBF_DELWRI | XBF_ASYNC); - xfs_buf_delwri_queue(bp, 1); - return 0; - } + xfs_buf_delwri_dequeue(bp); - bp->b_flags &= ~(XBF_READ | XBF_WRITE | XBF_ASYNC | XBF_DELWRI | \ - XBF_READ_AHEAD | _XBF_RUN_QUEUES); - bp->b_flags |= flags & (XBF_READ | XBF_WRITE | XBF_ASYNC | \ - XBF_READ_AHEAD | _XBF_RUN_QUEUES); + bp->b_flags &= ~(XBF_READ | XBF_DELWRI | XBF_READ_AHEAD); + bp->b_flags |= (XBF_WRITE | XBF_ASYNC | _XBF_RUN_QUEUES); + + bp->b_fspriv3 = mp; + bp->b_strat = xfs_bdstrat_cb; + return xfs_bdstrat_cb(bp); +} - BUG_ON(bp->b_bn == XFS_BUF_DADDR_NULL); +void +xfs_bdwrite( + void *mp, + struct xfs_buf *bp) +{ + XB_TRACE(bp, "bdwrite", 0); - /* For writes allow an alternate strategy routine to precede - * the actual I/O request (which may not be issued at all in - * a shutdown situation, for example). - */ - status = (flags & XBF_WRITE) ? - xfs_buf_iostrategy(bp) : xfs_buf_iorequest(bp); + bp->b_strat = xfs_bdstrat_cb; + bp->b_fspriv3 = mp; - /* Wait for I/O if we are not an async request. - * Note: async I/O request completion will release the buffer, - * and that can already be done by this point. So using the - * buffer pointer from here on, after async I/O, is invalid. - */ - if (!status && !(flags & XBF_ASYNC)) - status = xfs_buf_iowait(bp); + bp->b_flags &= ~XBF_READ; + bp->b_flags |= (XBF_DELWRI | XBF_ASYNC); - return status; + xfs_buf_delwri_queue(bp, 1); } STATIC_INLINE void Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_buf.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_buf.h 2008-05-15 14:15:45.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_buf.h 2008-05-15 14:24:26.000000000 +0200 @@ -206,9 +206,10 @@ extern void xfs_buf_lock(xfs_buf_t *); extern void xfs_buf_unlock(xfs_buf_t *); /* Buffer Read and Write Routines */ +extern int xfs_bawrite(void *mp, xfs_buf_t *bp); +extern void xfs_bdwrite(void *mp, xfs_buf_t *bp); extern void xfs_buf_ioend(xfs_buf_t *, int); extern void xfs_buf_ioerror(xfs_buf_t *, int); -extern int xfs_buf_iostart(xfs_buf_t *, xfs_buf_flags_t); extern int xfs_buf_iorequest(xfs_buf_t *); extern int xfs_buf_iowait(xfs_buf_t *); extern void xfs_buf_iomove(xfs_buf_t *, size_t, size_t, xfs_caddr_t, @@ -358,14 +359,6 @@ extern void xfs_buf_trace(xfs_buf_t *, c #define XFS_BUF_TARGET(bp) ((bp)->b_target) #define XFS_BUFTARG_NAME(target) xfs_buf_target_name(target) -static inline int xfs_bawrite(void *mp, xfs_buf_t *bp) -{ - bp->b_fspriv3 = mp; - bp->b_strat = xfs_bdstrat_cb; - xfs_buf_delwri_dequeue(bp); - return xfs_buf_iostart(bp, XBF_WRITE | XBF_ASYNC | _XBF_RUN_QUEUES); -} - static inline void xfs_buf_relse(xfs_buf_t *bp) { if (!bp->b_relse) @@ -406,17 +399,6 @@ static inline int XFS_bwrite(xfs_buf_t * return error; } -/* - * No error can be returned from xfs_buf_iostart for delwri - * buffers as they are queued and no I/O is issued. - */ -static inline void xfs_bdwrite(void *mp, xfs_buf_t *bp) -{ - bp->b_strat = xfs_bdstrat_cb; - bp->b_fspriv3 = mp; - (void)xfs_buf_iostart(bp, XBF_DELWRI | XBF_ASYNC); -} - #define XFS_bdstrat(bp) xfs_buf_iorequest(bp) #define xfs_iowait(bp) xfs_buf_iowait(bp) Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_ksyms.c 2008-05-15 14:25:03.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c 2008-05-15 14:25:12.000000000 +0200 @@ -160,7 +160,8 @@ EXPORT_SYMBOL(xfs_file_operations); EXPORT_SYMBOL(xfs_invis_file_operations); EXPORT_SYMBOL(xfs_buf_delwri_dequeue); EXPORT_SYMBOL(_xfs_buf_find); -EXPORT_SYMBOL(xfs_buf_iostart); +EXPORT_SYMBOL(xfs_bawrite); +EXPORT_SYMBOL(xfs_bdwrite); EXPORT_SYMBOL(xfs_buf_ispin); #ifdef DEBUG EXPORT_SYMBOL(xfs_buf_lock_value); From owner-xfs@oss.sgi.com Sat Aug 2 17:05:19 2008 Received: with ECARTIS (v1.0.0; list xfs); Sat, 02 Aug 2008 17:05:22 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: *** X-Spam-Status: No, score=3.3 required=5.0 tests=BAYES_50,HTML_MESSAGE, J_CHICKENPOX_31,J_CHICKENPOX_43,J_CHICKENPOX_54 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m7305Jul024022 for ; Sat, 2 Aug 2008 17:05:19 -0700 X-ASG-Debug-ID: 1217721993-2cb202910000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail-gx0-f28.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id AF00B352A36 for ; Sat, 2 Aug 2008 17:06:33 -0700 (PDT) Received: from mail-gx0-f28.google.com (mail-gx0-f28.google.com [209.85.217.28]) by cuda.sgi.com with ESMTP id XXegqgGI7fKjp7rn for ; Sat, 02 Aug 2008 17:06:33 -0700 (PDT) Received: by gxk9 with SMTP id 9so2308511gxk.20 for ; Sat, 02 Aug 2008 17:06:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type; bh=pA6a9ncZ9Pv3gGd5XwwzKP5fQ7Sh7+rmbxNShGcBouA=; b=afzs2AqurVAmV+SOgYGaYkR8ofXa9whQpR0JNPR1n6jeHC/m8AI8LEqcq0o33Ot4gc bUIg0VsQeLHI6a3UkBNEHVSNgOuaY9z3ECyRalXZp8NM3sCwyudwzAcNpQnqolyLOzCf orosRA3V3mVHKuSOVtmevUe/iZrda63+Gfyyo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type; b=OEk74C+Y3jsWKkNO1LPjSeVoLiHaxwjVMgsojC0tUEfSoUFXMAL2/+IcMjoS6kI2WX 2BNbaZGkJMksDmyoIyG1SWc9NpBeeXb4dDUSBox3KGTbKmWxO0K3m2NlB6hN6AM0WoA+ /eSUPHJZuMIgRW6mrgN9vSxp2H6gLdO3mTDZY= Received: by 10.151.108.3 with SMTP id k3mr5747152ybm.226.1217721989355; Sat, 02 Aug 2008 17:06:29 -0700 (PDT) Received: by 10.151.107.3 with HTTP; Sat, 2 Aug 2008 17:06:29 -0700 (PDT) Message-ID: Date: Sun, 3 Aug 2008 00:06:29 +0000 From: "paulina godwill" To: paulina04godwill@yahoo.com X-ASG-Orig-Subj: HI Subject: HI MIME-Version: 1.0 X-Barracuda-Connect: mail-gx0-f28.google.com[209.85.217.28] X-Barracuda-Start-Time: 1217721993 X-Barracuda-Bayes: INNOCENT GLOBAL 0.5054 1.0000 0.7500 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 0.75 X-Barracuda-Spam-Status: No, SCORE=0.75 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1586 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 556 X-archive-position: 17324 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: paulina.godwill@gmail.com Precedence: bulk X-list: xfs Hello My name is miss Paulina godwill i saw your profile today and became intrested in you,i will also like to know you the more,and i want you to send a mail to my email address so i can give you my picture for you to know whom l am.Here is my email address(paulina04godwill@yahoo.com ).I believe we can move from here.I am waiting for your mail to my email address above.miss paulina.(Remeber the distance or colour does not matter but love matters alot in life) Please rpely me with my email address here [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Sun Aug 3 18:33:10 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:33:27 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741X9eZ032035 for ; Sun, 3 Aug 2008 18:33:10 -0700 X-ASG-Debug-ID: 1217813660-40db01c40000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 48D5EEF6A71 for ; Sun, 3 Aug 2008 18:34:21 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id UcDt2BhN1uWeeoO6 for ; Sun, 03 Aug 2008 18:34:21 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741YMIF009137 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:34:22 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741YMY7009135 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:34:22 +0200 Date: Mon, 4 Aug 2008 03:34:22 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 13/26] implement generic xfs_btree_updkey Subject: [PATCH 13/26] implement generic xfs_btree_updkey Message-ID: <20080804013422.GN8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-common-btree-updkey User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813663 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17338 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs From: Dave Chinner Note that there are many > 80 char lines introduced due to the xfs_btree_key casts. But the places where this happens is throw-away code once the whole btree code gets merged into a common implementation. The same is true for the temporary xfs_alloc_log_keys define to the new name. All old users will be gone after a few patches. [hch: split out from bigger patch and minor adaptions] Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-02 13:46:01.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-02 23:11:28.000000000 +0200 @@ -1371,3 +1371,50 @@ error0: XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); return error; } + +/* + * Update keys at all levels from here to the root along the cursor's path. + */ +int +xfs_btree_updkey( + struct xfs_btree_cur *cur, + union xfs_btree_key *keyp, + int level) +{ + struct xfs_btree_block *block; + struct xfs_buf *bp; + union xfs_btree_key *kp; + int ptr; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGIK(cur, level, keyp); + + ASSERT(!(cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) || level >= 1); + + /* + * Go up the tree from this level toward the root. + * At each level, update the key value to the value input. + * Stop when we reach a level where the cursor isn't pointing + * at the first entry in the block. + */ + for (ptr = 1; ptr == 1 && level < cur->bc_nlevels; level++) { +#ifdef DEBUG + int error; +#endif + block = xfs_btree_get_block(cur, level, &bp); +#ifdef DEBUG + error = xfs_btree_check_block(cur, block, level, bp); + if (error) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; + } +#endif + ptr = cur->bc_ptrs[level]; + kp = cur->bc_ops->key_addr(cur, ptr, block); + cur->bc_ops->copy_keys(cur, keyp, kp, 1); + cur->bc_ops->log_keys(cur, bp, ptr, ptr); + } + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + return 0; +} Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-02 13:46:01.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-02 23:11:28.000000000 +0200 @@ -205,6 +205,16 @@ struct xfs_btree_ops { __int64_t (*key_diff)(struct xfs_btree_cur *cur, union xfs_btree_key *key); + /* copy bits of btree blocks between blocks */ + void (*copy_keys)(struct xfs_btree_cur *cur, + union xfs_btree_key *src_key, + union xfs_btree_key *dst_key, int numkeys); + + /* log changes to btree structures */ + void (*log_keys)(struct xfs_btree_cur *cur, struct xfs_buf *bp, + int first, int last); + + /* btree tracing */ #ifdef XFS_BTREE_TRACE void (*trace_enter)(struct xfs_btree_cur *, const char *, @@ -523,6 +533,7 @@ xfs_btree_setbuf( int xfs_btree_increment(struct xfs_btree_cur *, int, int *); int xfs_btree_decrement(struct xfs_btree_cur *, int, int *); int xfs_btree_lookup(struct xfs_btree_cur *, xfs_lookup_t, int *); +int xfs_btree_updkey(struct xfs_btree_cur *, union xfs_btree_key *, int); /* Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-08-02 13:46:01.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-08-02 23:11:50.000000000 +0200 @@ -35,6 +35,7 @@ #include "xfs_dinode.h" #include "xfs_inode.h" #include "xfs_btree.h" +#include "xfs_btree_trace.h" #include "xfs_ialloc.h" #include "xfs_alloc.h" #include "xfs_error.h" @@ -44,7 +45,9 @@ */ STATIC void xfs_alloc_log_block(xfs_trans_t *, xfs_buf_t *, int); -STATIC void xfs_alloc_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); +STATIC void xfs_allocbt_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); +#define xfs_alloc_log_keys(c, b, i, j) \ + xfs_allocbt_log_keys(c, b, i, j) STATIC void xfs_alloc_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); STATIC void xfs_alloc_log_recs(xfs_btree_cur_t *, xfs_buf_t *, int, int); STATIC int xfs_alloc_lshift(xfs_btree_cur_t *, int, int *); @@ -52,7 +55,6 @@ STATIC int xfs_alloc_newroot(xfs_btree_c STATIC int xfs_alloc_rshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_alloc_split(xfs_btree_cur_t *, int, xfs_agblock_t *, xfs_alloc_key_t *, xfs_btree_cur_t **, int *); -STATIC int xfs_alloc_updkey(xfs_btree_cur_t *, xfs_alloc_key_t *, int); /* * Internal functions. @@ -265,7 +267,7 @@ xfs_alloc_delrec( * If we deleted the leftmost entry in the block, update the * key values above us in the tree. */ - if (ptr == 1 && (error = xfs_alloc_updkey(cur, lkp, level + 1))) + if (ptr == 1 && (error = xfs_btree_updkey(cur, (union xfs_btree_key *)lkp, level + 1))) return error; /* * If the number of records remaining in the block is at least @@ -798,7 +800,7 @@ xfs_alloc_insrec( /* * If we inserted at the start of a block, update the parents' keys. */ - if (optr == 1 && (error = xfs_alloc_updkey(cur, &key, level + 1))) + if (optr == 1 && (error = xfs_btree_updkey(cur, (union xfs_btree_key *)&key, level + 1))) return error; /* * Look to see if the longest extent in the allocation group @@ -859,28 +861,6 @@ xfs_alloc_log_block( } /* - * Log keys from a btree block (nonleaf). - */ -STATIC void -xfs_alloc_log_keys( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int kfirst, /* index of first key to log */ - int klast) /* index of last key to log */ -{ - xfs_alloc_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - xfs_alloc_key_t *kp; /* key pointer in btree block */ - int last; /* last byte offset logged */ - - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - kp = XFS_ALLOC_KEY_ADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&kp[kfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&kp[klast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); -} - -/* * Log block pointer fields from a btree block (nonleaf). */ STATIC void @@ -1068,7 +1048,7 @@ xfs_alloc_lshift( /* * Update the parent key values of right. */ - if ((error = xfs_alloc_updkey(cur, rkp, level + 1))) + if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)rkp, level + 1))) return error; /* * Slide the cursor value left one. @@ -1354,7 +1334,7 @@ xfs_alloc_rshift( i = xfs_btree_lastrec(tcur, level); XFS_WANT_CORRUPTED_GOTO(i == 1, error0); if ((error = xfs_btree_increment(tcur, level, &i)) || - (error = xfs_alloc_updkey(tcur, rkp, level + 1))) + (error = xfs_btree_updkey(tcur, (union xfs_btree_key *)rkp, level + 1))) goto error0; xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); *stat = 1; @@ -1520,45 +1500,6 @@ xfs_alloc_split( } /* - * Update keys at all levels from here to the root along the cursor's path. - */ -STATIC int /* error */ -xfs_alloc_updkey( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_alloc_key_t *keyp, /* new key value to update to */ - int level) /* starting level for update */ -{ - int ptr; /* index of key in block */ - - /* - * Go up the tree from this level toward the root. - * At each level, update the key value to the value input. - * Stop when we reach a level where the cursor isn't pointing - * at the first entry in the block. - */ - for (ptr = 1; ptr == 1 && level < cur->bc_nlevels; level++) { - xfs_alloc_block_t *block; /* btree block */ - xfs_buf_t *bp; /* buffer for block */ -#ifdef DEBUG - int error; /* error return value */ -#endif - xfs_alloc_key_t *kp; /* ptr to btree block keys */ - - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_ALLOC_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; -#endif - ptr = cur->bc_ptrs[level]; - kp = XFS_ALLOC_KEY_ADDR(block, ptr, cur); - *kp = *keyp; - xfs_alloc_log_keys(cur, bp, ptr, ptr); - } - return 0; -} - -/* * Externally visible routines. */ @@ -1765,7 +1706,7 @@ xfs_alloc_update( key.ar_startblock = cpu_to_be32(bno); key.ar_blockcount = cpu_to_be32(len); - if ((error = xfs_alloc_updkey(cur, &key, 1))) + if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)&key, 1))) return error; } return 0; @@ -1853,6 +1794,44 @@ xfs_allocbt_key_diff( return (__int64_t)be32_to_cpu(kp->ar_startblock) - rec->ar_startblock; } +STATIC void +xfs_allocbt_copy_keys( + struct xfs_btree_cur *cur, + union xfs_btree_key *src_key, + union xfs_btree_key *dst_key, + int numkeys) +{ + ASSERT(numkeys >= 0); + memcpy(dst_key, src_key, numkeys * sizeof(xfs_alloc_key_t)); +} + +/* + * Log keys from a btree block (nonleaf). + */ +STATIC void +xfs_allocbt_log_keys( + struct xfs_btree_cur *cur, /* btree cursor */ + struct xfs_buf *bp, /* buffer containing btree block */ + int first, /* index of first key to log */ + int last) /* index of last key to log */ +{ + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_alloc_key_t *kp = XFS_ALLOC_KEY_ADDR(block, 1, cur); + char *baddr = (char *)block; + int start; /* first byte offset logged */ + int end; /* last byte offset logged */ + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, first, last); + + start = (char *)&kp[first - 1] - baddr; + end = (char *)&kp[last] - 1 - baddr; + xfs_trans_log_buf(cur->bc_tp, bp, start, end); + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); +} + + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_allocbt_trace_buf; @@ -1928,6 +1907,8 @@ static const struct xfs_btree_ops xfs_al .key_addr = xfs_allocbt_key_addr, .rec_addr = xfs_allocbt_rec_addr, .key_diff = xfs_allocbt_key_diff, + .copy_keys = xfs_allocbt_copy_keys, + .log_keys = xfs_allocbt_log_keys, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_allocbt_trace_enter, Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-08-02 13:46:01.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-08-02 23:11:59.000000000 +0200 @@ -35,6 +35,7 @@ #include "xfs_dinode.h" #include "xfs_inode.h" #include "xfs_btree.h" +#include "xfs_btree_trace.h" #include "xfs_ialloc.h" #include "xfs_alloc.h" #include "xfs_error.h" @@ -48,7 +49,6 @@ STATIC int xfs_inobt_newroot(xfs_btree_c STATIC int xfs_inobt_rshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_inobt_split(xfs_btree_cur_t *, int, xfs_agblock_t *, xfs_inobt_key_t *, xfs_btree_cur_t **, int *); -STATIC int xfs_inobt_updkey(xfs_btree_cur_t *, xfs_inobt_key_t *, int); /* * Single level of the xfs_inobt_delete record deletion routine. @@ -214,7 +214,7 @@ xfs_inobt_delrec( * If we deleted the leftmost entry in the block, update the * key values above us in the tree. */ - if (ptr == 1 && (error = xfs_inobt_updkey(cur, kp, level + 1))) + if (ptr == 1 && (error = xfs_btree_updkey(cur, (union xfs_btree_key *)kp, level + 1))) return error; /* * If the number of records remaining in the block is at least @@ -723,7 +723,7 @@ xfs_inobt_insrec( /* * If we inserted at the start of a block, update the parents' keys. */ - if (optr == 1 && (error = xfs_inobt_updkey(cur, &key, level + 1))) + if (optr == 1 && (error = xfs_btree_updkey(cur, (union xfs_btree_key *)&key, level + 1))) return error; /* * Return the new block number, if any. @@ -763,28 +763,6 @@ xfs_inobt_log_block( } /* - * Log keys from a btree block (nonleaf). - */ -STATIC void -xfs_inobt_log_keys( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int kfirst, /* index of first key to log */ - int klast) /* index of last key to log */ -{ - xfs_inobt_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - xfs_inobt_key_t *kp; /* key pointer in btree block */ - int last; /* last byte offset logged */ - - block = XFS_BUF_TO_INOBT_BLOCK(bp); - kp = XFS_INOBT_KEY_ADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&kp[kfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&kp[klast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); -} - -/* * Log block pointer fields from a btree block (nonleaf). */ STATIC void @@ -960,7 +938,7 @@ xfs_inobt_lshift( /* * Update the parent key values of right. */ - if ((error = xfs_inobt_updkey(cur, rkp, level + 1))) + if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)rkp, level + 1))) return error; /* * Slide the cursor value left one. @@ -1238,7 +1216,7 @@ xfs_inobt_rshift( return error; xfs_btree_lastrec(tcur, level); if ((error = xfs_btree_increment(tcur, level, &i)) || - (error = xfs_inobt_updkey(tcur, rkp, level + 1))) { + (error = xfs_btree_updkey(tcur, (union xfs_btree_key *)rkp, level + 1))) { xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); return error; } @@ -1407,45 +1385,6 @@ xfs_inobt_split( } /* - * Update keys at all levels from here to the root along the cursor's path. - */ -STATIC int /* error */ -xfs_inobt_updkey( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_inobt_key_t *keyp, /* new key value to update to */ - int level) /* starting level for update */ -{ - int ptr; /* index of key in block */ - - /* - * Go up the tree from this level toward the root. - * At each level, update the key value to the value input. - * Stop when we reach a level where the cursor isn't pointing - * at the first entry in the block. - */ - for (ptr = 1; ptr == 1 && level < cur->bc_nlevels; level++) { - xfs_buf_t *bp; /* buffer for block */ - xfs_inobt_block_t *block; /* btree block */ -#ifdef DEBUG - int error; /* error return value */ -#endif - xfs_inobt_key_t *kp; /* ptr to btree block keys */ - - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; -#endif - ptr = cur->bc_ptrs[level]; - kp = XFS_INOBT_KEY_ADDR(block, ptr, cur); - *kp = *keyp; - xfs_inobt_log_keys(cur, bp, ptr, ptr); - } - return 0; -} - -/* * Externally visible routines. */ @@ -1637,7 +1576,7 @@ xfs_inobt_update( xfs_inobt_key_t key; /* key containing [ino] */ key.ir_startino = cpu_to_be32(ino); - if ((error = xfs_inobt_updkey(cur, &key, 1))) + if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)&key, 1))) return error; } return 0; @@ -1711,6 +1650,43 @@ xfs_inobt_key_diff( cur->bc_rec.i.ir_startino; } +STATIC void +xfs_inobt_copy_keys( + struct xfs_btree_cur *cur, + union xfs_btree_key *src_key, + union xfs_btree_key *dst_key, + int numkeys) +{ + ASSERT(numkeys >= 0); + memcpy(dst_key, src_key, numkeys * sizeof(xfs_inobt_key_t)); +} + +/* + * Log keys from a btree block (nonleaf). + */ +STATIC void +xfs_inobt_log_keys( + struct xfs_btree_cur *cur, /* btree cursor */ + struct xfs_buf *bp, /* buffer containing btree block */ + int first, /* index of first key to log */ + int last) /* index of last key to log */ +{ + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_inobt_key_t *kp = XFS_INOBT_KEY_ADDR(block, 1, cur); + char *baddr = (char *)block; + int start; /* first byte offset logged */ + int end; /* last byte offset logged */ + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, first, last); + + start = (char *)&kp[first - 1] - baddr; + end = (char *)&kp[last] - 1 - baddr; + xfs_trans_log_buf(cur->bc_tp, bp, start, end); + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); +} + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_inobt_trace_buf; @@ -1786,6 +1762,8 @@ static const struct xfs_btree_ops xfs_in .key_addr = xfs_inobt_key_addr, .rec_addr = xfs_inobt_rec_addr, .key_diff = xfs_inobt_key_diff, + .copy_keys = xfs_inobt_copy_keys, + .log_keys = xfs_inobt_log_keys, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_inobt_trace_enter, Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-02 13:46:01.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-02 23:12:14.000000000 +0200 @@ -56,7 +56,6 @@ STATIC int xfs_bmbt_lshift(xfs_btree_cur STATIC int xfs_bmbt_rshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_bmbt_split(xfs_btree_cur_t *, int, xfs_fsblock_t *, __uint64_t *, xfs_btree_cur_t **, int *); -STATIC int xfs_bmbt_updkey(xfs_btree_cur_t *, xfs_bmbt_key_t *, int); #undef EXIT @@ -211,7 +210,7 @@ xfs_bmbt_delrec( *stat = 1; return 0; } - if (ptr == 1 && (error = xfs_bmbt_updkey(cur, kp, level + 1))) { + if (ptr == 1 && (error = xfs_btree_updkey(cur, (union xfs_btree_key *)kp, level + 1))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -635,7 +634,7 @@ xfs_bmbt_insrec( kp + ptr); } #endif - if (optr == 1 && (error = xfs_bmbt_updkey(cur, &key, level + 1))) { + if (optr == 1 && (error = xfs_btree_updkey(cur, (union xfs_btree_key *)&key, level + 1))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); return error; } @@ -741,42 +740,6 @@ xfs_bmbt_killroot( } /* - * Log key values from the btree block. - */ -STATIC void -xfs_bmbt_log_keys( - xfs_btree_cur_t *cur, - xfs_buf_t *bp, - int kfirst, - int klast) -{ - xfs_trans_t *tp; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGBII(cur, bp, kfirst, klast); - tp = cur->bc_tp; - if (bp) { - xfs_bmbt_block_t *block; - int first; - xfs_bmbt_key_t *kp; - int last; - - block = XFS_BUF_TO_BMBT_BLOCK(bp); - kp = XFS_BMAP_KEY_DADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&kp[kfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&kp[klast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(tp, bp, first, last); - } else { - xfs_inode_t *ip; - - ip = cur->bc_private.b.ip; - xfs_trans_log_inode(tp, ip, - XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); -} - -/* * Log pointer values from the btree block. */ STATIC void @@ -935,7 +898,7 @@ xfs_bmbt_lshift( key.br_startoff = cpu_to_be64(xfs_bmbt_disk_get_startoff(rrp)); rkp = &key; } - if ((error = xfs_bmbt_updkey(cur, rkp, level + 1))) { + if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)rkp, level + 1))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); return error; } @@ -1067,7 +1030,7 @@ xfs_bmbt_rshift( goto error1; } XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_bmbt_updkey(tcur, rkp, level + 1))) { + if ((error = xfs_btree_updkey(tcur, (union xfs_btree_key *)rkp, level + 1))) { XFS_BMBT_TRACE_CURSOR(tcur, ERROR); goto error1; } @@ -1276,44 +1239,6 @@ xfs_bmbt_split( return 0; } - -/* - * Update keys for the record. - */ -STATIC int -xfs_bmbt_updkey( - xfs_btree_cur_t *cur, - xfs_bmbt_key_t *keyp, /* on-disk format */ - int level) -{ - xfs_bmbt_block_t *block; - xfs_buf_t *bp; -#ifdef DEBUG - int error; -#endif - xfs_bmbt_key_t *kp; - int ptr; - - ASSERT(level >= 1); - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGIK(cur, level, keyp); - for (ptr = 1; ptr == 1 && level < cur->bc_nlevels; level++) { - block = xfs_bmbt_get_block(cur, level, &bp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - ptr = cur->bc_ptrs[level]; - kp = XFS_BMAP_KEY_IADDR(block, ptr, cur); - *kp = *keyp; - xfs_bmbt_log_keys(cur, bp, ptr, ptr); - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - return 0; -} - /* * Convert on-disk form of btree root to in-memory form. */ @@ -2039,7 +1964,7 @@ xfs_bmbt_update( return 0; } key.br_startoff = cpu_to_be64(off); - if ((error = xfs_bmbt_updkey(cur, &key, 1))) { + if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)&key, 1))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); return error; } @@ -2157,6 +2082,48 @@ xfs_bmbt_key_diff( cur->bc_rec.b.br_startoff; } +STATIC void +xfs_bmbt_copy_keys( + struct xfs_btree_cur *cur, + union xfs_btree_key *src_key, + union xfs_btree_key *dst_key, + int numkeys) +{ + ASSERT(numkeys >= 0); + memcpy(dst_key, src_key, numkeys * sizeof(xfs_bmbt_key_t)); +} + +/* + * Log key values from the btree block. + */ +STATIC void +xfs_bmbt_log_keys( + struct xfs_btree_cur *cur, + struct xfs_buf *bp, + int first, + int last) +{ + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, first, last); + + if (bp) { + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_bmbt_key_t *kp = XFS_BMAP_KEY_DADDR(block, 1, cur); + char *baddr = (char *)block; + int start; + int end; + + start = (char *)&kp[first - 1] - baddr; + end = (char *)&kp[last] - 1 - baddr; + xfs_trans_log_buf(cur->bc_tp, bp, start, end); + } else { + xfs_trans_log_inode(cur->bc_tp, cur->bc_private.b.ip, + XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); + } + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); +} + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_bmbt_trace_buf; @@ -2252,6 +2219,8 @@ static const struct xfs_btree_ops xfs_bm .key_addr = xfs_bmbt_key_addr, .rec_addr = xfs_bmbt_rec_addr, .key_diff = xfs_bmbt_key_diff, + .copy_keys = xfs_bmbt_copy_keys, + .log_keys = xfs_bmbt_log_keys, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_bmbt_trace_enter, -- From owner-xfs@oss.sgi.com Sun Aug 3 18:33:03 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:33:19 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_61, J_CHICKENPOX_65,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741X3nS031933 for ; Sun, 3 Aug 2008 18:33:03 -0700 X-ASG-Debug-ID: 1217813654-16b402d80000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0FD7A197643B for ; Sun, 3 Aug 2008 18:34:14 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id VxoVVPMGVwgDETFi for ; Sun, 03 Aug 2008 18:34:14 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741YFIF009118 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:34:16 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741YF0W009116 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:34:15 +0200 Date: Mon, 4 Aug 2008 03:34:15 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 12/26] implement generic xfs_btree_lookup Subject: [PATCH 12/26] implement generic xfs_btree_lookup Message-ID: <20080804013415.GM8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-common-btree-lookup User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813656 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1688 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17337 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs From: Dave Chinner [hch: split out from bigger patch and minor adaptions] Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-02 02:53:26.000000000 +0200 @@ -1151,3 +1151,224 @@ error0: return error; } + +STATIC int +xfs_btree_lookup_get_block( + struct xfs_btree_cur *cur, /* btree cursor */ + int level, /* level in the btree */ + union xfs_btree_ptr *pp, /* ptr to btree block */ + struct xfs_btree_block **blkp) /* return btree block */ +{ + struct xfs_buf *bp; /* buffer pointer for btree block */ + int error = 0; + + /* special case the root block if in an inode */ + if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) && + (level == cur->bc_nlevels - 1)) { + *blkp = cur->bc_ops->get_root_from_inode(cur); + return 0; + } + + /* + * If the old buffer at this level for the disk address we are + * looking for re-use it. + * + * Otherwise throw it away and get a new one. + */ + bp = cur->bc_bufs[level]; + if (bp && XFS_BUF_ADDR(bp) == xfs_btree_ptr_to_daddr(cur, pp)) { + *blkp = XFS_BUF_TO_BLOCK(bp); + return 0; + } + + error = xfs_btree_read_buf_block(cur, pp, level, 0, blkp, &bp); + if (error) + return error; + + xfs_btree_setbuf(cur, level, bp); + return 0; +} + +/* + * Get current search key. For level 0 we don't actually have a key + * structure so we make one up from the record. For all other levels + * we just return the right key. + */ +STATIC union xfs_btree_key * +xfs_lookup_get_search_key( + struct xfs_btree_cur *cur, + int level, + int keyno, + struct xfs_btree_block *block, + union xfs_btree_key *kp) +{ + if (level == 0) { + union xfs_btree_rec *rp; + + rp = cur->bc_ops->rec_addr(cur, keyno, block); + cur->bc_ops->init_key_from_rec(cur, kp, rp); + return kp; + } + + return cur->bc_ops->key_addr(cur, keyno, block); +} + +/* + * Lookup the record. The cursor is made to point to it, based on dir. + * Return 0 if can't find any such record, 1 for success. + */ +int /* error */ +xfs_btree_lookup( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_lookup_t dir, /* <=, ==, or >= */ + int *stat) /* success/failure */ +{ + struct xfs_btree_block *block; /* current btree block */ + __int64_t diff; /* difference for the current key */ + int error; /* error return value */ + int keyno; /* current key number */ + int level; /* level in the btree */ + union xfs_btree_ptr *pp; /* ptr to btree block */ + union xfs_btree_ptr ptr; /* ptr to btree block */ + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, dir); + + XFS_BTREE_STATS_INC(cur, lookup); + + block = NULL; + keyno = 0; + + /* initialise start pointer from cursor */ + cur->bc_ops->init_ptr_from_cur(cur, &ptr); + pp = &ptr; + + /* + * Iterate over each level in the btree, starting at the root. + * For each level above the leaves, find the key we need, based + * on the lookup record, then follow the corresponding block + * pointer down to the next level. + */ + for (level = cur->bc_nlevels - 1, diff = 1; level >= 0; level--) { + /* Get the block we need to do the lookup on. */ + error = xfs_btree_lookup_get_block(cur, level, pp, &block); + if (error) + goto error0; + + if (diff == 0) { + /* + * If we already had a key match at a higher level, we + * know we need to use the first entry in this block. + */ + keyno = 1; + } else { + /* Otherwise search this block. Do a binary search. */ + + int high; /* high entry number */ + int low; /* low entry number */ + + /* Set low and high entry numbers, 1-based. */ + low = 1; + high = xfs_btree_get_numrecs(block); + if (!high) { + /* Block is empty, must be an empty leaf. */ + ASSERT(level == 0 && cur->bc_nlevels == 1); + + cur->bc_ptrs[0] = dir != XFS_LOOKUP_LE; + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + } + + /* Binary search the block. */ + while (low <= high) { + union xfs_btree_key key; + union xfs_btree_key *kp; + + XFS_BTREE_STATS_INC(cur, compare); + + /* keyno is average of low and high. */ + keyno = (low + high) >> 1; + + /* Get current search key */ + kp = xfs_lookup_get_search_key(cur, level, + keyno, block, &key); + + /* + * Compute difference to get next direction: + * - less than, move right + * - greater than, move left + * - equal, we're done + */ + diff = cur->bc_ops->key_diff(cur, kp); + if (diff < 0) + low = keyno + 1; + else if (diff > 0) + high = keyno - 1; + else + break; + } + } + + /* + * If there are more levels, set up for the next level + * by getting the block number and filling in the cursor. + */ + if (level > 0) { + /* + * If we moved left, need the previous key number, + * unless there isn't one. + */ + if (diff > 0 && --keyno < 1) + keyno = 1; + pp = cur->bc_ops->ptr_addr(cur, keyno, block); + +#ifdef DEBUG + error = xfs_btree_check_ptr(cur, pp, 0, level); + if (error) + goto error0; +#endif + cur->bc_ptrs[level] = keyno; + } + } + + /* Done with the search. See if we need to adjust the results. */ + if (dir != XFS_LOOKUP_LE && diff < 0) { + keyno++; + /* + * If ge search and we went off the end of the block, but it's + * not the last block, we're in the wrong block. + */ + xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB); + if (dir == XFS_LOOKUP_GE && + keyno > xfs_btree_get_numrecs(block) && + !xfs_btree_ptr_is_null(cur, &ptr)) { + int i; + + cur->bc_ptrs[0] = keyno; + error = xfs_btree_increment(cur, 0, &i); + if (error) + goto error0; + XFS_WANT_CORRUPTED_RETURN(i == 1); + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + } + } else if (dir == XFS_LOOKUP_LE && diff > 0) + keyno--; + cur->bc_ptrs[0] = keyno; + + /* Return if we succeeded or not. */ + if (keyno == 0 || keyno > xfs_btree_get_numrecs(block)) + *stat = 0; + else if (dir != XFS_LOOKUP_EQ || diff == 0) + *stat = 1; + else + *stat = 0; + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-02 02:53:26.000000000 +0200 @@ -194,6 +194,21 @@ struct xfs_btree_ops { /* return address of btree structures */ union xfs_btree_ptr *(*ptr_addr)(struct xfs_btree_cur *cur, int index, struct xfs_btree_block *block); + union xfs_btree_key *(*key_addr)(struct xfs_btree_cur *cur, int index, + struct xfs_btree_block *block); + union xfs_btree_rec *(*rec_addr)(struct xfs_btree_cur *cur, int index, + struct xfs_btree_block *block); + + /* init values of btree structures */ + void (*init_key_from_rec)(struct xfs_btree_cur *cur, + union xfs_btree_key *key, + union xfs_btree_rec *rec); + void (*init_ptr_from_cur)(struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr); + + /* difference between key value and cursor value */ + __int64_t (*key_diff)(struct xfs_btree_cur *cur, + union xfs_btree_key *key); /* btree tracing */ #ifdef XFS_BTREE_TRACE @@ -512,6 +527,7 @@ xfs_btree_setbuf( int xfs_btree_increment(struct xfs_btree_cur *, int, int *); int xfs_btree_decrement(struct xfs_btree_cur *, int, int *); +int xfs_btree_lookup(struct xfs_btree_cur *, xfs_lookup_t, int *); /* Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-08-02 02:54:04.000000000 +0200 @@ -938,223 +938,6 @@ xfs_alloc_log_recs( } /* - * Lookup the record. The cursor is made to point to it, based on dir. - * Return 0 if can't find any such record, 1 for success. - */ -STATIC int /* error */ -xfs_alloc_lookup( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_lookup_t dir, /* <=, ==, or >= */ - int *stat) /* success/failure */ -{ - xfs_agblock_t agbno; /* a.g. relative btree block number */ - xfs_agnumber_t agno; /* allocation group number */ - xfs_alloc_block_t *block=NULL; /* current btree block */ - int diff; /* difference for the current key */ - int error; /* error return value */ - int keyno=0; /* current key number */ - int level; /* level in the btree */ - xfs_mount_t *mp; /* file system mount point */ - - XFS_STATS_INC(xs_abt_lookup); - /* - * Get the allocation group header, and the root block number. - */ - mp = cur->bc_mp; - - { - xfs_agf_t *agf; /* a.g. freespace header */ - - agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); - agno = be32_to_cpu(agf->agf_seqno); - agbno = be32_to_cpu(agf->agf_roots[cur->bc_btnum]); - } - /* - * Iterate over each level in the btree, starting at the root. - * For each level above the leaves, find the key we need, based - * on the lookup record, then follow the corresponding block - * pointer down to the next level. - */ - for (level = cur->bc_nlevels - 1, diff = 1; level >= 0; level--) { - xfs_buf_t *bp; /* buffer pointer for btree block */ - xfs_daddr_t d; /* disk address of btree block */ - - /* - * Get the disk address we're looking for. - */ - d = XFS_AGB_TO_DADDR(mp, agno, agbno); - /* - * If the old buffer at this level is for a different block, - * throw it away, otherwise just use it. - */ - bp = cur->bc_bufs[level]; - if (bp && XFS_BUF_ADDR(bp) != d) - bp = NULL; - if (!bp) { - /* - * Need to get a new buffer. Read it, then - * set it in the cursor, releasing the old one. - */ - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, agno, - agbno, 0, &bp, XFS_ALLOC_BTREE_REF))) - return error; - xfs_btree_setbuf(cur, level, bp); - /* - * Point to the btree block, now that we have the buffer - */ - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, level, - bp))) - return error; - } else - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - /* - * If we already had a key match at a higher level, we know - * we need to use the first entry in this block. - */ - if (diff == 0) - keyno = 1; - /* - * Otherwise we need to search this block. Do a binary search. - */ - else { - int high; /* high entry number */ - xfs_alloc_key_t *kkbase=NULL;/* base of keys in block */ - xfs_alloc_rec_t *krbase=NULL;/* base of records in block */ - int low; /* low entry number */ - - /* - * Get a pointer to keys or records. - */ - if (level > 0) - kkbase = XFS_ALLOC_KEY_ADDR(block, 1, cur); - else - krbase = XFS_ALLOC_REC_ADDR(block, 1, cur); - /* - * Set low and high entry numbers, 1-based. - */ - low = 1; - if (!(high = be16_to_cpu(block->bb_numrecs))) { - /* - * If the block is empty, the tree must - * be an empty leaf. - */ - ASSERT(level == 0 && cur->bc_nlevels == 1); - cur->bc_ptrs[0] = dir != XFS_LOOKUP_LE; - *stat = 0; - return 0; - } - /* - * Binary search the block. - */ - while (low <= high) { - xfs_extlen_t blockcount; /* key value */ - xfs_agblock_t startblock; /* key value */ - - XFS_STATS_INC(xs_abt_compare); - /* - * keyno is average of low and high. - */ - keyno = (low + high) >> 1; - /* - * Get startblock & blockcount. - */ - if (level > 0) { - xfs_alloc_key_t *kkp; - - kkp = kkbase + keyno - 1; - startblock = be32_to_cpu(kkp->ar_startblock); - blockcount = be32_to_cpu(kkp->ar_blockcount); - } else { - xfs_alloc_rec_t *krp; - - krp = krbase + keyno - 1; - startblock = be32_to_cpu(krp->ar_startblock); - blockcount = be32_to_cpu(krp->ar_blockcount); - } - /* - * Compute difference to get next direction. - */ - if (cur->bc_btnum == XFS_BTNUM_BNO) - diff = (int)startblock - - (int)cur->bc_rec.a.ar_startblock; - else if (!(diff = (int)blockcount - - (int)cur->bc_rec.a.ar_blockcount)) - diff = (int)startblock - - (int)cur->bc_rec.a.ar_startblock; - /* - * Less than, move right. - */ - if (diff < 0) - low = keyno + 1; - /* - * Greater than, move left. - */ - else if (diff > 0) - high = keyno - 1; - /* - * Equal, we're done. - */ - else - break; - } - } - /* - * If there are more levels, set up for the next level - * by getting the block number and filling in the cursor. - */ - if (level > 0) { - /* - * If we moved left, need the previous key number, - * unless there isn't one. - */ - if (diff > 0 && --keyno < 1) - keyno = 1; - agbno = be32_to_cpu(*XFS_ALLOC_PTR_ADDR(block, keyno, cur)); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, agbno, level))) - return error; -#endif - cur->bc_ptrs[level] = keyno; - } - } - /* - * Done with the search. - * See if we need to adjust the results. - */ - if (dir != XFS_LOOKUP_LE && diff < 0) { - keyno++; - /* - * If ge search and we went off the end of the block, but it's - * not the last block, we're in the wrong block. - */ - if (dir == XFS_LOOKUP_GE && - keyno > be16_to_cpu(block->bb_numrecs) && - be32_to_cpu(block->bb_rightsib) != NULLAGBLOCK) { - int i; - - cur->bc_ptrs[0] = keyno; - if ((error = xfs_btree_increment(cur, 0, &i))) - return error; - XFS_WANT_CORRUPTED_RETURN(i == 1); - *stat = 1; - return 0; - } - } - else if (dir == XFS_LOOKUP_LE && diff > 0) - keyno--; - cur->bc_ptrs[0] = keyno; - /* - * Return if we succeeded or not. - */ - if (keyno == 0 || keyno > be16_to_cpu(block->bb_numrecs)) - *stat = 0; - else - *stat = ((dir != XFS_LOOKUP_EQ) || (diff == 0)); - return 0; -} - -/* * Move 1 record left from cur/level if possible. * Update cur to reflect the new path. */ @@ -1919,53 +1702,6 @@ xfs_alloc_insert( } /* - * Lookup the record equal to [bno, len] in the btree given by cur. - */ -int /* error */ -xfs_alloc_lookup_eq( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agblock_t bno, /* starting block of extent */ - xfs_extlen_t len, /* length of extent */ - int *stat) /* success/failure */ -{ - cur->bc_rec.a.ar_startblock = bno; - cur->bc_rec.a.ar_blockcount = len; - return xfs_alloc_lookup(cur, XFS_LOOKUP_EQ, stat); -} - -/* - * Lookup the first record greater than or equal to [bno, len] - * in the btree given by cur. - */ -int /* error */ -xfs_alloc_lookup_ge( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agblock_t bno, /* starting block of extent */ - xfs_extlen_t len, /* length of extent */ - int *stat) /* success/failure */ -{ - cur->bc_rec.a.ar_startblock = bno; - cur->bc_rec.a.ar_blockcount = len; - return xfs_alloc_lookup(cur, XFS_LOOKUP_GE, stat); -} - -/* - * Lookup the first record less than or equal to [bno, len] - * in the btree given by cur. - */ -int /* error */ -xfs_alloc_lookup_le( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agblock_t bno, /* starting block of extent */ - xfs_extlen_t len, /* length of extent */ - int *stat) /* success/failure */ -{ - cur->bc_rec.a.ar_startblock = bno; - cur->bc_rec.a.ar_blockcount = len; - return xfs_alloc_lookup(cur, XFS_LOOKUP_LE, stat); -} - -/* * Update the record referred to by cur, to the value given by [bno, len]. * This either works (return 0) or gets an EFSCORRUPTED error. */ @@ -2044,6 +1780,31 @@ xfs_allocbt_dup_cursor( cur->bc_btnum); } +STATIC void +xfs_allocbt_init_key_from_rec( + struct xfs_btree_cur *cur, + union xfs_btree_key *key, + union xfs_btree_rec *rec) +{ + ASSERT(rec->alloc.ar_startblock != 0); + + key->alloc.ar_startblock = rec->alloc.ar_startblock; + key->alloc.ar_blockcount = rec->alloc.ar_blockcount; +} + +STATIC void +xfs_allocbt_init_ptr_from_cur( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr) +{ + struct xfs_agf *agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); + + ASSERT(cur->bc_private.a.agno == be32_to_cpu(agf->agf_seqno)); + ASSERT(agf->agf_roots[cur->bc_btnum] != 0); + + ptr->s = agf->agf_roots[cur->bc_btnum]; +} + STATIC union xfs_btree_ptr * xfs_allocbt_ptr_addr( struct xfs_btree_cur *cur, @@ -2053,6 +1814,45 @@ xfs_allocbt_ptr_addr( return (union xfs_btree_ptr *)XFS_ALLOC_PTR_ADDR(block, index, cur); } +STATIC union xfs_btree_key * +xfs_allocbt_key_addr( + struct xfs_btree_cur *cur, + int index, + struct xfs_btree_block *block) +{ + return (union xfs_btree_key *)XFS_ALLOC_KEY_ADDR(block, index, cur); +} + +STATIC union xfs_btree_rec * +xfs_allocbt_rec_addr( + struct xfs_btree_cur *cur, + int index, + struct xfs_btree_block *block) +{ + return (union xfs_btree_rec *)XFS_ALLOC_REC_ADDR(block, index, cur); +} + +STATIC __int64_t +xfs_allocbt_key_diff( + struct xfs_btree_cur *cur, + union xfs_btree_key *key) +{ + xfs_alloc_rec_incore_t *rec = &cur->bc_rec.a; + xfs_alloc_key_t *kp = &key->alloc; + __int64_t diff; + + if (cur->bc_btnum == XFS_BTNUM_BNO) { + return (__int64_t)be32_to_cpu(kp->ar_startblock) - + rec->ar_startblock; + } + + diff = (__int64_t)be32_to_cpu(kp->ar_blockcount) - rec->ar_blockcount; + if (diff) + return diff; + + return (__int64_t)be32_to_cpu(kp->ar_startblock) - rec->ar_startblock; +} + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_allocbt_trace_buf; @@ -2122,7 +1922,12 @@ xfs_allocbt_trace_record( static const struct xfs_btree_ops xfs_allocbt_ops = { .dup_cursor = xfs_allocbt_dup_cursor, + .init_key_from_rec = xfs_allocbt_init_key_from_rec, + .init_ptr_from_cur = xfs_allocbt_init_ptr_from_cur, .ptr_addr = xfs_allocbt_ptr_addr, + .key_addr = xfs_allocbt_key_addr, + .rec_addr = xfs_allocbt_rec_addr, + .key_diff = xfs_allocbt_key_diff, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_allocbt_trace_enter, Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-08-02 02:54:40.000000000 +0200 @@ -829,212 +829,6 @@ xfs_inobt_log_recs( } /* - * Lookup the record. The cursor is made to point to it, based on dir. - * Return 0 if can't find any such record, 1 for success. - */ -STATIC int /* error */ -xfs_inobt_lookup( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_lookup_t dir, /* <=, ==, or >= */ - int *stat) /* success/failure */ -{ - xfs_agblock_t agbno; /* a.g. relative btree block number */ - xfs_agnumber_t agno; /* allocation group number */ - xfs_inobt_block_t *block=NULL; /* current btree block */ - __int64_t diff; /* difference for the current key */ - int error; /* error return value */ - int keyno=0; /* current key number */ - int level; /* level in the btree */ - xfs_mount_t *mp; /* file system mount point */ - - /* - * Get the allocation group header, and the root block number. - */ - mp = cur->bc_mp; - { - xfs_agi_t *agi; /* a.g. inode header */ - - agi = XFS_BUF_TO_AGI(cur->bc_private.a.agbp); - agno = be32_to_cpu(agi->agi_seqno); - agbno = be32_to_cpu(agi->agi_root); - } - /* - * Iterate over each level in the btree, starting at the root. - * For each level above the leaves, find the key we need, based - * on the lookup record, then follow the corresponding block - * pointer down to the next level. - */ - for (level = cur->bc_nlevels - 1, diff = 1; level >= 0; level--) { - xfs_buf_t *bp; /* buffer pointer for btree block */ - xfs_daddr_t d; /* disk address of btree block */ - - /* - * Get the disk address we're looking for. - */ - d = XFS_AGB_TO_DADDR(mp, agno, agbno); - /* - * If the old buffer at this level is for a different block, - * throw it away, otherwise just use it. - */ - bp = cur->bc_bufs[level]; - if (bp && XFS_BUF_ADDR(bp) != d) - bp = NULL; - if (!bp) { - /* - * Need to get a new buffer. Read it, then - * set it in the cursor, releasing the old one. - */ - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - agno, agbno, 0, &bp, XFS_INO_BTREE_REF))) - return error; - xfs_btree_setbuf(cur, level, bp); - /* - * Point to the btree block, now that we have the buffer - */ - block = XFS_BUF_TO_INOBT_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, level, - bp))) - return error; - } else - block = XFS_BUF_TO_INOBT_BLOCK(bp); - /* - * If we already had a key match at a higher level, we know - * we need to use the first entry in this block. - */ - if (diff == 0) - keyno = 1; - /* - * Otherwise we need to search this block. Do a binary search. - */ - else { - int high; /* high entry number */ - xfs_inobt_key_t *kkbase=NULL;/* base of keys in block */ - xfs_inobt_rec_t *krbase=NULL;/* base of records in block */ - int low; /* low entry number */ - - /* - * Get a pointer to keys or records. - */ - if (level > 0) - kkbase = XFS_INOBT_KEY_ADDR(block, 1, cur); - else - krbase = XFS_INOBT_REC_ADDR(block, 1, cur); - /* - * Set low and high entry numbers, 1-based. - */ - low = 1; - if (!(high = be16_to_cpu(block->bb_numrecs))) { - /* - * If the block is empty, the tree must - * be an empty leaf. - */ - ASSERT(level == 0 && cur->bc_nlevels == 1); - cur->bc_ptrs[0] = dir != XFS_LOOKUP_LE; - *stat = 0; - return 0; - } - /* - * Binary search the block. - */ - while (low <= high) { - xfs_agino_t startino; /* key value */ - - /* - * keyno is average of low and high. - */ - keyno = (low + high) >> 1; - /* - * Get startino. - */ - if (level > 0) { - xfs_inobt_key_t *kkp; - - kkp = kkbase + keyno - 1; - startino = be32_to_cpu(kkp->ir_startino); - } else { - xfs_inobt_rec_t *krp; - - krp = krbase + keyno - 1; - startino = be32_to_cpu(krp->ir_startino); - } - /* - * Compute difference to get next direction. - */ - diff = (__int64_t) - startino - cur->bc_rec.i.ir_startino; - /* - * Less than, move right. - */ - if (diff < 0) - low = keyno + 1; - /* - * Greater than, move left. - */ - else if (diff > 0) - high = keyno - 1; - /* - * Equal, we're done. - */ - else - break; - } - } - /* - * If there are more levels, set up for the next level - * by getting the block number and filling in the cursor. - */ - if (level > 0) { - /* - * If we moved left, need the previous key number, - * unless there isn't one. - */ - if (diff > 0 && --keyno < 1) - keyno = 1; - agbno = be32_to_cpu(*XFS_INOBT_PTR_ADDR(block, keyno, cur)); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, agbno, level))) - return error; -#endif - cur->bc_ptrs[level] = keyno; - } - } - /* - * Done with the search. - * See if we need to adjust the results. - */ - if (dir != XFS_LOOKUP_LE && diff < 0) { - keyno++; - /* - * If ge search and we went off the end of the block, but it's - * not the last block, we're in the wrong block. - */ - if (dir == XFS_LOOKUP_GE && - keyno > be16_to_cpu(block->bb_numrecs) && - be32_to_cpu(block->bb_rightsib) != NULLAGBLOCK) { - int i; - - cur->bc_ptrs[0] = keyno; - if ((error = xfs_btree_increment(cur, 0, &i))) - return error; - ASSERT(i == 1); - *stat = 1; - return 0; - } - } - else if (dir == XFS_LOOKUP_LE && diff > 0) - keyno--; - cur->bc_ptrs[0] = keyno; - /* - * Return if we succeeded or not. - */ - if (keyno == 0 || keyno > be16_to_cpu(block->bb_numrecs)) - *stat = 0; - else - *stat = ((dir != XFS_LOOKUP_EQ) || (diff == 0)); - return 0; -} - -/* * Move 1 record left from cur/level if possible. * Update cur to reflect the new path. */ @@ -1798,59 +1592,6 @@ xfs_inobt_insert( } /* - * Lookup the record equal to ino in the btree given by cur. - */ -int /* error */ -xfs_inobt_lookup_eq( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agino_t ino, /* starting inode of chunk */ - __int32_t fcnt, /* free inode count */ - xfs_inofree_t free, /* free inode mask */ - int *stat) /* success/failure */ -{ - cur->bc_rec.i.ir_startino = ino; - cur->bc_rec.i.ir_freecount = fcnt; - cur->bc_rec.i.ir_free = free; - return xfs_inobt_lookup(cur, XFS_LOOKUP_EQ, stat); -} - -/* - * Lookup the first record greater than or equal to ino - * in the btree given by cur. - */ -int /* error */ -xfs_inobt_lookup_ge( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agino_t ino, /* starting inode of chunk */ - __int32_t fcnt, /* free inode count */ - xfs_inofree_t free, /* free inode mask */ - int *stat) /* success/failure */ -{ - cur->bc_rec.i.ir_startino = ino; - cur->bc_rec.i.ir_freecount = fcnt; - cur->bc_rec.i.ir_free = free; - return xfs_inobt_lookup(cur, XFS_LOOKUP_GE, stat); -} - -/* - * Lookup the first record less than or equal to ino - * in the btree given by cur. - */ -int /* error */ -xfs_inobt_lookup_le( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agino_t ino, /* starting inode of chunk */ - __int32_t fcnt, /* free inode count */ - xfs_inofree_t free, /* free inode mask */ - int *stat) /* success/failure */ -{ - cur->bc_rec.i.ir_startino = ino; - cur->bc_rec.i.ir_freecount = fcnt; - cur->bc_rec.i.ir_free = free; - return xfs_inobt_lookup(cur, XFS_LOOKUP_LE, stat); -} - -/* * Update the record referred to by cur, to the value given * by [ino, fcnt, free]. * This either works (return 0) or gets an EFSCORRUPTED error. @@ -1910,6 +1651,30 @@ xfs_inobt_dup_cursor( cur->bc_private.a.agbp, cur->bc_private.a.agno); } +STATIC void +xfs_inobt_init_key_from_rec( + struct xfs_btree_cur *cur, + union xfs_btree_key *key, + union xfs_btree_rec *rec) +{ + key->inobt.ir_startino = rec->inobt.ir_startino; +} + +/* + * intial value of ptr for lookup + */ +STATIC void +xfs_inobt_init_ptr_from_cur( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr) +{ + struct xfs_agi *agi = XFS_BUF_TO_AGI(cur->bc_private.a.agbp); + + ASSERT(cur->bc_private.a.agno == be32_to_cpu(agi->agi_seqno)); + + ptr->s = agi->agi_root; +} + STATIC union xfs_btree_ptr * xfs_inobt_ptr_addr( struct xfs_btree_cur *cur, @@ -1919,6 +1684,33 @@ xfs_inobt_ptr_addr( return (union xfs_btree_ptr *)XFS_INOBT_PTR_ADDR(block, index, cur); } +STATIC union xfs_btree_key * +xfs_inobt_key_addr( + struct xfs_btree_cur *cur, + int index, + struct xfs_btree_block *block) +{ + return (union xfs_btree_key *)XFS_INOBT_KEY_ADDR(block, index, cur); +} + +STATIC union xfs_btree_rec * +xfs_inobt_rec_addr( + struct xfs_btree_cur *cur, + int index, + struct xfs_btree_block *block) +{ + return (union xfs_btree_rec *)XFS_INOBT_REC_ADDR(block, index, cur); +} + +STATIC __int64_t +xfs_inobt_key_diff( + struct xfs_btree_cur *cur, + union xfs_btree_key *key) +{ + return (__int64_t)be32_to_cpu(key->inobt.ir_startino) - + cur->bc_rec.i.ir_startino; +} + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_inobt_trace_buf; @@ -1988,7 +1780,12 @@ xfs_inobt_trace_record( static const struct xfs_btree_ops xfs_inobt_ops = { .dup_cursor = xfs_inobt_dup_cursor, + .init_key_from_rec = xfs_inobt_init_key_from_rec, + .init_ptr_from_cur = xfs_inobt_init_ptr_from_cur, .ptr_addr = xfs_inobt_ptr_addr, + .key_addr = xfs_inobt_key_addr, + .rec_addr = xfs_inobt_rec_addr, + .key_diff = xfs_inobt_key_diff, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_inobt_trace_enter, Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-02 02:55:15.000000000 +0200 @@ -813,146 +813,6 @@ xfs_bmbt_log_ptrs( } /* - * Lookup the record. The cursor is made to point to it, based on dir. - */ -STATIC int /* error */ -xfs_bmbt_lookup( - xfs_btree_cur_t *cur, - xfs_lookup_t dir, - int *stat) /* success/failure */ -{ - xfs_bmbt_block_t *block=NULL; - xfs_buf_t *bp; - xfs_daddr_t d; - xfs_sfiloff_t diff; - int error; /* error return value */ - xfs_fsblock_t fsbno=0; - int high; - int i; - int keyno=0; - xfs_bmbt_key_t *kkbase=NULL; - xfs_bmbt_key_t *kkp; - xfs_bmbt_rec_t *krbase=NULL; - xfs_bmbt_rec_t *krp; - int level; - int low; - xfs_mount_t *mp; - xfs_bmbt_ptr_t *pp; - xfs_bmbt_irec_t *rp; - xfs_fileoff_t startoff; - xfs_trans_t *tp; - - XFS_STATS_INC(xs_bmbt_lookup); - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, (int)dir); - tp = cur->bc_tp; - mp = cur->bc_mp; - rp = &cur->bc_rec.b; - for (level = cur->bc_nlevels - 1, diff = 1; level >= 0; level--) { - if (level < cur->bc_nlevels - 1) { - d = XFS_FSB_TO_DADDR(mp, fsbno); - bp = cur->bc_bufs[level]; - if (bp && XFS_BUF_ADDR(bp) != d) - bp = NULL; - if (!bp) { - if ((error = xfs_btree_read_bufl(mp, tp, fsbno, - 0, &bp, XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - xfs_btree_setbuf(cur, level, bp); - block = XFS_BUF_TO_BMBT_BLOCK(bp); - if ((error = xfs_btree_check_lblock(cur, block, - level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } else - block = XFS_BUF_TO_BMBT_BLOCK(bp); - } else - block = xfs_bmbt_get_block(cur, level, &bp); - if (diff == 0) - keyno = 1; - else { - if (level > 0) - kkbase = XFS_BMAP_KEY_IADDR(block, 1, cur); - else - krbase = XFS_BMAP_REC_IADDR(block, 1, cur); - low = 1; - if (!(high = be16_to_cpu(block->bb_numrecs))) { - ASSERT(level == 0); - cur->bc_ptrs[0] = dir != XFS_LOOKUP_LE; - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - while (low <= high) { - XFS_STATS_INC(xs_bmbt_compare); - keyno = (low + high) >> 1; - if (level > 0) { - kkp = kkbase + keyno - 1; - startoff = be64_to_cpu(kkp->br_startoff); - } else { - krp = krbase + keyno - 1; - startoff = xfs_bmbt_disk_get_startoff(krp); - } - diff = (xfs_sfiloff_t) - (startoff - rp->br_startoff); - if (diff < 0) - low = keyno + 1; - else if (diff > 0) - high = keyno - 1; - else - break; - } - } - if (level > 0) { - if (diff > 0 && --keyno < 1) - keyno = 1; - pp = XFS_BMAP_PTR_IADDR(block, keyno, cur); - fsbno = be64_to_cpu(*pp); -#ifdef DEBUG - if ((error = xfs_btree_check_lptr(cur, fsbno, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - cur->bc_ptrs[level] = keyno; - } - } - if (dir != XFS_LOOKUP_LE && diff < 0) { - keyno++; - /* - * If ge search and we went off the end of the block, but it's - * not the last block, we're in the wrong block. - */ - if (dir == XFS_LOOKUP_GE && keyno > be16_to_cpu(block->bb_numrecs) && - be64_to_cpu(block->bb_rightsib) != NULLDFSBNO) { - cur->bc_ptrs[0] = keyno; - if ((error = xfs_btree_increment(cur, 0, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - XFS_WANT_CORRUPTED_RETURN(i == 1); - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - } - else if (dir == XFS_LOOKUP_LE && diff > 0) - keyno--; - cur->bc_ptrs[0] = keyno; - if (keyno == 0 || keyno > be16_to_cpu(block->bb_numrecs)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - } else { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = ((dir != XFS_LOOKUP_EQ) || (diff == 0)); - } - return 0; -} - -/* * Move 1 record left from cur/level if possible. * Update cur to reflect the new path. */ @@ -1809,34 +1669,6 @@ xfs_bmbt_log_recs( XFS_BMBT_TRACE_CURSOR(cur, EXIT); } -int /* error */ -xfs_bmbt_lookup_eq( - xfs_btree_cur_t *cur, - xfs_fileoff_t off, - xfs_fsblock_t bno, - xfs_filblks_t len, - int *stat) /* success/failure */ -{ - cur->bc_rec.b.br_startoff = off; - cur->bc_rec.b.br_startblock = bno; - cur->bc_rec.b.br_blockcount = len; - return xfs_bmbt_lookup(cur, XFS_LOOKUP_EQ, stat); -} - -int /* error */ -xfs_bmbt_lookup_ge( - xfs_btree_cur_t *cur, - xfs_fileoff_t off, - xfs_fsblock_t bno, - xfs_filblks_t len, - int *stat) /* success/failure */ -{ - cur->bc_rec.b.br_startoff = off; - cur->bc_rec.b.br_startblock = bno; - cur->bc_rec.b.br_blockcount = len; - return xfs_bmbt_lookup(cur, XFS_LOOKUP_GE, stat); -} - /* * Give the bmap btree a new root block. Copy the old broot contents * down into a real block and make the broot point to it. @@ -2271,6 +2103,24 @@ xfs_bmbt_get_root_from_inode( return (struct xfs_btree_block *)ifp->if_broot; } +STATIC void +xfs_bmbt_init_key_from_rec( + struct xfs_btree_cur *cur, + union xfs_btree_key *key, + union xfs_btree_rec *rec) +{ + key->bmbt.br_startoff = + cpu_to_be64(xfs_bmbt_disk_get_startoff(&rec->bmbt)); +} + +STATIC void +xfs_bmbt_init_ptr_from_cur( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr) +{ + ptr->l = 0; +} + STATIC union xfs_btree_ptr * xfs_bmbt_ptr_addr( struct xfs_btree_cur *cur, @@ -2281,6 +2131,33 @@ xfs_bmbt_ptr_addr( XFS_BMAP_PTR_IADDR(&block->bb_h, index, cur); } +STATIC union xfs_btree_key * +xfs_bmbt_key_addr( + struct xfs_btree_cur *cur, + int index, + struct xfs_btree_block *block) +{ + return (union xfs_btree_key *)XFS_BMAP_KEY_IADDR(block, index, cur); +} + +STATIC union xfs_btree_rec * +xfs_bmbt_rec_addr( + struct xfs_btree_cur *cur, + int index, + struct xfs_btree_block *block) +{ + return (union xfs_btree_rec *)XFS_BMAP_REC_IADDR(block, index, cur); +} + +STATIC __int64_t +xfs_bmbt_key_diff( + struct xfs_btree_cur *cur, + union xfs_btree_key *key) +{ + return (__int64_t)be64_to_cpu(key->bmbt.br_startoff) - + cur->bc_rec.b.br_startoff; +} + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_bmbt_trace_buf; @@ -2370,7 +2247,12 @@ xfs_bmbt_trace_record( static const struct xfs_btree_ops xfs_bmbt_ops = { .dup_cursor = xfs_bmbt_dup_cursor, .get_root_from_inode = xfs_bmbt_get_root_from_inode, + .init_key_from_rec = xfs_bmbt_init_key_from_rec, + .init_ptr_from_cur = xfs_bmbt_init_ptr_from_cur, .ptr_addr = xfs_bmbt_ptr_addr, + .key_addr = xfs_bmbt_key_addr, + .rec_addr = xfs_bmbt_rec_addr, + .key_diff = xfs_bmbt_key_diff, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_bmbt_trace_enter, Index: linux-2.6-xfs/fs/xfs/xfs_alloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc.c 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc.c 2008-08-02 02:53:26.000000000 +0200 @@ -90,6 +90,54 @@ STATIC int xfs_alloc_ag_vextent_small(xf */ /* + * Lookup the record equal to [bno, len] in the btree given by cur. + */ +STATIC int /* error */ +xfs_alloc_lookup_eq( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_agblock_t bno, /* starting block of extent */ + xfs_extlen_t len, /* length of extent */ + int *stat) /* success/failure */ +{ + cur->bc_rec.a.ar_startblock = bno; + cur->bc_rec.a.ar_blockcount = len; + return xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat); +} + +/* + * Lookup the first record greater than or equal to [bno, len] + * in the btree given by cur. + */ +STATIC int /* error */ +xfs_alloc_lookup_ge( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_agblock_t bno, /* starting block of extent */ + xfs_extlen_t len, /* length of extent */ + int *stat) /* success/failure */ +{ + cur->bc_rec.a.ar_startblock = bno; + cur->bc_rec.a.ar_blockcount = len; + return xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); +} + +/* + * Lookup the first record less than or equal to [bno, len] + * in the btree given by cur. + */ +STATIC int /* error */ +xfs_alloc_lookup_le( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_agblock_t bno, /* starting block of extent */ + xfs_extlen_t len, /* length of extent */ + int *stat) /* success/failure */ +{ + cur->bc_rec.a.ar_startblock = bno; + cur->bc_rec.a.ar_blockcount = len; + return xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); +} + + +/* * Compute aligned version of the found extent. * Takes alignment and min length into account. */ Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.h 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h 2008-08-02 02:53:26.000000000 +0200 @@ -114,26 +114,6 @@ extern int xfs_alloc_get_rec(struct xfs_ extern int xfs_alloc_insert(struct xfs_btree_cur *cur, int *stat); /* - * Lookup the record equal to [bno, len] in the btree given by cur. - */ -extern int xfs_alloc_lookup_eq(struct xfs_btree_cur *cur, xfs_agblock_t bno, - xfs_extlen_t len, int *stat); - -/* - * Lookup the first record greater than or equal to [bno, len] - * in the btree given by cur. - */ -extern int xfs_alloc_lookup_ge(struct xfs_btree_cur *cur, xfs_agblock_t bno, - xfs_extlen_t len, int *stat); - -/* - * Lookup the first record less than or equal to [bno, len] - * in the btree given by cur. - */ -extern int xfs_alloc_lookup_le(struct xfs_btree_cur *cur, xfs_agblock_t bno, - xfs_extlen_t len, int *stat); - -/* * Update the record referred to by cur, to the value given by [bno, len]. * This either works (return 0) or gets an EFSCORRUPTED error. */ Index: linux-2.6-xfs/fs/xfs/xfs_bmap.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap.c 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap.c 2008-08-02 02:53:26.000000000 +0200 @@ -402,6 +402,35 @@ xfs_bmap_disk_count_leaves( * Bmap internal routines. */ +STATIC int /* error */ +xfs_bmbt_lookup_eq( + struct xfs_btree_cur *cur, + xfs_fileoff_t off, + xfs_fsblock_t bno, + xfs_filblks_t len, + int *stat) /* success/failure */ +{ + cur->bc_rec.b.br_startoff = off; + cur->bc_rec.b.br_startblock = bno; + cur->bc_rec.b.br_blockcount = len; + return xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat); +} + +STATIC int /* error */ +xfs_bmbt_lookup_ge( + struct xfs_btree_cur *cur, + xfs_fileoff_t off, + xfs_fsblock_t bno, + xfs_filblks_t len, + int *stat) /* success/failure */ +{ + cur->bc_rec.b.br_startoff = off; + cur->bc_rec.b.br_startblock = bno; + cur->bc_rec.b.br_blockcount = len; + return xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); +} + + /* * Called from xfs_bmap_add_attrfork to handle btree format files. */ Index: linux-2.6-xfs/fs/xfs/xfs_ialloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc.c 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc.c 2008-08-02 02:53:26.000000000 +0200 @@ -119,6 +119,59 @@ xfs_ialloc_cluster_alignment( } /* + * Lookup the record equal to ino in the btree given by cur. + */ +STATIC int /* error */ +xfs_inobt_lookup_eq( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_agino_t ino, /* starting inode of chunk */ + __int32_t fcnt, /* free inode count */ + xfs_inofree_t free, /* free inode mask */ + int *stat) /* success/failure */ +{ + cur->bc_rec.i.ir_startino = ino; + cur->bc_rec.i.ir_freecount = fcnt; + cur->bc_rec.i.ir_free = free; + return xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat); +} + +/* + * Lookup the first record greater than or equal to ino + * in the btree given by cur. + */ +int /* error */ +xfs_inobt_lookup_ge( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_agino_t ino, /* starting inode of chunk */ + __int32_t fcnt, /* free inode count */ + xfs_inofree_t free, /* free inode mask */ + int *stat) /* success/failure */ +{ + cur->bc_rec.i.ir_startino = ino; + cur->bc_rec.i.ir_freecount = fcnt; + cur->bc_rec.i.ir_free = free; + return xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); +} + +/* + * Lookup the first record less than or equal to ino + * in the btree given by cur. + */ +int /* error */ +xfs_inobt_lookup_le( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_agino_t ino, /* starting inode of chunk */ + __int32_t fcnt, /* free inode count */ + xfs_inofree_t free, /* free inode mask */ + int *stat) /* success/failure */ +{ + cur->bc_rec.i.ir_startino = ino; + cur->bc_rec.i.ir_freecount = fcnt; + cur->bc_rec.i.ir_free = free; + return xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); +} + +/* * Allocate new inodes in the allocation group specified by agbp. * Return 0 for success, else error code. */ Index: linux-2.6-xfs/fs/xfs/xfs_ialloc.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc.h 2008-08-02 02:49:48.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc.h 2008-08-02 02:53:26.000000000 +0200 @@ -154,6 +154,21 @@ xfs_ialloc_pagi_init( struct xfs_trans *tp, /* transaction pointer */ xfs_agnumber_t agno); /* allocation group number */ +/* + * Lookup the first record greater than or equal to ino + * in the btree given by cur. + */ +int xfs_inobt_lookup_ge(struct xfs_btree_cur *cur, xfs_agino_t ino, + __int32_t fcnt, xfs_inofree_t free, int *stat); + +/* + * Lookup the first record less than or equal to ino + * in the btree given by cur. + */ +int xfs_inobt_lookup_le(struct xfs_btree_cur *cur, xfs_agino_t ino, + __int32_t fcnt, xfs_inofree_t free, int *stat); + + #endif /* __KERNEL__ */ #endif /* __XFS_IALLOC_H__ */ Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.h 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h 2008-08-02 02:53:26.000000000 +0200 @@ -136,26 +136,6 @@ extern int xfs_inobt_get_rec(struct xfs_ extern int xfs_inobt_insert(struct xfs_btree_cur *cur, int *stat); /* - * Lookup the record equal to ino in the btree given by cur. - */ -extern int xfs_inobt_lookup_eq(struct xfs_btree_cur *cur, xfs_agino_t ino, - __int32_t fcnt, xfs_inofree_t free, int *stat); - -/* - * Lookup the first record greater than or equal to ino - * in the btree given by cur. - */ -extern int xfs_inobt_lookup_ge(struct xfs_btree_cur *cur, xfs_agino_t ino, - __int32_t fcnt, xfs_inofree_t free, int *stat); - -/* - * Lookup the first record less than or equal to ino - * in the btree given by cur. - */ -extern int xfs_inobt_lookup_le(struct xfs_btree_cur *cur, xfs_agino_t ino, - __int32_t fcnt, xfs_inofree_t free, int *stat); - -/* * Update the record referred to by cur, to the value given * by [ino, fcnt, free]. * This either works (return 0) or gets an EFSCORRUPTED error. Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.h 2008-08-02 02:53:11.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h 2008-08-02 02:53:26.000000000 +0200 @@ -254,10 +254,6 @@ extern int xfs_bmbt_insert(struct xfs_bt extern void xfs_bmbt_log_block(struct xfs_btree_cur *, struct xfs_buf *, int); extern void xfs_bmbt_log_recs(struct xfs_btree_cur *, struct xfs_buf *, int, int); -extern int xfs_bmbt_lookup_eq(struct xfs_btree_cur *, xfs_fileoff_t, - xfs_fsblock_t, xfs_filblks_t, int *); -extern int xfs_bmbt_lookup_ge(struct xfs_btree_cur *, xfs_fileoff_t, - xfs_fsblock_t, xfs_filblks_t, int *); /* * Give the bmap btree a new root block. Copy the old broot contents -- From owner-xfs@oss.sgi.com Sun Aug 3 18:30:47 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:31:18 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741UjIx030300 for ; Sun, 3 Aug 2008 18:30:47 -0700 X-ASG-Debug-ID: 1217813518-7399019f0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0E8AA3566AA for ; Sun, 3 Aug 2008 18:31:58 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id ey3983KkYRQG0zxE for ; Sun, 03 Aug 2008 18:31:58 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741VwIF008856 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:31:58 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741Vwm9008854 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:31:58 +0200 Date: Mon, 4 Aug 2008 03:31:58 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 00/26] generic btree implementation, version 3 Subject: [PATCH 00/26] generic btree implementation, version 3 Message-ID: <20080804013158.GA8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813520 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17325 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Firstly all but one of Dave's review comments are addressed. There is also a new first patch to clean up the defintion of struct xfs_btree_block which I would consider an ASAP merge candidate. But there are also some bigger changes. For reimplemented the newroot and killroot methods. In both cases one method was used for block rooted and inode rooted btree but doing something quite different. This patchset gets rid of these two method completely and implements both cases in xfs_btree.c. For new_root that was as easy as just calling xfs_btree_new_root directly for the !XFS_BTREE_ROOT_IN_INODE and moving xfs_bmbt_newroot into the common code and calling it for the other case. The XFS_BTREE_ROOT_IN_INODE case for kill_root was similarly easily solved by just moving the code around and making it block/rec/key/ptr size agnostic, but the !XFS_BTREE_ROOT_IN_INODE needed a little care. The alloc and inobt trees were in fact using the almost exact sequence built from ->set_root and ->free_block but the odd way to pass the block to be freed made this non-obvious. By changing the calling convention to the normal one this one could be unified. I've left in some nasty asserts to ensure the assumptions about these blocks wasn't wrong. The other big news is the last patch in the series, which adds rec_len and key_len members to the btree_ops structure and uses this to make the ptr_addr, key_addr, rec_addr, set_key, set_rec, move_keys, move_recs, copy_keys, copy_recsm log_keys and log_recs methods generic. This patch by itselfs saves a few hundred lines of code and more than two kilobytes in the binary image. With this eventual worse generated code that needs to multiply by variables instead of constants should easily be offset by the reduced instruction cache footprint. The patch is so far not really comments and more a proof of concept - if everyone agrees with this approach the changes will be merged into the earlier patches. Now even with this move the set_*, move_* and copy_* interfaces are left as-is. All the old discussion still applies except that things might be a little more clear now that there's just one implementation each and everything is contained in one file. I think keeping the copy_* interfaces is a good idea, they are now basically just typed memcpy wrappers - although we should switch the order of the src and dst arguments to be the same as memcpy. The set_* interfaces can probably go away - over copy_ they just add the index based addressing which most callers don't use anyway. I'm not sure what to do with move_* - these are the most ugly helpers, so maybe we should just make them memmove wrappers in the style of copy_ and leave all addressing to the callers. With all these changes the stats for this series are now: 37 files changed, 6309 insertions(+), 7873 deletions(-) text data bss dec hex filename 631577 4227 3092 638896 9bfb0 fs/xfs/xfs.ko.base 614298 4435 3124 621857 97d21 fs/xfs/xfs.ko.btree -- From owner-xfs@oss.sgi.com Sun Aug 3 18:32:40 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:32:48 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_63, J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741WdiT031460 for ; Sun, 3 Aug 2008 18:32:39 -0700 X-ASG-Debug-ID: 1217813630-2dae03580000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1EFB7EF6A60 for ; Sun, 3 Aug 2008 18:33:51 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id IQ7XvVvg40BTKJXf for ; Sun, 03 Aug 2008 18:33:51 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741XpIF009081 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:33:52 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741XpUa009079 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:33:51 +0200 Date: Mon, 4 Aug 2008 03:33:51 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 11/26] implement generic xfs_btree_decrement Subject: [PATCH 11/26] implement generic xfs_btree_decrement Message-ID: <20080804013351.GL8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-common-btree-decrement User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813633 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17336 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs From: Dave Chinner [hch: split out from bigger patch and minor adaptions] Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-01 18:15:06.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-01 18:16:29.000000000 +0200 @@ -1052,3 +1052,102 @@ error0: XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); return error; } + +/* + * Decrement cursor by one record at the level. + * For nonzero levels the leaf-ward information is untouched. + */ +int /* error */ +xfs_btree_decrement( + struct xfs_btree_cur *cur, + int level, + int *stat) /* success/failure */ +{ + struct xfs_btree_block *block; + xfs_buf_t *bp; + int error; /* error return value */ + int lev; + union xfs_btree_ptr ptr; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, level); + + ASSERT(level < cur->bc_nlevels); + + /* Read-ahead to the left at this level. */ + xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); + + /* We're done if we remain in the block after the decrement. */ + if (--cur->bc_ptrs[level] > 0) + goto out1; + + /* Get a pointer to the btree block. */ + block = xfs_btree_get_block(cur, level, &bp); + +#ifdef DEBUG + error = xfs_btree_check_block(cur, block, level, bp); + if (error) + goto error0; +#endif + + /* Fail if we just went off the left edge of the tree. */ + xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_LEFTSIB); + if (xfs_btree_ptr_is_null(cur, &ptr)) + goto out0; + + XFS_BTREE_STATS_INC(cur, decrement); + + /* + * March up the tree decrementing pointers. + * Stop when we don't go off the left edge of a block. + */ + for (lev = level + 1; lev < cur->bc_nlevels; lev++) { + if (--cur->bc_ptrs[lev] > 0) + break; + /* Read-ahead the left block for the next loop. */ + xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); + } + + /* + * If we went off the root then we are seriously confused. + * or the root of the tree is in an inode. + */ + if (lev == cur->bc_nlevels) { + if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) + goto out0; + ASSERT(0); + error = EFSCORRUPTED; + goto error0; + } + ASSERT(lev < cur->bc_nlevels); + + /* + * Now walk back down the tree, fixing up the cursor's buffer + * pointers and key numbers. + */ + for (block = xfs_btree_get_block(cur, lev, &bp); lev > level; ) { + union xfs_btree_ptr *ptrp; + + ptrp = cur->bc_ops->ptr_addr(cur, cur->bc_ptrs[lev], block); + error = xfs_btree_read_buf_block(cur, ptrp, --lev, + 0, &block, &bp); + if (error) + goto error0; + xfs_btree_setbuf(cur, lev, bp); + cur->bc_ptrs[lev] = xfs_btree_get_numrecs(block); + } +out1: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + Index: linux-2.6-xfs/fs/xfs/xfs_alloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc.c 2008-08-01 18:10:44.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc.c 2008-08-01 18:16:14.000000000 +0200 @@ -961,7 +961,7 @@ xfs_alloc_ag_vextent_near( args->minlen, <bnoa, <lena); if (ltlena >= args->minlen) break; - if ((error = xfs_alloc_decrement(bno_cur_lt, 0, &i))) + if ((error = xfs_btree_decrement(bno_cur_lt, 0, &i))) goto error0; if (!i) { xfs_btree_del_cursor(bno_cur_lt, @@ -1162,7 +1162,7 @@ xfs_alloc_ag_vextent_near( /* * Fell off the left end. */ - if ((error = xfs_alloc_decrement( + if ((error = xfs_btree_decrement( bno_cur_lt, 0, &i))) goto error0; if (!i) { @@ -1321,7 +1321,7 @@ xfs_alloc_ag_vextent_size( bestflen = flen; bestfbno = fbno; for (;;) { - if ((error = xfs_alloc_decrement(cnt_cur, 0, &i))) + if ((error = xfs_btree_decrement(cnt_cur, 0, &i))) goto error0; if (i == 0) break; @@ -1416,7 +1416,7 @@ xfs_alloc_ag_vextent_small( xfs_extlen_t flen; int i; - if ((error = xfs_alloc_decrement(ccur, 0, &i))) + if ((error = xfs_btree_decrement(ccur, 0, &i))) goto error0; if (i) { if ((error = xfs_alloc_get_rec(ccur, &fbno, &flen, &i))) @@ -1607,7 +1607,7 @@ xfs_free_ag_extent( /* * Move the by-block cursor back to the left neighbor. */ - if ((error = xfs_alloc_decrement(bno_cur, 0, &i))) + if ((error = xfs_btree_decrement(bno_cur, 0, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); #ifdef DEBUG @@ -1653,7 +1653,7 @@ xfs_free_ag_extent( * Back up the by-block cursor to the left neighbor, and * update its length. */ - if ((error = xfs_alloc_decrement(bno_cur, 0, &i))) + if ((error = xfs_btree_decrement(bno_cur, 0, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); nbno = ltbno; Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-08-01 18:10:44.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-08-01 18:16:14.000000000 +0200 @@ -256,7 +256,7 @@ xfs_alloc_delrec( xfs_btree_setbuf(cur, level, NULL); cur->bc_nlevels--; } else if (level > 0 && - (error = xfs_alloc_decrement(cur, level, &i))) + (error = xfs_btree_decrement(cur, level, &i))) return error; *stat = 1; return 0; @@ -272,7 +272,7 @@ xfs_alloc_delrec( * the minimum, we're done. */ if (numrecs >= XFS_ALLOC_BLOCK_MINRECS(level, cur)) { - if (level > 0 && (error = xfs_alloc_decrement(cur, level, &i))) + if (level > 0 && (error = xfs_btree_decrement(cur, level, &i))) return error; *stat = 1; return 0; @@ -336,7 +336,7 @@ xfs_alloc_delrec( xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); if (level > 0 && - (error = xfs_alloc_decrement(cur, level, + (error = xfs_btree_decrement(cur, level, &i))) return error; *stat = 1; @@ -352,7 +352,7 @@ xfs_alloc_delrec( if (lbno != NULLAGBLOCK) { i = xfs_btree_firstrec(tcur, level); XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_decrement(tcur, level, &i))) + if ((error = xfs_btree_decrement(tcur, level, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); } @@ -368,7 +368,7 @@ xfs_alloc_delrec( */ i = xfs_btree_firstrec(tcur, level); XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_decrement(tcur, level, &i))) + if ((error = xfs_btree_decrement(tcur, level, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); xfs_btree_firstrec(tcur, level); @@ -468,7 +468,7 @@ xfs_alloc_delrec( * Just return. This is probably a logic error, but it's not fatal. */ else { - if (level > 0 && (error = xfs_alloc_decrement(cur, level, &i))) + if (level > 0 && (error = xfs_btree_decrement(cur, level, &i))) return error; *stat = 1; return 0; @@ -1780,90 +1780,6 @@ xfs_alloc_updkey( */ /* - * Decrement cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -int /* error */ -xfs_alloc_decrement( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level in btree, 0 is leaf */ - int *stat) /* success/failure */ -{ - xfs_alloc_block_t *block; /* btree block */ - int error; /* error return value */ - int lev; /* btree level */ - - ASSERT(level < cur->bc_nlevels); - /* - * Read-ahead to the left at this level. - */ - xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); - /* - * Decrement the ptr at this level. If we're still in the block - * then we're done. - */ - if (--cur->bc_ptrs[level] > 0) { - *stat = 1; - return 0; - } - /* - * Get a pointer to the btree block. - */ - block = XFS_BUF_TO_ALLOC_BLOCK(cur->bc_bufs[level]); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, - cur->bc_bufs[level]))) - return error; -#endif - /* - * If we just went off the left edge of the tree, return failure. - */ - if (be32_to_cpu(block->bb_leftsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * March up the tree decrementing pointers. - * Stop when we don't go off the left edge of a block. - */ - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - if (--cur->bc_ptrs[lev] > 0) - break; - /* - * Read-ahead the left block, we're going to read it - * in the next loop. - */ - xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); - } - /* - * If we went off the root then we are seriously confused. - */ - ASSERT(lev < cur->bc_nlevels); - /* - * Now walk back down the tree, fixing up the cursor's buffer - * pointers and key numbers. - */ - for (block = XFS_BUF_TO_ALLOC_BLOCK(cur->bc_bufs[lev]); lev > level; ) { - xfs_agblock_t agbno; /* block number of btree block */ - xfs_buf_t *bp; /* buffer pointer for block */ - - agbno = be32_to_cpu(*XFS_ALLOC_PTR_ADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, agbno, 0, &bp, - XFS_ALLOC_BTREE_REF))) - return error; - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; - cur->bc_ptrs[lev] = be16_to_cpu(block->bb_numrecs); - } - *stat = 1; - return 0; -} - -/* * Delete the record pointed to by cur. * The cursor refers to the place where the record was (could be inserted) * when the operation returns. @@ -1889,7 +1805,7 @@ xfs_alloc_delete( if (i == 0) { for (level = 1; level < cur->bc_nlevels; level++) { if (cur->bc_ptrs[level] == 0) { - if ((error = xfs_alloc_decrement(cur, level, &i))) + if ((error = xfs_btree_decrement(cur, level, &i))) return error; break; } Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.h 2008-08-01 18:10:44.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h 2008-08-01 18:16:14.000000000 +0200 @@ -95,12 +95,6 @@ typedef struct xfs_btree_sblock xfs_allo XFS_BTREE_PTR_ADDR(xfs_alloc, bb, i, XFS_ALLOC_BLOCK_MAXRECS(1, cur)) /* - * Decrement cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -extern int xfs_alloc_decrement(struct xfs_btree_cur *cur, int level, int *stat); - -/* * Delete the record pointed to by cur. * The cursor refers to the place where the record was (could be inserted) * when the operation returns. Index: linux-2.6-xfs/fs/xfs/xfs_bmap.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap.c 2008-08-01 18:10:44.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap.c 2008-08-01 18:16:14.000000000 +0200 @@ -820,7 +820,7 @@ xfs_bmap_add_extent_delay_real( if ((error = xfs_bmbt_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_bmbt_update(cur, LEFT.br_startoff, @@ -1381,13 +1381,13 @@ xfs_bmap_add_extent_unwritten_real( if ((error = xfs_bmbt_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_bmbt_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_bmbt_update(cur, LEFT.br_startoff, @@ -1430,7 +1430,7 @@ xfs_bmap_add_extent_unwritten_real( if ((error = xfs_bmbt_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_bmbt_update(cur, LEFT.br_startoff, @@ -1473,7 +1473,7 @@ xfs_bmap_add_extent_unwritten_real( if ((error = xfs_bmbt_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_bmbt_update(cur, new->br_startoff, @@ -1556,7 +1556,7 @@ xfs_bmap_add_extent_unwritten_real( PREV.br_blockcount - new->br_blockcount, oldext))) goto done; - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; if (xfs_bmbt_update(cur, LEFT.br_startoff, LEFT.br_startblock, @@ -2108,7 +2108,7 @@ xfs_bmap_add_extent_hole_real( if ((error = xfs_bmbt_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_bmbt_update(cur, left.br_startoff, Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-01 18:10:44.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-01 18:16:14.000000000 +0200 @@ -203,7 +203,7 @@ xfs_bmbt_delrec( XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } - if (level > 0 && (error = xfs_bmbt_decrement(cur, level, &j))) { + if (level > 0 && (error = xfs_btree_decrement(cur, level, &j))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -216,7 +216,7 @@ xfs_bmbt_delrec( goto error0; } if (numrecs >= XFS_BMAP_BLOCK_IMINRECS(level, cur)) { - if (level > 0 && (error = xfs_bmbt_decrement(cur, level, &j))) { + if (level > 0 && (error = xfs_btree_decrement(cur, level, &j))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -237,7 +237,7 @@ xfs_bmbt_delrec( XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } - if (level > 0 && (error = xfs_bmbt_decrement(cur, level, &i))) { + if (level > 0 && (error = xfs_btree_decrement(cur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -282,7 +282,7 @@ xfs_bmbt_delrec( xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); tcur = NULL; if (level > 0) { - if ((error = xfs_bmbt_decrement(cur, + if ((error = xfs_btree_decrement(cur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); @@ -298,7 +298,7 @@ xfs_bmbt_delrec( if (lbno != NULLFSBLOCK) { i = xfs_btree_firstrec(tcur, level); XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_bmbt_decrement(tcur, level, &i))) { + if ((error = xfs_btree_decrement(tcur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -311,7 +311,7 @@ xfs_bmbt_delrec( /* * decrement to last in block */ - if ((error = xfs_bmbt_decrement(tcur, level, &i))) { + if ((error = xfs_btree_decrement(tcur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -383,7 +383,7 @@ xfs_bmbt_delrec( } lrecs = be16_to_cpu(left->bb_numrecs); } else { - if (level > 0 && (error = xfs_bmbt_decrement(cur, level, &i))) { + if (level > 0 && (error = xfs_btree_decrement(cur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -1487,80 +1487,6 @@ xfs_bmdr_to_bmbt( } /* - * Decrement cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -int /* error */ -xfs_bmbt_decrement( - xfs_btree_cur_t *cur, - int level, - int *stat) /* success/failure */ -{ - xfs_bmbt_block_t *block; - xfs_buf_t *bp; - int error; /* error return value */ - xfs_fsblock_t fsbno; - int lev; - xfs_mount_t *mp; - xfs_trans_t *tp; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, level); - ASSERT(level < cur->bc_nlevels); - - xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); - - if (--cur->bc_ptrs[level] > 0) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - block = xfs_bmbt_get_block(cur, level, &bp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - if (be64_to_cpu(block->bb_leftsib) == NULLDFSBNO) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - if (--cur->bc_ptrs[lev] > 0) - break; - xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); - } - if (lev == cur->bc_nlevels) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - tp = cur->bc_tp; - mp = cur->bc_mp; - for (block = xfs_bmbt_get_block(cur, lev, &bp); lev > level; ) { - fsbno = be64_to_cpu(*XFS_BMAP_PTR_IADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufl(mp, tp, fsbno, 0, &bp, - XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_BMBT_BLOCK(bp); - if ((error = xfs_btree_check_lblock(cur, block, lev, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - cur->bc_ptrs[lev] = be16_to_cpu(block->bb_numrecs); - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; -} - -/* * Delete the record pointed to by cur. */ int /* error */ @@ -1582,7 +1508,7 @@ xfs_bmbt_delete( if (i == 0) { for (level = 1; level < cur->bc_nlevels; level++) { if (cur->bc_ptrs[level] == 0) { - if ((error = xfs_bmbt_decrement(cur, level, + if ((error = xfs_btree_decrement(cur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); return error; Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.h 2008-08-01 18:10:44.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h 2008-08-01 18:16:14.000000000 +0200 @@ -237,7 +237,6 @@ typedef struct xfs_btree_lblock xfs_bmbt * Prototypes for xfs_bmap.c to call. */ extern void xfs_bmdr_to_bmbt(xfs_bmdr_block_t *, int, xfs_bmbt_block_t *, int); -extern int xfs_bmbt_decrement(struct xfs_btree_cur *, int, int *); extern int xfs_bmbt_delete(struct xfs_btree_cur *, int *); extern void xfs_bmbt_get_all(xfs_bmbt_rec_host_t *r, xfs_bmbt_irec_t *s); extern xfs_bmbt_block_t *xfs_bmbt_get_block(struct xfs_btree_cur *cur, Index: linux-2.6-xfs/fs/xfs/xfs_ialloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc.c 2008-08-01 18:10:44.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc.c 2008-08-01 18:16:14.000000000 +0200 @@ -739,7 +739,7 @@ nextag: /* * Search left with tcur, back up 1 record. */ - if ((error = xfs_inobt_decrement(tcur, 0, &i))) + if ((error = xfs_btree_decrement(tcur, 0, &i))) goto error1; doneleft = !i; if (!doneleft) { @@ -815,7 +815,7 @@ nextag: * further left. */ if (useleft) { - if ((error = xfs_inobt_decrement(tcur, 0, + if ((error = xfs_btree_decrement(tcur, 0, &i))) goto error1; doneleft = !i; Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-08-01 18:10:44.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-08-01 18:16:14.000000000 +0200 @@ -205,7 +205,7 @@ xfs_inobt_delrec( cur->bc_bufs[level] = NULL; cur->bc_nlevels--; } else if (level > 0 && - (error = xfs_inobt_decrement(cur, level, &i))) + (error = xfs_btree_decrement(cur, level, &i))) return error; *stat = 1; return 0; @@ -222,7 +222,7 @@ xfs_inobt_delrec( */ if (numrecs >= XFS_INOBT_BLOCK_MINRECS(level, cur)) { if (level > 0 && - (error = xfs_inobt_decrement(cur, level, &i))) + (error = xfs_btree_decrement(cur, level, &i))) return error; *stat = 1; return 0; @@ -286,7 +286,7 @@ xfs_inobt_delrec( xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); if (level > 0 && - (error = xfs_inobt_decrement(cur, level, + (error = xfs_btree_decrement(cur, level, &i))) return error; *stat = 1; @@ -301,7 +301,7 @@ xfs_inobt_delrec( rrecs = be16_to_cpu(right->bb_numrecs); if (lbno != NULLAGBLOCK) { xfs_btree_firstrec(tcur, level); - if ((error = xfs_inobt_decrement(tcur, level, &i))) + if ((error = xfs_btree_decrement(tcur, level, &i))) goto error0; } } @@ -315,7 +315,7 @@ xfs_inobt_delrec( * previous block. */ xfs_btree_firstrec(tcur, level); - if ((error = xfs_inobt_decrement(tcur, level, &i))) + if ((error = xfs_btree_decrement(tcur, level, &i))) goto error0; xfs_btree_firstrec(tcur, level); /* @@ -414,7 +414,7 @@ xfs_inobt_delrec( * Just return. This is probably a logic error, but it's not fatal. */ else { - if (level > 0 && (error = xfs_inobt_decrement(cur, level, &i))) + if (level > 0 && (error = xfs_btree_decrement(cur, level, &i))) return error; *stat = 1; return 0; @@ -1656,90 +1656,6 @@ xfs_inobt_updkey( */ /* - * Decrement cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -int /* error */ -xfs_inobt_decrement( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level in btree, 0 is leaf */ - int *stat) /* success/failure */ -{ - xfs_inobt_block_t *block; /* btree block */ - int error; - int lev; /* btree level */ - - ASSERT(level < cur->bc_nlevels); - /* - * Read-ahead to the left at this level. - */ - xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); - /* - * Decrement the ptr at this level. If we're still in the block - * then we're done. - */ - if (--cur->bc_ptrs[level] > 0) { - *stat = 1; - return 0; - } - /* - * Get a pointer to the btree block. - */ - block = XFS_BUF_TO_INOBT_BLOCK(cur->bc_bufs[level]); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, - cur->bc_bufs[level]))) - return error; -#endif - /* - * If we just went off the left edge of the tree, return failure. - */ - if (be32_to_cpu(block->bb_leftsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * March up the tree decrementing pointers. - * Stop when we don't go off the left edge of a block. - */ - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - if (--cur->bc_ptrs[lev] > 0) - break; - /* - * Read-ahead the left block, we're going to read it - * in the next loop. - */ - xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); - } - /* - * If we went off the root then we are seriously confused. - */ - ASSERT(lev < cur->bc_nlevels); - /* - * Now walk back down the tree, fixing up the cursor's buffer - * pointers and key numbers. - */ - for (block = XFS_BUF_TO_INOBT_BLOCK(cur->bc_bufs[lev]); lev > level; ) { - xfs_agblock_t agbno; /* block number of btree block */ - xfs_buf_t *bp; /* buffer containing btree block */ - - agbno = be32_to_cpu(*XFS_INOBT_PTR_ADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, agbno, 0, &bp, - XFS_INO_BTREE_REF))) - return error; - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_INOBT_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; - cur->bc_ptrs[lev] = be16_to_cpu(block->bb_numrecs); - } - *stat = 1; - return 0; -} - -/* * Delete the record pointed to by cur. * The cursor refers to the place where the record was (could be inserted) * when the operation returns. @@ -1765,7 +1681,7 @@ xfs_inobt_delete( if (i == 0) { for (level = 1; level < cur->bc_nlevels; level++) { if (cur->bc_ptrs[level] == 0) { - if ((error = xfs_inobt_decrement(cur, level, &i))) + if ((error = xfs_btree_decrement(cur, level, &i))) return error; break; } Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.h 2008-08-01 18:10:44.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h 2008-08-01 18:16:14.000000000 +0200 @@ -117,12 +117,6 @@ typedef struct xfs_btree_sblock xfs_inob i, XFS_INOBT_BLOCK_MAXRECS(1, cur))) /* - * Decrement cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -extern int xfs_inobt_decrement(struct xfs_btree_cur *cur, int level, int *stat); - -/* * Delete the record pointed to by cur. * The cursor refers to the place where the record was (could be inserted) * when the operation returns. Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-01 18:10:44.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-01 18:16:14.000000000 +0200 @@ -511,6 +511,7 @@ xfs_btree_setbuf( */ int xfs_btree_increment(struct xfs_btree_cur *, int, int *); +int xfs_btree_decrement(struct xfs_btree_cur *, int, int *); /* -- From owner-xfs@oss.sgi.com Sun Aug 3 18:32:21 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:32:31 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_42, RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741WLBb031172 for ; Sun, 3 Aug 2008 18:32:21 -0700 X-ASG-Debug-ID: 1217813612-16a602eb0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 129BA197642B for ; Sun, 3 Aug 2008 18:33:33 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id anQwla8VGpIOuv5F for ; Sun, 03 Aug 2008 18:33:33 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741XYIF009045 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:33:34 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741XYsr009043 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:33:34 +0200 Date: Mon, 4 Aug 2008 03:33:34 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 09/26] make btree tracing generic Subject: [PATCH 09/26] make btree tracing generic Message-ID: <20080804013334.GJ8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-btree-trace User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813614 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.22 X-Barracuda-Spam-Status: No, SCORE=-1.22 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC0_MJ615, MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1688 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words 0.20 BSF_SC0_MJ615 Custom Rule MJ615 X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17334 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs make the existing bmap btree tracing generic so that it applies to all btree types. Some fragments lifted from a patch by Dave Chinner. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-07-29 16:23:23.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-07-29 16:52:06.000000000 +0200 @@ -190,6 +190,25 @@ struct xfs_btree_ops { /* get inode rooted btree root */ struct xfs_btree_block *(*get_root_from_inode)(struct xfs_btree_cur *); + + /* btree tracing */ +#ifdef XFS_BTREE_TRACE + void (*trace_enter)(struct xfs_btree_cur *, const char *, + char *, int, int, __psunsigned_t, + __psunsigned_t, __psunsigned_t, + __psunsigned_t, __psunsigned_t, + __psunsigned_t, __psunsigned_t, + __psunsigned_t, __psunsigned_t, + __psunsigned_t, __psunsigned_t); + void (*trace_cursor)(struct xfs_btree_cur *, __uint32_t *, + __uint64_t *, __uint64_t *); + void (*trace_key)(struct xfs_btree_cur *, + union xfs_btree_key *, __uint64_t *, + __uint64_t *); + void (*trace_record)(struct xfs_btree_cur *, + union xfs_btree_rec *, __uint64_t *, + __uint64_t *, __uint64_t *); +#endif }; /* Index: linux-2.6-xfs/fs/xfs/xfs_btree_trace.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6-xfs/fs/xfs/xfs_btree_trace.c 2008-07-29 16:59:40.000000000 +0200 @@ -0,0 +1,247 @@ +/* + * Copyright (c) 2008 Silicon Graphics, Inc. + * All Rights Reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ +#include "xfs.h" +#include "xfs_types.h" +#include "xfs_inum.h" +#include "xfs_bmap_btree.h" +#include "xfs_alloc_btree.h" +#include "xfs_ialloc_btree.h" +#include "xfs_inode.h" +#include "xfs_btree.h" +#include "xfs_btree_trace.h" + +STATIC void +xfs_btree_trace_ptr( + struct xfs_btree_cur *cur, + union xfs_btree_ptr ptr, + __psunsigned_t *high, + __psunsigned_t *low) +{ + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { + __u64 val = be64_to_cpu(ptr.l); + *high = val >> 32; + *low = (int)val; + } else { + *high = 0; + *low = be32_to_cpu(ptr.s); + } +} + +/* + * Add a trace buffer entry for arguments, for a buffer & 1 integer arg. + */ +void +xfs_btree_trace_argbi( + const char *func, + struct xfs_btree_cur *cur, + struct xfs_buf *b, + int i, + int line) +{ + cur->bc_ops->trace_enter(cur, func, XBT_ARGS, XFS_BTREE_KTRACE_ARGBI, + line, (__psunsigned_t)b, i, 0, 0, 0, 0, 0, + 0, 0, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for a buffer & 2 integer args. + */ +void +xfs_btree_trace_argbii( + const char *func, + struct xfs_btree_cur *cur, + struct xfs_buf *b, + int i0, + int i1, + int line) +{ + cur->bc_ops->trace_enter(cur, func, XBT_ARGS, XFS_BTREE_KTRACE_ARGBII, + line, (__psunsigned_t)b, i0, i1, 0, 0, 0, 0, + 0, 0, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for 3 block-length args + * and an integer arg. + */ +void +xfs_btree_trace_argfffi( + const char *func, + struct xfs_btree_cur *cur, + xfs_dfiloff_t o, + xfs_dfsbno_t b, + xfs_dfilblks_t i, + int j, + int line) +{ + cur->bc_ops->trace_enter(cur, func, XBT_ARGS, XFS_BTREE_KTRACE_ARGFFFI, line, + o >> 32, (int)o, b >> 32, (int)b, + i >> 32, (int)i, (int)j, 0, + 0, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for one integer arg. + */ +void +xfs_btree_trace_argi( + const char *func, + struct xfs_btree_cur *cur, + int i, + int line) +{ + cur->bc_ops->trace_enter(cur, func, XBT_ARGS, XFS_BTREE_KTRACE_ARGI, + line, i, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for int, fsblock, key. + */ +void +xfs_btree_trace_argipk( + const char *func, + struct xfs_btree_cur *cur, + int i, + union xfs_btree_ptr ptr, + union xfs_btree_key *key, + int line) +{ + __psunsigned_t high, low; + __uint64_t l0, l1; + + xfs_btree_trace_ptr(cur, ptr, &high, &low); + cur->bc_ops->trace_key(cur, key, &l0, &l1); + cur->bc_ops->trace_enter(cur, func, XBT_ARGS, XFS_BTREE_KTRACE_ARGIPK, + line, i, high, low, + l0 >> 32, (int)l0, + l1 >> 32, (int)l1, + 0, 0, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for int, fsblock, rec. + */ +void +xfs_btree_trace_argipr( + const char *func, + struct xfs_btree_cur *cur, + int i, + union xfs_btree_ptr ptr, + union xfs_btree_rec *rec, + int line) +{ + __psunsigned_t high, low; + __uint64_t l0, l1, l2; + + xfs_btree_trace_ptr(cur, ptr, &high, &low); + cur->bc_ops->trace_record(cur, rec, &l0, &l1, &l2); + cur->bc_ops->trace_enter(cur, func, XBT_ARGS, XFS_BTREE_KTRACE_ARGIPR, + line, i, + high, low, + l0 >> 32, (int)l0, + l1 >> 32, (int)l1, + l2 >> 32, (int)l2, + 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for int, key. + */ +void +xfs_btree_trace_argik( + const char *func, + struct xfs_btree_cur *cur, + int i, + union xfs_btree_key *key, + int line) +{ + __uint64_t l0, l1; + + cur->bc_ops->trace_key(cur, key, &l0, &l1); + cur->bc_ops->trace_enter(cur, func, XBT_ARGS, XFS_BTREE_KTRACE_ARGIK, + line, i, + l0 >> 32, (int)l0, + l1 >> 32, (int)l1, + 0, 0, 0, 0, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for record. + */ +void +xfs_btree_trace_argr( + const char *func, + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec, + int line) +{ + __uint64_t l0, l1, l2; + + cur->bc_ops->trace_record(cur, rec, &l0, &l1, &l2); + cur->bc_ops->trace_enter(cur, func, XBT_ARGS, XFS_BTREE_KTRACE_ARGR, + line, + l0 >> 32, (int)l0, + l1 >> 32, (int)l1, + l2 >> 32, (int)l2, + 0, 0, 0, 0, 0); +} + +/* + * Add a trace buffer entry for the cursor/operation. + */ +void +xfs_btree_trace_cursor( + const char *func, + struct xfs_btree_cur *cur, + int type, + int line) +{ + __uint32_t s0; + __uint64_t l0, l1; + char *s; + + switch (type) { + case XBT_ARGS: + s = "args"; + break; + case XBT_ENTRY: + s = "entry"; + break; + case XBT_ERROR: + s = "error"; + break; + case XBT_EXIT: + s = "exit"; + break; + default: + s = "unknown"; + break; + } + + cur->bc_ops->trace_cursor(cur, &s0, &l0, &l1); + cur->bc_ops->trace_enter(cur, func, s, XFS_BTREE_KTRACE_CUR, line, + s0, + l0 >> 32, (int)l0, + l1 >> 32, (int)l1, + (__psunsigned_t)cur->bc_bufs[0], + (__psunsigned_t)cur->bc_bufs[1], + (__psunsigned_t)cur->bc_bufs[2], + (__psunsigned_t)cur->bc_bufs[3], + (cur->bc_ptrs[0] << 16) | cur->bc_ptrs[1], + (cur->bc_ptrs[2] << 16) | cur->bc_ptrs[3]); +} Index: linux-2.6-xfs/fs/xfs/Makefile =================================================================== --- linux-2.6-xfs.orig/fs/xfs/Makefile 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/Makefile 2008-07-29 16:23:24.000000000 +0200 @@ -81,7 +81,8 @@ xfs-y += xfs_alloc.o \ xfs_dmops.o \ xfs_qmops.o -xfs-$(CONFIG_XFS_TRACE) += xfs_dir2_trace.o +xfs-$(CONFIG_XFS_TRACE) += xfs_btree_trace.o \ + xfs_dir2_trace.o # Objects in linux/ xfs-y += $(addprefix $(XFS_LINUX)/, \ Index: linux-2.6-xfs/fs/xfs/xfs.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs.h 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs.h 2008-07-29 16:23:24.000000000 +0200 @@ -30,7 +30,7 @@ #define XFS_ATTR_TRACE 1 #define XFS_BLI_TRACE 1 #define XFS_BMAP_TRACE 1 -#define XFS_BMBT_TRACE 1 +#define XFS_BTREE_TRACE 1 #define XFS_DIR2_TRACE 1 #define XFS_DQUOT_TRACE 1 #define XFS_ILOCK_TRACE 1 Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-07-29 16:57:57.000000000 +0200 @@ -37,16 +37,13 @@ #include "xfs_inode_item.h" #include "xfs_alloc.h" #include "xfs_btree.h" +#include "xfs_btree_trace.h" #include "xfs_ialloc.h" #include "xfs_itable.h" #include "xfs_bmap.h" #include "xfs_error.h" #include "xfs_quota.h" -#if defined(XFS_BMBT_TRACE) -ktrace_t *xfs_bmbt_trace_buf; -#endif - /* * Prototypes for internal btree functions. */ @@ -61,245 +58,33 @@ STATIC int xfs_bmbt_split(xfs_btree_cur_ __uint64_t *, xfs_btree_cur_t **, int *); STATIC int xfs_bmbt_updkey(xfs_btree_cur_t *, xfs_bmbt_key_t *, int); - -#if defined(XFS_BMBT_TRACE) - -static char ARGS[] = "args"; -static char ENTRY[] = "entry"; -static char ERROR[] = "error"; #undef EXIT -static char EXIT[] = "exit"; -/* - * Add a trace buffer entry for the arguments given to the routine, - * generic form. - */ -STATIC void -xfs_bmbt_trace_enter( - const char *func, - xfs_btree_cur_t *cur, - char *s, - int type, - int line, - __psunsigned_t a0, - __psunsigned_t a1, - __psunsigned_t a2, - __psunsigned_t a3, - __psunsigned_t a4, - __psunsigned_t a5, - __psunsigned_t a6, - __psunsigned_t a7, - __psunsigned_t a8, - __psunsigned_t a9, - __psunsigned_t a10) -{ - xfs_inode_t *ip; - int whichfork; - - ip = cur->bc_private.b.ip; - whichfork = cur->bc_private.b.whichfork; - ktrace_enter(xfs_bmbt_trace_buf, - (void *)((__psint_t)type | (whichfork << 8) | (line << 16)), - (void *)func, (void *)s, (void *)ip, (void *)cur, - (void *)a0, (void *)a1, (void *)a2, (void *)a3, - (void *)a4, (void *)a5, (void *)a6, (void *)a7, - (void *)a8, (void *)a9, (void *)a10); - ASSERT(ip->i_btrace); - ktrace_enter(ip->i_btrace, - (void *)((__psint_t)type | (whichfork << 8) | (line << 16)), - (void *)func, (void *)s, (void *)ip, (void *)cur, - (void *)a0, (void *)a1, (void *)a2, (void *)a3, - (void *)a4, (void *)a5, (void *)a6, (void *)a7, - (void *)a8, (void *)a9, (void *)a10); -} -/* - * Add a trace buffer entry for arguments, for a buffer & 1 integer arg. - */ -STATIC void -xfs_bmbt_trace_argbi( - const char *func, - xfs_btree_cur_t *cur, - xfs_buf_t *b, - int i, - int line) -{ - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGBI, line, - (__psunsigned_t)b, i, 0, 0, - 0, 0, 0, 0, - 0, 0, 0); -} - -/* - * Add a trace buffer entry for arguments, for a buffer & 2 integer args. - */ -STATIC void -xfs_bmbt_trace_argbii( - const char *func, - xfs_btree_cur_t *cur, - xfs_buf_t *b, - int i0, - int i1, - int line) -{ - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGBII, line, - (__psunsigned_t)b, i0, i1, 0, - 0, 0, 0, 0, - 0, 0, 0); -} - -/* - * Add a trace buffer entry for arguments, for 3 block-length args - * and an integer arg. - */ -STATIC void -xfs_bmbt_trace_argfffi( - const char *func, - xfs_btree_cur_t *cur, - xfs_dfiloff_t o, - xfs_dfsbno_t b, - xfs_dfilblks_t i, - int j, - int line) -{ - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGFFFI, line, - o >> 32, (int)o, b >> 32, (int)b, - i >> 32, (int)i, (int)j, 0, - 0, 0, 0); -} - -/* - * Add a trace buffer entry for arguments, for one integer arg. - */ -STATIC void -xfs_bmbt_trace_argi( - const char *func, - xfs_btree_cur_t *cur, - int i, - int line) -{ - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGI, line, - i, 0, 0, 0, - 0, 0, 0, 0, - 0, 0, 0); -} - -/* - * Add a trace buffer entry for arguments, for int, fsblock, key. - */ -STATIC void -xfs_bmbt_trace_argifk( - const char *func, - xfs_btree_cur_t *cur, - int i, - xfs_fsblock_t f, - xfs_dfiloff_t o, - int line) -{ - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGIFK, line, - i, (xfs_dfsbno_t)f >> 32, (int)f, o >> 32, - (int)o, 0, 0, 0, - 0, 0, 0); -} - -/* - * Add a trace buffer entry for arguments, for int, fsblock, rec. - */ -STATIC void -xfs_bmbt_trace_argifr( - const char *func, - xfs_btree_cur_t *cur, - int i, - xfs_fsblock_t f, - xfs_bmbt_rec_t *r, - int line) -{ - xfs_dfsbno_t b; - xfs_dfilblks_t c; - xfs_dfsbno_t d; - xfs_dfiloff_t o; - xfs_bmbt_irec_t s; - - d = (xfs_dfsbno_t)f; - xfs_bmbt_disk_get_all(r, &s); - o = (xfs_dfiloff_t)s.br_startoff; - b = (xfs_dfsbno_t)s.br_startblock; - c = s.br_blockcount; - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGIFR, line, - i, d >> 32, (int)d, o >> 32, - (int)o, b >> 32, (int)b, c >> 32, - (int)c, 0, 0); -} - -/* - * Add a trace buffer entry for arguments, for int, key. - */ -STATIC void -xfs_bmbt_trace_argik( - const char *func, - xfs_btree_cur_t *cur, - int i, - xfs_bmbt_key_t *k, - int line) -{ - xfs_dfiloff_t o; - - o = be64_to_cpu(k->br_startoff); - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGIFK, line, - i, o >> 32, (int)o, 0, - 0, 0, 0, 0, - 0, 0, 0); -} - -/* - * Add a trace buffer entry for the cursor/operation. - */ -STATIC void -xfs_bmbt_trace_cursor( - const char *func, - xfs_btree_cur_t *cur, - char *s, - int line) -{ - xfs_bmbt_rec_host_t r; - - xfs_bmbt_set_all(&r, &cur->bc_rec.b); - xfs_bmbt_trace_enter(func, cur, s, XFS_BMBT_KTRACE_CUR, line, - (cur->bc_nlevels << 24) | (cur->bc_private.b.flags << 16) | - cur->bc_private.b.allocated, - r.l0 >> 32, (int)r.l0, - r.l1 >> 32, (int)r.l1, - (unsigned long)cur->bc_bufs[0], (unsigned long)cur->bc_bufs[1], - (unsigned long)cur->bc_bufs[2], (unsigned long)cur->bc_bufs[3], - (cur->bc_ptrs[0] << 16) | cur->bc_ptrs[1], - (cur->bc_ptrs[2] << 16) | cur->bc_ptrs[3]); -} - -#define XFS_BMBT_TRACE_ARGBI(c,b,i) \ - xfs_bmbt_trace_argbi(__func__, c, b, i, __LINE__) -#define XFS_BMBT_TRACE_ARGBII(c,b,i,j) \ - xfs_bmbt_trace_argbii(__func__, c, b, i, j, __LINE__) -#define XFS_BMBT_TRACE_ARGFFFI(c,o,b,i,j) \ - xfs_bmbt_trace_argfffi(__func__, c, o, b, i, j, __LINE__) -#define XFS_BMBT_TRACE_ARGI(c,i) \ - xfs_bmbt_trace_argi(__func__, c, i, __LINE__) -#define XFS_BMBT_TRACE_ARGIFK(c,i,f,s) \ - xfs_bmbt_trace_argifk(__func__, c, i, f, s, __LINE__) -#define XFS_BMBT_TRACE_ARGIFR(c,i,f,r) \ - xfs_bmbt_trace_argifr(__func__, c, i, f, r, __LINE__) -#define XFS_BMBT_TRACE_ARGIK(c,i,k) \ - xfs_bmbt_trace_argik(__func__, c, i, k, __LINE__) -#define XFS_BMBT_TRACE_CURSOR(c,s) \ - xfs_bmbt_trace_cursor(__func__, c, s, __LINE__) -#else -#define XFS_BMBT_TRACE_ARGBI(c,b,i) -#define XFS_BMBT_TRACE_ARGBII(c,b,i,j) -#define XFS_BMBT_TRACE_ARGFFFI(c,o,b,i,j) -#define XFS_BMBT_TRACE_ARGI(c,i) -#define XFS_BMBT_TRACE_ARGIFK(c,i,f,s) -#define XFS_BMBT_TRACE_ARGIFR(c,i,f,r) -#define XFS_BMBT_TRACE_ARGIK(c,i,k) -#define XFS_BMBT_TRACE_CURSOR(c,s) -#endif /* XFS_BMBT_TRACE */ +#define ENTRY XBT_ENTRY +#define ERROR XBT_ERROR +#define EXIT XBT_EXIT + +/* + * Keep the XFS_BMBT_TRACE_ names around for now until all code using them + * is converted to be generic and thus switches to the XFS_BTREE_TRACE_ names. + */ +#define XFS_BMBT_TRACE_ARGBI(c,b,i) \ + XFS_BTREE_TRACE_ARGBI(c,b,i) +#define XFS_BMBT_TRACE_ARGBII(c,b,i,j) \ + XFS_BTREE_TRACE_ARGBII(c,b,i,j) +#define XFS_BMBT_TRACE_ARGFFFI(c,o,b,i,j) \ + XFS_BTREE_TRACE_ARGFFFI(c,o,b,i,j) +#define XFS_BMBT_TRACE_ARGI(c,i) \ + XFS_BTREE_TRACE_ARGI(c,i) +#define XFS_BMBT_TRACE_ARGIFK(c,i,f,s) \ + XFS_BTREE_TRACE_ARGIPK(c,i,(union xfs_btree_ptr)f,s) +#define XFS_BMBT_TRACE_ARGIFR(c,i,f,r) \ + XFS_BTREE_TRACE_ARGIPR(c,i, \ + (union xfs_btree_ptr)f, (union xfs_btree_rec *)r) +#define XFS_BMBT_TRACE_ARGIK(c,i,k) \ + XFS_BTREE_TRACE_ARGIK(c,i,(union xfs_btree_key *)k) +#define XFS_BMBT_TRACE_CURSOR(c,s) \ + XFS_BTREE_TRACE_CURSOR(c,s) /* @@ -1485,7 +1270,8 @@ xfs_bmbt_split( xfs_bmbt_rec_t *rrp; /* right record pointer */ XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGIFK(cur, level, *bnop, *startoff); + // disable until merged into common code +// XFS_BMBT_TRACE_ARGIFK(cur, level, *bnop, *startoff); args.tp = cur->bc_tp; args.mp = cur->bc_mp; lbp = cur->bc_bufs[level]; @@ -2639,9 +2425,102 @@ xfs_bmbt_get_root_from_inode( return (struct xfs_btree_block *)ifp->if_broot; } +#ifdef XFS_BTREE_TRACE + +ktrace_t *xfs_bmbt_trace_buf; + +STATIC void +xfs_bmbt_trace_enter( + struct xfs_btree_cur *cur, + const char *func, + char *s, + int type, + int line, + __psunsigned_t a0, + __psunsigned_t a1, + __psunsigned_t a2, + __psunsigned_t a3, + __psunsigned_t a4, + __psunsigned_t a5, + __psunsigned_t a6, + __psunsigned_t a7, + __psunsigned_t a8, + __psunsigned_t a9, + __psunsigned_t a10) +{ + struct xfs_inode *ip = cur->bc_private.b.ip; + int whichfork = cur->bc_private.b.whichfork; + + ktrace_enter(xfs_bmbt_trace_buf, + (void *)((__psint_t)type | (whichfork << 8) | (line << 16)), + (void *)func, (void *)s, (void *)ip, (void *)cur, + (void *)a0, (void *)a1, (void *)a2, (void *)a3, + (void *)a4, (void *)a5, (void *)a6, (void *)a7, + (void *)a8, (void *)a9, (void *)a10); + ktrace_enter(ip->i_btrace, + (void *)((__psint_t)type | (whichfork << 8) | (line << 16)), + (void *)func, (void *)s, (void *)ip, (void *)cur, + (void *)a0, (void *)a1, (void *)a2, (void *)a3, + (void *)a4, (void *)a5, (void *)a6, (void *)a7, + (void *)a8, (void *)a9, (void *)a10); +} + +STATIC void +xfs_bmbt_trace_cursor( + struct xfs_btree_cur *cur, + __uint32_t *s0, + __uint64_t *l0, + __uint64_t *l1) +{ + struct xfs_bmbt_rec_host r; + + xfs_bmbt_set_all(&r, &cur->bc_rec.b); + + *s0 = (cur->bc_nlevels << 24) | + (cur->bc_private.b.flags << 16) | + cur->bc_private.b.allocated; + *l0 = r.l0; + *l1 = r.l1; +} + +STATIC void +xfs_bmbt_trace_key( + struct xfs_btree_cur *cur, + union xfs_btree_key *key, + __uint64_t *l0, + __uint64_t *l1) +{ + *l0 = be64_to_cpu(key->bmbt.br_startoff); + *l1 = 0; +} + +STATIC void +xfs_bmbt_trace_record( + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec, + __uint64_t *l0, + __uint64_t *l1, + __uint64_t *l2) +{ + struct xfs_bmbt_irec irec; + + xfs_bmbt_disk_get_all(&rec->bmbt, &irec); + *l0 = irec.br_startoff; + *l1 = irec.br_startblock; + *l2 = irec.br_blockcount; +} +#endif /* XFS_BTREE_TRACE */ + static const struct xfs_btree_ops xfs_bmbt_ops = { .dup_cursor = xfs_bmbt_dup_cursor, .get_root_from_inode = xfs_bmbt_get_root_from_inode, + +#ifdef XFS_BTREE_TRACE + .trace_enter = xfs_bmbt_trace_enter, + .trace_cursor = xfs_bmbt_trace_cursor, + .trace_key = xfs_bmbt_trace_key, + .trace_record = xfs_bmbt_trace_record, +#endif }; /* Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_ksyms.c 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_ksyms.c 2008-07-29 16:23:24.000000000 +0200 @@ -39,6 +39,7 @@ #include "xfs_dinode.h" #include "xfs_inode.h" #include "xfs_btree.h" +#include "xfs_btree_trace.h" #include "xfs_ialloc.h" #include "xfs_bmap.h" #include "xfs_rtalloc.h" @@ -97,7 +98,9 @@ EXPORT_SYMBOL(xfs_alloc_trace_buf); #ifdef XFS_BMAP_TRACE EXPORT_SYMBOL(xfs_bmap_trace_buf); #endif -#ifdef XFS_BMBT_TRACE +#ifdef XFS_BTREE_TRACE +EXPORT_SYMBOL(xfs_allocbt_trace_buf); +EXPORT_SYMBOL(xfs_inobt_trace_buf); EXPORT_SYMBOL(xfs_bmbt_trace_buf); #endif #ifdef XFS_ATTR_TRACE Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_super.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_super.c 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_super.c 2008-07-29 16:23:24.000000000 +0200 @@ -35,6 +35,7 @@ #include "xfs_dinode.h" #include "xfs_inode.h" #include "xfs_btree.h" +#include "xfs_btree_trace.h" #include "xfs_ialloc.h" #include "xfs_bmap.h" #include "xfs_rtalloc.h" @@ -1744,10 +1745,18 @@ xfs_alloc_trace_bufs(void) if (!xfs_bmap_trace_buf) goto out_free_alloc_trace; #endif -#ifdef XFS_BMBT_TRACE +#ifdef XFS_BTREE_TRACE + xfs_allocbt_trace_buf = ktrace_alloc(XFS_ALLOCBT_TRACE_SIZE, KM_MAYFAIL); + if (!xfs_allocbt_trace_buf) + goto out_free_bmap_trace; + + xfs_inobt_trace_buf = ktrace_alloc(XFS_INOBT_TRACE_SIZE, KM_MAYFAIL); + if (!xfs_inobt_trace_buf) + goto out_free_allocbt_trace; + xfs_bmbt_trace_buf = ktrace_alloc(XFS_BMBT_TRACE_SIZE, KM_MAYFAIL); if (!xfs_bmbt_trace_buf) - goto out_free_bmap_trace; + goto out_free_inobt_trace; #endif #ifdef XFS_ATTR_TRACE xfs_attr_trace_buf = ktrace_alloc(XFS_ATTR_TRACE_SIZE, KM_MAYFAIL); @@ -1769,8 +1778,12 @@ xfs_alloc_trace_bufs(void) ktrace_free(xfs_attr_trace_buf); out_free_bmbt_trace: #endif -#ifdef XFS_BMBT_TRACE +#ifdef XFS_BTREE_TRACE ktrace_free(xfs_bmbt_trace_buf); + out_free_inobt_trace: + ktrace_free(xfs_inobt_trace_buf); + out_free_allocbt_trace: + ktrace_free(xfs_allocbt_trace_buf); out_free_bmap_trace: #endif #ifdef XFS_BMAP_TRACE @@ -1793,8 +1806,10 @@ xfs_free_trace_bufs(void) #ifdef XFS_ATTR_TRACE ktrace_free(xfs_attr_trace_buf); #endif -#ifdef XFS_BMBT_TRACE +#ifdef XFS_BTREE_TRACE ktrace_free(xfs_bmbt_trace_buf); + ktrace_free(xfs_inobt_trace_buf); + ktrace_free(xfs_allocbt_trace_buf); #endif #ifdef XFS_BMAP_TRACE ktrace_free(xfs_bmap_trace_buf); Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.h 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h 2008-07-29 16:52:06.000000000 +0200 @@ -233,24 +233,6 @@ typedef struct xfs_btree_lblock xfs_bmbt #ifdef __KERNEL__ -#if defined(XFS_BMBT_TRACE) -/* - * Trace buffer entry types. - */ -#define XFS_BMBT_KTRACE_ARGBI 1 -#define XFS_BMBT_KTRACE_ARGBII 2 -#define XFS_BMBT_KTRACE_ARGFFFI 3 -#define XFS_BMBT_KTRACE_ARGI 4 -#define XFS_BMBT_KTRACE_ARGIFK 5 -#define XFS_BMBT_KTRACE_ARGIFR 6 -#define XFS_BMBT_KTRACE_ARGIK 7 -#define XFS_BMBT_KTRACE_CUR 8 - -#define XFS_BMBT_TRACE_SIZE 4096 /* size of global trace buffer */ -#define XFS_BMBT_KTRACE_SIZE 32 /* size of per-inode trace buffer */ -extern ktrace_t *xfs_bmbt_trace_buf; -#endif - /* * Prototypes for xfs_bmap.c to call. */ Index: linux-2.6-xfs/fs/xfs/xfs_inode.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_inode.c 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_inode.c 2008-07-29 16:23:24.000000000 +0200 @@ -41,6 +41,7 @@ #include "xfs_buf_item.h" #include "xfs_inode_item.h" #include "xfs_btree.h" +#include "xfs_btree_trace.h" #include "xfs_alloc.h" #include "xfs_ialloc.h" #include "xfs_bmap.h" @@ -841,7 +842,7 @@ xfs_iread( #ifdef XFS_BMAP_TRACE ip->i_xtrace = ktrace_alloc(XFS_BMAP_KTRACE_SIZE, KM_SLEEP); #endif -#ifdef XFS_BMBT_TRACE +#ifdef XFS_BTREE_TRACE ip->i_btrace = ktrace_alloc(XFS_BMBT_KTRACE_SIZE, KM_SLEEP); #endif #ifdef XFS_RW_TRACE @@ -2600,7 +2601,7 @@ xfs_idestroy( #ifdef XFS_BMAP_TRACE ktrace_free(ip->i_xtrace); #endif -#ifdef XFS_BMBT_TRACE +#ifdef XFS_BTREE_TRACE ktrace_free(ip->i_btrace); #endif #ifdef XFS_RW_TRACE Index: linux-2.6-xfs/fs/xfs/xfs_inode.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_inode.h 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_inode.h 2008-07-29 16:23:24.000000000 +0200 @@ -251,7 +251,7 @@ typedef struct xfs_inode { #ifdef XFS_BMAP_TRACE struct ktrace *i_xtrace; /* inode extent list trace */ #endif -#ifdef XFS_BMBT_TRACE +#ifdef XFS_BTREE_TRACE struct ktrace *i_btrace; /* inode bmap btree trace */ #endif #ifdef XFS_RW_TRACE Index: linux-2.6-xfs/fs/xfs/xfsidbg.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfsidbg.c 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfsidbg.c 2008-07-29 16:59:55.000000000 +0200 @@ -45,6 +45,7 @@ #include "xfs_dinode.h" #include "xfs_inode.h" #include "xfs_btree.h" +#include "xfs_btree_trace.h" #include "xfs_buf_item.h" #include "xfs_inode_item.h" #include "xfs_extfree_item.h" @@ -88,7 +89,7 @@ static void xfsidbg_xattrtrace(int); static void xfsidbg_xblitrace(xfs_buf_log_item_t *); #endif #ifdef XFS_BMAP_TRACE -static void xfsidbg_xbmatrace(int); +static void xfsidbg_xbtatrace(int, struct ktrace *, int); static void xfsidbg_xbmitrace(xfs_inode_t *); static void xfsidbg_xbmstrace(xfs_inode_t *); static void xfsidbg_xbxatrace(int); @@ -133,7 +134,7 @@ static void xfsidbg_xattrleaf(xfs_attr_l static void xfsidbg_xattrsf(xfs_attr_shortform_t *); static void xfsidbg_xbirec(xfs_bmbt_irec_t *r); static void xfsidbg_xbmalla(xfs_bmalloca_t *); -static void xfsidbg_xbrec(xfs_bmbt_rec_host_t *); +static void xfsidbg_xbmbtrec(xfs_bmbt_rec_host_t *); static void xfsidbg_xbroot(xfs_inode_t *); static void xfsidbg_xbroota(xfs_inode_t *); static void xfsidbg_xbtcur(xfs_btree_cur_t *); @@ -198,9 +199,6 @@ static int xfs_attr_trace_entry(ktrace_e #ifdef XFS_BMAP_TRACE static int xfs_bmap_trace_entry(ktrace_entry_t *ktep); #endif -#ifdef XFS_BMAP_TRACE -static int xfs_bmbt_trace_entry(ktrace_entry_t *ktep); -#endif #ifdef XFS_DIR2_TRACE static int xfs_dir2_trace_entry(ktrace_entry_t *ktep); #endif @@ -421,10 +419,52 @@ static int kdbm_xfs_xbmatrace( if (diag) return diag; - xfsidbg_xbmatrace((int) addr); + xfsidbg_xbtatrace(XFS_BTNUM_BMAP, xfs_bmbt_trace_buf, (int)addr); return 0; } +static int kdbm_xfs_xbtaatrace( + int argc, + const char **argv) +{ + unsigned long addr; + int nextarg = 1; + long offset = 0; + int diag; + + if (argc != 1) + return KDB_ARGCOUNT; + + diag = kdbgetaddrarg(argc, argv, &nextarg, &addr, &offset, NULL); + if (diag) + return diag; + + /* also contains XFS_BTNUM_CNT */ + xfsidbg_xbtatrace(XFS_BTNUM_BNO, xfs_allocbt_trace_buf, (int)addr); + return 0; +} + +static int kdbm_xfs_xbtiatrace( + int argc, + const char **argv) +{ + unsigned long addr; + int nextarg = 1; + long offset = 0; + int diag; + + if (argc != 1) + return KDB_ARGCOUNT; + + diag = kdbgetaddrarg(argc, argv, &nextarg, &addr, &offset, NULL); + if (diag) + return diag; + + xfsidbg_xbtatrace(XFS_BTNUM_INO, xfs_inobt_trace_buf, (int)addr); + return 0; +} + + static int kdbm_xfs_xbmitrace( int argc, const char **argv) @@ -902,7 +942,7 @@ static int kdbm_xfs_xbrec( if (diag) return diag; - xfsidbg_xbrec((xfs_bmbt_rec_host_t *) addr); + xfsidbg_xbmbtrec((xfs_bmbt_rec_host_t *) addr); return 0; } @@ -2273,6 +2313,10 @@ static struct xif xfsidbg_funcs[] = { "Dump XFS bmap btree per-inode trace" }, { "xbmstrc", kdbm_xfs_xbmstrace, "", "Dump XFS bmap btree inode trace" }, + { "xbtaatrc", kdbm_xfs_xbtaatrace, "", + "Dump XFS alloc btree count trace" }, + { "xbtiatrc", kdbm_xfs_xbtiatrace, "", + "Dump XFS inobt btree count trace" }, #endif { "xbrec", kdbm_xfs_xbrec, "", "Dump XFS bmap record"}, @@ -2706,16 +2750,115 @@ xfs_bmap_trace_entry(ktrace_entry_t *kte return 1; } +static void +xfsidb_btree_trace_record( + int btnum, + unsigned long l0, + unsigned long l1, + unsigned long l2, + unsigned long l3, + unsigned long l4, + unsigned long l5) +{ + switch (btnum) { + case XFS_BTNUM_BMAP: + { + struct xfs_bmbt_irec s; + + s.br_startoff = ((xfs_dfiloff_t)l0 << 32) | (xfs_dfiloff_t)l1; + s.br_startblock = ((xfs_dfsbno_t)l2 << 32) | (xfs_dfsbno_t)l3; + s.br_blockcount = ((xfs_dfilblks_t)l4 << 32) | (xfs_dfilblks_t)l5; + + xfsidbg_xbirec(&s); + break; + } + case XFS_BTNUM_BNO: + case XFS_BTNUM_CNT: + qprintf(" startblock = %d, blockcount = %d\n", + (unsigned int)l0, (unsigned int)l2); + break; + case XFS_BTNUM_INO: + qprintf(" startino = %d, freecount = %d, free = %lld\n", + (unsigned int)l0, (unsigned int)l2, + ((xfs_inofree_t)l4 << 32) | (xfs_inofree_t)l5); + break; + default: + break; + } +} + +static void +xfsidb_btree_trace_key( + int btnum, + unsigned long l0, + unsigned long l1, + unsigned long l2, + unsigned long l3) +{ + switch (btnum) { + case XFS_BTNUM_BMAP: + qprintf(" startoff 0x%x%08x\n", (unsigned int)l0, (unsigned int)l1); + break; + case XFS_BTNUM_BNO: + case XFS_BTNUM_CNT: + qprintf(" startblock %d blockcount %d\n", + (unsigned int)l0, (unsigned int)l3); + break; + case XFS_BTNUM_INO: + qprintf(" startino %d\n", (unsigned int)l0); + break; + default: + break; + } +} + +static void +xfsidb_btree_trace_cursor( + int btnum, + unsigned long l0, + unsigned long l1, + unsigned long l2, + unsigned long l3, + unsigned long l4) +{ + switch (btnum) { + case XFS_BTNUM_BMAP: + { + struct xfs_bmbt_rec_host r; + + qprintf(" nlevels %ld flags %ld allocated %ld ", + (l0 >> 24) & 0xff, (l0 >> 16) & 0xff, l0 & 0xffff); + + r.l0 = (unsigned long long)l1 << 32 | l2; + r.l1 = (unsigned long long)l3 << 32 | l4; + + xfsidbg_xbmbtrec(&r); + break; + } + case XFS_BTNUM_BNO: + case XFS_BTNUM_CNT: + qprintf(" agno %d startblock %d blockcount %d\n", + (unsigned int)l0, (unsigned int)l1, (unsigned int)l3); + break; + case XFS_BTNUM_INO: + qprintf(" agno %d startino %d free %lld\n", + (unsigned int)l0, (unsigned int)l1, + ((xfs_inofree_t)l3 << 32) | (xfs_inofree_t)l4); + break; + default: + break; + } +} + /* * Print xfs bmap btree trace buffer entry. */ static int -xfs_bmbt_trace_entry( - ktrace_entry_t *ktep) +xfs_btree_trace_entry( + int btnum, + ktrace_entry_t *ktep) { int line; - xfs_bmbt_rec_host_t r; - xfs_bmbt_irec_t s; int type; int whichfork; @@ -2732,18 +2875,18 @@ xfs_bmbt_trace_entry( "da"[whichfork], (xfs_btree_cur_t *)ktep->val[4]); switch (type) { - case XFS_BMBT_KTRACE_ARGBI: + case XFS_BTREE_KTRACE_ARGBI: qprintf(" buf 0x%p i %ld\n", (xfs_buf_t *)ktep->val[5], (long)ktep->val[6]); break; - case XFS_BMBT_KTRACE_ARGBII: + case XFS_BTREE_KTRACE_ARGBII: qprintf(" buf 0x%p i0 %ld i1 %ld\n", (xfs_buf_t *)ktep->val[5], (long)ktep->val[6], (long)ktep->val[7]); break; - case XFS_BMBT_KTRACE_ARGFFFI: + case XFS_BTREE_KTRACE_ARGFFFI: qprintf(" o 0x%x%08x b 0x%x%08x i 0x%x%08x j %ld\n", (unsigned int)(long)ktep->val[5], (unsigned int)(long)ktep->val[6], @@ -2753,11 +2896,11 @@ xfs_bmbt_trace_entry( (unsigned int)(long)ktep->val[10], (long)ktep->val[11]); break; - case XFS_BMBT_KTRACE_ARGI: + case XFS_BTREE_KTRACE_ARGI: qprintf(" i 0x%lx\n", (long)ktep->val[5]); break; - case XFS_BMBT_KTRACE_ARGIFK: + case XFS_BTREE_KTRACE_ARGIPK: qprintf(" i 0x%lx f 0x%x%08x o 0x%x%08x\n", (long)ktep->val[5], (unsigned int)(long)ktep->val[6], @@ -2765,39 +2908,45 @@ xfs_bmbt_trace_entry( (unsigned int)(long)ktep->val[8], (unsigned int)(long)ktep->val[9]); break; - case XFS_BMBT_KTRACE_ARGIFR: + case XFS_BTREE_KTRACE_ARGIPR: qprintf(" i 0x%lx f 0x%x%08x ", (long)ktep->val[5], (unsigned int)(long)ktep->val[6], (unsigned int)(long)ktep->val[7]); - s.br_startoff = (xfs_fileoff_t) - (((xfs_dfiloff_t)(unsigned long)ktep->val[8] << 32) | - (xfs_dfiloff_t)(unsigned long)ktep->val[9]); - s.br_startblock = (xfs_fsblock_t) - (((xfs_dfsbno_t)(unsigned long)ktep->val[10] << 32) | - (xfs_dfsbno_t)(unsigned long)ktep->val[11]); - s.br_blockcount = (xfs_filblks_t) - (((xfs_dfilblks_t)(unsigned long)ktep->val[12] << 32) | - (xfs_dfilblks_t)(unsigned long)ktep->val[13]); - xfsidbg_xbirec(&s); - break; - case XFS_BMBT_KTRACE_ARGIK: - qprintf(" i 0x%lx o 0x%x%08x\n", - (long)ktep->val[5], - (unsigned int)(long)ktep->val[6], - (unsigned int)(long)ktep->val[7]); - break; - case XFS_BMBT_KTRACE_CUR: - qprintf(" nlevels %ld flags %ld allocated %ld ", - ((long)ktep->val[5] >> 24) & 0xff, - ((long)ktep->val[5] >> 16) & 0xff, - (long)ktep->val[5] & 0xffff); - r.l0 = (unsigned long long)(unsigned long)ktep->val[6] << 32 | - (unsigned long)ktep->val[7]; - r.l1 = (unsigned long long)(unsigned long)ktep->val[8] << 32 | - (unsigned long)ktep->val[9]; + xfsidb_btree_trace_record(btnum, + (unsigned long)ktep->val[8], + (unsigned long)ktep->val[9], + (unsigned long)ktep->val[10], + (unsigned long)ktep->val[11], + (unsigned long)ktep->val[12], + (unsigned long)ktep->val[13]); + break; + case XFS_BTREE_KTRACE_ARGIK: + qprintf(" i 0x%lx\n", (long)ktep->val[5]); + xfsidb_btree_trace_key(btnum, + (unsigned long)ktep->val[6], + (unsigned long)ktep->val[7], + (unsigned long)ktep->val[8], + (unsigned long)ktep->val[9]); + break; + case XFS_BTREE_KTRACE_ARGR: + xfsidb_btree_trace_record(btnum, + (unsigned long)ktep->val[5], + (unsigned long)ktep->val[6], + (unsigned long)ktep->val[7], + (unsigned long)ktep->val[8], + (unsigned long)ktep->val[9], + (unsigned long)ktep->val[10]); + break; + case XFS_BTREE_KTRACE_CUR: + + xfsidb_btree_trace_cursor(btnum, + (unsigned long)ktep->val[5], + (unsigned long)ktep->val[6], + (unsigned long)ktep->val[7], + (unsigned long)ktep->val[8], + (unsigned long)ktep->val[9]); - xfsidbg_xbrec(&r); qprintf(" bufs 0x%p 0x%p 0x%p 0x%p ", (xfs_buf_t *)ktep->val[10], (xfs_buf_t *)ktep->val[11], @@ -4383,21 +4532,21 @@ xfsidbg_xbmalla(xfs_bmalloca_t *a) #ifdef XFS_BMAP_TRACE /* - * Print out the last "count" entries in the bmap btree trace buffer. + * Print out the last "count" entries in a btree trace buffer. * The "a" is for "all" inodes. */ static void -xfsidbg_xbmatrace(int count) +xfsidbg_xbtatrace(int btnum, struct ktrace *trace_buf, int count) { ktrace_entry_t *ktep; ktrace_snap_t kts; int nentries; int skip_entries; - if (xfs_bmbt_trace_buf == NULL) { + if (trace_buf == NULL) { qprintf("The xfs bmap btree trace buffer is not initialized\n"); return; } - nentries = ktrace_nentries(xfs_bmbt_trace_buf); + nentries = ktrace_nentries(trace_buf); if (count == -1) { count = nentries; } @@ -4406,23 +4555,23 @@ xfsidbg_xbmatrace(int count) return; } - ktep = ktrace_first(xfs_bmbt_trace_buf, &kts); + ktep = ktrace_first(trace_buf, &kts); if (count != nentries) { /* * Skip the total minus the number to look at minus one * for the entry returned by ktrace_first(). */ skip_entries = nentries - count - 1; - ktep = ktrace_skip(xfs_bmbt_trace_buf, skip_entries, &kts); + ktep = ktrace_skip(trace_buf, skip_entries, &kts); if (ktep == NULL) { qprintf("Skipped them all\n"); return; } } while (ktep != NULL) { - if (xfs_bmbt_trace_entry(ktep)) + if (xfs_btree_trace_entry(btnum, ktep)) qprintf("\n"); - ktep = ktrace_next(xfs_bmbt_trace_buf, &kts); + ktep = ktrace_next(trace_buf, &kts); } } @@ -4442,7 +4591,7 @@ xfsidbg_xbmitrace(xfs_inode_t *ip) ktep = ktrace_first(ip->i_btrace, &kts); while (ktep != NULL) { - if (xfs_bmbt_trace_entry(ktep)) + if (xfs_btree_trace_entry(XFS_BTNUM_BMAP, ktep)) qprintf("\n"); ktep = ktrace_next(ip->i_btrace, &kts); } @@ -4465,7 +4614,7 @@ xfsidbg_xbmstrace(xfs_inode_t *ip) ktep = ktrace_first(xfs_bmbt_trace_buf, &kts); while (ktep != NULL) { if ((xfs_inode_t *)(ktep->val[2]) == ip) { - if (xfs_bmbt_trace_entry(ktep)) + if (xfs_btree_trace_entry(XFS_BTNUM_BMAP, ktep)) qprintf("\n"); } ktep = ktrace_next(xfs_bmbt_trace_buf, &kts); @@ -4477,7 +4626,7 @@ xfsidbg_xbmstrace(xfs_inode_t *ip) * Print xfs bmap record */ static void -xfsidbg_xbrec(xfs_bmbt_rec_host_t *r) +xfsidbg_xbmbtrec(xfs_bmbt_rec_host_t *r) { xfs_bmbt_irec_t irec; @@ -6398,7 +6547,7 @@ xfsidbg_xnode(xfs_inode_t *ip) #ifdef XFS_BMAP_TRACE qprintf(" bmap_trace 0x%p\n", ip->i_xtrace); #endif -#ifdef XFS_BMBT_TRACE +#ifdef XFS_BTREE_TRACE qprintf(" bmbt trace 0x%p\n", ip->i_btrace); #endif #ifdef XFS_RW_TRACE Index: linux-2.6-xfs/fs/xfs/xfs_btree_trace.h =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-2.6-xfs/fs/xfs/xfs_btree_trace.h 2008-07-29 16:53:44.000000000 +0200 @@ -0,0 +1,116 @@ +/* + * Copyright (c) 2008 Silicon Graphics, Inc. + * All Rights Reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ +#ifndef __XFS_BTREE_TRACE_H__ +#define __XFS_BTREE_TRACE_H__ + +struct xfs_btree_cur; +struct xfs_buf; + + +/* + * Trace hooks. + * i,j = integer (32 bit) + * b = btree block buffer (xfs_buf_t) + * p = btree ptr + * r = btree record + * k = btree key + */ + +#ifdef XFS_BTREE_TRACE + +/* + * Trace buffer entry types. + */ +#define XFS_BTREE_KTRACE_ARGBI 1 +#define XFS_BTREE_KTRACE_ARGBII 2 +#define XFS_BTREE_KTRACE_ARGFFFI 3 +#define XFS_BTREE_KTRACE_ARGI 4 +#define XFS_BTREE_KTRACE_ARGIPK 5 +#define XFS_BTREE_KTRACE_ARGIPR 6 +#define XFS_BTREE_KTRACE_ARGIK 7 +#define XFS_BTREE_KTRACE_ARGR 8 +#define XFS_BTREE_KTRACE_CUR 9 + +/* + * Sub-types for cursor traces. + */ +#define XBT_ARGS 0 +#define XBT_ENTRY 1 +#define XBT_ERROR 2 +#define XBT_EXIT 3 + +void xfs_btree_trace_argbi(const char *, struct xfs_btree_cur *, + struct xfs_buf *, int, int); +void xfs_btree_trace_argbii(const char *, struct xfs_btree_cur *, + struct xfs_buf *, int, int, int); +void xfs_btree_trace_argfffi(const char *, struct xfs_btree_cur *, + xfs_dfiloff_t, xfs_dfsbno_t, xfs_dfilblks_t, int, int); +void xfs_btree_trace_argi(const char *, struct xfs_btree_cur *, int, int); +void xfs_btree_trace_argipk(const char *, struct xfs_btree_cur *, int, + union xfs_btree_ptr, union xfs_btree_key *, int); +void xfs_btree_trace_argipr(const char *, struct xfs_btree_cur *, int, + union xfs_btree_ptr, union xfs_btree_rec *, int); +void xfs_btree_trace_argik(const char *, struct xfs_btree_cur *, int, + union xfs_btree_key *, int); +void xfs_btree_trace_argr(const char *, struct xfs_btree_cur *, + union xfs_btree_rec *, int); +void xfs_btree_trace_cursor(const char *, struct xfs_btree_cur *, int, int); + + +#define XFS_ALLOCBT_TRACE_SIZE 4096 /* size of global trace buffer */ +extern ktrace_t *xfs_allocbt_trace_buf; + +#define XFS_INOBT_TRACE_SIZE 4096 /* size of global trace buffer */ +extern ktrace_t *xfs_inobt_trace_buf; + +#define XFS_BMBT_TRACE_SIZE 4096 /* size of global trace buffer */ +#define XFS_BMBT_KTRACE_SIZE 32 /* size of per-inode trace buffer */ +extern ktrace_t *xfs_bmbt_trace_buf; + + +#define XFS_BTREE_TRACE_ARGBI(c,b,i) \ + xfs_btree_trace_argbi(__func__, c, b, i, __LINE__) +#define XFS_BTREE_TRACE_ARGBII(c,b,i,j) \ + xfs_btree_trace_argbii(__func__, c, b, i, j, __LINE__) +#define XFS_BTREE_TRACE_ARGFFFI(c,o,b,i,j) \ + xfs_btree_trace_argfffi(__func__, c, o, b, i, j, __LINE__) +#define XFS_BTREE_TRACE_ARGI(c,i) \ + xfs_btree_trace_argi(__func__, c, i, __LINE__) +#define XFS_BTREE_TRACE_ARGIPK(c,i,p,k) \ + xfs_btree_trace_argipk(__func__, c, i, p, k, __LINE__) +#define XFS_BTREE_TRACE_ARGIPR(c,i,p,r) \ + xfs_btree_trace_argipr(__func__, c, i, p, r, __LINE__) +#define XFS_BTREE_TRACE_ARGIK(c,i,k) \ + xfs_btree_trace_argik(__func__, c, i, k, __LINE__) +#define XFS_BTREE_TRACE_ARGR(c,r) \ + xfs_btree_trace_argr(__func__, c, r, __LINE__) +#define XFS_BTREE_TRACE_CURSOR(c,t) \ + xfs_btree_trace_cursor(__func__, c, t, __LINE__) +#else +#define XFS_BTREE_TRACE_ARGBI(c,b,i) +#define XFS_BTREE_TRACE_ARGBII(c,b,i,j) +#define XFS_BTREE_TRACE_ARGFFFI(c,o,b,i,j) +#define XFS_BTREE_TRACE_ARGI(c,i) +#define XFS_BTREE_TRACE_ARGIPK(c,i,p,s) +#define XFS_BTREE_TRACE_ARGIPR(c,i,p,r) +#define XFS_BTREE_TRACE_ARGIK(c,i,k) +#define XFS_BTREE_TRACE_ARGR(c,r) +#define XFS_BTREE_TRACE_CURSOR(c,t) +#endif /* XFS_BTREE_TRACE */ + +#endif /* __XFS_BTREE_TRACE_H__ */ Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-07-29 16:52:06.000000000 +0200 @@ -2219,8 +2219,82 @@ xfs_allocbt_dup_cursor( cur->bc_btnum); } +#ifdef XFS_BTREE_TRACE + +ktrace_t *xfs_allocbt_trace_buf; + +STATIC void +xfs_allocbt_trace_enter( + struct xfs_btree_cur *cur, + const char *func, + char *s, + int type, + int line, + __psunsigned_t a0, + __psunsigned_t a1, + __psunsigned_t a2, + __psunsigned_t a3, + __psunsigned_t a4, + __psunsigned_t a5, + __psunsigned_t a6, + __psunsigned_t a7, + __psunsigned_t a8, + __psunsigned_t a9, + __psunsigned_t a10) +{ + ktrace_enter(xfs_allocbt_trace_buf, (void *)(__psint_t)type, + (void *)func, (void *)s, NULL, (void *)cur, + (void *)a0, (void *)a1, (void *)a2, (void *)a3, + (void *)a4, (void *)a5, (void *)a6, (void *)a7, + (void *)a8, (void *)a9, (void *)a10); +} + +STATIC void +xfs_allocbt_trace_cursor( + struct xfs_btree_cur *cur, + __uint32_t *s0, + __uint64_t *l0, + __uint64_t *l1) +{ + *s0 = cur->bc_private.a.agno; + *l0 = cur->bc_rec.a.ar_startblock; + *l1 = cur->bc_rec.a.ar_blockcount; +} + +STATIC void +xfs_allocbt_trace_key( + struct xfs_btree_cur *cur, + union xfs_btree_key *key, + __uint64_t *l0, + __uint64_t *l1) +{ + *l0 = be32_to_cpu(key->alloc.ar_startblock); + *l1 = be32_to_cpu(key->alloc.ar_blockcount); +} + +STATIC void +xfs_allocbt_trace_record( + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec, + __uint64_t *l0, + __uint64_t *l1, + __uint64_t *l2) +{ + *l0 = be32_to_cpu(rec->alloc.ar_startblock); + *l1 = be32_to_cpu(rec->alloc.ar_blockcount); + *l2 = 0; +} +#endif /* XFS_BTREE_TRACE */ + static const struct xfs_btree_ops xfs_allocbt_ops = { .dup_cursor = xfs_allocbt_dup_cursor, + +#ifdef XFS_BTREE_TRACE + .trace_enter = xfs_allocbt_trace_enter, + .trace_cursor = xfs_allocbt_trace_cursor, + .trace_key = xfs_allocbt_trace_key, + .trace_record = xfs_allocbt_trace_record, +#endif }; /* Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-07-29 16:23:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-07-29 16:52:06.000000000 +0200 @@ -2085,8 +2085,82 @@ xfs_inobt_dup_cursor( cur->bc_private.a.agbp, cur->bc_private.a.agno); } +#ifdef XFS_BTREE_TRACE + +ktrace_t *xfs_inobt_trace_buf; + +STATIC void +xfs_inobt_trace_enter( + struct xfs_btree_cur *cur, + const char *func, + char *s, + int type, + int line, + __psunsigned_t a0, + __psunsigned_t a1, + __psunsigned_t a2, + __psunsigned_t a3, + __psunsigned_t a4, + __psunsigned_t a5, + __psunsigned_t a6, + __psunsigned_t a7, + __psunsigned_t a8, + __psunsigned_t a9, + __psunsigned_t a10) +{ + ktrace_enter(xfs_inobt_trace_buf, (void *)(__psint_t)type, + (void *)func, (void *)s, NULL, (void *)cur, + (void *)a0, (void *)a1, (void *)a2, (void *)a3, + (void *)a4, (void *)a5, (void *)a6, (void *)a7, + (void *)a8, (void *)a9, (void *)a10); +} + +STATIC void +xfs_inobt_trace_cursor( + struct xfs_btree_cur *cur, + __uint32_t *s0, + __uint64_t *l0, + __uint64_t *l1) +{ + *s0 = cur->bc_private.a.agno; + *l0 = cur->bc_rec.i.ir_startino; + *l1 = cur->bc_rec.i.ir_free; +} + +STATIC void +xfs_inobt_trace_key( + struct xfs_btree_cur *cur, + union xfs_btree_key *key, + __uint64_t *l0, + __uint64_t *l1) +{ + *l0 = be32_to_cpu(key->inobt.ir_startino); + *l1 = 0; +} + +STATIC void +xfs_inobt_trace_record( + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec, + __uint64_t *l0, + __uint64_t *l1, + __uint64_t *l2) +{ + *l0 = be32_to_cpu(rec->inobt.ir_startino); + *l1 = be32_to_cpu(rec->inobt.ir_freecount); + *l2 = be64_to_cpu(rec->inobt.ir_free); +} +#endif /* XFS_BTREE_TRACE */ + static const struct xfs_btree_ops xfs_inobt_ops = { .dup_cursor = xfs_inobt_dup_cursor, + +#ifdef XFS_BTREE_TRACE + .trace_enter = xfs_inobt_trace_enter, + .trace_cursor = xfs_inobt_trace_cursor, + .trace_key = xfs_inobt_trace_key, + .trace_record = xfs_inobt_trace_record, +#endif }; /* -- From owner-xfs@oss.sgi.com Sun Aug 3 18:31:53 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:31:55 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741Vr1N030687 for ; Sun, 3 Aug 2008 18:31:53 -0700 X-ASG-Debug-ID: 1217813585-73a201b00000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id E1F203566B9 for ; Sun, 3 Aug 2008 18:33:06 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id IjxGNwkmMV5Y74BJ for ; Sun, 03 Aug 2008 18:33:06 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741X6IF008953 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:33:06 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741X6c4008951 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:33:06 +0200 Date: Mon, 4 Aug 2008 03:33:06 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 06/26] refactor xfs_btree_readahead Subject: [PATCH 06/26] refactor xfs_btree_readahead Message-ID: <20080804013306.GG8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-btree-readahead-refactor User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813587 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17331 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs From: Dave Chinner Refactor xfs_btree_readahead to make it more readable: (a) remove the inline xfs_btree_readahead wrapper and move all checks out of line into the main routine. (b) factor out helpers for short/long form btrees (c) move check for root in inodes from the callers into xfs_btree_readahead [hch: split out from a big patch and minor cleanups] Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-02 02:11:17.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-02 03:03:28.000000000 +0200 @@ -709,66 +709,84 @@ xfs_btree_reada_bufs( xfs_baread(mp->m_ddev_targp, d, mp->m_bsize * count); } +STATIC int +xfs_btree_readahead_lblock( + struct xfs_btree_cur *cur, + int lr, + struct xfs_btree_block *block) +{ + int rval = 0; + xfs_fsblock_t left = be64_to_cpu(block->bb_u.l.bb_leftsib); + xfs_fsblock_t right = be64_to_cpu(block->bb_u.l.bb_rightsib); + + if ((lr & XFS_BTCUR_LEFTRA) && left != NULLDFSBNO) { + xfs_btree_reada_bufl(cur->bc_mp, left, 1); + rval++; + } + + if ((lr & XFS_BTCUR_RIGHTRA) && right != NULLDFSBNO) { + xfs_btree_reada_bufl(cur->bc_mp, right, 1); + rval++; + } + + return rval; +} + +STATIC int +xfs_btree_readahead_sblock( + struct xfs_btree_cur *cur, + int lr, + struct xfs_btree_block *block) +{ + int rval = 0; + xfs_agblock_t left = be32_to_cpu(block->bb_u.s.bb_leftsib); + xfs_agblock_t right = be32_to_cpu(block->bb_u.s.bb_rightsib); + + + if ((lr & XFS_BTCUR_LEFTRA) && left != NULLAGBLOCK) { + xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.a.agno, + left, 1); + rval++; + } + + if ((lr & XFS_BTCUR_RIGHTRA) && right != NULLAGBLOCK) { + xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.a.agno, + right, 1); + rval++; + } + + return rval; +} + /* * Read-ahead btree blocks, at the given level. * Bits in lr are set from XFS_BTCUR_{LEFT,RIGHT}RA. */ int -xfs_btree_readahead_core( - xfs_btree_cur_t *cur, /* btree cursor */ +xfs_btree_readahead( + struct xfs_btree_cur *cur, /* btree cursor */ int lev, /* level in btree */ int lr) /* left/right bits */ { - xfs_alloc_block_t *a; - xfs_bmbt_block_t *b; - xfs_inobt_block_t *i; - int rval = 0; + struct xfs_btree_block *block; + + /* + * No readahead needed if we are at the root level and the + * btree root is stored in the inode. + */ + if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) && + (lev == cur->bc_nlevels - 1)) + return 0; + + if ((cur->bc_ra[lev] | lr) == cur->bc_ra[lev]) + return 0; - ASSERT(cur->bc_bufs[lev] != NULL); cur->bc_ra[lev] |= lr; - switch (cur->bc_btnum) { - case XFS_BTNUM_BNO: - case XFS_BTNUM_CNT: - a = XFS_BUF_TO_ALLOC_BLOCK(cur->bc_bufs[lev]); - if ((lr & XFS_BTCUR_LEFTRA) && be32_to_cpu(a->bb_leftsib) != NULLAGBLOCK) { - xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.a.agno, - be32_to_cpu(a->bb_leftsib), 1); - rval++; - } - if ((lr & XFS_BTCUR_RIGHTRA) && be32_to_cpu(a->bb_rightsib) != NULLAGBLOCK) { - xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.a.agno, - be32_to_cpu(a->bb_rightsib), 1); - rval++; - } - break; - case XFS_BTNUM_BMAP: - b = XFS_BUF_TO_BMBT_BLOCK(cur->bc_bufs[lev]); - if ((lr & XFS_BTCUR_LEFTRA) && be64_to_cpu(b->bb_leftsib) != NULLDFSBNO) { - xfs_btree_reada_bufl(cur->bc_mp, be64_to_cpu(b->bb_leftsib), 1); - rval++; - } - if ((lr & XFS_BTCUR_RIGHTRA) && be64_to_cpu(b->bb_rightsib) != NULLDFSBNO) { - xfs_btree_reada_bufl(cur->bc_mp, be64_to_cpu(b->bb_rightsib), 1); - rval++; - } - break; - case XFS_BTNUM_INO: - i = XFS_BUF_TO_INOBT_BLOCK(cur->bc_bufs[lev]); - if ((lr & XFS_BTCUR_LEFTRA) && be32_to_cpu(i->bb_leftsib) != NULLAGBLOCK) { - xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.a.agno, - be32_to_cpu(i->bb_leftsib), 1); - rval++; - } - if ((lr & XFS_BTCUR_RIGHTRA) && be32_to_cpu(i->bb_rightsib) != NULLAGBLOCK) { - xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.a.agno, - be32_to_cpu(i->bb_rightsib), 1); - rval++; - } - break; - default: - ASSERT(0); - } - return rval; + block = XFS_BUF_TO_BLOCK(cur->bc_bufs[lev]); + + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) + return xfs_btree_readahead_lblock(cur, lr, block); + return xfs_btree_readahead_sblock(cur, lr, block); } /* Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-02 02:11:17.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-02 03:00:21.000000000 +0200 @@ -429,23 +429,10 @@ xfs_btree_reada_bufs( * Bits in lr are set from XFS_BTCUR_{LEFT,RIGHT}RA. */ int /* readahead block count */ -xfs_btree_readahead_core( - xfs_btree_cur_t *cur, /* btree cursor */ - int lev, /* level in btree */ - int lr); /* left/right bits */ - -static inline int /* readahead block count */ xfs_btree_readahead( xfs_btree_cur_t *cur, /* btree cursor */ int lev, /* level in btree */ - int lr) /* left/right bits */ -{ - if ((cur->bc_ra[lev] | lr) == cur->bc_ra[lev]) - return 0; - - return xfs_btree_readahead_core(cur, lev, lr); -} - + int lr); /* left/right bits */ /* * Set the buffer for level "lev" in the cursor to bp, releasing Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-02 02:11:17.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-02 03:00:18.000000000 +0200 @@ -1721,8 +1721,9 @@ xfs_bmbt_decrement( XFS_BMBT_TRACE_CURSOR(cur, ENTRY); XFS_BMBT_TRACE_ARGI(cur, level); ASSERT(level < cur->bc_nlevels); - if (level < cur->bc_nlevels - 1) - xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); + + xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); + if (--cur->bc_ptrs[level] > 0) { XFS_BMBT_TRACE_CURSOR(cur, EXIT); *stat = 1; @@ -1743,8 +1744,7 @@ xfs_bmbt_decrement( for (lev = level + 1; lev < cur->bc_nlevels; lev++) { if (--cur->bc_ptrs[lev] > 0) break; - if (lev < cur->bc_nlevels - 1) - xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); + xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); } if (lev == cur->bc_nlevels) { XFS_BMBT_TRACE_CURSOR(cur, EXIT); @@ -1995,8 +1995,8 @@ xfs_bmbt_increment( XFS_BMBT_TRACE_CURSOR(cur, ENTRY); XFS_BMBT_TRACE_ARGI(cur, level); ASSERT(level < cur->bc_nlevels); - if (level < cur->bc_nlevels - 1) - xfs_btree_readahead(cur, level, XFS_BTCUR_RIGHTRA); + + xfs_btree_readahead(cur, level, XFS_BTCUR_RIGHTRA); block = xfs_bmbt_get_block(cur, level, &bp); #ifdef DEBUG if ((error = xfs_btree_check_lblock(cur, block, level, bp))) { @@ -2024,8 +2024,7 @@ xfs_bmbt_increment( #endif if (++cur->bc_ptrs[lev] <= be16_to_cpu(block->bb_numrecs)) break; - if (lev < cur->bc_nlevels - 1) - xfs_btree_readahead(cur, lev, XFS_BTCUR_RIGHTRA); + xfs_btree_readahead(cur, lev, XFS_BTCUR_RIGHTRA); } if (lev == cur->bc_nlevels) { XFS_BMBT_TRACE_CURSOR(cur, EXIT); -- From owner-xfs@oss.sgi.com Sun Aug 3 18:31:12 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:31:19 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741VCrJ030341 for ; Sun, 3 Aug 2008 18:31:12 -0700 X-ASG-Debug-ID: 1217813543-739601ce0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4409D3566AC for ; Sun, 3 Aug 2008 18:32:24 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id 9Y6DgqHx3WB2TX45 for ; Sun, 03 Aug 2008 18:32:24 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741WPIF008896 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:32:25 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741WPaN008894 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:32:25 +0200 Date: Mon, 4 Aug 2008 03:32:25 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 02/26] split up xfs_btree_init_cursor Subject: [PATCH 02/26] split up xfs_btree_init_cursor Message-ID: <20080804013225.GC8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-btree-split-init-cursor User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813546 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17327 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs xfs_btree_init_cursor contains close to little shared code for the different btrees and will get even more non-common code in the future. Split it up into one routine per btree type. Because xfs_btree_dup_cursor needs to call the init routine for a generic btree cursor add a new btree operation vector that contains a dup_cursor method that initializes a new cursor based on an existing one. The btree operations vector is based on an idea and code from Dave Chinner and will be used for more operations later on. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-02 04:01:06.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-02 04:01:46.000000000 +0200 @@ -131,6 +131,11 @@ extern const __uint32_t xfs_magics[]; #define XFS_BTREE_MAXLEVELS 8 /* max of all btrees */ +struct xfs_btree_ops { + /* cursor operations */ + struct xfs_btree_cur *(*dup_cursor)(struct xfs_btree_cur *); +}; + /* * Btree cursor structure. * This collects all information needed by the btree code in one place. @@ -139,6 +144,7 @@ typedef struct xfs_btree_cur { struct xfs_trans *bc_tp; /* transaction we're in, if any */ struct xfs_mount *bc_mp; /* file system mount struct */ + const struct xfs_btree_ops *bc_ops; union { xfs_alloc_rec_incore_t a; xfs_bmbt_irec_t b; @@ -308,20 +314,6 @@ xfs_btree_get_bufs( uint lock); /* lock flags for get_buf */ /* - * Allocate a new btree cursor. - * The cursor is either for allocation (A) or bmap (B). - */ -xfs_btree_cur_t * /* new btree cursor */ -xfs_btree_init_cursor( - struct xfs_mount *mp, /* file system mount point */ - struct xfs_trans *tp, /* transaction pointer */ - struct xfs_buf *agbp, /* (A only) buffer for agf structure */ - xfs_agnumber_t agno, /* (A only) allocation group number */ - xfs_btnum_t btnum, /* btree identifier */ - struct xfs_inode *ip, /* (B only) inode owning the btree */ - int whichfork); /* (B only) data/attr fork */ - -/* * Check for the cursor referring to the last block at the given level. */ int /* 1=is last block, 0=not last block */ Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-02 04:01:38.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-02 04:01:46.000000000 +0200 @@ -387,16 +387,17 @@ xfs_btree_dup_cursor( tp = cur->bc_tp; mp = cur->bc_mp; + /* * Allocate a new cursor like the old one. */ - new = xfs_btree_init_cursor(mp, tp, cur->bc_private.a.agbp, - cur->bc_private.a.agno, cur->bc_btnum, cur->bc_private.b.ip, - cur->bc_private.b.whichfork); + new = cur->bc_ops->dup_cursor(cur); + /* * Copy the record currently in the cursor. */ new->bc_rec = cur->bc_rec; + /* * For each level current, re-get the buffer and copy the ptr value. */ @@ -416,15 +417,6 @@ xfs_btree_dup_cursor( } else new->bc_bufs[i] = NULL; } - /* - * For bmap btrees, copy the firstblock, flist, and flags values, - * since init cursor doesn't get them. - */ - if (new->bc_btnum == XFS_BTNUM_BMAP) { - new->bc_private.b.firstblock = cur->bc_private.b.firstblock; - new->bc_private.b.flist = cur->bc_private.b.flist; - new->bc_private.b.flags = cur->bc_private.b.flags; - } *ncur = new; return 0; } @@ -505,97 +497,6 @@ xfs_btree_get_bufs( } /* - * Allocate a new btree cursor. - * The cursor is either for allocation (A) or bmap (B) or inodes (I). - */ -xfs_btree_cur_t * /* new btree cursor */ -xfs_btree_init_cursor( - xfs_mount_t *mp, /* file system mount point */ - xfs_trans_t *tp, /* transaction pointer */ - xfs_buf_t *agbp, /* (A only) buffer for agf structure */ - /* (I only) buffer for agi structure */ - xfs_agnumber_t agno, /* (AI only) allocation group number */ - xfs_btnum_t btnum, /* btree identifier */ - xfs_inode_t *ip, /* (B only) inode owning the btree */ - int whichfork) /* (B only) data or attr fork */ -{ - xfs_agf_t *agf; /* (A) allocation group freespace */ - xfs_agi_t *agi; /* (I) allocation group inodespace */ - xfs_btree_cur_t *cur; /* return value */ - xfs_ifork_t *ifp; /* (I) inode fork pointer */ - int nlevels=0; /* number of levels in the btree */ - - ASSERT(xfs_btree_cur_zone != NULL); - /* - * Allocate a new cursor. - */ - cur = kmem_zone_zalloc(xfs_btree_cur_zone, KM_SLEEP); - /* - * Deduce the number of btree levels from the arguments. - */ - switch (btnum) { - case XFS_BTNUM_BNO: - case XFS_BTNUM_CNT: - agf = XFS_BUF_TO_AGF(agbp); - nlevels = be32_to_cpu(agf->agf_levels[btnum]); - break; - case XFS_BTNUM_BMAP: - ifp = XFS_IFORK_PTR(ip, whichfork); - nlevels = be16_to_cpu(ifp->if_broot->bb_level) + 1; - break; - case XFS_BTNUM_INO: - agi = XFS_BUF_TO_AGI(agbp); - nlevels = be32_to_cpu(agi->agi_level); - break; - default: - ASSERT(0); - } - /* - * Fill in the common fields. - */ - cur->bc_tp = tp; - cur->bc_mp = mp; - cur->bc_nlevels = nlevels; - cur->bc_btnum = btnum; - cur->bc_blocklog = mp->m_sb.sb_blocklog; - /* - * Fill in private fields. - */ - switch (btnum) { - case XFS_BTNUM_BNO: - case XFS_BTNUM_CNT: - /* - * Allocation btree fields. - */ - cur->bc_private.a.agbp = agbp; - cur->bc_private.a.agno = agno; - break; - case XFS_BTNUM_INO: - /* - * Inode allocation btree fields. - */ - cur->bc_private.a.agbp = agbp; - cur->bc_private.a.agno = agno; - break; - case XFS_BTNUM_BMAP: - /* - * Bmap btree fields. - */ - cur->bc_private.b.forksize = XFS_IFORK_SIZE(ip, whichfork); - cur->bc_private.b.ip = ip; - cur->bc_private.b.firstblock = NULLFSBLOCK; - cur->bc_private.b.flist = NULL; - cur->bc_private.b.allocated = 0; - cur->bc_private.b.flags = 0; - cur->bc_private.b.whichfork = whichfork; - break; - default: - ASSERT(0); - } - return cur; -} - -/* * Check for the cursor referring to the last block at the given level. */ int /* 1=is last block, 0=not last block */ Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-02 03:59:34.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-02 04:01:46.000000000 +0200 @@ -2608,3 +2608,62 @@ xfs_check_nostate_extents( } return 0; } + + +STATIC struct xfs_btree_cur * +xfs_bmbt_dup_cursor( + struct xfs_btree_cur *cur) +{ + struct xfs_btree_cur *new; + + new = xfs_bmbt_init_cursor(cur->bc_mp, cur->bc_tp, + cur->bc_private.b.ip, cur->bc_private.b.whichfork); + + /* + * Copy the firstblock, flist, and flags values, + * since init cursor doesn't get them. + */ + new->bc_private.b.firstblock = cur->bc_private.b.firstblock; + new->bc_private.b.flist = cur->bc_private.b.flist; + new->bc_private.b.flags = cur->bc_private.b.flags; + + return new; +} + +static const struct xfs_btree_ops xfs_bmbt_ops = { + .dup_cursor = xfs_bmbt_dup_cursor, +}; + +/* + * Allocate a new bmap btree cursor. + */ +struct xfs_btree_cur * /* new bmap btree cursor */ +xfs_bmbt_init_cursor( + struct xfs_mount *mp, /* file system mount point */ + struct xfs_trans *tp, /* transaction pointer */ + struct xfs_inode *ip, /* inode owning the btree */ + int whichfork) /* data or attr fork */ +{ + struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork); + struct xfs_btree_cur *cur; + + cur = kmem_zone_zalloc(xfs_btree_cur_zone, KM_SLEEP); + + cur->bc_tp = tp; + cur->bc_mp = mp; + cur->bc_nlevels = be16_to_cpu(ifp->if_broot->bb_level) + 1; + cur->bc_btnum = XFS_BTNUM_BMAP; + cur->bc_blocklog = mp->m_sb.sb_blocklog; + + cur->bc_ops = &xfs_bmbt_ops; + + cur->bc_private.b.forksize = XFS_IFORK_SIZE(ip, whichfork); + cur->bc_private.b.ip = ip; + cur->bc_private.b.firstblock = NULLFSBLOCK; + cur->bc_private.b.flist = NULL; + cur->bc_private.b.allocated = 0; + cur->bc_private.b.flags = 0; + cur->bc_private.b.whichfork = whichfork; + + return cur; +} Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-08-02 03:59:34.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-08-02 04:01:46.000000000 +0200 @@ -2076,3 +2076,44 @@ xfs_inobt_update( } return 0; } + +STATIC struct xfs_btree_cur * +xfs_inobt_dup_cursor( + struct xfs_btree_cur *cur) +{ + return xfs_inobt_init_cursor(cur->bc_mp, cur->bc_tp, + cur->bc_private.a.agbp, cur->bc_private.a.agno); +} + +static const struct xfs_btree_ops xfs_inobt_ops = { + .dup_cursor = xfs_inobt_dup_cursor, +}; + +/* + * Allocate a new inode btree cursor. + */ +struct xfs_btree_cur * /* new inode btree cursor */ +xfs_inobt_init_cursor( + struct xfs_mount *mp, /* file system mount point */ + struct xfs_trans *tp, /* transaction pointer */ + struct xfs_buf *agbp, /* buffer for agi structure */ + xfs_agnumber_t agno) /* allocation group number */ +{ + struct xfs_agi *agi = XFS_BUF_TO_AGI(agbp); + struct xfs_btree_cur *cur; + + cur = kmem_zone_zalloc(xfs_btree_cur_zone, KM_SLEEP); + + cur->bc_tp = tp; + cur->bc_mp = mp; + cur->bc_nlevels = be32_to_cpu(agi->agi_level); + cur->bc_btnum = XFS_BTNUM_INO; + cur->bc_blocklog = mp->m_sb.sb_blocklog; + + cur->bc_ops = &xfs_inobt_ops; + + cur->bc_private.a.agbp = agbp; + cur->bc_private.a.agno = agno; + + return cur; +} Index: linux-2.6-xfs/fs/xfs/xfs_alloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc.c 2008-08-02 04:00:05.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc.c 2008-08-02 04:01:46.000000000 +0200 @@ -640,8 +640,8 @@ xfs_alloc_ag_vextent_exact( /* * Allocate/initialize a cursor for the by-number freespace btree. */ - bno_cur = xfs_btree_init_cursor(args->mp, args->tp, args->agbp, - args->agno, XFS_BTNUM_BNO, NULL, 0); + bno_cur = xfs_allocbt_init_cursor(args->mp, args->tp, args->agbp, + args->agno, XFS_BTNUM_BNO); /* * Lookup bno and minlen in the btree (minlen is irrelevant, really). * Look for the closest free block <= bno, it must contain bno @@ -696,8 +696,8 @@ xfs_alloc_ag_vextent_exact( * We are allocating agbno for rlen [agbno .. end] * Allocate/initialize a cursor for the by-size btree. */ - cnt_cur = xfs_btree_init_cursor(args->mp, args->tp, args->agbp, - args->agno, XFS_BTNUM_CNT, NULL, 0); + cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, args->agbp, + args->agno, XFS_BTNUM_CNT); ASSERT(args->agbno + args->len <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); if ((error = xfs_alloc_fixup_trees(cnt_cur, bno_cur, fbno, flen, @@ -759,8 +759,8 @@ xfs_alloc_ag_vextent_near( /* * Get a cursor for the by-size btree. */ - cnt_cur = xfs_btree_init_cursor(args->mp, args->tp, args->agbp, - args->agno, XFS_BTNUM_CNT, NULL, 0); + cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, args->agbp, + args->agno, XFS_BTNUM_CNT); ltlen = 0; bno_cur_lt = bno_cur_gt = NULL; /* @@ -886,8 +886,8 @@ xfs_alloc_ag_vextent_near( /* * Set up a cursor for the by-bno tree. */ - bno_cur_lt = xfs_btree_init_cursor(args->mp, args->tp, - args->agbp, args->agno, XFS_BTNUM_BNO, NULL, 0); + bno_cur_lt = xfs_allocbt_init_cursor(args->mp, args->tp, + args->agbp, args->agno, XFS_BTNUM_BNO); /* * Fix up the btree entries. */ @@ -914,8 +914,8 @@ xfs_alloc_ag_vextent_near( /* * Allocate and initialize the cursor for the leftward search. */ - bno_cur_lt = xfs_btree_init_cursor(args->mp, args->tp, args->agbp, - args->agno, XFS_BTNUM_BNO, NULL, 0); + bno_cur_lt = xfs_allocbt_init_cursor(args->mp, args->tp, args->agbp, + args->agno, XFS_BTNUM_BNO); /* * Lookup <= bno to find the leftward search's starting point. */ @@ -1267,8 +1267,8 @@ xfs_alloc_ag_vextent_size( /* * Allocate and initialize a cursor for the by-size btree. */ - cnt_cur = xfs_btree_init_cursor(args->mp, args->tp, args->agbp, - args->agno, XFS_BTNUM_CNT, NULL, 0); + cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, args->agbp, + args->agno, XFS_BTNUM_CNT); bno_cur = NULL; /* * Look for an entry >= maxlen+alignment-1 blocks. @@ -1372,8 +1372,8 @@ xfs_alloc_ag_vextent_size( /* * Allocate and initialize a cursor for the by-block tree. */ - bno_cur = xfs_btree_init_cursor(args->mp, args->tp, args->agbp, - args->agno, XFS_BTNUM_BNO, NULL, 0); + bno_cur = xfs_allocbt_init_cursor(args->mp, args->tp, args->agbp, + args->agno, XFS_BTNUM_BNO); if ((error = xfs_alloc_fixup_trees(cnt_cur, bno_cur, fbno, flen, rbno, rlen, XFSA_FIXUP_CNT_OK))) goto error0; @@ -1515,8 +1515,7 @@ xfs_free_ag_extent( /* * Allocate and initialize a cursor for the by-block btree. */ - bno_cur = xfs_btree_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_BNO, NULL, - 0); + bno_cur = xfs_allocbt_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_BNO); cnt_cur = NULL; /* * Look for a neighboring block on the left (lower block numbers) @@ -1575,8 +1574,7 @@ xfs_free_ag_extent( /* * Now allocate and initialize a cursor for the by-size tree. */ - cnt_cur = xfs_btree_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_CNT, NULL, - 0); + cnt_cur = xfs_allocbt_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_CNT); /* * Have both left and right contiguous neighbors. * Merge all three into a single free block. Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-08-02 03:59:34.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-08-02 04:01:46.000000000 +0200 @@ -2209,3 +2209,48 @@ xfs_alloc_update( } return 0; } + +STATIC struct xfs_btree_cur * +xfs_allocbt_dup_cursor( + struct xfs_btree_cur *cur) +{ + return xfs_allocbt_init_cursor(cur->bc_mp, cur->bc_tp, + cur->bc_private.a.agbp, cur->bc_private.a.agno, + cur->bc_btnum); +} + +static const struct xfs_btree_ops xfs_allocbt_ops = { + .dup_cursor = xfs_allocbt_dup_cursor, +}; + +/* + * Allocate a new allocation btree cursor. + */ +struct xfs_btree_cur * /* new alloc btree cursor */ +xfs_allocbt_init_cursor( + struct xfs_mount *mp, /* file system mount point */ + struct xfs_trans *tp, /* transaction pointer */ + struct xfs_buf *agbp, /* buffer for agf structure */ + xfs_agnumber_t agno, /* allocation group number */ + xfs_btnum_t btnum) /* btree identifier */ +{ + struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp); + struct xfs_btree_cur *cur; + + ASSERT(btnum == XFS_BTNUM_BNO || btnum == XFS_BTNUM_CNT); + + cur = kmem_zone_zalloc(xfs_btree_cur_zone, KM_SLEEP); + + cur->bc_tp = tp; + cur->bc_mp = mp; + cur->bc_nlevels = be32_to_cpu(agf->agf_levels[btnum]); + cur->bc_btnum = btnum; + cur->bc_blocklog = mp->m_sb.sb_blocklog; + + cur->bc_ops = &xfs_allocbt_ops; + + cur->bc_private.a.agbp = agbp; + cur->bc_private.a.agno = agno; + + return cur; +} Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.h 2008-08-02 03:59:34.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h 2008-08-02 04:01:46.000000000 +0200 @@ -152,4 +152,9 @@ extern int xfs_alloc_lookup_le(struct xf extern int xfs_alloc_update(struct xfs_btree_cur *cur, xfs_agblock_t bno, xfs_extlen_t len); + +extern struct xfs_btree_cur *xfs_allocbt_init_cursor(struct xfs_mount *, + struct xfs_trans *, struct xfs_buf *, + xfs_agnumber_t, xfs_btnum_t); + #endif /* __XFS_ALLOC_BTREE_H__ */ Index: linux-2.6-xfs/fs/xfs/xfs_bmap.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap.c 2008-08-02 03:59:34.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap.c 2008-08-02 04:01:46.000000000 +0200 @@ -422,8 +422,7 @@ xfs_bmap_add_attrfork_btree( if (ip->i_df.if_broot_bytes <= XFS_IFORK_DSIZE(ip)) *flags |= XFS_ILOG_DBROOT; else { - cur = xfs_btree_init_cursor(mp, tp, NULL, 0, XFS_BTNUM_BMAP, ip, - XFS_DATA_FORK); + cur = xfs_bmbt_init_cursor(mp, tp, ip, XFS_DATA_FORK); cur->bc_private.b.flist = flist; cur->bc_private.b.firstblock = *firstblock; if ((error = xfs_bmbt_lookup_ge(cur, 0, 0, 0, &stat))) @@ -3441,8 +3440,7 @@ xfs_bmap_extents_to_btree( * Need a cursor. Can't allocate until bb_level is filled in. */ mp = ip->i_mount; - cur = xfs_btree_init_cursor(mp, tp, NULL, 0, XFS_BTNUM_BMAP, ip, - whichfork); + cur = xfs_bmbt_init_cursor(mp, tp, ip, whichfork); cur->bc_private.b.firstblock = *firstblock; cur->bc_private.b.flist = flist; cur->bc_private.b.flags = wasdel ? XFS_BTCUR_BPRV_WASDEL : 0; @@ -5029,8 +5027,7 @@ xfs_bmapi( if (abno == NULLFSBLOCK) break; if ((ifp->if_flags & XFS_IFBROOT) && !cur) { - cur = xfs_btree_init_cursor(mp, - tp, NULL, 0, XFS_BTNUM_BMAP, + cur = xfs_bmbt_init_cursor(mp, tp, ip, whichfork); cur->bc_private.b.firstblock = *firstblock; @@ -5147,9 +5144,8 @@ xfs_bmapi( */ ASSERT(mval->br_blockcount <= len); if ((ifp->if_flags & XFS_IFBROOT) && !cur) { - cur = xfs_btree_init_cursor(mp, - tp, NULL, 0, XFS_BTNUM_BMAP, - ip, whichfork); + cur = xfs_bmbt_init_cursor(mp, + tp, ip, whichfork); cur->bc_private.b.firstblock = *firstblock; cur->bc_private.b.flist = flist; @@ -5440,8 +5436,7 @@ xfs_bunmapi( logflags = 0; if (ifp->if_flags & XFS_IFBROOT) { ASSERT(XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_BTREE); - cur = xfs_btree_init_cursor(mp, tp, NULL, 0, XFS_BTNUM_BMAP, ip, - whichfork); + cur = xfs_bmbt_init_cursor(mp, tp, ip, whichfork); cur->bc_private.b.firstblock = *firstblock; cur->bc_private.b.flist = flist; cur->bc_private.b.flags = 0; Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.h 2008-08-02 03:59:34.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h 2008-08-02 04:01:46.000000000 +0200 @@ -24,6 +24,7 @@ struct xfs_btree_cur; struct xfs_btree_lblock; struct xfs_mount; struct xfs_inode; +struct xfs_trans; /* * Bmap root header, on-disk form only. @@ -300,6 +301,9 @@ extern void xfs_bmbt_to_bmdr(xfs_bmbt_bl extern int xfs_bmbt_update(struct xfs_btree_cur *, xfs_fileoff_t, xfs_fsblock_t, xfs_filblks_t, xfs_exntst_t); +extern struct xfs_btree_cur *xfs_bmbt_init_cursor(struct xfs_mount *, + struct xfs_trans *, struct xfs_inode *, int); + #endif /* __KERNEL__ */ #endif /* __XFS_BMAP_BTREE_H__ */ Index: linux-2.6-xfs/fs/xfs/xfs_ialloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc.c 2008-08-02 04:00:05.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc.c 2008-08-02 04:01:46.000000000 +0200 @@ -335,8 +335,7 @@ xfs_ialloc_ag_alloc( /* * Insert records describing the new inode chunk into the btree. */ - cur = xfs_btree_init_cursor(args.mp, tp, agbp, agno, - XFS_BTNUM_INO, (xfs_inode_t *)0, 0); + cur = xfs_inobt_init_cursor(args.mp, tp, agbp, agno); for (thisino = newino; thisino < newino + newlen; thisino += XFS_INODES_PER_CHUNK) { @@ -676,8 +675,7 @@ nextag: */ agno = tagno; *IO_agbp = NULL; - cur = xfs_btree_init_cursor(mp, tp, agbp, be32_to_cpu(agi->agi_seqno), - XFS_BTNUM_INO, (xfs_inode_t *)0, 0); + cur = xfs_inobt_init_cursor(mp, tp, agbp, be32_to_cpu(agi->agi_seqno)); /* * If pagino is 0 (this is the root inode allocation) use newino. * This must work because we've just allocated some. @@ -1022,8 +1020,7 @@ xfs_difree( /* * Initialize the cursor. */ - cur = xfs_btree_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_INO, - (xfs_inode_t *)0, 0); + cur = xfs_inobt_init_cursor(mp, tp, agbp, agno); #ifdef DEBUG if (cur->bc_nlevels == 1) { int freecount = 0; @@ -1259,8 +1256,7 @@ xfs_dilocate( #endif /* DEBUG */ return error; } - cur = xfs_btree_init_cursor(mp, tp, agbp, agno, XFS_BTNUM_INO, - (xfs_inode_t *)0, 0); + cur = xfs_inobt_init_cursor(mp, tp, agbp, agno); if ((error = xfs_inobt_lookup_le(cur, agino, 0, 0, &i))) { #ifdef DEBUG xfs_fs_cmn_err(CE_ALERT, mp, "xfs_dilocate: " Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.h 2008-08-02 03:59:34.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h 2008-08-02 04:01:46.000000000 +0200 @@ -175,4 +175,8 @@ extern int xfs_inobt_lookup_le(struct xf extern int xfs_inobt_update(struct xfs_btree_cur *cur, xfs_agino_t ino, __int32_t fcnt, xfs_inofree_t free); + +extern struct xfs_btree_cur *xfs_inobt_init_cursor(struct xfs_mount *, + struct xfs_trans *, struct xfs_buf *, xfs_agnumber_t); + #endif /* __XFS_IALLOC_BTREE_H__ */ Index: linux-2.6-xfs/fs/xfs/xfs_itable.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_itable.c 2008-08-02 03:59:34.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_itable.c 2008-08-02 04:01:46.000000000 +0200 @@ -416,8 +416,7 @@ xfs_bulkstat( /* * Allocate and initialize a btree cursor for ialloc btree. */ - cur = xfs_btree_init_cursor(mp, NULL, agbp, agno, XFS_BTNUM_INO, - (xfs_inode_t *)0, 0); + cur = xfs_inobt_init_cursor(mp, NULL, agbp, agno); irbp = irbuf; irbufend = irbuf + nirbuf; end_of_ag = 0; @@ -842,8 +841,7 @@ xfs_inumbers( agino = 0; continue; } - cur = xfs_btree_init_cursor(mp, NULL, agbp, agno, - XFS_BTNUM_INO, (xfs_inode_t *)0, 0); + cur = xfs_inobt_init_cursor(mp, NULL, agbp, agno); error = xfs_inobt_lookup_ge(cur, agino, 0, 0, &tmp); if (error) { xfs_btree_del_cursor(cur, XFS_BTREE_ERROR); -- From owner-xfs@oss.sgi.com Sun Aug 3 18:33:22 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:33:29 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741XLFO032251 for ; Sun, 3 Aug 2008 18:33:21 -0700 X-ASG-Debug-ID: 1217813674-16b402e00000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 9359D1976448 for ; Sun, 3 Aug 2008 18:34:34 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id jmuM2grRkgb4HdxB for ; Sun, 03 Aug 2008 18:34:34 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741YaIF009168 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:34:37 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741YafG009166 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:34:36 +0200 Date: Mon, 4 Aug 2008 03:34:36 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 15/26] add get_maxrecs btree operation Subject: [PATCH 15/26] add get_maxrecs btree operation Message-ID: <20080804013436.GP8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-btree-get_maxrecs-method User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813675 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1688 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17339 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Factor xfs_btree_maxrecs into a per-btree operation. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-02 04:08:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-02 04:08:43.000000000 +0200 @@ -52,31 +52,6 @@ const __uint32_t xfs_magics[XFS_BTNUM_MA }; /* - * Checking routine: return maxrecs for the block. - */ -STATIC int /* number of records fitting in block */ -xfs_btree_maxrecs( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_block_t *block) /* generic btree block pointer */ -{ - switch (cur->bc_btnum) { - case XFS_BTNUM_BNO: - case XFS_BTNUM_CNT: - return (int)XFS_ALLOC_BLOCK_MAXRECS( - be16_to_cpu(block->bb_level), cur); - case XFS_BTNUM_BMAP: - return (int)XFS_BMAP_BLOCK_IMAXRECS( - be16_to_cpu(block->bb_level), cur); - case XFS_BTNUM_INO: - return (int)XFS_INOBT_BLOCK_MAXRECS( - be16_to_cpu(block->bb_level), cur); - default: - ASSERT(0); - return 0; - } -} - -/* * External routines. */ @@ -208,7 +183,7 @@ xfs_btree_check_lblock( be32_to_cpu(block->bb_magic) == xfs_magics[cur->bc_btnum] && be16_to_cpu(block->bb_level) == level && be16_to_cpu(block->bb_numrecs) <= - xfs_btree_maxrecs(cur, (xfs_btree_block_t *)block) && + cur->bc_ops->get_maxrecs(cur, level) && block->bb_leftsib && (be64_to_cpu(block->bb_leftsib) == NULLDFSBNO || XFS_FSB_SANITY_CHECK(mp, be64_to_cpu(block->bb_leftsib))) && @@ -245,7 +220,7 @@ xfs_btree_check_sblock( be32_to_cpu(block->bb_magic) == xfs_magics[cur->bc_btnum] && be16_to_cpu(block->bb_level) == level && be16_to_cpu(block->bb_numrecs) <= - xfs_btree_maxrecs(cur, (xfs_btree_block_t *)block) && + cur->bc_ops->get_maxrecs(cur, level) && (be32_to_cpu(block->bb_leftsib) == NULLAGBLOCK || be32_to_cpu(block->bb_leftsib) < agflen) && block->bb_leftsib && Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-02 04:08:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-02 04:08:43.000000000 +0200 @@ -200,6 +200,9 @@ struct xfs_btree_ops { union xfs_btree_rec *(*rec_addr)(struct xfs_btree_cur *cur, int index, struct xfs_btree_block *block); + /* records in block/level */ + int (*get_maxrecs)(struct xfs_btree_cur *cur, int level); + /* init values of btree structures */ void (*init_key_from_rec)(struct xfs_btree_cur *cur, union xfs_btree_key *key, Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-08-02 04:08:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-08-02 04:08:43.000000000 +0200 @@ -1655,6 +1655,14 @@ xfs_allocbt_update_lastrec( xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, XFS_AGF_LONGEST); } +STATIC int +xfs_allocbt_get_maxrecs( + struct xfs_btree_cur *cur, + int level) +{ + return cur->bc_mp->m_alloc_mxr[level != 0]; +} + STATIC void xfs_allocbt_init_key_from_rec( struct xfs_btree_cur *cur, @@ -1895,6 +1903,7 @@ xfs_allocbt_trace_record( static const struct xfs_btree_ops xfs_allocbt_ops = { .dup_cursor = xfs_allocbt_dup_cursor, .update_lastrec = xfs_allocbt_update_lastrec, + .get_maxrecs = xfs_allocbt_get_maxrecs, .init_key_from_rec = xfs_allocbt_init_key_from_rec, .init_ptr_from_cur = xfs_allocbt_init_ptr_from_cur, .ptr_addr = xfs_allocbt_ptr_addr, Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-02 04:08:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-02 04:08:43.000000000 +0200 @@ -1955,6 +1955,14 @@ xfs_bmbt_get_root_from_inode( return (struct xfs_btree_block *)ifp->if_broot; } +STATIC int +xfs_bmbt_get_maxrecs( + struct xfs_btree_cur *cur, + int level) +{ + return XFS_BMAP_BLOCK_IMAXRECS(level, cur); +} + STATIC void xfs_bmbt_init_key_from_rec( struct xfs_btree_cur *cur, @@ -2179,6 +2187,7 @@ static const struct xfs_btree_ops xfs_bm .get_root_from_inode = xfs_bmbt_get_root_from_inode, .init_key_from_rec = xfs_bmbt_init_key_from_rec, .init_ptr_from_cur = xfs_bmbt_init_ptr_from_cur, + .get_maxrecs = xfs_bmbt_get_maxrecs, .ptr_addr = xfs_bmbt_ptr_addr, .key_addr = xfs_bmbt_key_addr, .rec_addr = xfs_bmbt_rec_addr, Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-08-02 04:08:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-08-02 04:08:43.000000000 +0200 @@ -1516,6 +1516,14 @@ xfs_inobt_dup_cursor( cur->bc_private.a.agbp, cur->bc_private.a.agno); } +STATIC int +xfs_inobt_get_maxrecs( + struct xfs_btree_cur *cur, + int level) +{ + return cur->bc_mp->m_inobt_mxr[level != 0]; +} + STATIC void xfs_inobt_init_key_from_rec( struct xfs_btree_cur *cur, @@ -1720,6 +1728,7 @@ xfs_inobt_trace_record( static const struct xfs_btree_ops xfs_inobt_ops = { .dup_cursor = xfs_inobt_dup_cursor, + .get_maxrecs = xfs_inobt_get_maxrecs, .init_key_from_rec = xfs_inobt_init_key_from_rec, .init_ptr_from_cur = xfs_inobt_init_ptr_from_cur, .ptr_addr = xfs_inobt_ptr_addr, -- From owner-xfs@oss.sgi.com Sun Aug 3 18:31:18 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:31:21 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741VIVA030368 for ; Sun, 3 Aug 2008 18:31:18 -0700 X-ASG-Debug-ID: 1217813551-2db303130000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 2E79DEF6A14 for ; Sun, 3 Aug 2008 18:32:32 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id GZqlZoXGFmphuYia for ; Sun, 03 Aug 2008 18:32:32 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741WXIF008910 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:32:33 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741WXmg008908 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:32:33 +0200 Date: Mon, 4 Aug 2008 03:32:33 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 03/26] add generic btree types Subject: [PATCH 03/26] add generic btree types Message-ID: <20080804013233.GD8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-btree-add-generic-types User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813553 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17328 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Add generic union types for btree pointers, keys and records. The generic btree pointer contains a 32 and 64bit big endian type for short and long form btrees, and the key and record contain the relevant type for each possible btree. Split out from a bigger patch from Dave Chinner and simplified a little further. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-02 04:01:46.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-02 04:04:16.000000000 +0200 @@ -79,6 +79,31 @@ typedef struct xfs_btree_block { } bb_u; /* rest */ } xfs_btree_block_t; + /* + * Generic key, ptr and record wrapper structures. + * + * These are disk format structures, and are converted where necessary + * by the btree specific code that needs to interpret them. + */ +union xfs_btree_ptr { + __be32 s; /* short form ptr */ + __be64 l; /* long form ptr */ +}; + +union xfs_btree_key { + xfs_bmbt_key_t bmbt; + xfs_bmdr_key_t bmbr; /* bmbt root block */ + xfs_alloc_key_t alloc; + xfs_inobt_key_t inobt; +}; + +union xfs_btree_rec { + xfs_bmbt_rec_t bmbt; + xfs_bmdr_rec_t bmbr; /* bmbt root block */ + xfs_alloc_rec_t alloc; + xfs_inobt_rec_t inobt; +}; + /* * For logging record fields. */ -- From owner-xfs@oss.sgi.com Sun Aug 3 18:33:49 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:33:57 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_31, J_CHICKENPOX_73 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741Xl19000378 for ; Sun, 3 Aug 2008 18:33:48 -0700 X-ASG-Debug-ID: 1217813698-40dc01b20000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id F3A97EF6ACC for ; Sun, 3 Aug 2008 18:34:59 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id dDFfYXUiP5eZLEtC for ; Sun, 03 Aug 2008 18:34:59 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741Z0IF009207 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:35:00 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741Z0uR009205 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:35:00 +0200 Date: Mon, 4 Aug 2008 03:35:00 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 17/26] implement generic xfs_btree_lshift Subject: [PATCH 17/26] implement generic xfs_btree_lshift Message-ID: <20080804013500.GR8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-common-btree-lshift User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813700 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.22 X-Barracuda-Spam-Status: No, SCORE=-1.22 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC0_MJ615, MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words 0.20 BSF_SC0_MJ615 Custom Rule MJ615 X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17342 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Make the btree left shift code generic. Based on a patch from David Chinner with lots of changes to follow the original btree implementations more closely. While this loses some of the generic helper routines for inserting/moving/removing records it also solves some of the one off bugs in the original code and makes it easier to verify. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-03 13:32:52.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-03 13:33:04.000000000 +0200 @@ -985,6 +985,21 @@ xfs_btree_move_ptrs( } } +STATIC void +xfs_btree_copy_ptrs( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *src_ptr, + union xfs_btree_ptr *dst_ptr, + int numptrs) +{ + ASSERT(numptrs > 0); + + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) + memcpy(dst_ptr, src_ptr, numptrs * sizeof(__be64)); + else + memcpy(dst_ptr, src_ptr, numptrs * sizeof(__be32)); +} + /* * Log block pointer fields from a btree block (nonleaf). */ @@ -1612,6 +1627,186 @@ error0: } /* + * Move 1 record left from cur/level if possible. + * Update cur to reflect the new path. + */ +int /* error */ +xfs_btree_lshift( + struct xfs_btree_cur *cur, + int level, + int *stat) /* success/failure */ +{ + union xfs_btree_key key; /* btree key */ + struct xfs_buf *lbp; /* left buffer pointer */ + struct xfs_btree_block *left; /* left btree block */ + int lrecs; /* left record count */ + struct xfs_buf *rbp; /* right buffer pointer */ + struct xfs_btree_block *right; /* right btree block */ + int rrecs; /* right record count */ + union xfs_btree_ptr lptr; /* left btree pointer */ + union xfs_btree_key *rkp = NULL; /* right btree key */ + union xfs_btree_ptr *rpp = NULL; /* right address pointer */ + union xfs_btree_rec *rrp = NULL; /* right record pointer */ + int error; /* error return value */ + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, level); + + if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) && + level == cur->bc_nlevels - 1) + goto out0; + + /* Set up variables for this block as "right". */ + right = xfs_btree_get_block(cur, level, &rbp); + +#ifdef DEBUG + error = xfs_btree_check_block(cur, right, level, rbp); + if (error) + goto error0; +#endif + + /* If we've got no left sibling then we can't shift an entry left. */ + xfs_btree_get_sibling(cur, right, &lptr, XFS_BB_LEFTSIB); + if (xfs_btree_ptr_is_null(cur, &lptr)) + goto out0; + + /* + * If the cursor entry is the one that would be moved, don't + * do it... it's too complicated. + */ + if (cur->bc_ptrs[level] <= 1) + goto out0; + + /* Set up the left neighbor as "left". */ + error = xfs_btree_read_buf_block(cur, &lptr, level, 0, &left, &lbp); + if (error) + goto error0; + + /* If it's full, it can't take another entry. */ + lrecs = xfs_btree_get_numrecs(left); + if (lrecs == cur->bc_ops->get_maxrecs(cur, level)) + goto out0; + + rrecs = xfs_btree_get_numrecs(right); + + /* + * We add one entry to the left side and remove one for the right side. + * Accout for it here, the changes will be updated on disk and logged + * later. + */ + lrecs++; + rrecs--; + + XFS_BTREE_STATS_INC(cur, lshift); + XFS_BTREE_STATS_ADD(cur, moves, 1); + + /* + * If non-leaf, copy a key and a ptr to the left block. + * Log the changes to the left block. + */ + if (level > 0) { + /* It's a non-leaf. Move keys and pointers. */ + union xfs_btree_key *lkp; /* left btree key */ + union xfs_btree_ptr *lpp; /* left address pointer */ + + lkp = cur->bc_ops->key_addr(cur, lrecs, left); + rkp = cur->bc_ops->key_addr(cur, 1, right); + + lpp = cur->bc_ops->ptr_addr(cur, lrecs, left); + rpp = cur->bc_ops->ptr_addr(cur, 1, right); +#ifdef DEBUG + error = xfs_btree_check_ptr(cur, rpp, 0, level); + if (error) + goto error0; +#endif + cur->bc_ops->copy_keys(cur, rkp, lkp, 1); + xfs_btree_copy_ptrs(cur, rpp, lpp, 1); + + cur->bc_ops->log_keys(cur, lbp, lrecs, lrecs); + xfs_btree_log_ptrs(cur, lbp, lrecs, lrecs); + + xfs_btree_check_key(cur->bc_btnum, + cur->bc_ops->key_addr(cur, lrecs - 1, left), + lkp); + } else { + /* It's a leaf. Move records. */ + union xfs_btree_rec *lrp; /* left record pointer */ + + lrp = cur->bc_ops->rec_addr(cur, lrecs, left); + rrp = cur->bc_ops->rec_addr(cur, 1, right); + + cur->bc_ops->copy_recs(cur, rrp, lrp, 1); + cur->bc_ops->log_recs(cur, lbp, lrecs, lrecs); + + xfs_btree_check_rec(cur->bc_btnum, + cur->bc_ops->rec_addr(cur, lrecs - 1, left), + lrp); + } + + xfs_btree_set_numrecs(left, lrecs); + xfs_btree_log_block(cur, lbp, XFS_BB_NUMRECS); + + xfs_btree_set_numrecs(right, rrecs); + xfs_btree_log_block(cur, rbp, XFS_BB_NUMRECS); + + /* + * Slide the contents of right down one entry. + */ + XFS_BTREE_STATS_ADD(cur, moves, rrecs - 1); + if (level > 0) { + /* It's a nonleaf. operate on keys and ptrs */ +#ifdef DEBUG + int i; /* loop index */ + + for (i = 0; i < rrecs; i++) { + error = xfs_btree_check_ptr(cur, rpp, i + 1, level); + if (error) + goto error0; + } +#endif + + cur->bc_ops->move_keys(cur, rkp, 1, 0, rrecs); + xfs_btree_move_ptrs(cur, rpp, 1, 0, rrecs); + cur->bc_ops->log_keys(cur, rbp, 1, rrecs); + xfs_btree_log_ptrs(cur, rbp, 1, rrecs); + } else { + /* It's a leaf. operate on records */ + + rrp = cur->bc_ops->rec_addr(cur, 1, right); + cur->bc_ops->move_recs(cur, rrp, 1, 0, rrecs); + cur->bc_ops->log_recs(cur, rbp, 1, rrecs); + + /* + * If it's the first record in the block, we'll need a key + * structure to pass up to the next level (updkey). + */ + cur->bc_ops->init_key_from_rec(cur, &key, rrp); + rkp = &key; + } + + /* Update the parent key values of right. */ + error = xfs_btree_updkey(cur, rkp, level + 1); + if (error) + goto error0; + + /* Slide the cursor value left one. */ + cur->bc_ptrs[level]--; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + +/* * Move 1 record right from cur/level if possible. * Update cur to reflect the new path. */ Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-08-03 13:30:46.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-08-03 13:33:04.000000000 +0200 @@ -52,7 +52,6 @@ STATIC void xfs_alloc_log_ptrs(xfs_btree STATIC void xfs_allocbt_log_recs(xfs_btree_cur_t *, xfs_buf_t *, int, int); #define xfs_alloc_log_recs(c, b, i, j) \ xfs_allocbt_log_recs(c, b, i, j) -STATIC int xfs_alloc_lshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_alloc_newroot(xfs_btree_cur_t *, int *); STATIC int xfs_alloc_split(xfs_btree_cur_t *, int, xfs_agblock_t *, xfs_alloc_key_t *, xfs_btree_cur_t **, int *); @@ -331,7 +330,7 @@ xfs_alloc_delrec( */ if (be16_to_cpu(right->bb_numrecs) - 1 >= XFS_ALLOC_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_alloc_lshift(tcur, level, &i))) + if ((error = xfs_btree_lshift(tcur, level, &i))) goto error0; if (i) { ASSERT(be16_to_cpu(block->bb_numrecs) >= @@ -696,7 +695,7 @@ xfs_alloc_insrec( * Next, try shifting an entry to the left neighbor. */ else { - if ((error = xfs_alloc_lshift(cur, level, &i))) + if ((error = xfs_btree_lshift(cur, level, &i))) return error; if (i) optr = ptr = cur->bc_ptrs[level]; @@ -884,147 +883,6 @@ xfs_alloc_log_ptrs( } /* - * Move 1 record left from cur/level if possible. - * Update cur to reflect the new path. - */ -STATIC int /* error */ -xfs_alloc_lshift( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to shift record on */ - int *stat) /* success/failure */ -{ - int error; /* error return value */ -#ifdef DEBUG - int i; /* loop index */ -#endif - xfs_alloc_key_t key; /* key value for leaf level upward */ - xfs_buf_t *lbp; /* buffer for left neighbor block */ - xfs_alloc_block_t *left; /* left neighbor btree block */ - int nrec; /* new number of left block entries */ - xfs_buf_t *rbp; /* buffer for right (current) block */ - xfs_alloc_block_t *right; /* right (current) btree block */ - xfs_alloc_key_t *rkp=NULL; /* key pointer for right block */ - xfs_alloc_ptr_t *rpp=NULL; /* address pointer for right block */ - xfs_alloc_rec_t *rrp=NULL; /* record pointer for right block */ - - /* - * Set up variables for this block as "right". - */ - rbp = cur->bc_bufs[level]; - right = XFS_BUF_TO_ALLOC_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; -#endif - /* - * If we've got no left sibling then we can't shift an entry left. - */ - if (be32_to_cpu(right->bb_leftsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * If the cursor entry is the one that would be moved, don't - * do it... it's too complicated. - */ - if (cur->bc_ptrs[level] <= 1) { - *stat = 0; - return 0; - } - /* - * Set up the left neighbor as "left". - */ - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, be32_to_cpu(right->bb_leftsib), - 0, &lbp, XFS_ALLOC_BTREE_REF))) - return error; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; - /* - * If it's full, it can't take another entry. - */ - if (be16_to_cpu(left->bb_numrecs) == XFS_ALLOC_BLOCK_MAXRECS(level, cur)) { - *stat = 0; - return 0; - } - nrec = be16_to_cpu(left->bb_numrecs) + 1; - /* - * If non-leaf, copy a key and a ptr to the left block. - */ - if (level > 0) { - xfs_alloc_key_t *lkp; /* key pointer for left block */ - xfs_alloc_ptr_t *lpp; /* address pointer for left block */ - - lkp = XFS_ALLOC_KEY_ADDR(left, nrec, cur); - rkp = XFS_ALLOC_KEY_ADDR(right, 1, cur); - *lkp = *rkp; - xfs_alloc_log_keys(cur, lbp, nrec, nrec); - lpp = XFS_ALLOC_PTR_ADDR(left, nrec, cur); - rpp = XFS_ALLOC_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(*rpp), level))) - return error; -#endif - *lpp = *rpp; - xfs_alloc_log_ptrs(cur, lbp, nrec, nrec); - xfs_btree_check_key(cur->bc_btnum, lkp - 1, lkp); - } - /* - * If leaf, copy a record to the left block. - */ - else { - xfs_alloc_rec_t *lrp; /* record pointer for left block */ - - lrp = XFS_ALLOC_REC_ADDR(left, nrec, cur); - rrp = XFS_ALLOC_REC_ADDR(right, 1, cur); - *lrp = *rrp; - xfs_alloc_log_recs(cur, lbp, nrec, nrec); - xfs_btree_check_rec(cur->bc_btnum, lrp - 1, lrp); - } - /* - * Bump and log left's numrecs, decrement and log right's numrecs. - */ - be16_add(&left->bb_numrecs, 1); - xfs_alloc_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS); - be16_add(&right->bb_numrecs, -1); - xfs_alloc_log_block(cur->bc_tp, rbp, XFS_BB_NUMRECS); - /* - * Slide the contents of right down one entry. - */ - if (level > 0) { -#ifdef DEBUG - for (i = 0; i < be16_to_cpu(right->bb_numrecs); i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i + 1]), - level))) - return error; - } -#endif - memmove(rkp, rkp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memmove(rpp, rpp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); - xfs_alloc_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - xfs_alloc_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - } else { - memmove(rrp, rrp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - xfs_alloc_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - key.ar_startblock = rrp->ar_startblock; - key.ar_blockcount = rrp->ar_blockcount; - rkp = &key; - } - /* - * Update the parent key values of right. - */ - if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)rkp, level + 1))) - return error; - /* - * Slide the cursor value left one. - */ - cur->bc_ptrs[level]--; - *stat = 1; - return 0; -} - -/* * Allocate a new root block, fill it in. */ STATIC int /* error */ Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-03 13:30:46.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-03 13:33:04.000000000 +0200 @@ -52,7 +52,6 @@ STATIC int xfs_bmbt_killroot(xfs_btree_cur_t *); STATIC void xfs_bmbt_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); STATIC void xfs_bmbt_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC int xfs_bmbt_lshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_bmbt_split(xfs_btree_cur_t *, int, xfs_fsblock_t *, __uint64_t *, xfs_btree_cur_t **, int *); @@ -270,7 +269,7 @@ xfs_bmbt_delrec( bno = be64_to_cpu(right->bb_leftsib); if (be16_to_cpu(right->bb_numrecs) - 1 >= XFS_BMAP_BLOCK_IMINRECS(level, cur)) { - if ((error = xfs_bmbt_lshift(tcur, level, &i))) { + if ((error = xfs_btree_lshift(tcur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -544,7 +543,7 @@ xfs_bmbt_insrec( if (i) { /* nothing */ } else { - if ((error = xfs_bmbt_lshift(cur, level, &i))) { + if ((error = xfs_btree_lshift(cur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); return error; } @@ -775,139 +774,6 @@ xfs_bmbt_log_ptrs( } /* - * Move 1 record left from cur/level if possible. - * Update cur to reflect the new path. - */ -STATIC int /* error */ -xfs_bmbt_lshift( - xfs_btree_cur_t *cur, - int level, - int *stat) /* success/failure */ -{ - int error; /* error return value */ -#ifdef DEBUG - int i; /* loop counter */ -#endif - xfs_bmbt_key_t key; /* bmap btree key */ - xfs_buf_t *lbp; /* left buffer pointer */ - xfs_bmbt_block_t *left; /* left btree block */ - xfs_bmbt_key_t *lkp=NULL; /* left btree key */ - xfs_bmbt_ptr_t *lpp; /* left address pointer */ - int lrecs; /* left record count */ - xfs_bmbt_rec_t *lrp=NULL; /* left record pointer */ - xfs_mount_t *mp; /* file system mount point */ - xfs_buf_t *rbp; /* right buffer pointer */ - xfs_bmbt_block_t *right; /* right btree block */ - xfs_bmbt_key_t *rkp=NULL; /* right btree key */ - xfs_bmbt_ptr_t *rpp=NULL; /* right address pointer */ - xfs_bmbt_rec_t *rrp=NULL; /* right record pointer */ - int rrecs; /* right record count */ - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, level); - if (level == cur->bc_nlevels - 1) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - rbp = cur->bc_bufs[level]; - right = XFS_BUF_TO_BMBT_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, right, level, rbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - if (be64_to_cpu(right->bb_leftsib) == NULLDFSBNO) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - if (cur->bc_ptrs[level] <= 1) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - mp = cur->bc_mp; - if ((error = xfs_btree_read_bufl(mp, cur->bc_tp, be64_to_cpu(right->bb_leftsib), 0, - &lbp, XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - left = XFS_BUF_TO_BMBT_BLOCK(lbp); - if ((error = xfs_btree_check_lblock(cur, left, level, lbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - if (be16_to_cpu(left->bb_numrecs) == XFS_BMAP_BLOCK_IMAXRECS(level, cur)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - lrecs = be16_to_cpu(left->bb_numrecs) + 1; - if (level > 0) { - lkp = XFS_BMAP_KEY_IADDR(left, lrecs, cur); - rkp = XFS_BMAP_KEY_IADDR(right, 1, cur); - *lkp = *rkp; - xfs_bmbt_log_keys(cur, lbp, lrecs, lrecs); - lpp = XFS_BMAP_PTR_IADDR(left, lrecs, cur); - rpp = XFS_BMAP_PTR_IADDR(right, 1, cur); -#ifdef DEBUG - if ((error = xfs_btree_check_lptr_disk(cur, *rpp, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - *lpp = *rpp; - xfs_bmbt_log_ptrs(cur, lbp, lrecs, lrecs); - } else { - lrp = XFS_BMAP_REC_IADDR(left, lrecs, cur); - rrp = XFS_BMAP_REC_IADDR(right, 1, cur); - *lrp = *rrp; - xfs_bmbt_log_recs(cur, lbp, lrecs, lrecs); - } - left->bb_numrecs = cpu_to_be16(lrecs); - xfs_bmbt_log_block(cur, lbp, XFS_BB_NUMRECS); -#ifdef DEBUG - if (level > 0) - xfs_btree_check_key(XFS_BTNUM_BMAP, lkp - 1, lkp); - else - xfs_btree_check_rec(XFS_BTNUM_BMAP, lrp - 1, lrp); -#endif - rrecs = be16_to_cpu(right->bb_numrecs) - 1; - right->bb_numrecs = cpu_to_be16(rrecs); - xfs_bmbt_log_block(cur, rbp, XFS_BB_NUMRECS); - if (level > 0) { -#ifdef DEBUG - for (i = 0; i < rrecs; i++) { - if ((error = xfs_btree_check_lptr_disk(cur, rpp[i + 1], - level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } -#endif - memmove(rkp, rkp + 1, rrecs * sizeof(*rkp)); - memmove(rpp, rpp + 1, rrecs * sizeof(*rpp)); - xfs_bmbt_log_keys(cur, rbp, 1, rrecs); - xfs_bmbt_log_ptrs(cur, rbp, 1, rrecs); - } else { - memmove(rrp, rrp + 1, rrecs * sizeof(*rrp)); - xfs_bmbt_log_recs(cur, rbp, 1, rrecs); - key.br_startoff = cpu_to_be64(xfs_bmbt_disk_get_startoff(rrp)); - rkp = &key; - } - if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)rkp, level + 1))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - cur->bc_ptrs[level]--; - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; -} - -/* * Determine the extent state. */ /* ARGSUSED */ Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-03 13:30:46.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-03 13:33:04.000000000 +0200 @@ -571,6 +571,7 @@ int xfs_btree_decrement(struct xfs_btree int xfs_btree_lookup(struct xfs_btree_cur *, xfs_lookup_t, int *); int xfs_btree_updkey(struct xfs_btree_cur *, union xfs_btree_key *, int); int xfs_btree_update(struct xfs_btree_cur *, union xfs_btree_rec *); +int xfs_btree_lshift(struct xfs_btree_cur *, int, int *); int xfs_btree_rshift(struct xfs_btree_cur *, int, int *); Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-08-03 13:30:46.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-08-03 13:33:04.000000000 +0200 @@ -44,7 +44,6 @@ STATIC void xfs_inobt_log_block(xfs_tran STATIC void xfs_inobt_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); STATIC void xfs_inobt_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); STATIC void xfs_inobt_log_recs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC int xfs_inobt_lshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_inobt_newroot(xfs_btree_cur_t *, int *); STATIC int xfs_inobt_split(xfs_btree_cur_t *, int, xfs_agblock_t *, xfs_inobt_key_t *, xfs_btree_cur_t **, int *); @@ -277,7 +276,7 @@ xfs_inobt_delrec( */ if (be16_to_cpu(right->bb_numrecs) - 1 >= XFS_INOBT_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_inobt_lshift(tcur, level, &i))) + if ((error = xfs_btree_lshift(tcur, level, &i))) goto error0; if (i) { ASSERT(be16_to_cpu(block->bb_numrecs) >= @@ -617,7 +616,7 @@ xfs_inobt_insrec( * Next, try shifting an entry to the left neighbor. */ else { - if ((error = xfs_inobt_lshift(cur, level, &i))) + if ((error = xfs_btree_lshift(cur, level, &i))) return error; if (i) { optr = ptr = cur->bc_ptrs[level]; @@ -784,148 +783,6 @@ xfs_inobt_log_ptrs( } /* - * Move 1 record left from cur/level if possible. - * Update cur to reflect the new path. - */ -STATIC int /* error */ -xfs_inobt_lshift( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to shift record on */ - int *stat) /* success/failure */ -{ - int error; /* error return value */ -#ifdef DEBUG - int i; /* loop index */ -#endif - xfs_inobt_key_t key; /* key value for leaf level upward */ - xfs_buf_t *lbp; /* buffer for left neighbor block */ - xfs_inobt_block_t *left; /* left neighbor btree block */ - xfs_inobt_key_t *lkp=NULL; /* key pointer for left block */ - xfs_inobt_ptr_t *lpp; /* address pointer for left block */ - xfs_inobt_rec_t *lrp=NULL; /* record pointer for left block */ - int nrec; /* new number of left block entries */ - xfs_buf_t *rbp; /* buffer for right (current) block */ - xfs_inobt_block_t *right; /* right (current) btree block */ - xfs_inobt_key_t *rkp=NULL; /* key pointer for right block */ - xfs_inobt_ptr_t *rpp=NULL; /* address pointer for right block */ - xfs_inobt_rec_t *rrp=NULL; /* record pointer for right block */ - - /* - * Set up variables for this block as "right". - */ - rbp = cur->bc_bufs[level]; - right = XFS_BUF_TO_INOBT_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; -#endif - /* - * If we've got no left sibling then we can't shift an entry left. - */ - if (be32_to_cpu(right->bb_leftsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * If the cursor entry is the one that would be moved, don't - * do it... it's too complicated. - */ - if (cur->bc_ptrs[level] <= 1) { - *stat = 0; - return 0; - } - /* - * Set up the left neighbor as "left". - */ - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, be32_to_cpu(right->bb_leftsib), - 0, &lbp, XFS_INO_BTREE_REF))) - return error; - left = XFS_BUF_TO_INOBT_BLOCK(lbp); - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; - /* - * If it's full, it can't take another entry. - */ - if (be16_to_cpu(left->bb_numrecs) == XFS_INOBT_BLOCK_MAXRECS(level, cur)) { - *stat = 0; - return 0; - } - nrec = be16_to_cpu(left->bb_numrecs) + 1; - /* - * If non-leaf, copy a key and a ptr to the left block. - */ - if (level > 0) { - lkp = XFS_INOBT_KEY_ADDR(left, nrec, cur); - rkp = XFS_INOBT_KEY_ADDR(right, 1, cur); - *lkp = *rkp; - xfs_inobt_log_keys(cur, lbp, nrec, nrec); - lpp = XFS_INOBT_PTR_ADDR(left, nrec, cur); - rpp = XFS_INOBT_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(*rpp), level))) - return error; -#endif - *lpp = *rpp; - xfs_inobt_log_ptrs(cur, lbp, nrec, nrec); - } - /* - * If leaf, copy a record to the left block. - */ - else { - lrp = XFS_INOBT_REC_ADDR(left, nrec, cur); - rrp = XFS_INOBT_REC_ADDR(right, 1, cur); - *lrp = *rrp; - xfs_inobt_log_recs(cur, lbp, nrec, nrec); - } - /* - * Bump and log left's numrecs, decrement and log right's numrecs. - */ - be16_add(&left->bb_numrecs, 1); - xfs_inobt_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS); -#ifdef DEBUG - if (level > 0) - xfs_btree_check_key(cur->bc_btnum, lkp - 1, lkp); - else - xfs_btree_check_rec(cur->bc_btnum, lrp - 1, lrp); -#endif - be16_add(&right->bb_numrecs, -1); - xfs_inobt_log_block(cur->bc_tp, rbp, XFS_BB_NUMRECS); - /* - * Slide the contents of right down one entry. - */ - if (level > 0) { -#ifdef DEBUG - for (i = 0; i < be16_to_cpu(right->bb_numrecs); i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i + 1]), - level))) - return error; - } -#endif - memmove(rkp, rkp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memmove(rpp, rpp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); - xfs_inobt_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - xfs_inobt_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - } else { - memmove(rrp, rrp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - xfs_inobt_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - key.ir_startino = rrp->ir_startino; - rkp = &key; - } - /* - * Update the parent key values of right. - */ - if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)rkp, level + 1))) - return error; - /* - * Slide the cursor value left one. - */ - cur->bc_ptrs[level]--; - *stat = 1; - return 0; -} - -/* * Allocate a new root block, fill it in. */ STATIC int /* error */ -- From owner-xfs@oss.sgi.com Sun Aug 3 18:33:19 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:33:33 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_63 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741XINk032166 for ; Sun, 3 Aug 2008 18:33:18 -0700 X-ASG-Debug-ID: 1217813668-40cb01cf0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 779A8EF6A7C for ; Sun, 3 Aug 2008 18:34:29 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id 1vt4xt0KGdXlO6nk for ; Sun, 03 Aug 2008 18:34:29 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741YUIF009151 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:34:30 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741YUYl009149 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:34:30 +0200 Date: Mon, 4 Aug 2008 03:34:30 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 14/26] implement generic xfs_btree_update Subject: [PATCH 14/26] implement generic xfs_btree_update Message-ID: <20080804013430.GO8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=unknown-8bit Content-Disposition: inline; filename=xfs-common-btree-update Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813671 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.22 X-Barracuda-Spam-Status: No, SCORE=-1.22 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC0_MJ615, MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words 0.20 BSF_SC0_MJ615 Custom Rule MJ615 X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17340 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs From: Dave Chinner The most complicated part here is the lastrec tracking for the alloc btree. Most logic is in the update_lastrec method which has to do some hopefully good enough dirty magic to maintain it. [hch: split out from bigger patch and a rework of the lastrec logic] Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-02 02:55:30.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-02 02:57:08.000000000 +0200 @@ -868,6 +868,30 @@ xfs_btree_get_sibling( } } +/* + * Return true if ptr is the last record in the btree and + * we need to track updateѕ to this record. The decision + * will be further refined in the update_lastrec method. + */ +STATIC int +xfs_btree_is_lastrec( + struct xfs_btree_cur *cur, + struct xfs_btree_block *block, + int level) +{ + union xfs_btree_ptr ptr; + + if (level > 0) + return 0; + if (!(cur->bc_flags & XFS_BTREE_LASTREC_UPDATE)) + return 0; + + xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB); + if (!xfs_btree_ptr_is_null(cur, &ptr)) + return 0; + return 1; +} + STATIC xfs_daddr_t xfs_btree_ptr_to_daddr( struct xfs_btree_cur *cur, @@ -1419,3 +1443,66 @@ xfs_btree_updkey( XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); return 0; } + +/* + * Update the record referred to by cur to the value in the + * given record. This either works (return 0) or gets an + * EFSCORRUPTED error. + */ +int +xfs_btree_update( + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec) +{ + struct xfs_btree_block *block; + struct xfs_buf *bp; + int error; + int ptr; + union xfs_btree_rec *rp; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGR(cur, rec); + + /* Pick up the current block. */ + block = xfs_btree_get_block(cur, 0, &bp); + +#ifdef DEBUG + error = xfs_btree_check_block(cur, block, 0, bp); + if (error) + goto error0; +#endif + /* Get the address of the rec to be updated. */ + ptr = cur->bc_ptrs[0]; + rp = cur->bc_ops->rec_addr(cur, ptr, block); + + /* Fill in the new contents and log them. */ + cur->bc_ops->copy_recs(cur, rec, rp, 1); + cur->bc_ops->log_recs(cur, bp, ptr, ptr); + + /* + * If we are tracking the last record in the tree and + * we are at the far right edge of the tree, update it. + */ + if (xfs_btree_is_lastrec(cur, block, 0)) { + cur->bc_ops->update_lastrec(cur, block, rec, + ptr, LASTREC_UPDATE); + } + + /* Updating first rec in leaf. Pass new key value up to our parent. */ + if (ptr == 1) { + union xfs_btree_key key; + + cur->bc_ops->init_key_from_rec(cur, &key, rec); + error = xfs_btree_updkey(cur, &key, 1); + if (error) + goto error0; + } + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-08-02 02:55:30.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-08-02 02:58:05.000000000 +0200 @@ -49,7 +49,9 @@ STATIC void xfs_allocbt_log_keys(xfs_btr #define xfs_alloc_log_keys(c, b, i, j) \ xfs_allocbt_log_keys(c, b, i, j) STATIC void xfs_alloc_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC void xfs_alloc_log_recs(xfs_btree_cur_t *, xfs_buf_t *, int, int); +STATIC void xfs_allocbt_log_recs(xfs_btree_cur_t *, xfs_buf_t *, int, int); +#define xfs_alloc_log_recs(c, b, i, j) \ + xfs_allocbt_log_recs(c, b, i, j) STATIC int xfs_alloc_lshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_alloc_newroot(xfs_btree_cur_t *, int *); STATIC int xfs_alloc_rshift(xfs_btree_cur_t *, int, int *); @@ -883,41 +885,6 @@ xfs_alloc_log_ptrs( } /* - * Log records from a btree block (leaf). - */ -STATIC void -xfs_alloc_log_recs( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int rfirst, /* index of first record to log */ - int rlast) /* index of last record to log */ -{ - xfs_alloc_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - int last; /* last byte offset logged */ - xfs_alloc_rec_t *rp; /* record pointer for btree block */ - - - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - rp = XFS_ALLOC_REC_ADDR(block, 1, cur); -#ifdef DEBUG - { - xfs_agf_t *agf; - xfs_alloc_rec_t *p; - - agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); - for (p = &rp[rfirst - 1]; p <= &rp[rlast - 1]; p++) - ASSERT(be32_to_cpu(p->ar_startblock) + - be32_to_cpu(p->ar_blockcount) <= - be32_to_cpu(agf->agf_length)); - } -#endif - first = (int)((xfs_caddr_t)&rp[rfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&rp[rlast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); -} - -/* * Move 1 record left from cur/level if possible. * Update cur to reflect the new path. */ @@ -1642,83 +1609,50 @@ xfs_alloc_insert( return 0; } +STATIC struct xfs_btree_cur * +xfs_allocbt_dup_cursor( + struct xfs_btree_cur *cur) +{ + return xfs_allocbt_init_cursor(cur->bc_mp, cur->bc_tp, + cur->bc_private.a.agbp, cur->bc_private.a.agno, + cur->bc_btnum); +} + /* - * Update the record referred to by cur, to the value given by [bno, len]. - * This either works (return 0) or gets an EFSCORRUPTED error. + * Update the longest extent in the AGF */ -int /* error */ -xfs_alloc_update( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agblock_t bno, /* starting block of extent */ - xfs_extlen_t len) /* length of extent */ +STATIC void +xfs_allocbt_update_lastrec( + struct xfs_btree_cur *cur, + struct xfs_btree_block *block, + union xfs_btree_rec *rec, + int ptr, + int reason) { - xfs_alloc_block_t *block; /* btree block to update */ - int error; /* error return value */ - int ptr; /* current record number (updating) */ + struct xfs_agf *agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); + xfs_agnumber_t seqno = be32_to_cpu(agf->agf_seqno); + __be32 len; - ASSERT(len > 0); - /* - * Pick up the a.g. freelist struct and the current block. - */ - block = XFS_BUF_TO_ALLOC_BLOCK(cur->bc_bufs[0]); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, 0, cur->bc_bufs[0]))) - return error; -#endif - /* - * Get the address of the rec to be updated. - */ - ptr = cur->bc_ptrs[0]; - { - xfs_alloc_rec_t *rp; /* pointer to updated record */ + ASSERT(cur->bc_btnum == XFS_BTNUM_CNT); - rp = XFS_ALLOC_REC_ADDR(block, ptr, cur); + switch (reason) { + case LASTREC_UPDATE: /* - * Fill in the new contents and log them. + * If this is the last leaf block and it's the last record, + * then update the size of the longest extent in the AG. */ - rp->ar_startblock = cpu_to_be32(bno); - rp->ar_blockcount = cpu_to_be32(len); - xfs_alloc_log_recs(cur, cur->bc_bufs[0], ptr, ptr); - } - /* - * If it's the by-size btree and it's the last leaf block and - * it's the last record... then update the size of the longest - * extent in the a.g., which we cache in the a.g. freelist header. - */ - if (cur->bc_btnum == XFS_BTNUM_CNT && - be32_to_cpu(block->bb_rightsib) == NULLAGBLOCK && - ptr == be16_to_cpu(block->bb_numrecs)) { - xfs_agf_t *agf; /* a.g. freespace header */ - xfs_agnumber_t seqno; - - agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); - seqno = be32_to_cpu(agf->agf_seqno); - cur->bc_mp->m_perag[seqno].pagf_longest = len; - agf->agf_longest = cpu_to_be32(len); - xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, - XFS_AGF_LONGEST); - } - /* - * Updating first record in leaf. Pass new key value up to our parent. - */ - if (ptr == 1) { - xfs_alloc_key_t key; /* key containing [bno, len] */ - - key.ar_startblock = cpu_to_be32(bno); - key.ar_blockcount = cpu_to_be32(len); - if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)&key, 1))) - return error; + if (ptr != xfs_btree_get_numrecs(block)) + return; + len = rec->alloc.ar_blockcount; + break; + default: + ASSERT(0); + return; } - return 0; -} -STATIC struct xfs_btree_cur * -xfs_allocbt_dup_cursor( - struct xfs_btree_cur *cur) -{ - return xfs_allocbt_init_cursor(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agbp, cur->bc_private.a.agno, - cur->bc_btnum); + agf->agf_longest = len; + cur->bc_mp->m_perag[seqno].pagf_longest = be32_to_cpu(len); + xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, XFS_AGF_LONGEST); } STATIC void @@ -1805,6 +1739,17 @@ xfs_allocbt_copy_keys( memcpy(dst_key, src_key, numkeys * sizeof(xfs_alloc_key_t)); } +STATIC void +xfs_allocbt_copy_recs( + struct xfs_btree_cur *cur, + union xfs_btree_rec *src_rec, + union xfs_btree_rec *dst_rec, + int numrecs) +{ + ASSERT(numrecs >= 0); + memcpy(dst_rec, src_rec, numrecs * sizeof(xfs_alloc_rec_t)); +} + /* * Log keys from a btree block (nonleaf). */ @@ -1831,6 +1776,54 @@ xfs_allocbt_log_keys( XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); } +#ifdef DEBUG +STATIC void +xfs_alloc_log_recs_check( + struct xfs_btree_cur *cur, + xfs_alloc_rec_t *rp, + int first, + int last) +{ + struct xfs_agf *agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); + xfs_alloc_rec_t *p; + + for (p = &rp[first - 1]; p <= &rp[last - 1]; p++) { + ASSERT(be32_to_cpu(p->ar_startblock) + + be32_to_cpu(p->ar_blockcount) <= + be32_to_cpu(agf->agf_length)); + } +} +#else +#define xfs_alloc_log_recs_check(cur, bp, first, last) +#endif + +/* + * Log records from a btree block (leaf). + */ +STATIC void +xfs_alloc_log_recs( + struct xfs_btree_cur *cur, /* btree cursor */ + struct xfs_buf *bp, /* buffer containing btree block */ + int first, /* index of first record to log */ + int last) /* index of last record to log */ +{ + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_alloc_rec_t *rp = XFS_ALLOC_REC_ADDR(block, 1, cur); + char *baddr = (char *)block; + int start; /* first byte offset logged */ + int end; /* last byte offset logged */ + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, first, last); + + xfs_alloc_log_recs_check(cur, rp, first, last); + + start = (char *)&rp[first - 1] - baddr; + end = (char *)&rp[last] - 1 - baddr; + xfs_trans_log_buf(cur->bc_tp, bp, start, end); + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); +} #ifdef XFS_BTREE_TRACE @@ -1901,6 +1894,7 @@ xfs_allocbt_trace_record( static const struct xfs_btree_ops xfs_allocbt_ops = { .dup_cursor = xfs_allocbt_dup_cursor, + .update_lastrec = xfs_allocbt_update_lastrec, .init_key_from_rec = xfs_allocbt_init_key_from_rec, .init_ptr_from_cur = xfs_allocbt_init_ptr_from_cur, .ptr_addr = xfs_allocbt_ptr_addr, @@ -1908,7 +1902,9 @@ static const struct xfs_btree_ops xfs_al .rec_addr = xfs_allocbt_rec_addr, .key_diff = xfs_allocbt_key_diff, .copy_keys = xfs_allocbt_copy_keys, + .copy_recs = xfs_allocbt_copy_recs, .log_keys = xfs_allocbt_log_keys, + .log_recs = xfs_allocbt_log_recs, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_allocbt_trace_enter, @@ -1943,6 +1939,8 @@ xfs_allocbt_init_cursor( cur->bc_blocklog = mp->m_sb.sb_blocklog; cur->bc_ops = &xfs_allocbt_ops; + if (btnum == XFS_BTNUM_CNT) + cur->bc_flags = XFS_BTREE_LASTREC_UPDATE; cur->bc_private.a.agbp = agbp; cur->bc_private.a.agno = agno; Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-02 02:55:30.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-02 02:57:08.000000000 +0200 @@ -191,6 +191,12 @@ struct xfs_btree_ops { /* get inode rooted btree root */ struct xfs_btree_block *(*get_root_from_inode)(struct xfs_btree_cur *); + /* updated last record information */ + void (*update_lastrec)(struct xfs_btree_cur *cur, + struct xfs_btree_block *block, + union xfs_btree_rec *rec, + int ptr, int reason); + /* return address of btree structures */ union xfs_btree_ptr *(*ptr_addr)(struct xfs_btree_cur *cur, int index, struct xfs_btree_block *block); @@ -214,11 +220,15 @@ struct xfs_btree_ops { void (*copy_keys)(struct xfs_btree_cur *cur, union xfs_btree_key *src_key, union xfs_btree_key *dst_key, int numkeys); + void (*copy_recs)(struct xfs_btree_cur *cur, + union xfs_btree_rec *src_rec, + union xfs_btree_rec *dst_rec, int numkeys); /* log changes to btree structures */ void (*log_keys)(struct xfs_btree_cur *cur, struct xfs_buf *bp, int first, int last); - + void (*log_recs)(struct xfs_btree_cur *cur, struct xfs_buf *bp, + int first, int last); /* btree tracing */ #ifdef XFS_BTREE_TRACE @@ -241,6 +251,12 @@ struct xfs_btree_ops { }; /* + * Reasons for the update_lastrec method to be called. + */ +#define LASTREC_UPDATE 0 + + +/* * Btree cursor structure. * This collects all information needed by the btree code in one place. */ @@ -284,6 +300,7 @@ typedef struct xfs_btree_cur /* cursor flags */ #define XFS_BTREE_LONG_PTRS (1<<0) /* pointers are 64bits long */ #define XFS_BTREE_ROOT_IN_INODE (1<<1) /* root may be variable size */ +#define XFS_BTREE_LASTREC_UPDATE (1<<2) /* track last rec externally */ #define XFS_BTREE_NOERROR 0 @@ -539,6 +556,7 @@ int xfs_btree_increment(struct xfs_btree int xfs_btree_decrement(struct xfs_btree_cur *, int, int *); int xfs_btree_lookup(struct xfs_btree_cur *, xfs_lookup_t, int *); int xfs_btree_updkey(struct xfs_btree_cur *, union xfs_btree_key *, int); +int xfs_btree_update(struct xfs_btree_cur *, union xfs_btree_rec *); /* Index: linux-2.6-xfs/fs/xfs/xfs_alloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc.c 2008-08-02 02:53:26.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc.c 2008-08-02 02:57:04.000000000 +0200 @@ -136,6 +136,23 @@ xfs_alloc_lookup_le( return xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); } +/* + * Update the record referred to by cur to the value given + * by [bno, len]. + * This either works (return 0) or gets an EFSCORRUPTED error. + */ +STATIC int /* error */ +xfs_alloc_update( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_agblock_t bno, /* starting block of extent */ + xfs_extlen_t len) /* length of extent */ +{ + union xfs_btree_rec rec; + + rec.alloc.ar_startblock = cpu_to_be32(bno); + rec.alloc.ar_blockcount = cpu_to_be32(len); + return xfs_btree_update(cur, &rec); +} /* * Compute aligned version of the found extent. Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.h 2008-08-02 02:53:26.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h 2008-08-02 02:57:04.000000000 +0200 @@ -113,13 +113,6 @@ extern int xfs_alloc_get_rec(struct xfs_ */ extern int xfs_alloc_insert(struct xfs_btree_cur *cur, int *stat); -/* - * Update the record referred to by cur, to the value given by [bno, len]. - * This either works (return 0) or gets an EFSCORRUPTED error. - */ -extern int xfs_alloc_update(struct xfs_btree_cur *cur, xfs_agblock_t bno, - xfs_extlen_t len); - extern struct xfs_btree_cur *xfs_allocbt_init_cursor(struct xfs_mount *, struct xfs_trans *, struct xfs_buf *, Index: linux-2.6-xfs/fs/xfs/xfs_ialloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc.c 2008-08-02 02:53:26.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc.c 2008-08-02 02:57:04.000000000 +0200 @@ -172,6 +172,26 @@ xfs_inobt_lookup_le( } /* + * Update the record referred to by cur to the value given + * by [ino, fcnt, free]. + * This either works (return 0) or gets an EFSCORRUPTED error. + */ +STATIC int /* error */ +xfs_inobt_update( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_agino_t ino, /* starting inode of chunk */ + __int32_t fcnt, /* free inode count */ + xfs_inofree_t free) /* free inode mask */ +{ + union xfs_btree_rec rec; + + rec.inobt.ir_startino = cpu_to_be32(ino); + rec.inobt.ir_freecount = cpu_to_be32(fcnt); + rec.inobt.ir_free = cpu_to_be64(free); + return xfs_btree_update(cur, &rec); +} + +/* * Allocate new inodes in the allocation group specified by agbp. * Return 0 for success, else error code. */ Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-08-02 02:55:30.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-08-02 02:58:27.000000000 +0200 @@ -785,28 +785,6 @@ xfs_inobt_log_ptrs( } /* - * Log records from a btree block (leaf). - */ -STATIC void -xfs_inobt_log_recs( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int rfirst, /* index of first record to log */ - int rlast) /* index of last record to log */ -{ - xfs_inobt_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - int last; /* last byte offset logged */ - xfs_inobt_rec_t *rp; /* record pointer for btree block */ - - block = XFS_BUF_TO_INOBT_BLOCK(bp); - rp = XFS_INOBT_REC_ADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&rp[rfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&rp[rlast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); -} - -/* * Move 1 record left from cur/level if possible. * Update cur to reflect the new path. */ @@ -1530,58 +1508,6 @@ xfs_inobt_insert( return 0; } -/* - * Update the record referred to by cur, to the value given - * by [ino, fcnt, free]. - * This either works (return 0) or gets an EFSCORRUPTED error. - */ -int /* error */ -xfs_inobt_update( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agino_t ino, /* starting inode of chunk */ - __int32_t fcnt, /* free inode count */ - xfs_inofree_t free) /* free inode mask */ -{ - xfs_inobt_block_t *block; /* btree block to update */ - xfs_buf_t *bp; /* buffer containing btree block */ - int error; /* error return value */ - int ptr; /* current record number (updating) */ - xfs_inobt_rec_t *rp; /* pointer to updated record */ - - /* - * Pick up the current block. - */ - bp = cur->bc_bufs[0]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, 0, bp))) - return error; -#endif - /* - * Get the address of the rec to be updated. - */ - ptr = cur->bc_ptrs[0]; - rp = XFS_INOBT_REC_ADDR(block, ptr, cur); - /* - * Fill in the new contents and log them. - */ - rp->ir_startino = cpu_to_be32(ino); - rp->ir_freecount = cpu_to_be32(fcnt); - rp->ir_free = cpu_to_be64(free); - xfs_inobt_log_recs(cur, bp, ptr, ptr); - /* - * Updating first record in leaf. Pass new key value up to our parent. - */ - if (ptr == 1) { - xfs_inobt_key_t key; /* key containing [ino] */ - - key.ir_startino = cpu_to_be32(ino); - if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)&key, 1))) - return error; - } - return 0; -} - STATIC struct xfs_btree_cur * xfs_inobt_dup_cursor( struct xfs_btree_cur *cur) @@ -1661,6 +1587,17 @@ xfs_inobt_copy_keys( memcpy(dst_key, src_key, numkeys * sizeof(xfs_inobt_key_t)); } +STATIC void +xfs_inobt_copy_recs( + struct xfs_btree_cur *cur, + union xfs_btree_rec *src_rec, + union xfs_btree_rec *dst_rec, + int numrecs) +{ + ASSERT(numrecs >= 0); + memcpy(dst_rec, src_rec, numrecs * sizeof(xfs_inobt_rec_t)); +} + /* * Log keys from a btree block (nonleaf). */ @@ -1687,6 +1624,33 @@ xfs_inobt_log_keys( XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); } +/* + * Log records from a btree block (leaf). + */ +STATIC void +xfs_inobt_log_recs( + struct xfs_btree_cur *cur, /* btree cursor */ + struct xfs_buf *bp, /* buffer containing btree block */ + int first, /* index of first record to log */ + int last) /* index of last record to log */ +{ + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_inobt_rec_t *rp = XFS_INOBT_REC_ADDR(block, 1, cur); + char *baddr = (char *)block; + int start; /* first byte offset logged */ + int end; /* last byte offset logged */ + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, first, last); + + start = (char *)&rp[first - 1] - baddr; + end = (char *)&rp[last] - 1 - baddr; + xfs_trans_log_buf(cur->bc_tp, bp, start, end); + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); +} + + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_inobt_trace_buf; @@ -1763,7 +1727,9 @@ static const struct xfs_btree_ops xfs_in .rec_addr = xfs_inobt_rec_addr, .key_diff = xfs_inobt_key_diff, .copy_keys = xfs_inobt_copy_keys, + .copy_recs = xfs_inobt_copy_recs, .log_keys = xfs_inobt_log_keys, + .log_recs = xfs_inobt_log_recs, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_inobt_trace_enter, Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.h 2008-08-02 02:53:26.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h 2008-08-02 02:57:04.000000000 +0200 @@ -135,14 +135,6 @@ extern int xfs_inobt_get_rec(struct xfs_ */ extern int xfs_inobt_insert(struct xfs_btree_cur *cur, int *stat); -/* - * Update the record referred to by cur, to the value given - * by [ino, fcnt, free]. - * This either works (return 0) or gets an EFSCORRUPTED error. - */ -extern int xfs_inobt_update(struct xfs_btree_cur *cur, xfs_agino_t ino, - __int32_t fcnt, xfs_inofree_t free); - extern struct xfs_btree_cur *xfs_inobt_init_cursor(struct xfs_mount *, struct xfs_trans *, struct xfs_buf *, xfs_agnumber_t); Index: linux-2.6-xfs/fs/xfs/xfs_bmap.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap.c 2008-08-02 02:53:26.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap.c 2008-08-02 02:57:04.000000000 +0200 @@ -430,6 +430,24 @@ xfs_bmbt_lookup_ge( return xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); } +/* +* Update the record referred to by cur to the value given + * by [off, bno, len, state]. + * This either works (return 0) or gets an EFSCORRUPTED error. + */ +STATIC int +xfs_bmbt_update( + struct xfs_btree_cur *cur, + xfs_fileoff_t off, + xfs_fsblock_t bno, + xfs_filblks_t len, + xfs_exntst_t state) +{ + union xfs_btree_rec rec; + + xfs_bmbt_disk_set_allf(&rec.bmbt, off, bno, len, state); + return xfs_btree_update(cur, &rec); +} /* * Called from xfs_bmap_add_attrfork to handle btree format files. Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-02 02:55:30.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-02 02:57:36.000000000 +0200 @@ -1567,34 +1567,6 @@ xfs_bmbt_log_block( } /* - * Log record values from the btree block. - */ -void -xfs_bmbt_log_recs( - xfs_btree_cur_t *cur, - xfs_buf_t *bp, - int rfirst, - int rlast) -{ - xfs_bmbt_block_t *block; - int first; - int last; - xfs_bmbt_rec_t *rp; - xfs_trans_t *tp; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGBII(cur, bp, rfirst, rlast); - ASSERT(bp); - tp = cur->bc_tp; - block = XFS_BUF_TO_BMBT_BLOCK(bp); - rp = XFS_BMAP_REC_DADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&rp[rfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&rp[rlast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(tp, bp, first, last); - XFS_BMBT_TRACE_CURSOR(cur, EXIT); -} - -/* * Give the bmap btree a new root block. Copy the old broot contents * down into a real block and make the broot point to it. */ @@ -1928,51 +1900,6 @@ xfs_bmbt_to_bmdr( } /* - * Update the record to the passed values. - */ -int -xfs_bmbt_update( - xfs_btree_cur_t *cur, - xfs_fileoff_t off, - xfs_fsblock_t bno, - xfs_filblks_t len, - xfs_exntst_t state) -{ - xfs_bmbt_block_t *block; - xfs_buf_t *bp; - int error; - xfs_bmbt_key_t key; - int ptr; - xfs_bmbt_rec_t *rp; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGFFFI(cur, (xfs_dfiloff_t)off, (xfs_dfsbno_t)bno, - (xfs_dfilblks_t)len, (int)state); - block = xfs_bmbt_get_block(cur, 0, &bp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, 0, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - ptr = cur->bc_ptrs[0]; - rp = XFS_BMAP_REC_IADDR(block, ptr, cur); - xfs_bmbt_disk_set_allf(rp, off, bno, len, state); - xfs_bmbt_log_recs(cur, bp, ptr, ptr); - if (ptr > 1) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - return 0; - } - key.br_startoff = cpu_to_be64(off); - if ((error = xfs_btree_updkey(cur, (union xfs_btree_key *)&key, 1))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - return 0; -} - -/* * Check extent records, which have just been read, for * any bit in the extent flag field. ASSERT on debug * kernels, as this condition should not occur. @@ -2094,6 +2021,17 @@ xfs_bmbt_copy_keys( memcpy(dst_key, src_key, numkeys * sizeof(xfs_bmbt_key_t)); } +STATIC void +xfs_bmbt_copy_recs( + struct xfs_btree_cur *cur, + union xfs_btree_rec *src_rec, + union xfs_btree_rec *dst_rec, + int numrecs) +{ + ASSERT(numrecs >= 0); + memcpy(dst_rec, src_rec, numrecs * sizeof(xfs_bmbt_rec_t)); +} + /* * Log key values from the btree block. */ @@ -2125,6 +2063,32 @@ xfs_bmbt_log_keys( XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); } +/* + * Log record values from the btree block. + */ +void +xfs_bmbt_log_recs( + struct xfs_btree_cur *cur, + struct xfs_buf *bp, + int first, + int last) +{ + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_bmbt_rec_t *rp = XFS_BMAP_REC_DADDR(block, 1, cur); + char *baddr = (char *)block; + int start; + int end; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, first, last); + + start = (char *)&rp[first - 1] - baddr; + end = (char *)&rp[last] - 1 - baddr; + xfs_trans_log_buf(cur->bc_tp, bp, start, end); + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); +} + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_bmbt_trace_buf; @@ -2221,7 +2185,9 @@ static const struct xfs_btree_ops xfs_bm .rec_addr = xfs_bmbt_rec_addr, .key_diff = xfs_bmbt_key_diff, .copy_keys = xfs_bmbt_copy_keys, + .copy_recs = xfs_bmbt_copy_recs, .log_keys = xfs_bmbt_log_keys, + .log_recs = xfs_bmbt_log_recs, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_bmbt_trace_enter, Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.h 2008-08-02 02:53:26.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h 2008-08-02 02:57:04.000000000 +0200 @@ -274,8 +274,6 @@ extern void xfs_bmbt_disk_set_allf(xfs_b xfs_fsblock_t b, xfs_filblks_t c, xfs_exntst_t v); extern void xfs_bmbt_to_bmdr(xfs_bmbt_block_t *, int, xfs_bmdr_block_t *, int); -extern int xfs_bmbt_update(struct xfs_btree_cur *, xfs_fileoff_t, - xfs_fsblock_t, xfs_filblks_t, xfs_exntst_t); extern struct xfs_btree_cur *xfs_bmbt_init_cursor(struct xfs_mount *, struct xfs_trans *, struct xfs_inode *, int); -- From owner-xfs@oss.sgi.com Sun Aug 3 18:31:30 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:31:32 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741VU0U030453 for ; Sun, 3 Aug 2008 18:31:30 -0700 X-ASG-Debug-ID: 1217813562-739901a30000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3E8693566B0 for ; Sun, 3 Aug 2008 18:32:42 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id 7QWqzMbP0KQFphDC for ; Sun, 03 Aug 2008 18:32:42 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741WiIF008922 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:32:44 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741WidF008920 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:32:44 +0200 Date: Mon, 4 Aug 2008 03:32:44 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 04/26] make btree root in inode support generic Subject: [PATCH 04/26] make btree root in inode support generic Message-ID: <20080804013244.GE8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-btree-generic-inode-root-support User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813564 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17329 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs The bmap btree is rooted in the inode and not in a single disk block. Make the support for this feature more generic by: (a) adding a btree flag to check for instead of the XFS_BTNUM_BMAP type (b) add a get_root_from_inode btree operation to get a btree block from the inode, with the implementation details left to the individual btree. Also clean up xfs_btree_get_block, which is the place these two features are used. The XFS_BTREE_ROOT_IN_INODE is based upon a patch from Dave Chinner. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-02 04:05:35.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-02 04:05:37.000000000 +0200 @@ -2630,8 +2630,19 @@ xfs_bmbt_dup_cursor( return new; } +STATIC struct xfs_btree_block * +xfs_bmbt_get_root_from_inode( + struct xfs_btree_cur *cur) +{ + struct xfs_ifork *ifp; + + ifp = XFS_IFORK_PTR(cur->bc_private.b.ip, cur->bc_private.b.whichfork); + return (struct xfs_btree_block *)ifp->if_broot; +} + static const struct xfs_btree_ops xfs_bmbt_ops = { .dup_cursor = xfs_bmbt_dup_cursor, + .get_root_from_inode = xfs_bmbt_get_root_from_inode, }; /* @@ -2656,6 +2667,7 @@ xfs_bmbt_init_cursor( cur->bc_blocklog = mp->m_sb.sb_blocklog; cur->bc_ops = &xfs_bmbt_ops; + cur->bc_flags = XFS_BTREE_ROOT_IN_INODE; cur->bc_private.b.forksize = XFS_IFORK_SIZE(ip, whichfork); cur->bc_private.b.ip = ip; Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-02 04:05:35.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-02 04:05:37.000000000 +0200 @@ -425,29 +425,20 @@ xfs_btree_dup_cursor( * Retrieve the block pointer from the cursor at the given level. * This may be a bmap btree root or from a buffer. */ -STATIC xfs_btree_block_t * /* generic btree block pointer */ +STATIC struct xfs_btree_block * /* generic btree block pointer */ xfs_btree_get_block( - xfs_btree_cur_t *cur, /* btree cursor */ + struct xfs_btree_cur *cur, /* btree cursor */ int level, /* level in btree */ - xfs_buf_t **bpp) /* buffer containing the block */ + struct xfs_buf **bpp) /* buffer containing the block */ { - xfs_btree_block_t *block; /* return value */ - xfs_buf_t *bp; /* return buffer */ - xfs_ifork_t *ifp; /* inode fork pointer */ - int whichfork; /* data or attr fork */ - - if (cur->bc_btnum == XFS_BTNUM_BMAP && level == cur->bc_nlevels - 1) { - whichfork = cur->bc_private.b.whichfork; - ifp = XFS_IFORK_PTR(cur->bc_private.b.ip, whichfork); - block = (xfs_btree_block_t *)ifp->if_broot; - bp = NULL; - } else { - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_BLOCK(bp); + if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) && + (level == cur->bc_nlevels - 1)) { + *bpp = NULL; + return cur->bc_ops->get_root_from_inode(cur); } - ASSERT(block != NULL); - *bpp = bp; - return block; + + *bpp = cur->bc_bufs[level]; + return XFS_BUF_TO_BLOCK(*bpp); } /* Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-02 04:05:36.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-02 04:05:37.000000000 +0200 @@ -159,6 +159,9 @@ extern const __uint32_t xfs_magics[]; struct xfs_btree_ops { /* cursor operations */ struct xfs_btree_cur *(*dup_cursor)(struct xfs_btree_cur *); + + /* get inode rooted btree root */ + struct xfs_btree_block *(*get_root_from_inode)(struct xfs_btree_cur *); }; /* @@ -170,6 +173,7 @@ typedef struct xfs_btree_cur struct xfs_trans *bc_tp; /* transaction we're in, if any */ struct xfs_mount *bc_mp; /* file system mount struct */ const struct xfs_btree_ops *bc_ops; + uint bc_flags; /* btree features - below */ union { xfs_alloc_rec_incore_t a; xfs_bmbt_irec_t b; @@ -201,6 +205,10 @@ typedef struct xfs_btree_cur } bc_private; /* per-btree type data */ } xfs_btree_cur_t; +/* cursor flags */ +#define XFS_BTREE_ROOT_IN_INODE (1<<1) /* root may be variable size */ + + #define XFS_BTREE_NOERROR 0 #define XFS_BTREE_ERROR 1 -- From owner-xfs@oss.sgi.com Sun Aug 3 18:32:13 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:32:17 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741WD2S031032 for ; Sun, 3 Aug 2008 18:32:13 -0700 X-ASG-Debug-ID: 1217813605-5a0302f50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 866A23566BC for ; Sun, 3 Aug 2008 18:33:26 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id 8FXtPsBQXAztlsCN for ; Sun, 03 Aug 2008 18:33:26 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741XRIF009018 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Mon, 4 Aug 2008 03:33:27 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741XR6c009016; Mon, 4 Aug 2008 03:33:27 +0200 Date: Mon, 4 Aug 2008 03:33:27 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Dave Chinner X-ASG-Orig-Subj: [PATCH 08/26] add new btree statistics Subject: [PATCH 08/26] add new btree statistics Message-ID: <20080804013327.GI8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-btree-stats User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813607 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -0.92 X-Barracuda-Spam-Status: No, SCORE=-0.92 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC0_SA081 X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 1.10 BSF_SC0_SA081 Custom Rule SA081 X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17333 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs From: Dave Chinner Introduce statistics coverage of all the btrees and cover all the btree operations, not just some. Invaluable for determining test code coverage of all the btree operations.... Signed-off-by: Dave Chinner Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_stats.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_stats.c 2008-07-14 17:24:05.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_stats.c 2008-07-14 17:29:17.000000000 +0200 @@ -53,6 +53,10 @@ xfs_read_xfsstats( { "icluster", XFSSTAT_END_INODE_CLUSTER }, { "vnodes", XFSSTAT_END_VNODE_OPS }, { "buf", XFSSTAT_END_BUF }, + { "abtb2", XFSSTAT_END_ABTB_V2 }, + { "abtc2", XFSSTAT_END_ABTC_V2 }, + { "bmbt2", XFSSTAT_END_BMBT_V2 }, + { "ibt2", XFSSTAT_END_IBT_V2 }, }; /* Loop over all stats groups */ Index: linux-2.6-xfs/fs/xfs/linux-2.6/xfs_stats.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/linux-2.6/xfs_stats.h 2008-07-14 17:24:05.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/linux-2.6/xfs_stats.h 2008-07-14 17:29:17.000000000 +0200 @@ -118,6 +118,71 @@ struct xfsstats { __uint32_t xb_page_retries; __uint32_t xb_page_found; __uint32_t xb_get_read; +/* Version 2 btree counters */ +#define XFSSTAT_END_ABTB_V2 (XFSSTAT_END_BUF+15) + __uint32_t xs_abtb_2_lookup; + __uint32_t xs_abtb_2_compare; + __uint32_t xs_abtb_2_insrec; + __uint32_t xs_abtb_2_delrec; + __uint32_t xs_abtb_2_newroot; + __uint32_t xs_abtb_2_killroot; + __uint32_t xs_abtb_2_increment; + __uint32_t xs_abtb_2_decrement; + __uint32_t xs_abtb_2_lshift; + __uint32_t xs_abtb_2_rshift; + __uint32_t xs_abtb_2_split; + __uint32_t xs_abtb_2_join; + __uint32_t xs_abtb_2_alloc; + __uint32_t xs_abtb_2_free; + __uint32_t xs_abtb_2_moves; +#define XFSSTAT_END_ABTC_V2 (XFSSTAT_END_ABTB_V2+15) + __uint32_t xs_abtc_2_lookup; + __uint32_t xs_abtc_2_compare; + __uint32_t xs_abtc_2_insrec; + __uint32_t xs_abtc_2_delrec; + __uint32_t xs_abtc_2_newroot; + __uint32_t xs_abtc_2_killroot; + __uint32_t xs_abtc_2_increment; + __uint32_t xs_abtc_2_decrement; + __uint32_t xs_abtc_2_lshift; + __uint32_t xs_abtc_2_rshift; + __uint32_t xs_abtc_2_split; + __uint32_t xs_abtc_2_join; + __uint32_t xs_abtc_2_alloc; + __uint32_t xs_abtc_2_free; + __uint32_t xs_abtc_2_moves; +#define XFSSTAT_END_BMBT_V2 (XFSSTAT_END_ABTC_V2+15) + __uint32_t xs_bmbt_2_lookup; + __uint32_t xs_bmbt_2_compare; + __uint32_t xs_bmbt_2_insrec; + __uint32_t xs_bmbt_2_delrec; + __uint32_t xs_bmbt_2_newroot; + __uint32_t xs_bmbt_2_killroot; + __uint32_t xs_bmbt_2_increment; + __uint32_t xs_bmbt_2_decrement; + __uint32_t xs_bmbt_2_lshift; + __uint32_t xs_bmbt_2_rshift; + __uint32_t xs_bmbt_2_split; + __uint32_t xs_bmbt_2_join; + __uint32_t xs_bmbt_2_alloc; + __uint32_t xs_bmbt_2_free; + __uint32_t xs_bmbt_2_moves; +#define XFSSTAT_END_IBT_V2 (XFSSTAT_END_BMBT_V2+15) + __uint32_t xs_ibt_2_lookup; + __uint32_t xs_ibt_2_compare; + __uint32_t xs_ibt_2_insrec; + __uint32_t xs_ibt_2_delrec; + __uint32_t xs_ibt_2_newroot; + __uint32_t xs_ibt_2_killroot; + __uint32_t xs_ibt_2_increment; + __uint32_t xs_ibt_2_decrement; + __uint32_t xs_ibt_2_lshift; + __uint32_t xs_ibt_2_rshift; + __uint32_t xs_ibt_2_split; + __uint32_t xs_ibt_2_join; + __uint32_t xs_ibt_2_alloc; + __uint32_t xs_ibt_2_free; + __uint32_t xs_ibt_2_moves; /* Extra precision counters */ __uint64_t xs_xstrat_bytes; __uint64_t xs_write_bytes; Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-07-14 17:28:45.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-07-14 17:29:17.000000000 +0200 @@ -126,6 +126,34 @@ union xfs_btree_rec { extern const __uint32_t xfs_magics[]; /* + * Generic stats interface + */ +#define __XFS_BTREE_STATS_INC(type, stat) \ + XFS_STATS_INC(xs_ ## type ## _2_ ## stat) +#define XFS_BTREE_STATS_INC(cur, stat) \ +do { \ + switch (cur->bc_btnum) { \ + case XFS_BTNUM_BNO: __XFS_BTREE_STATS_INC(abtb, stat); break; \ + case XFS_BTNUM_CNT: __XFS_BTREE_STATS_INC(abtc, stat); break; \ + case XFS_BTNUM_BMAP: __XFS_BTREE_STATS_INC(bmbt, stat); break; \ + case XFS_BTNUM_INO: __XFS_BTREE_STATS_INC(ibt, stat); break; \ + case XFS_BTNUM_MAX: ASSERT(0); /* fucking gcc */ ; break; \ + } \ +} while (0) + +#define __XFS_BTREE_STATS_ADD(type, stat, val) \ + XFS_STATS_ADD(xs_ ## type ## _2_ ## stat, val) +#define XFS_BTREE_STATS_ADD(cur, stat, val) \ +do { \ + switch (cur->bc_btnum) { \ + case XFS_BTNUM_BNO: __XFS_BTREE_STATS_ADD(abtb, stat, val); break; \ + case XFS_BTNUM_CNT: __XFS_BTREE_STATS_ADD(abtc, stat, val); break; \ + case XFS_BTNUM_BMAP: __XFS_BTREE_STATS_ADD(bmbt, stat, val); break; \ + case XFS_BTNUM_INO: __XFS_BTREE_STATS_ADD(ibt, stat, val); break; \ + case XFS_BTNUM_MAX: ASSERT(0); /* fucking gcc */ ; break; \ + } \ +} while (0) +/* * Maximum and minimum records in a btree block. * Given block size, type prefix, and leaf flag (0 or 1). * The divisor below is equivalent to lf ? (e1) : (e2) but that produces -- From owner-xfs@oss.sgi.com Sun Aug 3 18:32:32 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:32:39 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_61, J_CHICKENPOX_62,J_CHICKENPOX_63,J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741WVZB031325 for ; Sun, 3 Aug 2008 18:32:31 -0700 X-ASG-Debug-ID: 1217813622-5a0202c50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1257B3566BE for ; Sun, 3 Aug 2008 18:33:43 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id kS6le3E1qqkxPsxS for ; Sun, 03 Aug 2008 18:33:43 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741XiIF009061 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:33:44 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741XhAn009059 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:33:43 +0200 Date: Mon, 4 Aug 2008 03:33:43 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 10/26] implement generic xfs_btree_increment Subject: [PATCH 10/26] implement generic xfs_btree_increment Message-ID: <20080804013343.GK8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-common-btree-increment User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813625 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.22 X-Barracuda-Spam-Status: No, SCORE=-1.22 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC0_MJ615, MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words 0.20 BSF_SC0_MJ615 Custom Rule MJ615 X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17335 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs From: Dave Chinner Because this is the first major generic btree routine this patch includes some infrastrucure, fistly a few routines to deal with a btree block that can be either in short or long form, and secondly xfs_btree_read_buf_block, which is the new central routine to read a btree block given a cursor. [hch: split out from bigger patch and minor adaptions] Signed-off-by: Christoph Hellwig Notes: - xfs_btree_read_buf_block now includes a xfs_btree_check_block, removed duplicate call Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-02 04:05:55.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-02 04:06:20.000000000 +0200 @@ -186,6 +186,10 @@ struct xfs_btree_ops { /* get inode rooted btree root */ struct xfs_btree_block *(*get_root_from_inode)(struct xfs_btree_cur *); + /* return address of btree structures */ + union xfs_btree_ptr *(*ptr_addr)(struct xfs_btree_cur *cur, int index, + struct xfs_btree_block *block); + /* btree tracing */ #ifdef XFS_BTREE_TRACE void (*trace_enter)(struct xfs_btree_cur *, const char *, @@ -497,6 +501,23 @@ xfs_btree_setbuf( int lev, /* level in btree */ struct xfs_buf *bp); /* new buffer to set */ +/* + * Common btree core entry points. + */ + +int xfs_btree_increment(struct xfs_btree_cur *, int, int *); + + +/* + * Helpers. + */ + +static inline int +xfs_btree_get_numrecs(struct xfs_btree_block *block) +{ + return be16_to_cpu(block->bb_numrecs); +} + #endif /* __KERNEL__ */ Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-02 04:05:50.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-02 04:05:57.000000000 +0200 @@ -35,6 +35,7 @@ #include "xfs_dinode.h" #include "xfs_inode.h" #include "xfs_btree.h" +#include "xfs_btree_trace.h" #include "xfs_ialloc.h" #include "xfs_error.h" @@ -829,3 +830,224 @@ xfs_btree_setbuf( cur->bc_ra[lev] |= XFS_BTCUR_RIGHTRA; } } + +STATIC int +xfs_btree_ptr_is_null( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr) +{ + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) + return be64_to_cpu(ptr->l) == NULLFSBLOCK; + else + return be32_to_cpu(ptr->s) == NULLAGBLOCK; +} + +/* + * Get/set/init sibling pointers + */ +STATIC void +xfs_btree_get_sibling( + struct xfs_btree_cur *cur, + struct xfs_btree_block *block, + union xfs_btree_ptr *ptr, + int lr) +{ + ASSERT(lr == XFS_BB_LEFTSIB || lr == XFS_BB_RIGHTSIB); + + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { + if (lr == XFS_BB_RIGHTSIB) + ptr->l = block->bb_u.l.bb_rightsib; + else + ptr->l = block->bb_u.l.bb_leftsib; + } else { + if (lr == XFS_BB_RIGHTSIB) + ptr->s = block->bb_u.s.bb_rightsib; + else + ptr->s = block->bb_u.s.bb_leftsib; + } +} + +STATIC xfs_daddr_t +xfs_btree_ptr_to_daddr( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr) +{ + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { + ASSERT(be64_to_cpu(ptr->l) != NULLFSBLOCK); + + return XFS_FSB_TO_DADDR(cur->bc_mp, be64_to_cpu(ptr->l)); + } else { + ASSERT(cur->bc_private.a.agno != NULLAGNUMBER); + ASSERT(be32_to_cpu(ptr->s) != NULLAGBLOCK); + + return XFS_AGB_TO_DADDR(cur->bc_mp, cur->bc_private.a.agno, + be32_to_cpu(ptr->s)); + } +} + +STATIC void +xfs_btree_set_refs( + struct xfs_btree_cur *cur, + struct xfs_buf *bp) +{ + switch (cur->bc_btnum) { + case XFS_BTNUM_BNO: + case XFS_BTNUM_CNT: + XFS_BUF_SET_VTYPE_REF(*bpp, B_FS_MAP, XFS_ALLOC_BTREE_REF); + break; + case XFS_BTNUM_INO: + XFS_BUF_SET_VTYPE_REF(*bpp, B_FS_INOMAP, XFS_INO_BTREE_REF); + break; + case XFS_BTNUM_BMAP: + XFS_BUF_SET_VTYPE_REF(*bpp, B_FS_MAP, XFS_BMAP_BTREE_REF); + break; + default: + ASSERT(0); + } +} + +/* + * Read in the buffer at the given ptr and return the buffer and + * the block pointer within the buffer. + */ +STATIC int +xfs_btree_read_buf_block( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr, + int level, + int flags, + struct xfs_btree_block **block, + struct xfs_buf **bpp) +{ + struct xfs_mount *mp = cur->bc_mp; + xfs_daddr_t d; + int error; + + /* need to sort out how callers deal with failures first */ + ASSERT(!(flags & XFS_BUF_TRYLOCK)); + + d = xfs_btree_ptr_to_daddr(cur, ptr); + error = xfs_trans_read_buf(mp, cur->bc_tp, mp->m_ddev_targp, d, + mp->m_bsize, flags, bpp); + if (error) + return error; + + ASSERT(*bpp != NULL); + ASSERT(!XFS_BUF_GETERROR(*bpp)); + + xfs_btree_set_refs(cur, *bpp); + *block = XFS_BUF_TO_BLOCK(*bpp); + + error = xfs_btree_check_block(cur, *block, level, *bpp); + if (error) + xfs_trans_brelse(cur->bc_tp, *bpp); + return error; +} + +/* + * Increment cursor by one record at the level. + * For nonzero levels the leaf-ward information is untouched. + */ +int /* error */ +xfs_btree_increment( + struct xfs_btree_cur *cur, + int level, + int *stat) /* success/failure */ +{ + struct xfs_btree_block *block; + union xfs_btree_ptr ptr; + struct xfs_buf *bp; + int error; /* error return value */ + int lev; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, level); + + ASSERT(level < cur->bc_nlevels); + + /* Read-ahead to the right at this level. */ + xfs_btree_readahead(cur, level, XFS_BTCUR_RIGHTRA); + + /* Get a pointer to the btree block. */ + block = xfs_btree_get_block(cur, level, &bp); + +#ifdef DEBUG + error = xfs_btree_check_block(cur, block, level, bp); + if (error) + goto error0; +#endif + + /* We're done if we remain in the block after the increment. */ + if (++cur->bc_ptrs[level] <= xfs_btree_get_numrecs(block)) + goto out1; + + /* Fail if we just went off the right edge of the tree. */ + xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB); + if (xfs_btree_ptr_is_null(cur, &ptr)) + goto out0; + + XFS_BTREE_STATS_INC(cur, increment); + + /* + * March up the tree incrementing pointers. + * Stop when we don't go off the right edge of a block. + */ + for (lev = level + 1; lev < cur->bc_nlevels; lev++) { + block = xfs_btree_get_block(cur, lev, &bp); + +#ifdef DEBUG + error = xfs_btree_check_block(cur, block, lev, bp); + if (error) + goto error0; +#endif + + if (++cur->bc_ptrs[lev] <= xfs_btree_get_numrecs(block)) + break; + + /* Read-ahead the right block for the next loop. */ + xfs_btree_readahead(cur, lev, XFS_BTCUR_RIGHTRA); + } + + /* + * If we went off the root then we are either seriously + * confused or have the tree root in an inode. + */ + if (lev == cur->bc_nlevels) { + if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) + goto out0; + ASSERT(0); + error = EFSCORRUPTED; + goto error0; + } + ASSERT(lev < cur->bc_nlevels); + + /* + * Now walk back down the tree, fixing up the cursor's buffer + * pointers and key numbers. + */ + for (block = xfs_btree_get_block(cur, lev, &bp); lev > level; ) { + union xfs_btree_ptr *ptrp; + + ptrp = cur->bc_ops->ptr_addr(cur, cur->bc_ptrs[lev], block); + error = xfs_btree_read_buf_block(cur, ptrp, --lev, + 0, &block, &bp); + if (error) + goto error0; + + xfs_btree_setbuf(cur, lev, bp); + cur->bc_ptrs[lev] = 1; + } +out1: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-08-02 04:05:55.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-08-02 04:05:57.000000000 +0200 @@ -303,7 +303,7 @@ xfs_alloc_delrec( */ i = xfs_btree_lastrec(tcur, level); XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_increment(tcur, level, &i))) + if ((error = xfs_btree_increment(tcur, level, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); i = xfs_btree_lastrec(tcur, level); @@ -517,7 +517,7 @@ xfs_alloc_delrec( * us, increment the cursor at that level. */ else if (level + 1 < cur->bc_nlevels && - (error = xfs_alloc_increment(cur, level + 1, &i))) + (error = xfs_btree_increment(cur, level + 1, &i))) return error; /* * Fix up the number of records in the surviving block. @@ -1134,7 +1134,7 @@ xfs_alloc_lookup( int i; cur->bc_ptrs[0] = keyno; - if ((error = xfs_alloc_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 1); *stat = 1; @@ -1570,7 +1570,7 @@ xfs_alloc_rshift( return error; i = xfs_btree_lastrec(tcur, level); XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_increment(tcur, level, &i)) || + if ((error = xfs_btree_increment(tcur, level, &i)) || (error = xfs_alloc_updkey(tcur, rkp, level + 1))) goto error0; xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); @@ -1943,97 +1943,6 @@ xfs_alloc_get_rec( } /* - * Increment cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -int /* error */ -xfs_alloc_increment( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level in btree, 0 is leaf */ - int *stat) /* success/failure */ -{ - xfs_alloc_block_t *block; /* btree block */ - xfs_buf_t *bp; /* tree block buffer */ - int error; /* error return value */ - int lev; /* btree level */ - - ASSERT(level < cur->bc_nlevels); - /* - * Read-ahead to the right at this level. - */ - xfs_btree_readahead(cur, level, XFS_BTCUR_RIGHTRA); - /* - * Get a pointer to the btree block. - */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_ALLOC_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; -#endif - /* - * Increment the ptr at this level. If we're still in the block - * then we're done. - */ - if (++cur->bc_ptrs[level] <= be16_to_cpu(block->bb_numrecs)) { - *stat = 1; - return 0; - } - /* - * If we just went off the right edge of the tree, return failure. - */ - if (be32_to_cpu(block->bb_rightsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * March up the tree incrementing pointers. - * Stop when we don't go off the right edge of a block. - */ - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - bp = cur->bc_bufs[lev]; - block = XFS_BUF_TO_ALLOC_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; -#endif - if (++cur->bc_ptrs[lev] <= be16_to_cpu(block->bb_numrecs)) - break; - /* - * Read-ahead the right block, we're going to read it - * in the next loop. - */ - xfs_btree_readahead(cur, lev, XFS_BTCUR_RIGHTRA); - } - /* - * If we went off the root then we are seriously confused. - */ - ASSERT(lev < cur->bc_nlevels); - /* - * Now walk back down the tree, fixing up the cursor's buffer - * pointers and key numbers. - */ - for (bp = cur->bc_bufs[lev], block = XFS_BUF_TO_ALLOC_BLOCK(bp); - lev > level; ) { - xfs_agblock_t agbno; /* block number of btree block */ - - agbno = be32_to_cpu(*XFS_ALLOC_PTR_ADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, agbno, 0, &bp, - XFS_ALLOC_BTREE_REF))) - return error; - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; - cur->bc_ptrs[lev] = 1; - } - *stat = 1; - return 0; -} - -/* * Insert the current record at the point referenced by cur. * The cursor may be inconsistent on return if splits have been done. */ @@ -2219,6 +2128,15 @@ xfs_allocbt_dup_cursor( cur->bc_btnum); } +STATIC union xfs_btree_ptr * +xfs_allocbt_ptr_addr( + struct xfs_btree_cur *cur, + int index, + struct xfs_btree_block *block) +{ + return (union xfs_btree_ptr *)XFS_ALLOC_PTR_ADDR(block, index, cur); +} + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_allocbt_trace_buf; @@ -2288,6 +2206,7 @@ xfs_allocbt_trace_record( static const struct xfs_btree_ops xfs_allocbt_ops = { .dup_cursor = xfs_allocbt_dup_cursor, + .ptr_addr = xfs_allocbt_ptr_addr, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_allocbt_trace_enter, Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.h 2008-08-02 04:05:35.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h 2008-08-02 04:05:57.000000000 +0200 @@ -114,12 +114,6 @@ extern int xfs_alloc_get_rec(struct xfs_ xfs_extlen_t *len, int *stat); /* - * Increment cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -extern int xfs_alloc_increment(struct xfs_btree_cur *cur, int level, int *stat); - -/* * Insert the current record at the point referenced by cur. * The cursor may be inconsistent on return if splits have been done. */ Index: linux-2.6-xfs/fs/xfs/xfs_alloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc.c 2008-08-02 04:05:35.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc.c 2008-08-02 04:05:57.000000000 +0200 @@ -818,7 +818,7 @@ xfs_alloc_ag_vextent_near( XFS_WANT_CORRUPTED_GOTO(i == 1, error0); if (ltlen >= args->minlen) break; - if ((error = xfs_alloc_increment(cnt_cur, 0, &i))) + if ((error = xfs_btree_increment(cnt_cur, 0, &i))) goto error0; } while (i); ASSERT(ltlen >= args->minlen); @@ -828,7 +828,7 @@ xfs_alloc_ag_vextent_near( i = cnt_cur->bc_ptrs[0]; for (j = 1, blen = 0, bdiff = 0; !error && j && (blen < args->maxlen || bdiff > 0); - error = xfs_alloc_increment(cnt_cur, 0, &j)) { + error = xfs_btree_increment(cnt_cur, 0, &j)) { /* * For each entry, decide if it's better than * the previous best entry. @@ -938,7 +938,7 @@ xfs_alloc_ag_vextent_near( * Increment the cursor, so we will point at the entry just right * of the leftward entry if any, or to the leftmost entry. */ - if ((error = xfs_alloc_increment(bno_cur_gt, 0, &i))) + if ((error = xfs_btree_increment(bno_cur_gt, 0, &i))) goto error0; if (!i) { /* @@ -977,7 +977,7 @@ xfs_alloc_ag_vextent_near( args->minlen, >bnoa, >lena); if (gtlena >= args->minlen) break; - if ((error = xfs_alloc_increment(bno_cur_gt, 0, &i))) + if ((error = xfs_btree_increment(bno_cur_gt, 0, &i))) goto error0; if (!i) { xfs_btree_del_cursor(bno_cur_gt, @@ -1066,7 +1066,7 @@ xfs_alloc_ag_vextent_near( /* * Fell off the right end. */ - if ((error = xfs_alloc_increment( + if ((error = xfs_btree_increment( bno_cur_gt, 0, &i))) goto error0; if (!i) { @@ -1548,7 +1548,7 @@ xfs_free_ag_extent( * Look for a neighboring block on the right (higher block numbers) * that is contiguous with this space. */ - if ((error = xfs_alloc_increment(bno_cur, 0, &haveright))) + if ((error = xfs_btree_increment(bno_cur, 0, &haveright))) goto error0; if (haveright) { /* Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-08-02 04:05:55.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-08-02 04:05:57.000000000 +0200 @@ -253,7 +253,7 @@ xfs_inobt_delrec( */ i = xfs_btree_lastrec(tcur, level); XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_inobt_increment(tcur, level, &i))) + if ((error = xfs_btree_increment(tcur, level, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); i = xfs_btree_lastrec(tcur, level); @@ -463,7 +463,7 @@ xfs_inobt_delrec( * us, increment the cursor at that level. */ else if (level + 1 < cur->bc_nlevels && - (error = xfs_alloc_increment(cur, level + 1, &i))) + (error = xfs_btree_increment(cur, level + 1, &i))) return error; /* * Fix up the number of records in the surviving block. @@ -1014,7 +1014,7 @@ xfs_inobt_lookup( int i; cur->bc_ptrs[0] = keyno; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) return error; ASSERT(i == 1); *stat = 1; @@ -1443,7 +1443,7 @@ xfs_inobt_rshift( if ((error = xfs_btree_dup_cursor(cur, &tcur))) return error; xfs_btree_lastrec(tcur, level); - if ((error = xfs_inobt_increment(tcur, level, &i)) || + if ((error = xfs_btree_increment(tcur, level, &i)) || (error = xfs_inobt_updkey(tcur, rkp, level + 1))) { xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); return error; @@ -1821,97 +1821,6 @@ xfs_inobt_get_rec( } /* - * Increment cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -int /* error */ -xfs_inobt_increment( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level in btree, 0 is leaf */ - int *stat) /* success/failure */ -{ - xfs_inobt_block_t *block; /* btree block */ - xfs_buf_t *bp; /* buffer containing btree block */ - int error; /* error return value */ - int lev; /* btree level */ - - ASSERT(level < cur->bc_nlevels); - /* - * Read-ahead to the right at this level. - */ - xfs_btree_readahead(cur, level, XFS_BTCUR_RIGHTRA); - /* - * Get a pointer to the btree block. - */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; -#endif - /* - * Increment the ptr at this level. If we're still in the block - * then we're done. - */ - if (++cur->bc_ptrs[level] <= be16_to_cpu(block->bb_numrecs)) { - *stat = 1; - return 0; - } - /* - * If we just went off the right edge of the tree, return failure. - */ - if (be32_to_cpu(block->bb_rightsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * March up the tree incrementing pointers. - * Stop when we don't go off the right edge of a block. - */ - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - bp = cur->bc_bufs[lev]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; -#endif - if (++cur->bc_ptrs[lev] <= be16_to_cpu(block->bb_numrecs)) - break; - /* - * Read-ahead the right block, we're going to read it - * in the next loop. - */ - xfs_btree_readahead(cur, lev, XFS_BTCUR_RIGHTRA); - } - /* - * If we went off the root then we are seriously confused. - */ - ASSERT(lev < cur->bc_nlevels); - /* - * Now walk back down the tree, fixing up the cursor's buffer - * pointers and key numbers. - */ - for (bp = cur->bc_bufs[lev], block = XFS_BUF_TO_INOBT_BLOCK(bp); - lev > level; ) { - xfs_agblock_t agbno; /* block number of btree block */ - - agbno = be32_to_cpu(*XFS_INOBT_PTR_ADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, agbno, 0, &bp, - XFS_INO_BTREE_REF))) - return error; - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_INOBT_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; - cur->bc_ptrs[lev] = 1; - } - *stat = 1; - return 0; -} - -/* * Insert the current record at the point referenced by cur. * The cursor may be inconsistent on return if splits have been done. */ @@ -2085,6 +1994,15 @@ xfs_inobt_dup_cursor( cur->bc_private.a.agbp, cur->bc_private.a.agno); } +STATIC union xfs_btree_ptr * +xfs_inobt_ptr_addr( + struct xfs_btree_cur *cur, + int index, + struct xfs_btree_block *block) +{ + return (union xfs_btree_ptr *)XFS_INOBT_PTR_ADDR(block, index, cur); +} + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_inobt_trace_buf; @@ -2154,6 +2072,7 @@ xfs_inobt_trace_record( static const struct xfs_btree_ops xfs_inobt_ops = { .dup_cursor = xfs_inobt_dup_cursor, + .ptr_addr = xfs_inobt_ptr_addr, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_inobt_trace_enter, Index: linux-2.6-xfs/fs/xfs/xfs_ialloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc.c 2008-08-02 04:05:35.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc.c 2008-08-02 04:05:57.000000000 +0200 @@ -695,7 +695,7 @@ nextag: goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); freecount += rec.ir_freecount; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error0; } while (i == 1); @@ -753,7 +753,7 @@ nextag: /* * Search right with cur, go forward 1 record. */ - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error1; doneright = !i; if (!doneright) { @@ -835,7 +835,7 @@ nextag: * further right. */ else { - if ((error = xfs_inobt_increment(cur, 0, + if ((error = xfs_btree_increment(cur, 0, &i))) goto error1; doneright = !i; @@ -890,7 +890,7 @@ nextag: XFS_WANT_CORRUPTED_GOTO(i == 1, error0); if (rec.ir_freecount > 0) break; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); } @@ -924,7 +924,7 @@ nextag: goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); freecount += rec.ir_freecount; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error0; } while (i == 1); ASSERT(freecount == be32_to_cpu(agi->agi_freecount) || @@ -1033,7 +1033,7 @@ xfs_difree( goto error0; if (i) { freecount += rec.ir_freecount; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error0; } } while (i == 1); @@ -1138,7 +1138,7 @@ xfs_difree( goto error0; if (i) { freecount += rec.ir_freecount; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error0; } } while (i == 1); Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.h 2008-08-02 04:05:35.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h 2008-08-02 04:05:57.000000000 +0200 @@ -136,12 +136,6 @@ extern int xfs_inobt_get_rec(struct xfs_ __int32_t *fcnt, xfs_inofree_t *free, int *stat); /* - * Increment cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -extern int xfs_inobt_increment(struct xfs_btree_cur *cur, int level, int *stat); - -/* * Insert the current record at the point referenced by cur. * The cursor may be inconsistent on return if splits have been done. */ Index: linux-2.6-xfs/fs/xfs/xfs_itable.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_itable.c 2008-08-02 04:05:35.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_itable.c 2008-08-02 04:05:57.000000000 +0200 @@ -471,7 +471,7 @@ xfs_bulkstat( * In any case, increment to the next record. */ if (!error) - error = xfs_inobt_increment(cur, 0, &tmp); + error = xfs_btree_increment(cur, 0, &tmp); } else { /* * Start of ag. Lookup the first inode chunk. @@ -538,7 +538,7 @@ xfs_bulkstat( * Set agino to after this chunk and bump the cursor. */ agino = gino + XFS_INODES_PER_CHUNK; - error = xfs_inobt_increment(cur, 0, &tmp); + error = xfs_btree_increment(cur, 0, &tmp); cond_resched(); } /* @@ -885,7 +885,7 @@ xfs_inumbers( bufidx = 0; } if (left) { - error = xfs_inobt_increment(cur, 0, &tmp); + error = xfs_btree_increment(cur, 0, &tmp); if (error) { xfs_btree_del_cursor(cur, XFS_BTREE_ERROR); cur = NULL; Index: linux-2.6-xfs/fs/xfs/xfs_bmap.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap.c 2008-08-02 04:05:35.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap.c 2008-08-02 04:05:57.000000000 +0200 @@ -1646,7 +1646,7 @@ xfs_bmap_add_extent_unwritten_real( PREV.br_blockcount - new->br_blockcount, oldext))) goto done; - if ((error = xfs_bmbt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto done; if ((error = xfs_bmbt_update(cur, new->br_startoff, new->br_startblock, @@ -3253,7 +3253,7 @@ xfs_bmap_del_extent( got.br_startblock, temp, got.br_state))) goto done; - if ((error = xfs_bmbt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto done; cur->bc_rec.b = new; error = xfs_bmbt_insert(cur, &i); Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-02 04:05:55.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-02 04:06:30.000000000 +0200 @@ -254,7 +254,7 @@ xfs_bmbt_delrec( if (rbno != NULLFSBLOCK) { i = xfs_btree_lastrec(tcur, level); XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_bmbt_increment(tcur, level, &i))) { + if ((error = xfs_btree_increment(tcur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -445,7 +445,7 @@ xfs_bmbt_delrec( cur->bc_bufs[level] = lbp; cur->bc_ptrs[level] += lrecs; cur->bc_ra[level] = 0; - } else if ((error = xfs_bmbt_increment(cur, level + 1, &i))) { + } else if ((error = xfs_btree_increment(cur, level + 1, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -929,7 +929,7 @@ xfs_bmbt_lookup( if (dir == XFS_LOOKUP_GE && keyno > be16_to_cpu(block->bb_numrecs) && be64_to_cpu(block->bb_rightsib) != NULLDFSBNO) { cur->bc_ptrs[0] = keyno; - if ((error = xfs_bmbt_increment(cur, 0, &i))) { + if ((error = xfs_btree_increment(cur, 0, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); return error; } @@ -1202,7 +1202,7 @@ xfs_bmbt_rshift( } i = xfs_btree_lastrec(tcur, level); XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_bmbt_increment(tcur, level, &i))) { + if ((error = xfs_btree_increment(tcur, level, &i))) { XFS_BMBT_TRACE_CURSOR(tcur, ERROR); goto error1; } @@ -1761,86 +1761,6 @@ xfs_bmbt_disk_get_startoff( } /* - * Increment cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -int /* error */ -xfs_bmbt_increment( - xfs_btree_cur_t *cur, - int level, - int *stat) /* success/failure */ -{ - xfs_bmbt_block_t *block; - xfs_buf_t *bp; - int error; /* error return value */ - xfs_fsblock_t fsbno; - int lev; - xfs_mount_t *mp; - xfs_trans_t *tp; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, level); - ASSERT(level < cur->bc_nlevels); - - xfs_btree_readahead(cur, level, XFS_BTCUR_RIGHTRA); - block = xfs_bmbt_get_block(cur, level, &bp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - if (++cur->bc_ptrs[level] <= be16_to_cpu(block->bb_numrecs)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - if (be64_to_cpu(block->bb_rightsib) == NULLDFSBNO) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - block = xfs_bmbt_get_block(cur, lev, &bp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, lev, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - if (++cur->bc_ptrs[lev] <= be16_to_cpu(block->bb_numrecs)) - break; - xfs_btree_readahead(cur, lev, XFS_BTCUR_RIGHTRA); - } - if (lev == cur->bc_nlevels) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - tp = cur->bc_tp; - mp = cur->bc_mp; - for (block = xfs_bmbt_get_block(cur, lev, &bp); lev > level; ) { - fsbno = be64_to_cpu(*XFS_BMAP_PTR_IADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufl(mp, tp, fsbno, 0, &bp, - XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_BMBT_BLOCK(bp); - if ((error = xfs_btree_check_lblock(cur, block, lev, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - cur->bc_ptrs[lev] = 1; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; -} - -/* * Insert the current record at the point referenced by cur. * * A multi-level split of the tree on insert will invalidate the original @@ -2425,6 +2345,15 @@ xfs_bmbt_get_root_from_inode( return (struct xfs_btree_block *)ifp->if_broot; } +STATIC union xfs_btree_ptr * +xfs_bmbt_ptr_addr( + struct xfs_btree_cur *cur, + int index, + struct xfs_btree_block *block) +{ + return (union xfs_btree_ptr *)XFS_BMAP_PTR_IADDR(block, index, cur); +} + #ifdef XFS_BTREE_TRACE ktrace_t *xfs_bmbt_trace_buf; @@ -2514,6 +2443,7 @@ xfs_bmbt_trace_record( static const struct xfs_btree_ops xfs_bmbt_ops = { .dup_cursor = xfs_bmbt_dup_cursor, .get_root_from_inode = xfs_bmbt_get_root_from_inode, + .ptr_addr = xfs_bmbt_ptr_addr, #ifdef XFS_BTREE_TRACE .trace_enter = xfs_bmbt_trace_enter, Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.h 2008-08-02 04:05:55.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h 2008-08-02 04:05:57.000000000 +0200 @@ -251,7 +251,6 @@ extern void xfs_bmbt_disk_get_all(xfs_bm extern xfs_filblks_t xfs_bmbt_disk_get_blockcount(xfs_bmbt_rec_t *r); extern xfs_fileoff_t xfs_bmbt_disk_get_startoff(xfs_bmbt_rec_t *r); -extern int xfs_bmbt_increment(struct xfs_btree_cur *, int, int *); extern int xfs_bmbt_insert(struct xfs_btree_cur *, int *); extern void xfs_bmbt_log_block(struct xfs_btree_cur *, struct xfs_buf *, int); extern void xfs_bmbt_log_recs(struct xfs_btree_cur *, struct xfs_buf *, int, -- From owner-xfs@oss.sgi.com Sun Aug 3 18:33:29 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:33:40 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_62, J_CHICKENPOX_64,J_CHICKENPOX_66,J_CHICKENPOX_73,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741XR13032394 for ; Sun, 3 Aug 2008 18:33:28 -0700 X-ASG-Debug-ID: 1217813680-16ab02e80000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 51B4A1976449 for ; Sun, 3 Aug 2008 18:34:40 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id 0Ek3iYo5HJdN4VKP for ; Sun, 03 Aug 2008 18:34:40 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741YfIF009184 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:34:42 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741YfKb009182 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:34:41 +0200 Date: Mon, 4 Aug 2008 03:34:41 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 16/26] implement generic xfs_btree_rshift Subject: [PATCH 16/26] implement generic xfs_btree_rshift Message-ID: <20080804013441.GQ8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-common-btree-rshift User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813681 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1688 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17341 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Make the btree right shift code generic. Based on a patch from David Chinner with lots of changes to follow the original btree implementations more closely. While this loses some of the generic helper routines for inserting/moving/removing records it also solves some of the one off bugs in the original code and makes it easier to verify. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-03 13:23:30.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-03 13:32:52.000000000 +0200 @@ -34,6 +34,7 @@ #include "xfs_attr_sf.h" #include "xfs_dinode.h" #include "xfs_inode.h" +#include "xfs_inode_item.h" #include "xfs_btree.h" #include "xfs_btree_trace.h" #include "xfs_ialloc.h" @@ -943,6 +944,137 @@ xfs_btree_read_buf_block( return error; } +STATIC void +xfs_btree_set_ptr( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr_addr, + int index, + union xfs_btree_ptr *newptr) +{ + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { + __be64 *pp = &ptr_addr->l; + + pp[index] = newptr->l; + } else { + __be32 *pp = &ptr_addr->s; + + pp[index] = newptr->s; + } +} + +STATIC void +xfs_btree_move_ptrs( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *base, + int from, + int to, + int numptrs) +{ + ASSERT(from >= 0); + ASSERT(to >= 0); + ASSERT(numptrs >= 0); + + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { + __be64 *lpp = &base->l; + + memmove(&lpp[to], &lpp[from], numptrs * sizeof(*lpp)); + } else { + __be32 *spp = &base->s; + + memmove(&spp[to], &spp[from], numptrs * sizeof(*spp)); + } +} + +/* + * Log block pointer fields from a btree block (nonleaf). + */ +STATIC void +xfs_btree_log_ptrs( + struct xfs_btree_cur *cur, /* btree cursor */ + struct xfs_buf *bp, /* buffer containing btree block */ + int first, /* index of first pointer to log */ + int last) /* index of last pointer to log */ +{ + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, first, last); + + if (bp) { + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + char *baddr = (char *)block; + union xfs_btree_ptr *pp; + int start; /* first byte offset logged */ + int end; /* last byte offset logged */ + + pp = cur->bc_ops->ptr_addr(cur, 1, block); + + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { + __be64 *lpp = &pp->l; + + start = (char *)&lpp[first - 1] - baddr; + end = (char *)&lpp[last] - 1 - baddr; + } else { + __be32 *spp = &pp->s; + + start = (char *)&spp[first - 1] - baddr; + end = (char *)&spp[last] - 1 - baddr; + } + + xfs_trans_log_buf(cur->bc_tp, bp, start, end); + } else { + xfs_trans_log_inode(cur->bc_tp, cur->bc_private.b.ip, + XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); + } + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); +} + +/* + * Log fields from the short from btree block header. + */ +STATIC void +xfs_btree_log_block( + struct xfs_btree_cur *cur, /* btree cursor */ + struct xfs_buf *bp, /* buffer containing btree block */ + int fields) /* mask of fields: XFS_BB_... */ +{ + int first; /* first byte offset logged */ + int last; /* last byte offset logged */ + static const short soffsets[] = { /* table of offsets (short) */ + offsetof(struct xfs_btree_sblock, bb_magic), + offsetof(struct xfs_btree_sblock, bb_level), + offsetof(struct xfs_btree_sblock, bb_numrecs), + offsetof(struct xfs_btree_sblock, bb_leftsib), + offsetof(struct xfs_btree_sblock, bb_rightsib), + sizeof(struct xfs_btree_sblock) + }; + static const short loffsets[] = { /* table of offsets (long) */ + offsetof(struct xfs_btree_lblock, bb_magic), + offsetof(struct xfs_btree_lblock, bb_level), + offsetof(struct xfs_btree_lblock, bb_numrecs), + offsetof(struct xfs_btree_lblock, bb_leftsib), + offsetof(struct xfs_btree_lblock, bb_rightsib), + sizeof(struct xfs_btree_lblock) + }; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBI(cur, bp, fields); + + if (bp) { + xfs_btree_offsets(fields, + (cur->bc_flags & XFS_BTREE_LONG_PTRS) ? + loffsets : soffsets, + XFS_BB_NUM_BITS, &first, &last); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + } else { + /* XXX(hch): maybe factor out into a method? */ + xfs_trans_log_inode(cur->bc_tp, cur->bc_private.b.ip, + XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); + } + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); +} + + /* * Increment cursor by one record at the level. * For nonzero levels the leaf-ward information is untouched. @@ -1149,7 +1281,6 @@ error0: return error; } - STATIC int xfs_btree_lookup_get_block( struct xfs_btree_cur *cur, /* btree cursor */ @@ -1480,3 +1611,177 @@ error0: return error; } +/* + * Move 1 record right from cur/level if possible. + * Update cur to reflect the new path. + */ +int /* error */ +xfs_btree_rshift( + struct xfs_btree_cur *cur, + int level, + int *stat) /* success/failure */ +{ + union xfs_btree_key key; /* btree key */ + struct xfs_buf *lbp; /* left buffer pointer */ + struct xfs_btree_block *left; /* left btree block */ + struct xfs_buf *rbp; /* right buffer pointer */ + struct xfs_btree_block *right; /* right btree block */ + struct xfs_btree_cur *tcur; /* temporary btree cursor */ + union xfs_btree_ptr rptr; /* right block pointer */ + union xfs_btree_key *rkp; /* right btree key */ + int rrecs; /* right record count */ + int lrecs; /* left record count */ + int error; /* error return value */ + int i; /* loop counter */ + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, level); + + if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) && + (level == cur->bc_nlevels - 1)) + goto out0; + + /* Set up variables for this block as "left". */ + left = xfs_btree_get_block(cur, level, &lbp); + +#ifdef DEBUG + error = xfs_btree_check_block(cur, left, level, lbp); + if (error) + goto error0; +#endif + + /* If we've got no right sibling then we can't shift an entry right. */ + xfs_btree_get_sibling(cur, left, &rptr, XFS_BB_RIGHTSIB); + if (xfs_btree_ptr_is_null(cur, &rptr)) + goto out0; + + /* + * If the cursor entry is the one that would be moved, don't + * do it... it's too complicated. + */ + lrecs = xfs_btree_get_numrecs(left); + if (cur->bc_ptrs[level] >= lrecs) + goto out0; + + /* Set up the right neighbor as "right". */ + error = xfs_btree_read_buf_block(cur, &rptr, level, 0, &right, &rbp); + if (error) + goto error0; + + /* If it's full, it can't take another entry. */ + rrecs = xfs_btree_get_numrecs(right); + if (rrecs == cur->bc_ops->get_maxrecs(cur, level)) + goto out0; + + XFS_BTREE_STATS_INC(cur, rshift); + XFS_BTREE_STATS_ADD(cur, moves, rrecs); + + /* + * Make a hole at the start of the right neighbor block, then + * copy the last left block entry to the hole. + */ + if (level > 0) { + /* It's a nonleaf. make a hole in the keys and ptrs */ + union xfs_btree_key *lkp; + union xfs_btree_ptr *lpp; + union xfs_btree_ptr *rpp; + + lkp = cur->bc_ops->key_addr(cur, lrecs, left); + lpp = cur->bc_ops->ptr_addr(cur, lrecs, left); + rkp = cur->bc_ops->key_addr(cur, 1, right); + rpp = cur->bc_ops->ptr_addr(cur, 1, right); + +#ifdef DEBUG + for (i = rrecs - 1; i >= 0; i--) { + error = xfs_btree_check_ptr(cur, rpp, i, level); + if (error) + goto error0; + } +#endif + + cur->bc_ops->move_keys(cur, rkp, 0, 1, rrecs); + xfs_btree_move_ptrs(cur, rpp, 0, 1, rrecs); + +#ifdef DEBUG + error = xfs_btree_check_ptr(cur, lpp, 0, level); + if (error) + goto error0; +#endif + + /* Now put the new data in, and log it. */ + cur->bc_ops->set_key(cur, rkp, 0, lkp); + xfs_btree_set_ptr(cur, rpp, 0, lpp); + + cur->bc_ops->log_keys(cur, rbp, 1, rrecs + 1); + xfs_btree_log_ptrs(cur, rbp, 1, rrecs + 1); + + xfs_btree_check_key(cur->bc_btnum, rkp, + cur->bc_ops->key_addr(cur, 2, right)); + } else { + /* It's a leaf. make a hole in the records */ + union xfs_btree_rec *lrp; + union xfs_btree_rec *rrp; + + lrp = cur->bc_ops->rec_addr(cur, lrecs, left); + rrp = cur->bc_ops->rec_addr(cur, 1, right); + + cur->bc_ops->move_recs(cur, rrp, 0, 1, rrecs); + + /* Now put the new data in, and log it. */ + cur->bc_ops->set_rec(cur, rrp, 0, lrp); + cur->bc_ops->log_recs(cur, rbp, 1, rrecs + 1); + + cur->bc_ops->init_key_from_rec(cur, &key, rrp); + rkp = &key; + + xfs_btree_check_rec(cur->bc_btnum, rrp, + cur->bc_ops->rec_addr(cur, 2, right)); + } + + /* + * Decrement and log left's numrecs, bump and log right's numrecs. + */ + xfs_btree_set_numrecs(left, --lrecs); + xfs_btree_log_block(cur, lbp, XFS_BB_NUMRECS); + + xfs_btree_set_numrecs(right, ++rrecs); + xfs_btree_log_block(cur, rbp, XFS_BB_NUMRECS); + + /* + * Using a temporary cursor, update the parent key values of the + * block on the right. + */ + error = xfs_btree_dup_cursor(cur, &tcur); + if (error) + goto error0; + i = xfs_btree_lastrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + error = xfs_btree_increment(tcur, level, &i); + if (error) + goto error1; + + error = xfs_btree_updkey(tcur, rkp, level + 1); + if (error) + goto error1; + + xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; + +error1: + XFS_BTREE_TRACE_CURSOR(tcur, XBT_ERROR); + xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); + return error; +} Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-03 13:23:30.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-03 13:30:46.000000000 +0200 @@ -214,6 +214,22 @@ struct xfs_btree_ops { __int64_t (*key_diff)(struct xfs_btree_cur *cur, union xfs_btree_key *key); + /* set values of btree structures */ + void (*set_key)(struct xfs_btree_cur *cur, + union xfs_btree_key *oldkey, int index, + union xfs_btree_key *newkey); + void (*set_rec)(struct xfs_btree_cur *cur, + union xfs_btree_rec *oldrec, int index, + union xfs_btree_rec *newrec); + + /* move bits of btree blocks within a block */ + void (*move_keys)(struct xfs_btree_cur *cur, + union xfs_btree_key *base, int src_index, + int dst_index, int numkeys); + void (*move_recs)(struct xfs_btree_cur *cur, + union xfs_btree_rec *base, int src_index, + int dst_index, int numrecs); + /* copy bits of btree blocks between blocks */ void (*copy_keys)(struct xfs_btree_cur *cur, union xfs_btree_key *src_key, @@ -555,6 +571,7 @@ int xfs_btree_decrement(struct xfs_btree int xfs_btree_lookup(struct xfs_btree_cur *, xfs_lookup_t, int *); int xfs_btree_updkey(struct xfs_btree_cur *, union xfs_btree_key *, int); int xfs_btree_update(struct xfs_btree_cur *, union xfs_btree_rec *); +int xfs_btree_rshift(struct xfs_btree_cur *, int, int *); /* @@ -567,6 +584,12 @@ xfs_btree_get_numrecs(struct xfs_btree_b return be16_to_cpu(block->bb_numrecs); } +static inline void +xfs_btree_set_numrecs(struct xfs_btree_block *block, __uint16_t numrecs) +{ + block->bb_numrecs = cpu_to_be16(numrecs); +} + #endif /* __KERNEL__ */ Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-08-03 13:23:30.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-08-03 13:30:46.000000000 +0200 @@ -54,7 +54,6 @@ STATIC void xfs_allocbt_log_recs(xfs_btr xfs_allocbt_log_recs(c, b, i, j) STATIC int xfs_alloc_lshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_alloc_newroot(xfs_btree_cur_t *, int *); -STATIC int xfs_alloc_rshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_alloc_split(xfs_btree_cur_t *, int, xfs_agblock_t *, xfs_alloc_key_t *, xfs_btree_cur_t **, int *); @@ -396,7 +395,7 @@ xfs_alloc_delrec( */ if (be16_to_cpu(left->bb_numrecs) - 1 >= XFS_ALLOC_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_alloc_rshift(tcur, level, &i))) + if ((error = xfs_btree_rshift(tcur, level, &i))) goto error0; if (i) { ASSERT(be16_to_cpu(block->bb_numrecs) >= @@ -688,7 +687,7 @@ xfs_alloc_insrec( /* * First, try shifting an entry to the right neighbor. */ - if ((error = xfs_alloc_rshift(cur, level, &i))) + if ((error = xfs_btree_rshift(cur, level, &i))) return error; if (i) { /* nothing */ @@ -1181,137 +1180,6 @@ xfs_alloc_newroot( } /* - * Move 1 record right from cur/level if possible. - * Update cur to reflect the new path. - */ -STATIC int /* error */ -xfs_alloc_rshift( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to shift record on */ - int *stat) /* success/failure */ -{ - int error; /* error return value */ - int i; /* loop index */ - xfs_alloc_key_t key; /* key value for leaf level upward */ - xfs_buf_t *lbp; /* buffer for left (current) block */ - xfs_alloc_block_t *left; /* left (current) btree block */ - xfs_buf_t *rbp; /* buffer for right neighbor block */ - xfs_alloc_block_t *right; /* right neighbor btree block */ - xfs_alloc_key_t *rkp; /* key pointer for right block */ - xfs_btree_cur_t *tcur; /* temporary cursor */ - - /* - * Set up variables for this block as "left". - */ - lbp = cur->bc_bufs[level]; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; -#endif - /* - * If we've got no right sibling then we can't shift an entry right. - */ - if (be32_to_cpu(left->bb_rightsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * If the cursor entry is the one that would be moved, don't - * do it... it's too complicated. - */ - if (cur->bc_ptrs[level] >= be16_to_cpu(left->bb_numrecs)) { - *stat = 0; - return 0; - } - /* - * Set up the right neighbor as "right". - */ - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, be32_to_cpu(left->bb_rightsib), - 0, &rbp, XFS_ALLOC_BTREE_REF))) - return error; - right = XFS_BUF_TO_ALLOC_BLOCK(rbp); - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; - /* - * If it's full, it can't take another entry. - */ - if (be16_to_cpu(right->bb_numrecs) == XFS_ALLOC_BLOCK_MAXRECS(level, cur)) { - *stat = 0; - return 0; - } - /* - * Make a hole at the start of the right neighbor block, then - * copy the last left block entry to the hole. - */ - if (level > 0) { - xfs_alloc_key_t *lkp; /* key pointer for left block */ - xfs_alloc_ptr_t *lpp; /* address pointer for left block */ - xfs_alloc_ptr_t *rpp; /* address pointer for right block */ - - lkp = XFS_ALLOC_KEY_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - lpp = XFS_ALLOC_PTR_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rkp = XFS_ALLOC_KEY_ADDR(right, 1, cur); - rpp = XFS_ALLOC_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - for (i = be16_to_cpu(right->bb_numrecs) - 1; i >= 0; i--) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i]), level))) - return error; - } -#endif - memmove(rkp + 1, rkp, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memmove(rpp + 1, rpp, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(*lpp), level))) - return error; -#endif - *rkp = *lkp; - *rpp = *lpp; - xfs_alloc_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - xfs_alloc_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - xfs_btree_check_key(cur->bc_btnum, rkp, rkp + 1); - } else { - xfs_alloc_rec_t *lrp; /* record pointer for left block */ - xfs_alloc_rec_t *rrp; /* record pointer for right block */ - - lrp = XFS_ALLOC_REC_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rrp = XFS_ALLOC_REC_ADDR(right, 1, cur); - memmove(rrp + 1, rrp, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - *rrp = *lrp; - xfs_alloc_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - key.ar_startblock = rrp->ar_startblock; - key.ar_blockcount = rrp->ar_blockcount; - rkp = &key; - xfs_btree_check_rec(cur->bc_btnum, rrp, rrp + 1); - } - /* - * Decrement and log left's numrecs, bump and log right's numrecs. - */ - be16_add(&left->bb_numrecs, -1); - xfs_alloc_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS); - be16_add(&right->bb_numrecs, 1); - xfs_alloc_log_block(cur->bc_tp, rbp, XFS_BB_NUMRECS); - /* - * Using a temporary cursor, update the parent key values of the - * block on the right. - */ - if ((error = xfs_btree_dup_cursor(cur, &tcur))) - return error; - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_btree_increment(tcur, level, &i)) || - (error = xfs_btree_updkey(tcur, (union xfs_btree_key *)rkp, level + 1))) - goto error0; - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - *stat = 1; - return 0; -error0: - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; -} - -/* * Split cur/level block in half. * Return new block number and its first record (to be inserted into parent). */ @@ -1737,6 +1605,64 @@ xfs_allocbt_key_diff( } STATIC void +xfs_allocbt_set_key( + struct xfs_btree_cur *cur, + union xfs_btree_key *key_addr, + int index, + union xfs_btree_key *newkey) +{ + xfs_alloc_key_t *kp = &key_addr->alloc; + + kp[index] = newkey->alloc; +} + +STATIC void +xfs_allocbt_set_rec( + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec_addr, + int index, + union xfs_btree_rec *newrec) +{ + xfs_alloc_rec_t *rp = &rec_addr->alloc; + + rp[index] = newrec->alloc; +} + +STATIC void +xfs_allocbt_move_keys( + struct xfs_btree_cur *cur, + union xfs_btree_key *base, + int from, + int to, + int numkeys) +{ + xfs_alloc_key_t *kp = &base->alloc; + + ASSERT(from >= 0); + ASSERT(to >= 0); + ASSERT(numkeys >= 0); + + memmove(&kp[to], &kp[from], numkeys * sizeof(*kp)); +} + +STATIC void +xfs_allocbt_move_recs( + struct xfs_btree_cur *cur, + union xfs_btree_rec *base, + int from, + int to, + int numrecs) +{ + xfs_alloc_rec_t *rp = &base->alloc; + + ASSERT(from >= 0); + ASSERT(to >= 0); + ASSERT(numrecs >= 0); + + memmove(&rp[to], &rp[from], numrecs * sizeof(*rp)); +} + +STATIC void xfs_allocbt_copy_keys( struct xfs_btree_cur *cur, union xfs_btree_key *src_key, @@ -1910,6 +1836,10 @@ static const struct xfs_btree_ops xfs_al .key_addr = xfs_allocbt_key_addr, .rec_addr = xfs_allocbt_rec_addr, .key_diff = xfs_allocbt_key_diff, + .set_key = xfs_allocbt_set_key, + .set_rec = xfs_allocbt_set_rec, + .move_keys = xfs_allocbt_move_keys, + .move_recs = xfs_allocbt_move_recs, .copy_keys = xfs_allocbt_copy_keys, .copy_recs = xfs_allocbt_copy_recs, .log_keys = xfs_allocbt_log_keys, Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-08-03 13:23:30.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-08-03 13:30:46.000000000 +0200 @@ -46,7 +46,6 @@ STATIC void xfs_inobt_log_ptrs(xfs_btree STATIC void xfs_inobt_log_recs(xfs_btree_cur_t *, xfs_buf_t *, int, int); STATIC int xfs_inobt_lshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_inobt_newroot(xfs_btree_cur_t *, int *); -STATIC int xfs_inobt_rshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_inobt_split(xfs_btree_cur_t *, int, xfs_agblock_t *, xfs_inobt_key_t *, xfs_btree_cur_t **, int *); @@ -338,7 +337,7 @@ xfs_inobt_delrec( */ if (be16_to_cpu(left->bb_numrecs) - 1 >= XFS_INOBT_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_inobt_rshift(tcur, level, &i))) + if ((error = xfs_btree_rshift(tcur, level, &i))) goto error0; if (i) { ASSERT(be16_to_cpu(block->bb_numrecs) >= @@ -609,7 +608,7 @@ xfs_inobt_insrec( /* * First, try shifting an entry to the right neighbor. */ - if ((error = xfs_inobt_rshift(cur, level, &i))) + if ((error = xfs_btree_rshift(cur, level, &i))) return error; if (i) { /* nothing */ @@ -1074,136 +1073,6 @@ xfs_inobt_newroot( } /* - * Move 1 record right from cur/level if possible. - * Update cur to reflect the new path. - */ -STATIC int /* error */ -xfs_inobt_rshift( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to shift record on */ - int *stat) /* success/failure */ -{ - int error; /* error return value */ - int i; /* loop index */ - xfs_inobt_key_t key; /* key value for leaf level upward */ - xfs_buf_t *lbp; /* buffer for left (current) block */ - xfs_inobt_block_t *left; /* left (current) btree block */ - xfs_inobt_key_t *lkp; /* key pointer for left block */ - xfs_inobt_ptr_t *lpp; /* address pointer for left block */ - xfs_inobt_rec_t *lrp; /* record pointer for left block */ - xfs_buf_t *rbp; /* buffer for right neighbor block */ - xfs_inobt_block_t *right; /* right neighbor btree block */ - xfs_inobt_key_t *rkp; /* key pointer for right block */ - xfs_inobt_ptr_t *rpp; /* address pointer for right block */ - xfs_inobt_rec_t *rrp=NULL; /* record pointer for right block */ - xfs_btree_cur_t *tcur; /* temporary cursor */ - - /* - * Set up variables for this block as "left". - */ - lbp = cur->bc_bufs[level]; - left = XFS_BUF_TO_INOBT_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; -#endif - /* - * If we've got no right sibling then we can't shift an entry right. - */ - if (be32_to_cpu(left->bb_rightsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * If the cursor entry is the one that would be moved, don't - * do it... it's too complicated. - */ - if (cur->bc_ptrs[level] >= be16_to_cpu(left->bb_numrecs)) { - *stat = 0; - return 0; - } - /* - * Set up the right neighbor as "right". - */ - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, be32_to_cpu(left->bb_rightsib), - 0, &rbp, XFS_INO_BTREE_REF))) - return error; - right = XFS_BUF_TO_INOBT_BLOCK(rbp); - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; - /* - * If it's full, it can't take another entry. - */ - if (be16_to_cpu(right->bb_numrecs) == XFS_INOBT_BLOCK_MAXRECS(level, cur)) { - *stat = 0; - return 0; - } - /* - * Make a hole at the start of the right neighbor block, then - * copy the last left block entry to the hole. - */ - if (level > 0) { - lkp = XFS_INOBT_KEY_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - lpp = XFS_INOBT_PTR_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rkp = XFS_INOBT_KEY_ADDR(right, 1, cur); - rpp = XFS_INOBT_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - for (i = be16_to_cpu(right->bb_numrecs) - 1; i >= 0; i--) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i]), level))) - return error; - } -#endif - memmove(rkp + 1, rkp, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memmove(rpp + 1, rpp, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(*lpp), level))) - return error; -#endif - *rkp = *lkp; - *rpp = *lpp; - xfs_inobt_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - xfs_inobt_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - } else { - lrp = XFS_INOBT_REC_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rrp = XFS_INOBT_REC_ADDR(right, 1, cur); - memmove(rrp + 1, rrp, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - *rrp = *lrp; - xfs_inobt_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - key.ir_startino = rrp->ir_startino; - rkp = &key; - } - /* - * Decrement and log left's numrecs, bump and log right's numrecs. - */ - be16_add(&left->bb_numrecs, -1); - xfs_inobt_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS); - be16_add(&right->bb_numrecs, 1); -#ifdef DEBUG - if (level > 0) - xfs_btree_check_key(cur->bc_btnum, rkp, rkp + 1); - else - xfs_btree_check_rec(cur->bc_btnum, rrp, rrp + 1); -#endif - xfs_inobt_log_block(cur->bc_tp, rbp, XFS_BB_NUMRECS); - /* - * Using a temporary cursor, update the parent key values of the - * block on the right. - */ - if ((error = xfs_btree_dup_cursor(cur, &tcur))) - return error; - xfs_btree_lastrec(tcur, level); - if ((error = xfs_btree_increment(tcur, level, &i)) || - (error = xfs_btree_updkey(tcur, (union xfs_btree_key *)rkp, level + 1))) { - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; - } - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - *stat = 1; - return 0; -} - -/* * Split cur/level block in half. * Return new block number and its first record (to be inserted into parent). */ @@ -1585,6 +1454,64 @@ xfs_inobt_key_diff( } STATIC void +xfs_inobt_set_key( + struct xfs_btree_cur *cur, + union xfs_btree_key *key_addr, + int index, + union xfs_btree_key *newkey) +{ + xfs_inobt_key_t *kp = &key_addr->inobt; + + kp[index] = newkey->inobt; +} + +STATIC void +xfs_inobt_set_rec( + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec_addr, + int index, + union xfs_btree_rec *newrec) +{ + xfs_inobt_rec_t *rp = &rec_addr->inobt; + + rp[index] = newrec->inobt; +} + +STATIC void +xfs_inobt_move_keys( + struct xfs_btree_cur *cur, + union xfs_btree_key *base, + int from, + int to, + int numkeys) +{ + xfs_inobt_key_t *kp = &base->inobt; + + ASSERT(from >= 0); + ASSERT(to >= 0); + ASSERT(numkeys >= 0); + + memmove(&kp[to], &kp[from], numkeys * sizeof(*kp)); +} + +STATIC void +xfs_inobt_move_recs( + struct xfs_btree_cur *cur, + union xfs_btree_rec *base, + int from, + int to, + int numrecs) +{ + xfs_inobt_rec_t *rp = &base->inobt; + + ASSERT(from >= 0); + ASSERT(to >= 0); + ASSERT(numrecs >= 0); + + memmove(&rp[to], &rp[from], numrecs * sizeof(*rp)); +} + +STATIC void xfs_inobt_copy_keys( struct xfs_btree_cur *cur, union xfs_btree_key *src_key, @@ -1735,6 +1662,11 @@ static const struct xfs_btree_ops xfs_in .key_addr = xfs_inobt_key_addr, .rec_addr = xfs_inobt_rec_addr, .key_diff = xfs_inobt_key_diff, + .set_key = xfs_inobt_set_key, + .set_rec = xfs_inobt_set_rec, + .move_keys = xfs_inobt_move_keys, + .move_recs = xfs_inobt_move_recs, + .copy_keys = xfs_inobt_copy_keys, .copy_recs = xfs_inobt_copy_recs, .log_keys = xfs_inobt_log_keys, Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-03 13:23:30.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-03 13:30:46.000000000 +0200 @@ -53,7 +53,6 @@ STATIC int xfs_bmbt_killroot(xfs_btree_c STATIC void xfs_bmbt_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); STATIC void xfs_bmbt_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); STATIC int xfs_bmbt_lshift(xfs_btree_cur_t *, int, int *); -STATIC int xfs_bmbt_rshift(xfs_btree_cur_t *, int, int *); STATIC int xfs_bmbt_split(xfs_btree_cur_t *, int, xfs_fsblock_t *, __uint64_t *, xfs_btree_cur_t **, int *); @@ -327,7 +326,7 @@ xfs_bmbt_delrec( bno = be64_to_cpu(left->bb_rightsib); if (be16_to_cpu(left->bb_numrecs) - 1 >= XFS_BMAP_BLOCK_IMINRECS(level, cur)) { - if ((error = xfs_bmbt_rshift(tcur, level, &i))) { + if ((error = xfs_btree_rshift(tcur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); goto error0; } @@ -538,7 +537,7 @@ xfs_bmbt_insrec( logflags); block = xfs_bmbt_get_block(cur, level, &bp); } else { - if ((error = xfs_bmbt_rshift(cur, level, &i))) { + if ((error = xfs_btree_rshift(cur, level, &i))) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); return error; } @@ -909,143 +908,6 @@ xfs_bmbt_lshift( } /* - * Move 1 record right from cur/level if possible. - * Update cur to reflect the new path. - */ -STATIC int /* error */ -xfs_bmbt_rshift( - xfs_btree_cur_t *cur, - int level, - int *stat) /* success/failure */ -{ - int error; /* error return value */ - int i; /* loop counter */ - xfs_bmbt_key_t key; /* bmap btree key */ - xfs_buf_t *lbp; /* left buffer pointer */ - xfs_bmbt_block_t *left; /* left btree block */ - xfs_bmbt_key_t *lkp; /* left btree key */ - xfs_bmbt_ptr_t *lpp; /* left address pointer */ - xfs_bmbt_rec_t *lrp; /* left record pointer */ - xfs_mount_t *mp; /* file system mount point */ - xfs_buf_t *rbp; /* right buffer pointer */ - xfs_bmbt_block_t *right; /* right btree block */ - xfs_bmbt_key_t *rkp; /* right btree key */ - xfs_bmbt_ptr_t *rpp; /* right address pointer */ - xfs_bmbt_rec_t *rrp=NULL; /* right record pointer */ - struct xfs_btree_cur *tcur; /* temporary btree cursor */ - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, level); - if (level == cur->bc_nlevels - 1) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - lbp = cur->bc_bufs[level]; - left = XFS_BUF_TO_BMBT_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, left, level, lbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - if (be64_to_cpu(left->bb_rightsib) == NULLDFSBNO) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - if (cur->bc_ptrs[level] >= be16_to_cpu(left->bb_numrecs)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - mp = cur->bc_mp; - if ((error = xfs_btree_read_bufl(mp, cur->bc_tp, be64_to_cpu(left->bb_rightsib), 0, - &rbp, XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - right = XFS_BUF_TO_BMBT_BLOCK(rbp); - if ((error = xfs_btree_check_lblock(cur, right, level, rbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - if (be16_to_cpu(right->bb_numrecs) == XFS_BMAP_BLOCK_IMAXRECS(level, cur)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - if (level > 0) { - lkp = XFS_BMAP_KEY_IADDR(left, be16_to_cpu(left->bb_numrecs), cur); - lpp = XFS_BMAP_PTR_IADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rkp = XFS_BMAP_KEY_IADDR(right, 1, cur); - rpp = XFS_BMAP_PTR_IADDR(right, 1, cur); -#ifdef DEBUG - for (i = be16_to_cpu(right->bb_numrecs) - 1; i >= 0; i--) { - if ((error = xfs_btree_check_lptr_disk(cur, rpp[i], level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } -#endif - memmove(rkp + 1, rkp, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memmove(rpp + 1, rpp, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); -#ifdef DEBUG - if ((error = xfs_btree_check_lptr_disk(cur, *lpp, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - *rkp = *lkp; - *rpp = *lpp; - xfs_bmbt_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - xfs_bmbt_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - } else { - lrp = XFS_BMAP_REC_IADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rrp = XFS_BMAP_REC_IADDR(right, 1, cur); - memmove(rrp + 1, rrp, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - *rrp = *lrp; - xfs_bmbt_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - key.br_startoff = cpu_to_be64(xfs_bmbt_disk_get_startoff(rrp)); - rkp = &key; - } - be16_add(&left->bb_numrecs, -1); - xfs_bmbt_log_block(cur, lbp, XFS_BB_NUMRECS); - be16_add(&right->bb_numrecs, 1); -#ifdef DEBUG - if (level > 0) - xfs_btree_check_key(XFS_BTNUM_BMAP, rkp, rkp + 1); - else - xfs_btree_check_rec(XFS_BTNUM_BMAP, rrp, rrp + 1); -#endif - xfs_bmbt_log_block(cur, rbp, XFS_BB_NUMRECS); - if ((error = xfs_btree_dup_cursor(cur, &tcur))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_btree_increment(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(tcur, ERROR); - goto error1; - } - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_btree_updkey(tcur, (union xfs_btree_key *)rkp, level + 1))) { - XFS_BMBT_TRACE_CURSOR(tcur, ERROR); - goto error1; - } - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; -error0: - XFS_BMBT_TRACE_CURSOR(cur, ERROR); -error1: - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; -} - -/* * Determine the extent state. */ /* ARGSUSED */ @@ -2018,6 +1880,64 @@ xfs_bmbt_key_diff( } STATIC void +xfs_bmbt_set_key( + struct xfs_btree_cur *cur, + union xfs_btree_key *key_addr, + int index, + union xfs_btree_key *newkey) +{ + xfs_bmbt_key_t *kp = &key_addr->bmbt; + + kp[index] = newkey->bmbt; +} + +STATIC void +xfs_bmbt_set_rec( + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec_addr, + int index, + union xfs_btree_rec *newrec) +{ + xfs_bmbt_rec_t *rp = &rec_addr->bmbt; + + rp[index] = newrec->bmbt; +} + +STATIC void +xfs_bmbt_move_keys( + struct xfs_btree_cur *cur, + union xfs_btree_key *base, + int from, + int to, + int numkeys) +{ + xfs_bmbt_key_t *kp = &base->bmbt; + + ASSERT(from >= 0 && from <= 1000); + ASSERT(to >= 0 && to <= 1000); + ASSERT(numkeys >= 0); + + memmove(&kp[to], &kp[from], numkeys * sizeof(*kp)); +} + +STATIC void +xfs_bmbt_move_recs( + struct xfs_btree_cur *cur, + union xfs_btree_rec *base, + int from, + int to, + int numrecs) +{ + xfs_bmbt_rec_t *rp = &base->bmbt; + + ASSERT(from >= 0); + ASSERT(to >= 0); + ASSERT(numrecs >= 0); + + memmove(&rp[to], &rp[from], numrecs * sizeof(*rp)); +} + +STATIC void xfs_bmbt_copy_keys( struct xfs_btree_cur *cur, union xfs_btree_key *src_key, @@ -2192,6 +2112,10 @@ static const struct xfs_btree_ops xfs_bm .key_addr = xfs_bmbt_key_addr, .rec_addr = xfs_bmbt_rec_addr, .key_diff = xfs_bmbt_key_diff, + .set_key = xfs_bmbt_set_key, + .set_rec = xfs_bmbt_set_rec, + .move_keys = xfs_bmbt_move_keys, + .move_recs = xfs_bmbt_move_recs, .copy_keys = xfs_bmbt_copy_keys, .copy_recs = xfs_bmbt_copy_recs, .log_keys = xfs_bmbt_log_keys, -- From owner-xfs@oss.sgi.com Sun Aug 3 18:31:38 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:31:39 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741VbV7030527 for ; Sun, 3 Aug 2008 18:31:38 -0700 X-ASG-Debug-ID: 1217813571-16ad02eb0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0CB3C1976421 for ; Sun, 3 Aug 2008 18:32:51 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id hWOxQTv4umzBpmsD for ; Sun, 03 Aug 2008 18:32:51 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741WrIF008934 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:32:53 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741WruO008932 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:32:53 +0200 Date: Mon, 4 Aug 2008 03:32:53 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 05/26] add a long pointers flag to xfs_btree_cur Subject: [PATCH 05/26] add a long pointers flag to xfs_btree_cur Message-ID: <20080804013253.GF8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-btree-long-flag User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813572 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.82 X-Barracuda-Spam-Status: No, SCORE=-1.82 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC0_MJ615 X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1688 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.20 BSF_SC0_MJ615 Custom Rule MJ615 X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17330 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Add a flag to the xfs btree cursor when using long (64bit) block pointers instead of checking btnum == XFS_BTNUM_BMAP. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-07-11 11:12:37.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-07-11 11:13:15.000000000 +0200 @@ -90,7 +90,7 @@ xfs_btree_check_block( int level, /* level of the btree block */ xfs_buf_t *bp) /* buffer containing block, if any */ { - if (XFS_BTREE_LONG_PTRS(cur->bc_btnum)) + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) xfs_btree_check_lblock(cur, (xfs_btree_lblock_t *)block, level, bp); else @@ -500,7 +500,7 @@ xfs_btree_islastblock( block = xfs_btree_get_block(cur, level, &bp); xfs_btree_check_block(cur, block, level, bp); - if (XFS_BTREE_LONG_PTRS(cur->bc_btnum)) + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) return be64_to_cpu(block->bb_u.l.bb_rightsib) == NULLDFSBNO; else return be32_to_cpu(block->bb_u.s.bb_rightsib) == NULLAGBLOCK; @@ -792,7 +792,7 @@ xfs_btree_setbuf( if (!bp) return; b = XFS_BUF_TO_BLOCK(bp); - if (XFS_BTREE_LONG_PTRS(cur->bc_btnum)) { + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { if (be64_to_cpu(b->bb_u.l.bb_leftsib) == NULLDFSBNO) cur->bc_ra[lev] |= XFS_BTCUR_LEFTRA; if (be64_to_cpu(b->bb_u.l.bb_rightsib) == NULLDFSBNO) Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-07-11 11:12:37.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-07-11 11:13:15.000000000 +0200 @@ -121,11 +121,6 @@ union xfs_btree_rec { #define XFS_BB_ALL_BITS ((1 << XFS_BB_NUM_BITS) - 1) /* - * Boolean to select which form of xfs_btree_block_t.bb_u to use. - */ -#define XFS_BTREE_LONG_PTRS(btnum) ((btnum) == XFS_BTNUM_BMAP) - -/* * Magic numbers for btree blocks. */ extern const __uint32_t xfs_magics[]; @@ -211,6 +206,7 @@ typedef struct xfs_btree_cur } xfs_btree_cur_t; /* cursor flags */ +#define XFS_BTREE_LONG_PTRS (1<<0) /* pointers are 64bits long */ #define XFS_BTREE_ROOT_IN_INODE (1<<1) /* root may be variable size */ Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-07-11 11:12:37.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-07-11 11:13:15.000000000 +0200 @@ -2667,7 +2667,7 @@ xfs_bmbt_init_cursor( cur->bc_blocklog = mp->m_sb.sb_blocklog; cur->bc_ops = &xfs_bmbt_ops; - cur->bc_flags = XFS_BTREE_ROOT_IN_INODE; + cur->bc_flags = XFS_BTREE_LONG_PTRS | XFS_BTREE_ROOT_IN_INODE; cur->bc_private.b.forksize = XFS_IFORK_SIZE(ip, whichfork); cur->bc_private.b.ip = ip; -- From owner-xfs@oss.sgi.com Sun Aug 3 18:30:57 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:31:18 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741Uv02030321 for ; Sun, 3 Aug 2008 18:30:57 -0700 X-ASG-Debug-ID: 1217813529-16b502c50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 844F3197641E for ; Sun, 3 Aug 2008 18:32:09 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id nUE1F5FU4zlqEWE0 for ; Sun, 03 Aug 2008 18:32:09 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741WBIF008884 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:32:11 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741WBVb008881 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:32:11 +0200 Date: Mon, 4 Aug 2008 03:32:11 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 01/26] kill struct xfs_btree_hdr Subject: [PATCH 01/26] kill struct xfs_btree_hdr Message-ID: <20080804013211.GB8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-kill-btree-hdr User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813531 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1688 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17326 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs This type is only embedded in struct xfs_btree_block and never used directly. By moving the fields directly into struct xfs_btree_block a lot of the macros for struct xfs_btree_sblock and struct xfs_btree_lblock can be used for struct xfs_btree_block too now which helps greatly with some of the migrations during implementing the generic btree code. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-02 04:01:21.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-02 04:04:22.000000000 +0200 @@ -62,13 +62,13 @@ xfs_btree_maxrecs( case XFS_BTNUM_BNO: case XFS_BTNUM_CNT: return (int)XFS_ALLOC_BLOCK_MAXRECS( - be16_to_cpu(block->bb_h.bb_level), cur); + be16_to_cpu(block->bb_level), cur); case XFS_BTNUM_BMAP: return (int)XFS_BMAP_BLOCK_IMAXRECS( - be16_to_cpu(block->bb_h.bb_level), cur); + be16_to_cpu(block->bb_level), cur); case XFS_BTNUM_INO: return (int)XFS_INOBT_BLOCK_MAXRECS( - be16_to_cpu(block->bb_h.bb_level), cur); + be16_to_cpu(block->bb_level), cur); default: ASSERT(0); return 0; @@ -634,7 +634,7 @@ xfs_btree_firstrec( /* * It's empty, there is no such record. */ - if (!block->bb_h.bb_numrecs) + if (!block->bb_numrecs) return 0; /* * Set the ptr value to 1, that's the first record/key. @@ -663,12 +663,12 @@ xfs_btree_lastrec( /* * It's empty, there is no such record. */ - if (!block->bb_h.bb_numrecs) + if (!block->bb_numrecs) return 0; /* * Set the ptr value to numrecs, that's the last record/key. */ - cur->bc_ptrs[level] = be16_to_cpu(block->bb_h.bb_numrecs); + cur->bc_ptrs[level] = be16_to_cpu(block->bb_numrecs); return 1; } Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-02 04:00:28.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-02 04:05:00.000000000 +0200 @@ -63,15 +63,10 @@ typedef struct xfs_btree_lblock { /* * Combined header and structure, used by common code. */ -typedef struct xfs_btree_hdr -{ +typedef struct xfs_btree_block { __be32 bb_magic; /* magic number for block type */ __be16 bb_level; /* 0 is a leaf */ __be16 bb_numrecs; /* current # of data records */ -} xfs_btree_hdr_t; - -typedef struct xfs_btree_block { - xfs_btree_hdr_t bb_h; /* header */ union { struct { __be32 bb_leftsib; -- From owner-xfs@oss.sgi.com Sun Aug 3 18:32:03 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:32:07 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741W2a8030847 for ; Sun, 3 Aug 2008 18:32:03 -0700 X-ASG-Debug-ID: 1217813595-40d501b50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 228D7EF6A27 for ; Sun, 3 Aug 2008 18:33:15 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id wRWTlKmdRjaYqMBY for ; Sun, 03 Aug 2008 18:33:15 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741XHIF009000 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:33:17 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741XHuo008998 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:33:17 +0200 Date: Mon, 4 Aug 2008 03:33:17 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 07/26] refactor btree validation helpers Subject: [PATCH 07/26] refactor btree validation helpers Message-ID: <20080804013317.GH8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-btree-check-refactor User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813597 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.82 X-Barracuda-Spam-Status: No, SCORE=-1.82 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC0_MJ615 X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.20 BSF_SC0_MJ615 Custom Rule MJ615 X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17332 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Move the various btree validation helpers around in xfs_btree.c so that they are close to each other and in common #ifdef DEBUG sections. Also add a new xfs_btree_check_ptr helper to check a btree ptr that can be either long or short form. Split out from a bigger patch from Dave Chinner with various small changes applied by me. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-07-29 21:35:34.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-01 02:34:35.000000000 +0200 @@ -81,24 +81,6 @@ xfs_btree_maxrecs( #ifdef DEBUG /* - * Debug routine: check that block header is ok. - */ -void -xfs_btree_check_block( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_block_t *block, /* generic btree block pointer */ - int level, /* level of the btree block */ - xfs_buf_t *bp) /* buffer containing block, if any */ -{ - if (cur->bc_flags & XFS_BTREE_LONG_PTRS) - xfs_btree_check_lblock(cur, (xfs_btree_lblock_t *)block, level, - bp); - else - xfs_btree_check_sblock(cur, (xfs_btree_sblock_t *)block, level, - bp); -} - -/* * Debug routine: check that keys are in the right order. */ void @@ -150,66 +132,8 @@ xfs_btree_check_key( ASSERT(0); } } -#endif /* DEBUG */ /* - * Checking routine: check that long form block header is ok. - */ -/* ARGSUSED */ -int /* error (0 or EFSCORRUPTED) */ -xfs_btree_check_lblock( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_lblock_t *block, /* btree long form block pointer */ - int level, /* level of the btree block */ - xfs_buf_t *bp) /* buffer for block, if any */ -{ - int lblock_ok; /* block passes checks */ - xfs_mount_t *mp; /* file system mount point */ - - mp = cur->bc_mp; - lblock_ok = - be32_to_cpu(block->bb_magic) == xfs_magics[cur->bc_btnum] && - be16_to_cpu(block->bb_level) == level && - be16_to_cpu(block->bb_numrecs) <= - xfs_btree_maxrecs(cur, (xfs_btree_block_t *)block) && - block->bb_leftsib && - (be64_to_cpu(block->bb_leftsib) == NULLDFSBNO || - XFS_FSB_SANITY_CHECK(mp, be64_to_cpu(block->bb_leftsib))) && - block->bb_rightsib && - (be64_to_cpu(block->bb_rightsib) == NULLDFSBNO || - XFS_FSB_SANITY_CHECK(mp, be64_to_cpu(block->bb_rightsib))); - if (unlikely(XFS_TEST_ERROR(!lblock_ok, mp, XFS_ERRTAG_BTREE_CHECK_LBLOCK, - XFS_RANDOM_BTREE_CHECK_LBLOCK))) { - if (bp) - xfs_buftrace("LBTREE ERROR", bp); - XFS_ERROR_REPORT("xfs_btree_check_lblock", XFS_ERRLEVEL_LOW, - mp); - return XFS_ERROR(EFSCORRUPTED); - } - return 0; -} - -/* - * Checking routine: check that (long) pointer is ok. - */ -int /* error (0 or EFSCORRUPTED) */ -xfs_btree_check_lptr( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_dfsbno_t ptr, /* btree block disk address */ - int level) /* btree block level */ -{ - xfs_mount_t *mp; /* file system mount point */ - - mp = cur->bc_mp; - XFS_WANT_CORRUPTED_RETURN( - level > 0 && - ptr != NULLDFSBNO && - XFS_FSB_SANITY_CHECK(mp, ptr)); - return 0; -} - -#ifdef DEBUG -/* * Debug routine: check that records are in the right order. */ void @@ -268,10 +192,39 @@ xfs_btree_check_rec( } #endif /* DEBUG */ -/* - * Checking routine: check that block header is ok. - */ -/* ARGSUSED */ +int /* error (0 or EFSCORRUPTED) */ +xfs_btree_check_lblock( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_btree_lblock_t *block, /* btree long form block pointer */ + int level, /* level of the btree block */ + xfs_buf_t *bp) /* buffer for block, if any */ +{ + int lblock_ok; /* block passes checks */ + xfs_mount_t *mp; /* file system mount point */ + + mp = cur->bc_mp; + lblock_ok = + be32_to_cpu(block->bb_magic) == xfs_magics[cur->bc_btnum] && + be16_to_cpu(block->bb_level) == level && + be16_to_cpu(block->bb_numrecs) <= + xfs_btree_maxrecs(cur, (xfs_btree_block_t *)block) && + block->bb_leftsib && + (be64_to_cpu(block->bb_leftsib) == NULLDFSBNO || + XFS_FSB_SANITY_CHECK(mp, be64_to_cpu(block->bb_leftsib))) && + block->bb_rightsib && + (be64_to_cpu(block->bb_rightsib) == NULLDFSBNO || + XFS_FSB_SANITY_CHECK(mp, be64_to_cpu(block->bb_rightsib))); + if (unlikely(XFS_TEST_ERROR(!lblock_ok, mp, XFS_ERRTAG_BTREE_CHECK_LBLOCK, + XFS_RANDOM_BTREE_CHECK_LBLOCK))) { + if (bp) + xfs_buftrace("LBTREE ERROR", bp); + XFS_ERROR_REPORT("xfs_btree_check_lblock", XFS_ERRLEVEL_LOW, + mp); + return XFS_ERROR(EFSCORRUPTED); + } + return 0; +} + int /* error (0 or EFSCORRUPTED) */ xfs_btree_check_sblock( xfs_btree_cur_t *cur, /* btree cursor */ @@ -311,27 +264,81 @@ xfs_btree_check_sblock( } /* - * Checking routine: check that (short) pointer is ok. + * Debug routine: check that block header is ok. + */ +int +xfs_btree_check_block( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_btree_block_t *block, /* generic btree block pointer */ + int level, /* level of the btree block */ + xfs_buf_t *bp) /* buffer containing block, if any */ +{ + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { + return xfs_btree_check_lblock(cur, (xfs_btree_lblock_t *)block, + level, bp); + } + + return xfs_btree_check_sblock(cur, (xfs_btree_sblock_t *)block, + level, bp); +} + +/* + * Check that (long) pointer is ok. */ int /* error (0 or EFSCORRUPTED) */ -xfs_btree_check_sptr( +xfs_btree_check_lptr( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agblock_t ptr, /* btree block disk address */ + xfs_dfsbno_t bno, /* btree block disk address */ int level) /* btree block level */ { - xfs_buf_t *agbp; /* buffer for ag. freespace struct */ - xfs_agf_t *agf; /* ag. freespace structure */ + xfs_mount_t *mp; /* file system mount point */ + + mp = cur->bc_mp; + XFS_WANT_CORRUPTED_RETURN( + level > 0 && + bno != NULLDFSBNO && + XFS_FSB_SANITY_CHECK(mp, bno)); + return 0; +} + +/* + * Check that (short) pointer is ok. + */ +int /* error (0 or EFSCORRUPTED) */ +xfs_btree_check_sptr( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_agblock_t bno, /* btree block disk address */ + int level) /* btree block level */ +{ + xfs_agblock_t agblocks = cur->bc_mp->m_sb.sb_agblocks; - agbp = cur->bc_private.a.agbp; - agf = XFS_BUF_TO_AGF(agbp); XFS_WANT_CORRUPTED_RETURN( level > 0 && - ptr != NULLAGBLOCK && ptr != 0 && - ptr < be32_to_cpu(agf->agf_length)); + bno != NULLAGBLOCK && + bno != 0 && + bno < agblocks); return 0; } /* + * Check that block ptr is ok. + */ +int /* error (0 or EFSCORRUPTED) */ +xfs_btree_check_ptr( + struct xfs_btree_cur *cur, /* btree cursor */ + union xfs_btree_ptr *ptr, /* btree block disk address */ + int index, /* offset from ptr to check */ + int level) /* btree block level */ +{ + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { + return xfs_btree_check_lptr(cur, + be64_to_cpu((&ptr->l)[index]), + level); + } + return xfs_btree_check_sptr(cur, be32_to_cpu((&ptr->s)[index]), level); +} + +/* * Delete the btree cursor. */ void Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-07-29 21:35:34.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-01 02:34:35.000000000 +0200 @@ -223,52 +251,38 @@ typedef struct xfs_btree_cur #ifdef __KERNEL__ -#ifdef DEBUG /* - * Debug routine: check that block header is ok. + * Check that long form block header is ok. */ -void -xfs_btree_check_block( +int /* error (0 or EFSCORRUPTED) */ +xfs_btree_check_lblock( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_block_t *block, /* generic btree block pointer */ + xfs_btree_lblock_t *block, /* btree long form block pointer */ int level, /* level of the btree block */ struct xfs_buf *bp); /* buffer containing block, if any */ /* - * Debug routine: check that keys are in the right order. - */ -void -xfs_btree_check_key( - xfs_btnum_t btnum, /* btree identifier */ - void *ak1, /* pointer to left (lower) key */ - void *ak2); /* pointer to right (higher) key */ - -/* - * Debug routine: check that records are in the right order. + * Check that short form block header is ok. */ -void -xfs_btree_check_rec( - xfs_btnum_t btnum, /* btree identifier */ - void *ar1, /* pointer to left (lower) record */ - void *ar2); /* pointer to right (higher) record */ -#else -#define xfs_btree_check_block(a,b,c,d) -#define xfs_btree_check_key(a,b,c) -#define xfs_btree_check_rec(a,b,c) -#endif /* DEBUG */ +int /* error (0 or EFSCORRUPTED) */ +xfs_btree_check_sblock( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_btree_sblock_t *block, /* btree short form block pointer */ + int level, /* level of the btree block */ + struct xfs_buf *bp); /* buffer containing block */ /* - * Checking routine: check that long form block header is ok. + * Check that block header is ok. */ -int /* error (0 or EFSCORRUPTED) */ -xfs_btree_check_lblock( +int +xfs_btree_check_block( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_lblock_t *block, /* btree long form block pointer */ + xfs_btree_block_t *block, /* generic btree block pointer */ int level, /* level of the btree block */ struct xfs_buf *bp); /* buffer containing block, if any */ /* - * Checking routine: check that (long) pointer is ok. + * Check that (long) pointer is ok. */ int /* error (0 or EFSCORRUPTED) */ xfs_btree_check_lptr( @@ -279,25 +293,50 @@ xfs_btree_check_lptr( #define xfs_btree_check_lptr_disk(cur, ptr, level) \ xfs_btree_check_lptr(cur, be64_to_cpu(ptr), level) + /* - * Checking routine: check that short form block header is ok. + * Check that (short) pointer is ok. */ int /* error (0 or EFSCORRUPTED) */ -xfs_btree_check_sblock( +xfs_btree_check_sptr( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_sblock_t *block, /* btree short form block pointer */ - int level, /* level of the btree block */ - struct xfs_buf *bp); /* buffer containing block */ + xfs_agblock_t ptr, /* btree block disk address */ + int level); /* btree block level */ /* - * Checking routine: check that (short) pointer is ok. + * Check that (short) pointer is ok. */ int /* error (0 or EFSCORRUPTED) */ -xfs_btree_check_sptr( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agblock_t ptr, /* btree block disk address */ +xfs_btree_check_ptr( + struct xfs_btree_cur *cur, /* btree cursor */ + union xfs_btree_ptr *ptr, /* btree block disk address */ + int index, /* offset from ptr to check */ int level); /* btree block level */ +#ifdef DEBUG + +/* + * Debug routine: check that keys are in the right order. + */ +void +xfs_btree_check_key( + xfs_btnum_t btnum, /* btree identifier */ + void *ak1, /* pointer to left (lower) key */ + void *ak2); /* pointer to right (higher) key */ + +/* + * Debug routine: check that records are in the right order. + */ +void +xfs_btree_check_rec( + xfs_btnum_t btnum, /* btree identifier */ + void *ar1, /* pointer to left (lower) record */ + void *ar2); /* pointer to right (higher) record */ +#else +#define xfs_btree_check_key(a,b,c) +#define xfs_btree_check_rec(a,b,c) +#endif /* DEBUG */ + /* * Delete the btree cursor. */ -- From owner-xfs@oss.sgi.com Sun Aug 3 18:34:18 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:34:29 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_42, J_CHICKENPOX_43,J_CHICKENPOX_44,J_CHICKENPOX_45,J_CHICKENPOX_46, J_CHICKENPOX_47,J_CHICKENPOX_48,J_CHICKENPOX_63,J_CHICKENPOX_66 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741YHKQ001058 for ; Sun, 3 Aug 2008 18:34:17 -0700 X-ASG-Debug-ID: 1217813728-40cc01ad0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 21391EF6ABA for ; Sun, 3 Aug 2008 18:35:28 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id X86ZcbO7M0l71UhA for ; Sun, 03 Aug 2008 18:35:28 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741ZUIF009261 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:35:30 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741ZU7i009259 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:35:30 +0200 Date: Mon, 4 Aug 2008 03:35:30 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 20/26] move xfs_bmbt_newroot to common code Subject: [PATCH 20/26] move xfs_bmbt_newroot to common code Message-ID: <20080804013530.GU8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-common-btree-iroot_to_root User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813731 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1689 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17343 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs xfs_bmbt_newroot is a mostly generic implementation of moving from an inode root to a real block based root. So move it to xfs_btree.c where it can use all the nice infrastructure there and make it pointer size agnostic The new name for it is xfs_btree_iroot_to_root which is not very nice but at least slightly more descriptive than the old name. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-03 15:34:20.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-03 22:42:26.000000000 +0200 @@ -2250,6 +2250,106 @@ error0: } /* + * Give the btree a real root block. Copy the old broot contents + * down into a real block and make the broot point to it. + */ +int /* error */ +xfs_btree_iroot_to_root( + struct xfs_btree_cur *cur, /* btree cursor */ + int *logflags, /* logging flags for inode */ + int *stat) /* return status - 0 fail */ +{ + struct xfs_buf *bp; /* buffer for block */ + struct xfs_buf *cbp; /* buffer for cblock */ + struct xfs_btree_block *block; /* btree block */ + struct xfs_btree_block *cblock; /* child btree block */ + union xfs_btree_key *ckp; /* child key pointer */ + union xfs_btree_ptr *cpp; /* child ptr pointer */ + union xfs_btree_key *kp; /* pointer to btree key */ + union xfs_btree_ptr *pp; /* pointer to block addr */ + union xfs_btree_ptr nptr; /* new block addr */ + int level; /* btree level */ + int error; /* error return code */ +#ifdef DEBUG + int i; /* loop counter */ +#endif + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_STATS_INC(cur, newroot); + + level = cur->bc_nlevels - 1; + /* XXX(hch): this should be an get_inode_from_root, right? */ + block = xfs_btree_get_block(cur, level, &bp); + pp = cur->bc_ops->ptr_addr(cur, 1, block); + + /* Allocate the new block. If we can't do it, we're toast. Give up. */ + error = cur->bc_ops->alloc_block(cur, pp, &nptr, 1, stat); + if (error) + goto error0; + if (*stat == 0) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + return 0; + } + XFS_BTREE_STATS_INC(cur, alloc); + + /* Copy the root into a real block. */ + error = xfs_btree_get_buf_block(cur, &nptr, 0, &cblock, &cbp); + if (error) + goto error0; + + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) + memcpy(cblock, block, sizeof(struct xfs_btree_lblock)); + else + memcpy(cblock, block, sizeof(struct xfs_btree_sblock)); + + be16_add(&block->bb_level, 1); + xfs_btree_set_numrecs(block, 1); + cur->bc_nlevels++; + cur->bc_ptrs[level + 1] = 1; + + kp = cur->bc_ops->key_addr(cur, 1, block); + ckp = cur->bc_ops->key_addr(cur, 1, cblock); + cur->bc_ops->copy_keys(cur, kp, ckp, xfs_btree_get_numrecs(cblock)); + + cpp = cur->bc_ops->ptr_addr(cur, 1, cblock); +#ifdef DEBUG + for (i = 0; i < be16_to_cpu(cblock->bb_numrecs); i++) { + error = xfs_btree_check_ptr(cur, pp, i, level); + if (error) + goto error0; + } +#endif + xfs_btree_copy_ptrs(cur, pp, cpp, xfs_btree_get_numrecs(cblock)); + +#ifdef DEBUG + error = xfs_btree_check_ptr(cur, &nptr, 0, level); + if (error) + goto error0; +#endif + xfs_btree_set_ptr(cur, pp, 0, &nptr); + + cur->bc_ops->realloc_root(cur, 1 - xfs_btree_get_numrecs(cblock)); + xfs_btree_setbuf(cur, level, cbp); + + /* + * Do all this logging at the end so that + * the root is at the right level. + */ + xfs_btree_log_block(cur, cbp, XFS_BB_ALL_BITS); + cur->bc_ops->log_keys(cur, cbp, 1, be16_to_cpu(cblock->bb_numrecs)); + xfs_btree_log_ptrs(cur, cbp, 1, be16_to_cpu(cblock->bb_numrecs)); + + *logflags |= + XFS_ILOG_CORE | XFS_ILOG_FBROOT(cur->bc_private.b.whichfork); + *stat = 1; + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + return 0; +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + +/* * Allocate a new root block, fill it in. */ int /* error */ Index: linux-2.6-xfs/fs/xfs/xfs_bmap.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap.c 2008-08-03 16:02:56.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap.c 2008-08-03 22:42:26.000000000 +0200 @@ -476,7 +476,7 @@ xfs_bmap_add_attrfork_btree( goto error0; /* must be at least one entry */ XFS_WANT_CORRUPTED_GOTO(stat == 1, error0); - if ((error = xfs_bmbt_newroot(cur, flags, &stat))) + if ((error = xfs_btree_iroot_to_root(cur, flags, &stat))) goto error0; if (stat == 0) { xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR); Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-03 15:34:19.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-03 22:42:25.000000000 +0200 @@ -525,7 +525,7 @@ xfs_bmbt_insrec( cur->bc_private.b.whichfork); block = xfs_bmbt_get_block(cur, level, &bp); } else if (level == cur->bc_nlevels - 1) { - if ((error = xfs_bmbt_newroot(cur, &logflags, stat)) || + if ((error = xfs_btree_iroot_to_root(cur, &logflags, stat)) || *stat == 0) { XFS_BMBT_TRACE_CURSOR(cur, ERROR); return error; @@ -1119,117 +1119,6 @@ xfs_bmbt_log_block( } /* - * Give the bmap btree a new root block. Copy the old broot contents - * down into a real block and make the broot point to it. - */ -int /* error */ -xfs_bmbt_newroot( - xfs_btree_cur_t *cur, /* btree cursor */ - int *logflags, /* logging flags for inode */ - int *stat) /* return status - 0 fail */ -{ - xfs_alloc_arg_t args; /* allocation arguments */ - xfs_bmbt_block_t *block; /* bmap btree block */ - xfs_buf_t *bp; /* buffer for block */ - xfs_bmbt_block_t *cblock; /* child btree block */ - xfs_bmbt_key_t *ckp; /* child key pointer */ - xfs_bmbt_ptr_t *cpp; /* child ptr pointer */ - int error; /* error return code */ -#ifdef DEBUG - int i; /* loop counter */ -#endif - xfs_bmbt_key_t *kp; /* pointer to bmap btree key */ - int level; /* btree level */ - xfs_bmbt_ptr_t *pp; /* pointer to bmap block addr */ - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - level = cur->bc_nlevels - 1; - block = xfs_bmbt_get_block(cur, level, &bp); - /* - * Copy the root into a real block. - */ - args.mp = cur->bc_mp; - pp = XFS_BMAP_PTR_IADDR(block, 1, cur); - args.tp = cur->bc_tp; - args.fsbno = cur->bc_private.b.firstblock; - args.mod = args.minleft = args.alignment = args.total = args.isfl = - args.userdata = args.minalignslop = 0; - args.minlen = args.maxlen = args.prod = 1; - args.wasdel = cur->bc_private.b.flags & XFS_BTCUR_BPRV_WASDEL; - args.firstblock = args.fsbno; - if (args.fsbno == NULLFSBLOCK) { -#ifdef DEBUG - if ((error = xfs_btree_check_lptr_disk(cur, *pp, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - args.fsbno = be64_to_cpu(*pp); - args.type = XFS_ALLOCTYPE_START_BNO; - } else if (cur->bc_private.b.flist->xbf_low) - args.type = XFS_ALLOCTYPE_START_BNO; - else - args.type = XFS_ALLOCTYPE_NEAR_BNO; - if ((error = xfs_alloc_vextent(&args))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - if (args.fsbno == NULLFSBLOCK) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - ASSERT(args.len == 1); - cur->bc_private.b.firstblock = args.fsbno; - cur->bc_private.b.allocated++; - cur->bc_private.b.ip->i_d.di_nblocks++; - XFS_TRANS_MOD_DQUOT_BYINO(args.mp, args.tp, cur->bc_private.b.ip, - XFS_TRANS_DQ_BCOUNT, 1L); - bp = xfs_btree_get_bufl(args.mp, cur->bc_tp, args.fsbno, 0); - cblock = XFS_BUF_TO_BMBT_BLOCK(bp); - *cblock = *block; - be16_add(&block->bb_level, 1); - block->bb_numrecs = cpu_to_be16(1); - cur->bc_nlevels++; - cur->bc_ptrs[level + 1] = 1; - kp = XFS_BMAP_KEY_IADDR(block, 1, cur); - ckp = XFS_BMAP_KEY_IADDR(cblock, 1, cur); - memcpy(ckp, kp, be16_to_cpu(cblock->bb_numrecs) * sizeof(*kp)); - cpp = XFS_BMAP_PTR_IADDR(cblock, 1, cur); -#ifdef DEBUG - for (i = 0; i < be16_to_cpu(cblock->bb_numrecs); i++) { - if ((error = xfs_btree_check_lptr_disk(cur, pp[i], level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } -#endif - memcpy(cpp, pp, be16_to_cpu(cblock->bb_numrecs) * sizeof(*pp)); -#ifdef DEBUG - if ((error = xfs_btree_check_lptr(cur, args.fsbno, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - *pp = cpu_to_be64(args.fsbno); - xfs_iroot_realloc(cur->bc_private.b.ip, 1 - be16_to_cpu(cblock->bb_numrecs), - cur->bc_private.b.whichfork); - xfs_btree_setbuf(cur, level, bp); - /* - * Do all this logging at the end so that - * the root is at the right level. - */ - xfs_bmbt_log_block(cur, bp, XFS_BB_ALL_BITS); - xfs_bmbt_log_keys(cur, bp, 1, be16_to_cpu(cblock->bb_numrecs)); - xfs_bmbt_log_ptrs(cur, bp, 1, be16_to_cpu(cblock->bb_numrecs)); - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *logflags |= - XFS_ILOG_CORE | XFS_ILOG_FBROOT(cur->bc_private.b.whichfork); - *stat = 1; - return 0; -} - -/* * Set all the fields in a bmap extent record from the arguments. */ void @@ -1593,6 +1482,15 @@ xfs_bmbt_get_root_from_inode( return (struct xfs_btree_block *)ifp->if_broot; } +STATIC void +xfs_bmbt_realloc_root( + struct xfs_btree_cur *cur, + int index) +{ + xfs_iroot_realloc(cur->bc_private.b.ip, index, + cur->bc_private.b.whichfork); +} + STATIC int xfs_bmbt_get_maxrecs( struct xfs_btree_cur *cur, @@ -1881,6 +1779,7 @@ xfs_bmbt_trace_record( static const struct xfs_btree_ops xfs_bmbt_ops = { .dup_cursor = xfs_bmbt_dup_cursor, .get_root_from_inode = xfs_bmbt_get_root_from_inode, + .realloc_root = xfs_bmbt_realloc_root, .alloc_block = xfs_bmbt_alloc_block, .init_key_from_rec = xfs_bmbt_init_key_from_rec, .init_ptr_from_cur = xfs_bmbt_init_ptr_from_cur, Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.h 2008-08-03 16:01:23.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h 2008-08-03 22:42:26.000000000 +0200 @@ -255,12 +255,6 @@ extern void xfs_bmbt_log_block(struct xf extern void xfs_bmbt_log_recs(struct xfs_btree_cur *, struct xfs_buf *, int, int); -/* - * Give the bmap btree a new root block. Copy the old broot contents - * down into a real block and make the broot point to it. - */ -extern int xfs_bmbt_newroot(struct xfs_btree_cur *cur, int *lflags, int *stat); - extern void xfs_bmbt_set_all(xfs_bmbt_rec_host_t *r, xfs_bmbt_irec_t *s); extern void xfs_bmbt_set_allf(xfs_bmbt_rec_host_t *r, xfs_fileoff_t o, xfs_fsblock_t b, xfs_filblks_t c, xfs_exntst_t v); Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-03 15:59:03.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-03 22:42:25.000000000 +0200 @@ -183,8 +183,9 @@ struct xfs_btree_ops { /* cursor operations */ struct xfs_btree_cur *(*dup_cursor)(struct xfs_btree_cur *); - /* get inode rooted btree root */ + /* btree root in inode support */ struct xfs_btree_block *(*get_root_from_inode)(struct xfs_btree_cur *); + void (*realloc_root)(struct xfs_btree_cur *cur, int index); /* btree root operations */ void (*set_root)(struct xfs_btree_cur *cur, @@ -586,6 +587,7 @@ int xfs_btree_rshift(struct xfs_btree_cu int xfs_btree_split(struct xfs_btree_cur *, int, union xfs_btree_ptr *, union xfs_btree_key *, struct xfs_btree_cur **, int *); int xfs_btree_new_root(struct xfs_btree_cur *, int *); +int xfs_btree_iroot_to_root(struct xfs_btree_cur *, int *, int *); /* -- From owner-xfs@oss.sgi.com Sun Aug 3 18:45:01 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:45:03 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741j0ii004526 for ; Sun, 3 Aug 2008 18:45:01 -0700 X-ASG-Debug-ID: 1217814373-16ac03850000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B925C197648C for ; Sun, 3 Aug 2008 18:46:13 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id 0G9RyZxwt2Hedihn for ; Sun, 03 Aug 2008 18:46:13 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AkwMAJP6lUh5LBjh/2dsb2JhbACKZKMZ X-IronPort-AV: E=Sophos;i="4.31,302,1215354600"; d="scan'208";a="163497816" Received: from ppp121-44-24-225.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.24.225]) by ipmail01.adl6.internode.on.net with ESMTP; 04 Aug 2008 11:16:11 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KPp9G-0001Xz-I0; Mon, 04 Aug 2008 11:46:10 +1000 Date: Mon, 4 Aug 2008 11:46:10 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 01/26] kill struct xfs_btree_hdr Subject: Re: [PATCH 01/26] kill struct xfs_btree_hdr Message-ID: <20080804014610.GD6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013211.GB8819@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804013211.GB8819@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217814374 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1688 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17351 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 03:32:11AM +0200, Christoph Hellwig wrote: > This type is only embedded in struct xfs_btree_block and never used > directly. By moving the fields directly into struct xfs_btree_block > a lot of the macros for struct xfs_btree_sblock and struct xfs_btree_lblock > can be used for struct xfs_btree_block too now which helps greatly > with some of the migrations during implementing the generic btree code. Looks fine. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Sun Aug 3 18:50:49 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:50:55 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_63, J_CHICKENPOX_64,J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741YeFP001455 for ; Sun, 3 Aug 2008 18:34:40 -0700 X-ASG-Debug-ID: 1217813750-739d01d20000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 06C513566D5 for ; Sun, 3 Aug 2008 18:35:50 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id W3LPQ4D8y3yUKQQH for ; Sun, 03 Aug 2008 18:35:50 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m741ZoIF009311 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Mon, 4 Aug 2008 03:35:50 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m741ZoEA009309 for xfs@oss.sgi.com; Mon, 4 Aug 2008 03:35:50 +0200 Date: Mon, 4 Aug 2008 03:35:50 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 23/26] implement generic xfs_btree_delete/delrec Subject: [PATCH 23/26] implement generic xfs_btree_delete/delrec Message-ID: <20080804013550.GX8819@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename=xfs-common-btree-delete User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217813753 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.22 X-Barracuda-Spam-Status: No, SCORE=-1.22 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC0_MJ615, MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1687 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words 0.20 BSF_SC0_MJ615 Custom Rule MJ615 X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17352 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Make the btree delete code generic. Based on a patch from David Chinner with lots of changes to follow the original btree implementations more closely. While this loses some of the generic helper routines for inserting/moving/removing records it also solves some of the one off bugs in the original code and makes it easier to verify. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-04 01:18:45.000000000 +0200 @@ -2943,3 +2943,639 @@ out0: XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); return 0; } + +STATIC int +xfs_btree_dec_cursor( + struct xfs_btree_cur *cur, + int level, + int *stat) +{ + int error; + int i; + + if (level > 0) { + error = xfs_btree_decrement(cur, level, &i); + if (error) + return error; + } + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; +} + +STATIC int +xfs_btree_update_root( + struct xfs_btree_cur *cur, + struct xfs_buf *bp, + int level, + union xfs_btree_ptr *newroot) +{ + int error; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_STATS_INC(cur, killroot); + +#ifdef DEBUG + if (cur->bc_btnum == XFS_BTNUM_BNO || cur->bc_btnum == XFS_BTNUM_CNT) { + struct xfs_buf *agbp = cur->bc_private.a.agbp; + struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp); + + ASSERT(be32_to_cpu(agf->agf_roots[cur->bc_btnum]) == + XFS_DADDR_TO_AGBNO(cur->bc_mp, XFS_BUF_ADDR(bp))); + } else if (cur->bc_btnum == XFS_BTNUM_INO) { + struct xfs_buf *agbp = cur->bc_private.a.agbp; + struct xfs_agi *agi = XFS_BUF_TO_AGI(agbp); + xfs_fsblock_t fsbno; + + fsbno = XFS_AGB_TO_FSB(cur->bc_mp, cur->bc_private.a.agno, + be32_to_cpu(agi->agi_root)); + + ASSERT(fsbno == XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(bp))); + } +#endif + + /* + * Update the root pointer, decreasing the level by 1 and then + * free the old root. + */ + cur->bc_ops->set_root(cur, newroot, -1); + error = cur->bc_ops->free_block(cur, bp, 1); + if (error) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; + } + + XFS_BTREE_STATS_INC(cur, free); + + /* Update the cursor so there's one fewer level. */ + xfs_btree_setbuf(cur, level, NULL); + cur->bc_nlevels--; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + return 0; +} + +/* + * Single level of the btree record deletion routine. + * Delete record pointed to by cur/level. + * Remove the record from its block then rebalance the tree. + * Return 0 for error, 1 for done, 2 to go on to the next level. + */ +STATIC int /* error */ +xfs_btree_delrec( + struct xfs_btree_cur *cur, /* btree cursor */ + int level, /* level removing record from */ + int *stat) /* fail/done/go-on */ +{ + struct xfs_btree_block *block; /* bmap btree block */ + union xfs_btree_ptr cptr; /* current block ptr */ + struct xfs_buf *bp; /* buffer for block */ + int error; /* error return value */ + int i; /* loop counter */ + union xfs_btree_key key; /* bmap btree key */ + union xfs_btree_ptr lptr; /* left sibling block ptr */ + struct xfs_buf *lbp; /* left buffer pointer */ + struct xfs_btree_block *left; /* left btree block */ + int lrecs = 0; /* left record count */ + int ptr; /* key/record index */ + union xfs_btree_ptr rptr; /* right sibling block ptr */ + struct xfs_buf *rbp; /* right buffer pointer */ + struct xfs_btree_block *right; /* right btree block */ + struct xfs_btree_block *rrblock; /* right-right btree block */ + struct xfs_buf *rrbp; /* right-right buffer pointer */ + int rrecs = 0; /* right record count */ + struct xfs_btree_cur *tcur; /* temporary btree cursor */ + int numrecs; /* temporary numrec count */ + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, level); + + tcur = NULL; + + /* Get the index of the entry being deleted, check for nothing there. */ + ptr = cur->bc_ptrs[level]; + if (ptr == 0) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + } + + /* Get the buffer & block containing the record or key/ptr. */ + block = xfs_btree_get_block(cur, level, &bp); + numrecs = xfs_btree_get_numrecs(block); + +#ifdef DEBUG + error = xfs_btree_check_block(cur, block, level, bp); + if (error) + goto error0; +#endif + + /* Fail if we're off the end of the block. */ + if (ptr > numrecs) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + } + + XFS_BTREE_STATS_INC(cur, delrec); + XFS_BTREE_STATS_ADD(cur, moves, numrecs - ptr); + + /* Excise the entries being deleted. */ + if (level > 0) { + /* It's a nonleaf. operate on keys and ptrs */ + union xfs_btree_key *lkp; /* pointer to bmap btree key */ + union xfs_btree_ptr *lpp; /* pointer to bmap block addr */ + + lkp = cur->bc_ops->key_addr(cur, 1, block); + lpp = cur->bc_ops->ptr_addr(cur, 1, block); + +#ifdef DEBUG + for (i = ptr; i < numrecs; i++) { + error = xfs_btree_check_ptr(cur, lpp, i, level); + if (error) + goto error0; + } +#endif + + if (ptr < numrecs) { + cur->bc_ops->move_keys(cur, lkp, ptr, ptr - 1, + numrecs - ptr); + xfs_btree_move_ptrs(cur, lpp, ptr, ptr - 1, + numrecs - ptr); + cur->bc_ops->log_keys(cur, bp, ptr, numrecs - 1); + xfs_btree_log_ptrs(cur, bp, ptr, numrecs - 1); + } + } else { + /* It's a leaf. operate on records */ + union xfs_btree_rec *lrp; /* pointer to bmap btree rec */ + + lrp = cur->bc_ops->rec_addr(cur, 1, block); + if (ptr < numrecs) { + cur->bc_ops->move_recs(cur, lrp, ptr, ptr - 1, + numrecs - ptr); + cur->bc_ops->log_recs(cur, bp, ptr, numrecs - 1); + } + + /* + * If it's the first record in the block, we'll need a key + * structure to pass up to the next level (updkey). + */ + if (ptr == 1) + cur->bc_ops->init_key_from_rec(cur, &key, lrp); + } + + /* + * Decrement and log the number of entries in the block. + */ + xfs_btree_set_numrecs(block, --numrecs); + xfs_btree_log_block(cur, bp, XFS_BB_NUMRECS); + + /* + * If we are tracking the last record in the tree and + * we are at the far right edge of the tree, update it. + */ + if (xfs_btree_is_lastrec(cur, block, level)) { + cur->bc_ops->update_lastrec(cur, block, NULL, + ptr, LASTREC_DELREC); + } + + /* + * We're at the root level. First, shrink the root block in-memory. + * Try to get rid of the next level down. If we can't then there's + * nothing left to do. + */ + if (level == cur->bc_nlevels - 1) { + if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) { + cur->bc_ops->realloc_root(cur, -1); + + error = xfs_btree_root_to_iroot(cur); + if (error) + goto error0; + + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + *stat = 1; + return 0; + } + + /* + * If this is the root level, and there's only one entry left, + * and it's NOT the leaf level, then we can get rid of this + * level. + */ + if (numrecs == 1 && level > 0) { + union xfs_btree_ptr *pp; + /* + * pp is still set to the first pointer in the block. + * Make it the new root of the btree. + */ + pp = cur->bc_ops->ptr_addr(cur, 1, block); + error = xfs_btree_update_root(cur, bp, level, pp); + if (error) + goto error0; + } else if (level > 0) { + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + } + *stat = 1; + return 0; + } + + /* + * If we deleted the leftmost entry in the block, update the + * key values above us in the tree. + */ + if (ptr == 1) { + error = xfs_btree_updkey(cur, &key, level + 1); + if (error) + goto error0; + } + + /* + * If the number of records remaining in the block is at least + * the minimum, we're done. + */ + if (numrecs >= cur->bc_ops->get_minrecs(cur, level)) { + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + return 0; + } + + /* + * Otherwise, we have to move some records around to keep the + * tree balanced. Look at the left and right sibling blocks to + * see if we can re-balance by moving only one record. + */ + xfs_btree_get_sibling(cur, block, &rptr, XFS_BB_RIGHTSIB); + xfs_btree_get_sibling(cur, block, &lptr, XFS_BB_LEFTSIB); + + if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) { + /* + * One child of root, need to get a chance to copy its contents + * into the root and delete it. Can't go up to next level, + * there's nothing to delete there. + */ + if (xfs_btree_ptr_is_null(cur, &rptr) && + xfs_btree_ptr_is_null(cur, &lptr) && + level == cur->bc_nlevels - 2) { + error = xfs_btree_root_to_iroot(cur); + if (!error) + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + return 0; + } + } + + ASSERT(!xfs_btree_ptr_is_null(cur, &rptr) || + !xfs_btree_ptr_is_null(cur, &lptr)); + + /* + * Duplicate the cursor so our btree manipulations here won't + * disrupt the next level up. + */ + error = xfs_btree_dup_cursor(cur, &tcur); + if (error) + goto error0; + + /* + * If there's a right sibling, see if it's ok to shift an entry + * out of it. + */ + if (!xfs_btree_ptr_is_null(cur, &rptr)) { + /* + * Move the temp cursor to the last entry in the next block. + * Actually any entry but the first would suffice. + */ + i = xfs_btree_lastrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + error = xfs_btree_increment(tcur, level, &i); + if (error) + goto error0; + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + i = xfs_btree_lastrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + /* Grab a pointer to the block. */ + right = xfs_btree_get_block(tcur, level, &rbp); +#ifdef DEBUG + error = xfs_btree_check_block(tcur, right, level, rbp); + if (error) + goto error0; +#endif + /* Grab the current block number, for future use. */ + xfs_btree_get_sibling(tcur, right, &cptr, XFS_BB_LEFTSIB); + + /* + * If right block is full enough so that removing one entry + * won't make it too empty, and left-shifting an entry out + * of right to us works, we're done. + */ + if (xfs_btree_get_numrecs(right) - 1 >= + cur->bc_ops->get_minrecs(tcur, level)) { + error = xfs_btree_lshift(tcur, level, &i); + if (error) + goto error0; + if (i) { + ASSERT(xfs_btree_get_numrecs(block) >= + cur->bc_ops->get_minrecs(tcur, level)); + + xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); + tcur = NULL; + + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + return 0; + } + } + + /* + * Otherwise, grab the number of records in right for + * future reference, and fix up the temp cursor to point + * to our block again (last record). + */ + rrecs = xfs_btree_get_numrecs(right); + if (!xfs_btree_ptr_is_null(cur, &lptr)) { + i = xfs_btree_firstrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + error = xfs_btree_decrement(tcur, level, &i); + if (error) + goto error0; + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + } + } + + /* + * If there's a left sibling, see if it's ok to shift an entry + * out of it. + */ + if (!xfs_btree_ptr_is_null(cur, &lptr)) { + /* + * Move the temp cursor to the first entry in the + * previous block. + */ + i = xfs_btree_firstrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + error = xfs_btree_decrement(tcur, level, &i); + if (error) + goto error0; + i = xfs_btree_firstrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + /* Grab a pointer to the block. */ + left = xfs_btree_get_block(tcur, level, &lbp); +#ifdef DEBUG + error = xfs_btree_check_block(cur, left, level, lbp); + if (error) + goto error0; +#endif + /* Grab the current block number, for future use. */ + xfs_btree_get_sibling(tcur, left, &cptr, XFS_BB_RIGHTSIB); + + /* + * If left block is full enough so that removing one entry + * won't make it too empty, and right-shifting an entry out + * of left to us works, we're done. + */ + if (xfs_btree_get_numrecs(left) - 1 >= + cur->bc_ops->get_minrecs(tcur, level)) { + error = xfs_btree_rshift(tcur, level, &i); + if (error) + goto error0; + if (i) { + ASSERT(xfs_btree_get_numrecs(block) >= + cur->bc_ops->get_minrecs(tcur, level)); + xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); + tcur = NULL; + if (level == 0) + cur->bc_ptrs[0]++; + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + } + } + + /* + * Otherwise, grab the number of records in right for + * future reference. + */ + lrecs = xfs_btree_get_numrecs(left); + } + + /* Delete the temp cursor, we're done with it. */ + xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); + tcur = NULL; + + /* If here, we need to do a join to keep the tree balanced. */ + ASSERT(!xfs_btree_ptr_is_null(cur, &cptr)); + + if (!xfs_btree_ptr_is_null(cur, &lptr) && + lrecs + xfs_btree_get_numrecs(block) <= + cur->bc_ops->get_maxrecs(cur, level)) { + /* + * Set "right" to be the starting block, + * "left" to be the left neighbor. + */ + rptr = cptr; + right = block; + rbp = bp; + error = xfs_btree_read_buf_block(cur, &lptr, level, + 0, &left, &lbp); + if (error) + goto error0; + + /* + * If that won't work, see if we can join with the right neighbor block. + */ + } else if (!xfs_btree_ptr_is_null(cur, &rptr) && + rrecs + xfs_btree_get_numrecs(block) <= + cur->bc_ops->get_maxrecs(cur, level)) { + /* + * Set "left" to be the starting block, + * "right" to be the right neighbor. + */ + lptr = cptr; + left = block; + lbp = bp; + error = xfs_btree_read_buf_block(cur, &rptr, level, + 0, &right, &rbp); + if (error) + goto error0; + + /* + * Otherwise, we can't fix the imbalance. + * Just return. This is probably a logic error, but it's not fatal. + */ + } else { + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + return 0; + } + + rrecs = xfs_btree_get_numrecs(right); + lrecs = xfs_btree_get_numrecs(left); + + /* + * We're now going to join "left" and "right" by moving all the stuff + * in "right" to "left" and deleting "right". + */ + XFS_BTREE_STATS_ADD(cur, moves, rrecs); + if (level > 0) { + /* It's a non-leaf. Move keys and pointers. */ + union xfs_btree_key *lkp; /* left btree key */ + union xfs_btree_ptr *lpp; /* left address pointer */ + union xfs_btree_key *rkp; /* right btree key */ + union xfs_btree_ptr *rpp; /* right address pointer */ + + lkp = cur->bc_ops->key_addr(cur, lrecs + 1, left); + lpp = cur->bc_ops->ptr_addr(cur, lrecs + 1, left); + rkp = cur->bc_ops->key_addr(cur, 1, right); + rpp = cur->bc_ops->ptr_addr(cur, 1, right); +#ifdef DEBUG + for (i = 1; i < rrecs; i++) { + error = xfs_btree_check_ptr(cur, rpp, i, level); + if (error) + goto error0; + } +#endif + cur->bc_ops->copy_keys(cur, rkp, lkp, rrecs); + xfs_btree_copy_ptrs(cur, rpp, lpp, rrecs); + + cur->bc_ops->log_keys(cur, lbp, lrecs + 1, lrecs + rrecs); + xfs_btree_log_ptrs(cur, lbp, lrecs + 1, lrecs + rrecs); + } else { + /* It's a leaf. Move records. */ + union xfs_btree_rec *lrp; /* left record pointer */ + union xfs_btree_rec *rrp; /* right record pointer */ + + lrp = cur->bc_ops->rec_addr(cur, lrecs + 1, left); + rrp = cur->bc_ops->rec_addr(cur, 1, right); + + cur->bc_ops->copy_recs(cur, rrp, lrp, rrecs); + cur->bc_ops->log_recs(cur, lbp, lrecs + 1, lrecs + rrecs); + } + + XFS_BTREE_STATS_INC(cur, join); + + /* + * Fix up the the number of records and right block pointer in the + * surviving block, and log it. + */ + xfs_btree_set_numrecs(left, lrecs + rrecs); + xfs_btree_get_sibling(cur, right, &cptr, XFS_BB_RIGHTSIB), + xfs_btree_set_sibling(cur, left, &cptr, XFS_BB_RIGHTSIB); + xfs_btree_log_block(cur, lbp, XFS_BB_NUMRECS | XFS_BB_RIGHTSIB); + + /* If there is a right sibling, point it to the remaining block. */ + xfs_btree_get_sibling(cur, left, &cptr, XFS_BB_RIGHTSIB); + if (!xfs_btree_ptr_is_null(cur, &cptr)) { + error = xfs_btree_read_buf_block(cur, &cptr, level, + 0, &rrblock, &rrbp); + if (error) + goto error0; + xfs_btree_set_sibling(cur, rrblock, &lptr, XFS_BB_LEFTSIB); + xfs_btree_log_block(cur, rrbp, XFS_BB_LEFTSIB); + } + /* Free the deleted block. */ + error = cur->bc_ops->free_block(cur, rbp, 1); + if (error) + goto error0; + XFS_BTREE_STATS_INC(cur, free); + + /* + * If we joined with the left neighbor, set the buffer in the + * cursor to the left block, and fix up the index. + */ + if (bp != lbp) { + cur->bc_bufs[level] = lbp; + cur->bc_ptrs[level] += lrecs; + cur->bc_ra[level] = 0; + } + /* + * If we joined with the right neighbor and there's a level above + * us, increment the cursor at that level. + */ + else if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) || + (level + 1 < cur->bc_nlevels)) { + error = xfs_btree_increment(cur, level + 1, &i); + if (error) + goto error0; + } + + /* + * Readjust the ptr at this level if it's not a leaf, since it's + * still pointing at the deletion point, which makes the cursor + * inconsistent. If this makes the ptr 0, the caller fixes it up. + * We can't use decrement because it would change the next level up. + */ + if (level > 0) + cur->bc_ptrs[level]--; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + /* Return value means the next level up has something to do. */ + *stat = 2; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + if (tcur) + xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); + return error; +} + +/* + * Delete the record pointed to by cur. + * The cursor refers to the place where the record was (could be inserted) + * when the operation returns. + */ +int /* error */ +xfs_btree_delete( + struct xfs_btree_cur *cur, + int *stat) /* success/failure */ +{ + int error; /* error return value */ + int level; + int i; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + + /* + * Go up the tree, starting at leaf level. + * + * If 2 is returned then a join was done; go to the next level. + * Otherwise we are done. + */ + for (level = 0, i = 2; i == 2; level++) { + error = xfs_btree_delrec(cur, level, &i); + if (error) + goto error0; + } + + if (i == 0) { + for (level = 1; level < cur->bc_nlevels; level++) { + if (cur->bc_ptrs[level] == 0) { + error = xfs_btree_decrement(cur, level, &i); + if (error) + goto error0; + break; + } + } + } + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = i; + return 0; +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.c 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.c 2008-08-04 01:18:45.000000000 +0200 @@ -40,638 +40,6 @@ #include "xfs_alloc.h" #include "xfs_error.h" -/* - * Prototypes for internal functions. - */ - -STATIC void xfs_alloc_log_block(xfs_trans_t *, xfs_buf_t *, int); -STATIC void xfs_allocbt_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); -#define xfs_alloc_log_keys(c, b, i, j) \ - xfs_allocbt_log_keys(c, b, i, j) -STATIC void xfs_alloc_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC void xfs_allocbt_log_recs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -#define xfs_alloc_log_recs(c, b, i, j) \ - xfs_allocbt_log_recs(c, b, i, j) - -/* - * Internal functions. - */ - -/* - * Single level of the xfs_alloc_delete record deletion routine. - * Delete record pointed to by cur/level. - * Remove the record from its block then rebalance the tree. - * Return 0 for error, 1 for done, 2 to go on to the next level. - */ -STATIC int /* error */ -xfs_alloc_delrec( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level removing record from */ - int *stat) /* fail/done/go-on */ -{ - xfs_agf_t *agf; /* allocation group freelist header */ - xfs_alloc_block_t *block; /* btree block record/key lives in */ - xfs_agblock_t bno; /* btree block number */ - xfs_buf_t *bp; /* buffer for block */ - int error; /* error return value */ - int i; /* loop index */ - xfs_alloc_key_t key; /* kp points here if block is level 0 */ - xfs_agblock_t lbno; /* left block's block number */ - xfs_buf_t *lbp; /* left block's buffer pointer */ - xfs_alloc_block_t *left; /* left btree block */ - xfs_alloc_key_t *lkp=NULL; /* left block key pointer */ - xfs_alloc_ptr_t *lpp=NULL; /* left block address pointer */ - int lrecs=0; /* number of records in left block */ - xfs_alloc_rec_t *lrp; /* left block record pointer */ - xfs_mount_t *mp; /* mount structure */ - int ptr; /* index in btree block for this rec */ - xfs_agblock_t rbno; /* right block's block number */ - xfs_buf_t *rbp; /* right block's buffer pointer */ - xfs_alloc_block_t *right; /* right btree block */ - xfs_alloc_key_t *rkp; /* right block key pointer */ - xfs_alloc_ptr_t *rpp; /* right block address pointer */ - int rrecs=0; /* number of records in right block */ - int numrecs; - xfs_alloc_rec_t *rrp; /* right block record pointer */ - xfs_btree_cur_t *tcur; /* temporary btree cursor */ - - /* - * Get the index of the entry being deleted, check for nothing there. - */ - ptr = cur->bc_ptrs[level]; - if (ptr == 0) { - *stat = 0; - return 0; - } - /* - * Get the buffer & block containing the record or key/ptr. - */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_ALLOC_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; -#endif - /* - * Fail if we're off the end of the block. - */ - numrecs = be16_to_cpu(block->bb_numrecs); - if (ptr > numrecs) { - *stat = 0; - return 0; - } - XFS_STATS_INC(xs_abt_delrec); - /* - * It's a nonleaf. Excise the key and ptr being deleted, by - * sliding the entries past them down one. - * Log the changed areas of the block. - */ - if (level > 0) { - lkp = XFS_ALLOC_KEY_ADDR(block, 1, cur); - lpp = XFS_ALLOC_PTR_ADDR(block, 1, cur); -#ifdef DEBUG - for (i = ptr; i < numrecs; i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(lpp[i]), level))) - return error; - } -#endif - if (ptr < numrecs) { - memmove(&lkp[ptr - 1], &lkp[ptr], - (numrecs - ptr) * sizeof(*lkp)); - memmove(&lpp[ptr - 1], &lpp[ptr], - (numrecs - ptr) * sizeof(*lpp)); - xfs_alloc_log_ptrs(cur, bp, ptr, numrecs - 1); - xfs_alloc_log_keys(cur, bp, ptr, numrecs - 1); - } - } - /* - * It's a leaf. Excise the record being deleted, by sliding the - * entries past it down one. Log the changed areas of the block. - */ - else { - lrp = XFS_ALLOC_REC_ADDR(block, 1, cur); - if (ptr < numrecs) { - memmove(&lrp[ptr - 1], &lrp[ptr], - (numrecs - ptr) * sizeof(*lrp)); - xfs_alloc_log_recs(cur, bp, ptr, numrecs - 1); - } - /* - * If it's the first record in the block, we'll need a key - * structure to pass up to the next level (updkey). - */ - if (ptr == 1) { - key.ar_startblock = lrp->ar_startblock; - key.ar_blockcount = lrp->ar_blockcount; - lkp = &key; - } - } - /* - * Decrement and log the number of entries in the block. - */ - numrecs--; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_alloc_log_block(cur->bc_tp, bp, XFS_BB_NUMRECS); - /* - * See if the longest free extent in the allocation group was - * changed by this operation. True if it's the by-size btree, and - * this is the leaf level, and there is no right sibling block, - * and this was the last record. - */ - agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); - mp = cur->bc_mp; - - if (level == 0 && - cur->bc_btnum == XFS_BTNUM_CNT && - be32_to_cpu(block->bb_rightsib) == NULLAGBLOCK && - ptr > numrecs) { - ASSERT(ptr == numrecs + 1); - /* - * There are still records in the block. Grab the size - * from the last one. - */ - if (numrecs) { - rrp = XFS_ALLOC_REC_ADDR(block, numrecs, cur); - agf->agf_longest = rrp->ar_blockcount; - } - /* - * No free extents left. - */ - else - agf->agf_longest = 0; - mp->m_perag[be32_to_cpu(agf->agf_seqno)].pagf_longest = - be32_to_cpu(agf->agf_longest); - xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, - XFS_AGF_LONGEST); - } - /* - * Is this the root level? If so, we're almost done. - */ - if (level == cur->bc_nlevels - 1) { - /* - * If this is the root level, - * and there's only one entry left, - * and it's NOT the leaf level, - * then we can get rid of this level. - */ - if (numrecs == 1 && level > 0) { - /* - * lpp is still set to the first pointer in the block. - * Make it the new root of the btree. - */ - bno = be32_to_cpu(agf->agf_roots[cur->bc_btnum]); - agf->agf_roots[cur->bc_btnum] = *lpp; - be32_add(&agf->agf_levels[cur->bc_btnum], -1); - mp->m_perag[be32_to_cpu(agf->agf_seqno)].pagf_levels[cur->bc_btnum]--; - /* - * Put this buffer/block on the ag's freelist. - */ - error = xfs_alloc_put_freelist(cur->bc_tp, - cur->bc_private.a.agbp, NULL, bno, 1); - if (error) - return error; - /* - * Since blocks move to the free list without the - * coordination used in xfs_bmap_finish, we can't allow - * block to be available for reallocation and - * non-transaction writing (user data) until we know - * that the transaction that moved it to the free list - * is permanently on disk. We track the blocks by - * declaring these blocks as "busy"; the busy list is - * maintained on a per-ag basis and each transaction - * records which entries should be removed when the - * iclog commits to disk. If a busy block is - * allocated, the iclog is pushed up to the LSN - * that freed the block. - */ - xfs_alloc_mark_busy(cur->bc_tp, - be32_to_cpu(agf->agf_seqno), bno, 1); - - xfs_trans_agbtree_delta(cur->bc_tp, -1); - xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, - XFS_AGF_ROOTS | XFS_AGF_LEVELS); - /* - * Update the cursor so there's one fewer level. - */ - xfs_btree_setbuf(cur, level, NULL); - cur->bc_nlevels--; - } else if (level > 0 && - (error = xfs_btree_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * If we deleted the leftmost entry in the block, update the - * key values above us in the tree. - */ - if (ptr == 1 && (error = xfs_btree_updkey(cur, (union xfs_btree_key *)lkp, level + 1))) - return error; - /* - * If the number of records remaining in the block is at least - * the minimum, we're done. - */ - if (numrecs >= XFS_ALLOC_BLOCK_MINRECS(level, cur)) { - if (level > 0 && (error = xfs_btree_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * Otherwise, we have to move some records around to keep the - * tree balanced. Look at the left and right sibling blocks to - * see if we can re-balance by moving only one record. - */ - rbno = be32_to_cpu(block->bb_rightsib); - lbno = be32_to_cpu(block->bb_leftsib); - bno = NULLAGBLOCK; - ASSERT(rbno != NULLAGBLOCK || lbno != NULLAGBLOCK); - /* - * Duplicate the cursor so our btree manipulations here won't - * disrupt the next level up. - */ - if ((error = xfs_btree_dup_cursor(cur, &tcur))) - return error; - /* - * If there's a right sibling, see if it's ok to shift an entry - * out of it. - */ - if (rbno != NULLAGBLOCK) { - /* - * Move the temp cursor to the last entry in the next block. - * Actually any entry but the first would suffice. - */ - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_btree_increment(tcur, level, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - /* - * Grab a pointer to the block. - */ - rbp = tcur->bc_bufs[level]; - right = XFS_BUF_TO_ALLOC_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - goto error0; -#endif - /* - * Grab the current block number, for future use. - */ - bno = be32_to_cpu(right->bb_leftsib); - /* - * If right block is full enough so that removing one entry - * won't make it too empty, and left-shifting an entry out - * of right to us works, we're done. - */ - if (be16_to_cpu(right->bb_numrecs) - 1 >= - XFS_ALLOC_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_btree_lshift(tcur, level, &i))) - goto error0; - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_ALLOC_BLOCK_MINRECS(level, cur)); - xfs_btree_del_cursor(tcur, - XFS_BTREE_NOERROR); - if (level > 0 && - (error = xfs_btree_decrement(cur, level, - &i))) - return error; - *stat = 1; - return 0; - } - } - /* - * Otherwise, grab the number of records in right for - * future reference, and fix up the temp cursor to point - * to our block again (last record). - */ - rrecs = be16_to_cpu(right->bb_numrecs); - if (lbno != NULLAGBLOCK) { - i = xfs_btree_firstrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_btree_decrement(tcur, level, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - } - } - /* - * If there's a left sibling, see if it's ok to shift an entry - * out of it. - */ - if (lbno != NULLAGBLOCK) { - /* - * Move the temp cursor to the first entry in the - * previous block. - */ - i = xfs_btree_firstrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_btree_decrement(tcur, level, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - xfs_btree_firstrec(tcur, level); - /* - * Grab a pointer to the block. - */ - lbp = tcur->bc_bufs[level]; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - goto error0; -#endif - /* - * Grab the current block number, for future use. - */ - bno = be32_to_cpu(left->bb_rightsib); - /* - * If left block is full enough so that removing one entry - * won't make it too empty, and right-shifting an entry out - * of left to us works, we're done. - */ - if (be16_to_cpu(left->bb_numrecs) - 1 >= - XFS_ALLOC_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_btree_rshift(tcur, level, &i))) - goto error0; - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_ALLOC_BLOCK_MINRECS(level, cur)); - xfs_btree_del_cursor(tcur, - XFS_BTREE_NOERROR); - if (level == 0) - cur->bc_ptrs[0]++; - *stat = 1; - return 0; - } - } - /* - * Otherwise, grab the number of records in right for - * future reference. - */ - lrecs = be16_to_cpu(left->bb_numrecs); - } - /* - * Delete the temp cursor, we're done with it. - */ - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - /* - * If here, we need to do a join to keep the tree balanced. - */ - ASSERT(bno != NULLAGBLOCK); - /* - * See if we can join with the left neighbor block. - */ - if (lbno != NULLAGBLOCK && - lrecs + numrecs <= XFS_ALLOC_BLOCK_MAXRECS(level, cur)) { - /* - * Set "right" to be the starting block, - * "left" to be the left neighbor. - */ - rbno = bno; - right = block; - rrecs = be16_to_cpu(right->bb_numrecs); - rbp = bp; - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, lbno, 0, &lbp, - XFS_ALLOC_BTREE_REF))) - return error; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); - lrecs = be16_to_cpu(left->bb_numrecs); - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; - } - /* - * If that won't work, see if we can join with the right neighbor block. - */ - else if (rbno != NULLAGBLOCK && - rrecs + numrecs <= XFS_ALLOC_BLOCK_MAXRECS(level, cur)) { - /* - * Set "left" to be the starting block, - * "right" to be the right neighbor. - */ - lbno = bno; - left = block; - lrecs = be16_to_cpu(left->bb_numrecs); - lbp = bp; - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, rbno, 0, &rbp, - XFS_ALLOC_BTREE_REF))) - return error; - right = XFS_BUF_TO_ALLOC_BLOCK(rbp); - rrecs = be16_to_cpu(right->bb_numrecs); - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; - } - /* - * Otherwise, we can't fix the imbalance. - * Just return. This is probably a logic error, but it's not fatal. - */ - else { - if (level > 0 && (error = xfs_btree_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * We're now going to join "left" and "right" by moving all the stuff - * in "right" to "left" and deleting "right". - */ - if (level > 0) { - /* - * It's a non-leaf. Move keys and pointers. - */ - lkp = XFS_ALLOC_KEY_ADDR(left, lrecs + 1, cur); - lpp = XFS_ALLOC_PTR_ADDR(left, lrecs + 1, cur); - rkp = XFS_ALLOC_KEY_ADDR(right, 1, cur); - rpp = XFS_ALLOC_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - for (i = 0; i < rrecs; i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i]), level))) - return error; - } -#endif - memcpy(lkp, rkp, rrecs * sizeof(*lkp)); - memcpy(lpp, rpp, rrecs * sizeof(*lpp)); - xfs_alloc_log_keys(cur, lbp, lrecs + 1, lrecs + rrecs); - xfs_alloc_log_ptrs(cur, lbp, lrecs + 1, lrecs + rrecs); - } else { - /* - * It's a leaf. Move records. - */ - lrp = XFS_ALLOC_REC_ADDR(left, lrecs + 1, cur); - rrp = XFS_ALLOC_REC_ADDR(right, 1, cur); - memcpy(lrp, rrp, rrecs * sizeof(*lrp)); - xfs_alloc_log_recs(cur, lbp, lrecs + 1, lrecs + rrecs); - } - /* - * If we joined with the left neighbor, set the buffer in the - * cursor to the left block, and fix up the index. - */ - if (bp != lbp) { - xfs_btree_setbuf(cur, level, lbp); - cur->bc_ptrs[level] += lrecs; - } - /* - * If we joined with the right neighbor and there's a level above - * us, increment the cursor at that level. - */ - else if (level + 1 < cur->bc_nlevels && - (error = xfs_btree_increment(cur, level + 1, &i))) - return error; - /* - * Fix up the number of records in the surviving block. - */ - lrecs += rrecs; - left->bb_numrecs = cpu_to_be16(lrecs); - /* - * Fix up the right block pointer in the surviving block, and log it. - */ - left->bb_rightsib = right->bb_rightsib; - xfs_alloc_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS | XFS_BB_RIGHTSIB); - /* - * If there is a right sibling now, make it point to the - * remaining block. - */ - if (be32_to_cpu(left->bb_rightsib) != NULLAGBLOCK) { - xfs_alloc_block_t *rrblock; - xfs_buf_t *rrbp; - - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, be32_to_cpu(left->bb_rightsib), 0, - &rrbp, XFS_ALLOC_BTREE_REF))) - return error; - rrblock = XFS_BUF_TO_ALLOC_BLOCK(rrbp); - if ((error = xfs_btree_check_sblock(cur, rrblock, level, rrbp))) - return error; - rrblock->bb_leftsib = cpu_to_be32(lbno); - xfs_alloc_log_block(cur->bc_tp, rrbp, XFS_BB_LEFTSIB); - } - /* - * Free the deleting block by putting it on the freelist. - */ - error = xfs_alloc_put_freelist(cur->bc_tp, - cur->bc_private.a.agbp, NULL, rbno, 1); - if (error) - return error; - /* - * Since blocks move to the free list without the coordination - * used in xfs_bmap_finish, we can't allow block to be available - * for reallocation and non-transaction writing (user data) - * until we know that the transaction that moved it to the free - * list is permanently on disk. We track the blocks by declaring - * these blocks as "busy"; the busy list is maintained on a - * per-ag basis and each transaction records which entries - * should be removed when the iclog commits to disk. If a - * busy block is allocated, the iclog is pushed up to the - * LSN that freed the block. - */ - xfs_alloc_mark_busy(cur->bc_tp, be32_to_cpu(agf->agf_seqno), bno, 1); - xfs_trans_agbtree_delta(cur->bc_tp, -1); - - /* - * Adjust the current level's cursor so that we're left referring - * to the right node, after we're done. - * If this leaves the ptr value 0 our caller will fix it up. - */ - if (level > 0) - cur->bc_ptrs[level]--; - /* - * Return value means the next level up has something to do. - */ - *stat = 2; - return 0; - -error0: - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; -} - -/* - * Log header fields from a btree block. - */ -STATIC void -xfs_alloc_log_block( - xfs_trans_t *tp, /* transaction pointer */ - xfs_buf_t *bp, /* buffer containing btree block */ - int fields) /* mask of fields: XFS_BB_... */ -{ - int first; /* first byte offset logged */ - int last; /* last byte offset logged */ - static const short offsets[] = { /* table of offsets */ - offsetof(xfs_alloc_block_t, bb_magic), - offsetof(xfs_alloc_block_t, bb_level), - offsetof(xfs_alloc_block_t, bb_numrecs), - offsetof(xfs_alloc_block_t, bb_leftsib), - offsetof(xfs_alloc_block_t, bb_rightsib), - sizeof(xfs_alloc_block_t) - }; - - xfs_btree_offsets(fields, offsets, XFS_BB_NUM_BITS, &first, &last); - xfs_trans_log_buf(tp, bp, first, last); -} - -/* - * Log block pointer fields from a btree block (nonleaf). - */ -STATIC void -xfs_alloc_log_ptrs( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int pfirst, /* index of first pointer to log */ - int plast) /* index of last pointer to log */ -{ - xfs_alloc_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - int last; /* last byte offset logged */ - xfs_alloc_ptr_t *pp; /* block-pointer pointer in btree blk */ - - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - pp = XFS_ALLOC_PTR_ADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&pp[pfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&pp[plast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); -} - - -/* - * Externally visible routines. - */ - -/* - * Delete the record pointed to by cur. - * The cursor refers to the place where the record was (could be inserted) - * when the operation returns. - */ -int /* error */ -xfs_alloc_delete( - xfs_btree_cur_t *cur, /* btree cursor */ - int *stat) /* success/failure */ -{ - int error; /* error return value */ - int i; /* result code */ - int level; /* btree level */ - - /* - * Go up the tree, starting at leaf level. - * If 2 is returned then a join was done; go to the next level. - * Otherwise we are done. - */ - for (level = 0, i = 2; i == 2; level++) { - if ((error = xfs_alloc_delrec(cur, level, &i))) - return error; - } - if (i == 0) { - for (level = 1; level < cur->bc_nlevels; level++) { - if (cur->bc_ptrs[level] == 0) { - if ((error = xfs_btree_decrement(cur, level, &i))) - return error; - break; - } - } - } - *stat = i; - return 0; -} /* * Get the data from the pointed-to record. @@ -828,6 +196,7 @@ xfs_allocbt_update_lastrec( struct xfs_agf *agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); xfs_agnumber_t seqno = be32_to_cpu(agf->agf_seqno); __be32 len; + int numrecs; ASSERT(cur->bc_btnum == XFS_BTNUM_CNT); @@ -847,6 +216,22 @@ xfs_allocbt_update_lastrec( return; len = rec->alloc.ar_blockcount; break; + case LASTREC_DELREC: + numrecs = xfs_btree_get_numrecs(block); + if (ptr <= numrecs) + return; + ASSERT(ptr == numrecs + 1); + + if (numrecs) { + xfs_alloc_rec_t *rrp; + + rrp = XFS_ALLOC_REC_ADDR(block, numrecs, cur); + len = rrp->ar_blockcount; + } else { + len = 0; + } + + break; default: ASSERT(0); return; @@ -858,6 +243,14 @@ xfs_allocbt_update_lastrec( } STATIC int +xfs_allocbt_get_minrecs( + struct xfs_btree_cur *cur, + int level) +{ + return cur->bc_mp->m_alloc_mnr[level != 0]; +} + +STATIC int xfs_allocbt_get_maxrecs( struct xfs_btree_cur *cur, int level) @@ -1091,7 +484,7 @@ xfs_alloc_log_recs_check( * Log records from a btree block (leaf). */ STATIC void -xfs_alloc_log_recs( +xfs_allocbt_log_recs( struct xfs_btree_cur *cur, /* btree cursor */ struct xfs_buf *bp, /* buffer containing btree block */ int first, /* index of first record to log */ @@ -1188,6 +581,7 @@ static const struct xfs_btree_ops xfs_al .alloc_block = xfs_allocbt_alloc_block, .free_block = xfs_allocbt_free_block, .update_lastrec = xfs_allocbt_update_lastrec, + .get_minrecs = xfs_allocbt_get_minrecs, .get_maxrecs = xfs_allocbt_get_maxrecs, .init_key_from_rec = xfs_allocbt_init_key_from_rec, .init_ptr_from_cur = xfs_allocbt_init_ptr_from_cur, Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.c 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.c 2008-08-04 01:19:05.000000000 +0200 @@ -44,14 +44,6 @@ #include "xfs_error.h" #include "xfs_quota.h" -/* - * Prototypes for internal btree functions. - */ - - -STATIC void xfs_bmbt_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC void xfs_bmbt_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); - #undef EXIT #define ENTRY XBT_ENTRY @@ -82,416 +74,6 @@ STATIC void xfs_bmbt_log_ptrs(xfs_btree_ /* - * Internal functions. - */ - -/* - * Delete record pointed to by cur/level. - */ -STATIC int /* error */ -xfs_bmbt_delrec( - xfs_btree_cur_t *cur, - int level, - int *stat) /* success/failure */ -{ - xfs_bmbt_block_t *block; /* bmap btree block */ - xfs_fsblock_t bno; /* fs-relative block number */ - xfs_buf_t *bp; /* buffer for block */ - int error; /* error return value */ - int i; /* loop counter */ - int j; /* temp state */ - xfs_bmbt_key_t key; /* bmap btree key */ - xfs_bmbt_key_t *kp=NULL; /* pointer to bmap btree key */ - xfs_fsblock_t lbno; /* left sibling block number */ - xfs_buf_t *lbp; /* left buffer pointer */ - xfs_bmbt_block_t *left; /* left btree block */ - xfs_bmbt_key_t *lkp; /* left btree key */ - xfs_bmbt_ptr_t *lpp; /* left address pointer */ - int lrecs=0; /* left record count */ - xfs_bmbt_rec_t *lrp; /* left record pointer */ - xfs_mount_t *mp; /* file system mount point */ - xfs_bmbt_ptr_t *pp; /* pointer to bmap block addr */ - int ptr; /* key/record index */ - xfs_fsblock_t rbno; /* right sibling block number */ - xfs_buf_t *rbp; /* right buffer pointer */ - xfs_bmbt_block_t *right; /* right btree block */ - xfs_bmbt_key_t *rkp; /* right btree key */ - xfs_bmbt_rec_t *rp; /* pointer to bmap btree rec */ - xfs_bmbt_ptr_t *rpp; /* right address pointer */ - xfs_bmbt_block_t *rrblock; /* right-right btree block */ - xfs_buf_t *rrbp; /* right-right buffer pointer */ - int rrecs=0; /* right record count */ - xfs_bmbt_rec_t *rrp; /* right record pointer */ - xfs_btree_cur_t *tcur; /* temporary btree cursor */ - int numrecs; /* temporary numrec count */ - int numlrecs, numrrecs; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, level); - ptr = cur->bc_ptrs[level]; - tcur = NULL; - if (ptr == 0) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - block = xfs_bmbt_get_block(cur, level, &bp); - numrecs = be16_to_cpu(block->bb_numrecs); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } -#endif - if (ptr > numrecs) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - XFS_STATS_INC(xs_bmbt_delrec); - if (level > 0) { - kp = XFS_BMAP_KEY_IADDR(block, 1, cur); - pp = XFS_BMAP_PTR_IADDR(block, 1, cur); -#ifdef DEBUG - for (i = ptr; i < numrecs; i++) { - if ((error = xfs_btree_check_lptr_disk(cur, pp[i], level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - } -#endif - if (ptr < numrecs) { - memmove(&kp[ptr - 1], &kp[ptr], - (numrecs - ptr) * sizeof(*kp)); - memmove(&pp[ptr - 1], &pp[ptr], - (numrecs - ptr) * sizeof(*pp)); - xfs_bmbt_log_ptrs(cur, bp, ptr, numrecs - 1); - xfs_bmbt_log_keys(cur, bp, ptr, numrecs - 1); - } - } else { - rp = XFS_BMAP_REC_IADDR(block, 1, cur); - if (ptr < numrecs) { - memmove(&rp[ptr - 1], &rp[ptr], - (numrecs - ptr) * sizeof(*rp)); - xfs_bmbt_log_recs(cur, bp, ptr, numrecs - 1); - } - if (ptr == 1) { - key.br_startoff = - cpu_to_be64(xfs_bmbt_disk_get_startoff(rp)); - kp = &key; - } - } - numrecs--; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_bmbt_log_block(cur, bp, XFS_BB_NUMRECS); - /* - * We're at the root level. - * First, shrink the root block in-memory. - * Try to get rid of the next level down. - * If we can't then there's nothing left to do. - */ - if (level == cur->bc_nlevels - 1) { - xfs_iroot_realloc(cur->bc_private.b.ip, -1, - cur->bc_private.b.whichfork); - if ((error = xfs_btree_root_to_iroot(cur))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (level > 0 && (error = xfs_btree_decrement(cur, level, &j))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - if (ptr == 1 && (error = xfs_btree_updkey(cur, (union xfs_btree_key *)kp, level + 1))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (numrecs >= XFS_BMAP_BLOCK_IMINRECS(level, cur)) { - if (level > 0 && (error = xfs_btree_decrement(cur, level, &j))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - rbno = be64_to_cpu(block->bb_rightsib); - lbno = be64_to_cpu(block->bb_leftsib); - /* - * One child of root, need to get a chance to copy its contents - * into the root and delete it. Can't go up to next level, - * there's nothing to delete there. - */ - if (lbno == NULLFSBLOCK && rbno == NULLFSBLOCK && - level == cur->bc_nlevels - 2) { - if ((error = xfs_btree_root_to_iroot(cur))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (level > 0 && (error = xfs_btree_decrement(cur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - ASSERT(rbno != NULLFSBLOCK || lbno != NULLFSBLOCK); - if ((error = xfs_btree_dup_cursor(cur, &tcur))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - bno = NULLFSBLOCK; - if (rbno != NULLFSBLOCK) { - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_btree_increment(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - rbp = tcur->bc_bufs[level]; - right = XFS_BUF_TO_BMBT_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, right, level, rbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } -#endif - bno = be64_to_cpu(right->bb_leftsib); - if (be16_to_cpu(right->bb_numrecs) - 1 >= - XFS_BMAP_BLOCK_IMINRECS(level, cur)) { - if ((error = xfs_btree_lshift(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_BMAP_BLOCK_IMINRECS(level, tcur)); - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - tcur = NULL; - if (level > 0) { - if ((error = xfs_btree_decrement(cur, - level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, - ERROR); - goto error0; - } - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - } - rrecs = be16_to_cpu(right->bb_numrecs); - if (lbno != NULLFSBLOCK) { - i = xfs_btree_firstrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_btree_decrement(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - } - } - if (lbno != NULLFSBLOCK) { - i = xfs_btree_firstrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - /* - * decrement to last in block - */ - if ((error = xfs_btree_decrement(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - i = xfs_btree_firstrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - lbp = tcur->bc_bufs[level]; - left = XFS_BUF_TO_BMBT_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, left, level, lbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } -#endif - bno = be64_to_cpu(left->bb_rightsib); - if (be16_to_cpu(left->bb_numrecs) - 1 >= - XFS_BMAP_BLOCK_IMINRECS(level, cur)) { - if ((error = xfs_btree_rshift(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_BMAP_BLOCK_IMINRECS(level, tcur)); - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - tcur = NULL; - if (level == 0) - cur->bc_ptrs[0]++; - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - } - lrecs = be16_to_cpu(left->bb_numrecs); - } - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - tcur = NULL; - mp = cur->bc_mp; - ASSERT(bno != NULLFSBLOCK); - if (lbno != NULLFSBLOCK && - lrecs + be16_to_cpu(block->bb_numrecs) <= XFS_BMAP_BLOCK_IMAXRECS(level, cur)) { - rbno = bno; - right = block; - rbp = bp; - if ((error = xfs_btree_read_bufl(mp, cur->bc_tp, lbno, 0, &lbp, - XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - left = XFS_BUF_TO_BMBT_BLOCK(lbp); - if ((error = xfs_btree_check_lblock(cur, left, level, lbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - } else if (rbno != NULLFSBLOCK && - rrecs + be16_to_cpu(block->bb_numrecs) <= - XFS_BMAP_BLOCK_IMAXRECS(level, cur)) { - lbno = bno; - left = block; - lbp = bp; - if ((error = xfs_btree_read_bufl(mp, cur->bc_tp, rbno, 0, &rbp, - XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - right = XFS_BUF_TO_BMBT_BLOCK(rbp); - if ((error = xfs_btree_check_lblock(cur, right, level, rbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - lrecs = be16_to_cpu(left->bb_numrecs); - } else { - if (level > 0 && (error = xfs_btree_decrement(cur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - numlrecs = be16_to_cpu(left->bb_numrecs); - numrrecs = be16_to_cpu(right->bb_numrecs); - if (level > 0) { - lkp = XFS_BMAP_KEY_IADDR(left, numlrecs + 1, cur); - lpp = XFS_BMAP_PTR_IADDR(left, numlrecs + 1, cur); - rkp = XFS_BMAP_KEY_IADDR(right, 1, cur); - rpp = XFS_BMAP_PTR_IADDR(right, 1, cur); -#ifdef DEBUG - for (i = 0; i < numrrecs; i++) { - if ((error = xfs_btree_check_lptr_disk(cur, rpp[i], level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - } -#endif - memcpy(lkp, rkp, numrrecs * sizeof(*lkp)); - memcpy(lpp, rpp, numrrecs * sizeof(*lpp)); - xfs_bmbt_log_keys(cur, lbp, numlrecs + 1, numlrecs + numrrecs); - xfs_bmbt_log_ptrs(cur, lbp, numlrecs + 1, numlrecs + numrrecs); - } else { - lrp = XFS_BMAP_REC_IADDR(left, numlrecs + 1, cur); - rrp = XFS_BMAP_REC_IADDR(right, 1, cur); - memcpy(lrp, rrp, numrrecs * sizeof(*lrp)); - xfs_bmbt_log_recs(cur, lbp, numlrecs + 1, numlrecs + numrrecs); - } - be16_add(&left->bb_numrecs, numrrecs); - left->bb_rightsib = right->bb_rightsib; - xfs_bmbt_log_block(cur, lbp, XFS_BB_RIGHTSIB | XFS_BB_NUMRECS); - if (be64_to_cpu(left->bb_rightsib) != NULLDFSBNO) { - if ((error = xfs_btree_read_bufl(mp, cur->bc_tp, - be64_to_cpu(left->bb_rightsib), - 0, &rrbp, XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - rrblock = XFS_BUF_TO_BMBT_BLOCK(rrbp); - if ((error = xfs_btree_check_lblock(cur, rrblock, level, rrbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - rrblock->bb_leftsib = cpu_to_be64(lbno); - xfs_bmbt_log_block(cur, rrbp, XFS_BB_LEFTSIB); - } - xfs_bmap_add_free(XFS_DADDR_TO_FSB(mp, XFS_BUF_ADDR(rbp)), 1, - cur->bc_private.b.flist, mp); - cur->bc_private.b.ip->i_d.di_nblocks--; - xfs_trans_log_inode(cur->bc_tp, cur->bc_private.b.ip, XFS_ILOG_CORE); - XFS_TRANS_MOD_DQUOT_BYINO(mp, cur->bc_tp, cur->bc_private.b.ip, - XFS_TRANS_DQ_BCOUNT, -1L); - xfs_trans_binval(cur->bc_tp, rbp); - if (bp != lbp) { - cur->bc_bufs[level] = lbp; - cur->bc_ptrs[level] += lrecs; - cur->bc_ra[level] = 0; - } else if ((error = xfs_btree_increment(cur, level + 1, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (level > 0) - cur->bc_ptrs[level]--; - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 2; - return 0; - -error0: - if (tcur) - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; -} - -/* - * Log pointer values from the btree block. - */ -STATIC void -xfs_bmbt_log_ptrs( - xfs_btree_cur_t *cur, - xfs_buf_t *bp, - int pfirst, - int plast) -{ - xfs_trans_t *tp; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGBII(cur, bp, pfirst, plast); - tp = cur->bc_tp; - if (bp) { - xfs_bmbt_block_t *block; - int first; - int last; - xfs_bmbt_ptr_t *pp; - - block = XFS_BUF_TO_BMBT_BLOCK(bp); - pp = XFS_BMAP_PTR_DADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&pp[pfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&pp[plast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(tp, bp, first, last); - } else { - xfs_inode_t *ip; - - ip = cur->bc_private.b.ip; - xfs_trans_log_inode(tp, ip, - XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); -} - -/* * Determine the extent state. */ /* ARGSUSED */ @@ -540,42 +122,6 @@ xfs_bmdr_to_bmbt( } /* - * Delete the record pointed to by cur. - */ -int /* error */ -xfs_bmbt_delete( - xfs_btree_cur_t *cur, - int *stat) /* success/failure */ -{ - int error; /* error return value */ - int i; - int level; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - for (level = 0, i = 2; i == 2; level++) { - if ((error = xfs_bmbt_delrec(cur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } - if (i == 0) { - for (level = 1; level < cur->bc_nlevels; level++) { - if (cur->bc_ptrs[level] == 0) { - if ((error = xfs_btree_decrement(cur, level, - &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - break; - } - } - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = i; - return 0; -} - -/* * Convert a compressed bmap extent record to an uncompressed form. * This code must be in sync with the routines xfs_bmbt_get_startoff, * xfs_bmbt_get_startblock, xfs_bmbt_get_blockcount and xfs_bmbt_get_state. @@ -629,31 +175,6 @@ xfs_bmbt_get_all( } /* - * Get the block pointer for the given level of the cursor. - * Fill in the buffer pointer, if applicable. - */ -xfs_bmbt_block_t * -xfs_bmbt_get_block( - xfs_btree_cur_t *cur, - int level, - xfs_buf_t **bpp) -{ - xfs_ifork_t *ifp; - xfs_bmbt_block_t *rval; - - if (level < cur->bc_nlevels - 1) { - *bpp = cur->bc_bufs[level]; - rval = XFS_BUF_TO_BMBT_BLOCK(*bpp); - } else { - *bpp = NULL; - ifp = XFS_IFORK_PTR(cur->bc_private.b.ip, - cur->bc_private.b.whichfork); - rval = ifp->if_broot; - } - return rval; -} - -/* * Extract the blockcount field from an in memory bmap extent record. */ xfs_filblks_t @@ -1180,6 +701,14 @@ xfs_bmbt_realloc_root( } STATIC int +xfs_bmbt_get_minrecs( + struct xfs_btree_cur *cur, + int level) +{ + return XFS_BMAP_BLOCK_IMINRECS(level, cur); +} + +STATIC int xfs_bmbt_get_maxrecs( struct xfs_btree_cur *cur, int level) @@ -1502,6 +1031,7 @@ static const struct xfs_btree_ops xfs_bm .init_ptr_from_cur = xfs_bmbt_init_ptr_from_cur, .init_rec_from_key = xfs_bmbt_init_rec_from_key, .init_rec_from_cur = xfs_bmbt_init_rec_from_cur, + .get_minrecs = xfs_bmbt_get_minrecs, .get_maxrecs = xfs_bmbt_get_maxrecs, .get_dmaxrecs = xfs_bmbt_get_dmaxrecs, .ptr_addr = xfs_bmbt_ptr_addr, Index: linux-2.6-xfs/fs/xfs/xfs_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.h 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_btree.h 2008-08-04 01:18:45.000000000 +0200 @@ -216,6 +216,7 @@ struct xfs_btree_ops { struct xfs_btree_block *block); /* records in block/level */ + int (*get_minrecs)(struct xfs_btree_cur *cur, int level); int (*get_maxrecs)(struct xfs_btree_cur *cur, int level); int (*get_dmaxrecs)(struct xfs_btree_cur *cur, int level); @@ -289,6 +290,7 @@ struct xfs_btree_ops { */ #define LASTREC_UPDATE 0 #define LASTREC_INSREC 1 +#define LASTREC_DELREC 2 /* @@ -600,6 +602,7 @@ int xfs_btree_new_root(struct xfs_btree_ int xfs_btree_iroot_to_root(struct xfs_btree_cur *, int *, int *); int xfs_btree_root_to_iroot(struct xfs_btree_cur *); int xfs_btree_insert(struct xfs_btree_cur *, int *); +int xfs_btree_delete(struct xfs_btree_cur *, int *); /* Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.c 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.c 2008-08-04 01:18:45.000000000 +0200 @@ -40,567 +40,6 @@ #include "xfs_alloc.h" #include "xfs_error.h" -STATIC void xfs_inobt_log_block(xfs_trans_t *, xfs_buf_t *, int); -STATIC void xfs_inobt_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC void xfs_inobt_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC void xfs_inobt_log_recs(xfs_btree_cur_t *, xfs_buf_t *, int, int); - -/* - * Single level of the xfs_inobt_delete record deletion routine. - * Delete record pointed to by cur/level. - * Remove the record from its block then rebalance the tree. - * Return 0 for error, 1 for done, 2 to go on to the next level. - */ -STATIC int /* error */ -xfs_inobt_delrec( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level removing record from */ - int *stat) /* fail/done/go-on */ -{ - xfs_buf_t *agbp; /* buffer for a.g. inode header */ - xfs_mount_t *mp; /* mount structure */ - xfs_agi_t *agi; /* allocation group inode header */ - xfs_inobt_block_t *block; /* btree block record/key lives in */ - xfs_agblock_t bno; /* btree block number */ - xfs_buf_t *bp; /* buffer for block */ - int error; /* error return value */ - int i; /* loop index */ - xfs_inobt_key_t key; /* kp points here if block is level 0 */ - xfs_inobt_key_t *kp = NULL; /* pointer to btree keys */ - xfs_agblock_t lbno; /* left block's block number */ - xfs_buf_t *lbp; /* left block's buffer pointer */ - xfs_inobt_block_t *left; /* left btree block */ - xfs_inobt_key_t *lkp; /* left block key pointer */ - xfs_inobt_ptr_t *lpp; /* left block address pointer */ - int lrecs = 0; /* number of records in left block */ - xfs_inobt_rec_t *lrp; /* left block record pointer */ - xfs_inobt_ptr_t *pp = NULL; /* pointer to btree addresses */ - int ptr; /* index in btree block for this rec */ - xfs_agblock_t rbno; /* right block's block number */ - xfs_buf_t *rbp; /* right block's buffer pointer */ - xfs_inobt_block_t *right; /* right btree block */ - xfs_inobt_key_t *rkp; /* right block key pointer */ - xfs_inobt_rec_t *rp; /* pointer to btree records */ - xfs_inobt_ptr_t *rpp; /* right block address pointer */ - int rrecs = 0; /* number of records in right block */ - int numrecs; - xfs_inobt_rec_t *rrp; /* right block record pointer */ - xfs_btree_cur_t *tcur; /* temporary btree cursor */ - - mp = cur->bc_mp; - - /* - * Get the index of the entry being deleted, check for nothing there. - */ - ptr = cur->bc_ptrs[level]; - if (ptr == 0) { - *stat = 0; - return 0; - } - - /* - * Get the buffer & block containing the record or key/ptr. - */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; -#endif - /* - * Fail if we're off the end of the block. - */ - - numrecs = be16_to_cpu(block->bb_numrecs); - if (ptr > numrecs) { - *stat = 0; - return 0; - } - /* - * It's a nonleaf. Excise the key and ptr being deleted, by - * sliding the entries past them down one. - * Log the changed areas of the block. - */ - if (level > 0) { - kp = XFS_INOBT_KEY_ADDR(block, 1, cur); - pp = XFS_INOBT_PTR_ADDR(block, 1, cur); -#ifdef DEBUG - for (i = ptr; i < numrecs; i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(pp[i]), level))) - return error; - } -#endif - if (ptr < numrecs) { - memmove(&kp[ptr - 1], &kp[ptr], - (numrecs - ptr) * sizeof(*kp)); - memmove(&pp[ptr - 1], &pp[ptr], - (numrecs - ptr) * sizeof(*kp)); - xfs_inobt_log_keys(cur, bp, ptr, numrecs - 1); - xfs_inobt_log_ptrs(cur, bp, ptr, numrecs - 1); - } - } - /* - * It's a leaf. Excise the record being deleted, by sliding the - * entries past it down one. Log the changed areas of the block. - */ - else { - rp = XFS_INOBT_REC_ADDR(block, 1, cur); - if (ptr < numrecs) { - memmove(&rp[ptr - 1], &rp[ptr], - (numrecs - ptr) * sizeof(*rp)); - xfs_inobt_log_recs(cur, bp, ptr, numrecs - 1); - } - /* - * If it's the first record in the block, we'll need a key - * structure to pass up to the next level (updkey). - */ - if (ptr == 1) { - key.ir_startino = rp->ir_startino; - kp = &key; - } - } - /* - * Decrement and log the number of entries in the block. - */ - numrecs--; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_inobt_log_block(cur->bc_tp, bp, XFS_BB_NUMRECS); - /* - * Is this the root level? If so, we're almost done. - */ - if (level == cur->bc_nlevels - 1) { - /* - * If this is the root level, - * and there's only one entry left, - * and it's NOT the leaf level, - * then we can get rid of this level. - */ - if (numrecs == 1 && level > 0) { - agbp = cur->bc_private.a.agbp; - agi = XFS_BUF_TO_AGI(agbp); - /* - * pp is still set to the first pointer in the block. - * Make it the new root of the btree. - */ - bno = be32_to_cpu(agi->agi_root); - agi->agi_root = *pp; - be32_add(&agi->agi_level, -1); - /* - * Free the block. - */ - if ((error = xfs_free_extent(cur->bc_tp, - XFS_AGB_TO_FSB(mp, cur->bc_private.a.agno, bno), 1))) - return error; - xfs_trans_binval(cur->bc_tp, bp); - xfs_ialloc_log_agi(cur->bc_tp, agbp, - XFS_AGI_ROOT | XFS_AGI_LEVEL); - /* - * Update the cursor so there's one fewer level. - */ - cur->bc_bufs[level] = NULL; - cur->bc_nlevels--; - } else if (level > 0 && - (error = xfs_btree_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * If we deleted the leftmost entry in the block, update the - * key values above us in the tree. - */ - if (ptr == 1 && (error = xfs_btree_updkey(cur, (union xfs_btree_key *)kp, level + 1))) - return error; - /* - * If the number of records remaining in the block is at least - * the minimum, we're done. - */ - if (numrecs >= XFS_INOBT_BLOCK_MINRECS(level, cur)) { - if (level > 0 && - (error = xfs_btree_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * Otherwise, we have to move some records around to keep the - * tree balanced. Look at the left and right sibling blocks to - * see if we can re-balance by moving only one record. - */ - rbno = be32_to_cpu(block->bb_rightsib); - lbno = be32_to_cpu(block->bb_leftsib); - bno = NULLAGBLOCK; - ASSERT(rbno != NULLAGBLOCK || lbno != NULLAGBLOCK); - /* - * Duplicate the cursor so our btree manipulations here won't - * disrupt the next level up. - */ - if ((error = xfs_btree_dup_cursor(cur, &tcur))) - return error; - /* - * If there's a right sibling, see if it's ok to shift an entry - * out of it. - */ - if (rbno != NULLAGBLOCK) { - /* - * Move the temp cursor to the last entry in the next block. - * Actually any entry but the first would suffice. - */ - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_btree_increment(tcur, level, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - /* - * Grab a pointer to the block. - */ - rbp = tcur->bc_bufs[level]; - right = XFS_BUF_TO_INOBT_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - goto error0; -#endif - /* - * Grab the current block number, for future use. - */ - bno = be32_to_cpu(right->bb_leftsib); - /* - * If right block is full enough so that removing one entry - * won't make it too empty, and left-shifting an entry out - * of right to us works, we're done. - */ - if (be16_to_cpu(right->bb_numrecs) - 1 >= - XFS_INOBT_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_btree_lshift(tcur, level, &i))) - goto error0; - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_INOBT_BLOCK_MINRECS(level, cur)); - xfs_btree_del_cursor(tcur, - XFS_BTREE_NOERROR); - if (level > 0 && - (error = xfs_btree_decrement(cur, level, - &i))) - return error; - *stat = 1; - return 0; - } - } - /* - * Otherwise, grab the number of records in right for - * future reference, and fix up the temp cursor to point - * to our block again (last record). - */ - rrecs = be16_to_cpu(right->bb_numrecs); - if (lbno != NULLAGBLOCK) { - xfs_btree_firstrec(tcur, level); - if ((error = xfs_btree_decrement(tcur, level, &i))) - goto error0; - } - } - /* - * If there's a left sibling, see if it's ok to shift an entry - * out of it. - */ - if (lbno != NULLAGBLOCK) { - /* - * Move the temp cursor to the first entry in the - * previous block. - */ - xfs_btree_firstrec(tcur, level); - if ((error = xfs_btree_decrement(tcur, level, &i))) - goto error0; - xfs_btree_firstrec(tcur, level); - /* - * Grab a pointer to the block. - */ - lbp = tcur->bc_bufs[level]; - left = XFS_BUF_TO_INOBT_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - goto error0; -#endif - /* - * Grab the current block number, for future use. - */ - bno = be32_to_cpu(left->bb_rightsib); - /* - * If left block is full enough so that removing one entry - * won't make it too empty, and right-shifting an entry out - * of left to us works, we're done. - */ - if (be16_to_cpu(left->bb_numrecs) - 1 >= - XFS_INOBT_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_btree_rshift(tcur, level, &i))) - goto error0; - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_INOBT_BLOCK_MINRECS(level, cur)); - xfs_btree_del_cursor(tcur, - XFS_BTREE_NOERROR); - if (level == 0) - cur->bc_ptrs[0]++; - *stat = 1; - return 0; - } - } - /* - * Otherwise, grab the number of records in right for - * future reference. - */ - lrecs = be16_to_cpu(left->bb_numrecs); - } - /* - * Delete the temp cursor, we're done with it. - */ - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - /* - * If here, we need to do a join to keep the tree balanced. - */ - ASSERT(bno != NULLAGBLOCK); - /* - * See if we can join with the left neighbor block. - */ - if (lbno != NULLAGBLOCK && - lrecs + numrecs <= XFS_INOBT_BLOCK_MAXRECS(level, cur)) { - /* - * Set "right" to be the starting block, - * "left" to be the left neighbor. - */ - rbno = bno; - right = block; - rrecs = be16_to_cpu(right->bb_numrecs); - rbp = bp; - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, lbno, 0, &lbp, - XFS_INO_BTREE_REF))) - return error; - left = XFS_BUF_TO_INOBT_BLOCK(lbp); - lrecs = be16_to_cpu(left->bb_numrecs); - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; - } - /* - * If that won't work, see if we can join with the right neighbor block. - */ - else if (rbno != NULLAGBLOCK && - rrecs + numrecs <= XFS_INOBT_BLOCK_MAXRECS(level, cur)) { - /* - * Set "left" to be the starting block, - * "right" to be the right neighbor. - */ - lbno = bno; - left = block; - lrecs = be16_to_cpu(left->bb_numrecs); - lbp = bp; - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, rbno, 0, &rbp, - XFS_INO_BTREE_REF))) - return error; - right = XFS_BUF_TO_INOBT_BLOCK(rbp); - rrecs = be16_to_cpu(right->bb_numrecs); - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; - } - /* - * Otherwise, we can't fix the imbalance. - * Just return. This is probably a logic error, but it's not fatal. - */ - else { - if (level > 0 && (error = xfs_btree_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * We're now going to join "left" and "right" by moving all the stuff - * in "right" to "left" and deleting "right". - */ - if (level > 0) { - /* - * It's a non-leaf. Move keys and pointers. - */ - lkp = XFS_INOBT_KEY_ADDR(left, lrecs + 1, cur); - lpp = XFS_INOBT_PTR_ADDR(left, lrecs + 1, cur); - rkp = XFS_INOBT_KEY_ADDR(right, 1, cur); - rpp = XFS_INOBT_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - for (i = 0; i < rrecs; i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i]), level))) - return error; - } -#endif - memcpy(lkp, rkp, rrecs * sizeof(*lkp)); - memcpy(lpp, rpp, rrecs * sizeof(*lpp)); - xfs_inobt_log_keys(cur, lbp, lrecs + 1, lrecs + rrecs); - xfs_inobt_log_ptrs(cur, lbp, lrecs + 1, lrecs + rrecs); - } else { - /* - * It's a leaf. Move records. - */ - lrp = XFS_INOBT_REC_ADDR(left, lrecs + 1, cur); - rrp = XFS_INOBT_REC_ADDR(right, 1, cur); - memcpy(lrp, rrp, rrecs * sizeof(*lrp)); - xfs_inobt_log_recs(cur, lbp, lrecs + 1, lrecs + rrecs); - } - /* - * If we joined with the left neighbor, set the buffer in the - * cursor to the left block, and fix up the index. - */ - if (bp != lbp) { - xfs_btree_setbuf(cur, level, lbp); - cur->bc_ptrs[level] += lrecs; - } - /* - * If we joined with the right neighbor and there's a level above - * us, increment the cursor at that level. - */ - else if (level + 1 < cur->bc_nlevels && - (error = xfs_btree_increment(cur, level + 1, &i))) - return error; - /* - * Fix up the number of records in the surviving block. - */ - lrecs += rrecs; - left->bb_numrecs = cpu_to_be16(lrecs); - /* - * Fix up the right block pointer in the surviving block, and log it. - */ - left->bb_rightsib = right->bb_rightsib; - xfs_inobt_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS | XFS_BB_RIGHTSIB); - /* - * If there is a right sibling now, make it point to the - * remaining block. - */ - if (be32_to_cpu(left->bb_rightsib) != NULLAGBLOCK) { - xfs_inobt_block_t *rrblock; - xfs_buf_t *rrbp; - - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, be32_to_cpu(left->bb_rightsib), 0, - &rrbp, XFS_INO_BTREE_REF))) - return error; - rrblock = XFS_BUF_TO_INOBT_BLOCK(rrbp); - if ((error = xfs_btree_check_sblock(cur, rrblock, level, rrbp))) - return error; - rrblock->bb_leftsib = cpu_to_be32(lbno); - xfs_inobt_log_block(cur->bc_tp, rrbp, XFS_BB_LEFTSIB); - } - /* - * Free the deleting block. - */ - if ((error = xfs_free_extent(cur->bc_tp, XFS_AGB_TO_FSB(mp, - cur->bc_private.a.agno, rbno), 1))) - return error; - xfs_trans_binval(cur->bc_tp, rbp); - /* - * Readjust the ptr at this level if it's not a leaf, since it's - * still pointing at the deletion point, which makes the cursor - * inconsistent. If this makes the ptr 0, the caller fixes it up. - * We can't use decrement because it would change the next level up. - */ - if (level > 0) - cur->bc_ptrs[level]--; - /* - * Return value means the next level up has something to do. - */ - *stat = 2; - return 0; - -error0: - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; -} - -/* - * Log header fields from a btree block. - */ -STATIC void -xfs_inobt_log_block( - xfs_trans_t *tp, /* transaction pointer */ - xfs_buf_t *bp, /* buffer containing btree block */ - int fields) /* mask of fields: XFS_BB_... */ -{ - int first; /* first byte offset logged */ - int last; /* last byte offset logged */ - static const short offsets[] = { /* table of offsets */ - offsetof(xfs_inobt_block_t, bb_magic), - offsetof(xfs_inobt_block_t, bb_level), - offsetof(xfs_inobt_block_t, bb_numrecs), - offsetof(xfs_inobt_block_t, bb_leftsib), - offsetof(xfs_inobt_block_t, bb_rightsib), - sizeof(xfs_inobt_block_t) - }; - - xfs_btree_offsets(fields, offsets, XFS_BB_NUM_BITS, &first, &last); - xfs_trans_log_buf(tp, bp, first, last); -} - -/* - * Log block pointer fields from a btree block (nonleaf). - */ -STATIC void -xfs_inobt_log_ptrs( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int pfirst, /* index of first pointer to log */ - int plast) /* index of last pointer to log */ -{ - xfs_inobt_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - int last; /* last byte offset logged */ - xfs_inobt_ptr_t *pp; /* block-pointer pointer in btree blk */ - - block = XFS_BUF_TO_INOBT_BLOCK(bp); - pp = XFS_INOBT_PTR_ADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&pp[pfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&pp[plast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); -} - - -/* - * Externally visible routines. - */ - -/* - * Delete the record pointed to by cur. - * The cursor refers to the place where the record was (could be inserted) - * when the operation returns. - */ -int /* error */ -xfs_inobt_delete( - xfs_btree_cur_t *cur, /* btree cursor */ - int *stat) /* success/failure */ -{ - int error; - int i; /* result code */ - int level; /* btree level */ - - /* - * Go up the tree, starting at leaf level. - * If 2 is returned then a join was done; go to the next level. - * Otherwise we are done. - */ - for (level = 0, i = 2; i == 2; level++) { - if ((error = xfs_inobt_delrec(cur, level, &i))) - return error; - } - if (i == 0) { - for (level = 1; level < cur->bc_nlevels; level++) { - if (cur->bc_ptrs[level] == 0) { - if ((error = xfs_btree_decrement(cur, level, &i))) - return error; - break; - } - } - } - *stat = i; - return 0; -} - /* * Get the data from the pointed-to record. @@ -945,6 +384,13 @@ xfs_inobt_log_recs( XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); } +STATIC int +xfs_inobt_get_minrecs( + struct xfs_btree_cur *cur, + int level) +{ + return cur->bc_mp->m_inobt_mnr[level != 0]; +} #ifdef XFS_BTREE_TRACE @@ -1018,6 +464,7 @@ static const struct xfs_btree_ops xfs_in .set_root = xfs_inobt_set_root, .alloc_block = xfs_inobt_alloc_block, .free_block = xfs_inobt_free_block, + .get_minrecs = xfs_inobt_get_minrecs, .get_maxrecs = xfs_inobt_get_maxrecs, .init_key_from_rec = xfs_inobt_init_key_from_rec, .init_ptr_from_cur = xfs_inobt_init_ptr_from_cur, Index: linux-2.6-xfs/fs/xfs/xfs_alloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc.c 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc.c 2008-08-04 01:18:45.000000000 +0200 @@ -398,7 +398,7 @@ xfs_alloc_fixup_trees( /* * Delete the entry from the by-size btree. */ - if ((error = xfs_alloc_delete(cnt_cur, &i))) + if ((error = xfs_btree_delete(cnt_cur, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 1); /* @@ -427,7 +427,7 @@ xfs_alloc_fixup_trees( /* * No remaining freespace, just delete the by-block tree entry. */ - if ((error = xfs_alloc_delete(bno_cur, &i))) + if ((error = xfs_btree_delete(bno_cur, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 1); } else { @@ -1651,7 +1651,7 @@ xfs_free_ag_extent( if ((error = xfs_alloc_lookup_eq(cnt_cur, ltbno, ltlen, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_delete(cnt_cur, &i))) + if ((error = xfs_btree_delete(cnt_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); /* @@ -1660,13 +1660,13 @@ xfs_free_ag_extent( if ((error = xfs_alloc_lookup_eq(cnt_cur, gtbno, gtlen, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_delete(cnt_cur, &i))) + if ((error = xfs_btree_delete(cnt_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); /* * Delete the old by-block entry for the right block. */ - if ((error = xfs_alloc_delete(bno_cur, &i))) + if ((error = xfs_btree_delete(bno_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); /* @@ -1711,7 +1711,7 @@ xfs_free_ag_extent( if ((error = xfs_alloc_lookup_eq(cnt_cur, ltbno, ltlen, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_delete(cnt_cur, &i))) + if ((error = xfs_btree_delete(cnt_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); /* @@ -1737,7 +1737,7 @@ xfs_free_ag_extent( if ((error = xfs_alloc_lookup_eq(cnt_cur, gtbno, gtlen, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_delete(cnt_cur, &i))) + if ((error = xfs_btree_delete(cnt_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); /* Index: linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_alloc_btree.h 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_alloc_btree.h 2008-08-04 01:18:45.000000000 +0200 @@ -95,13 +95,6 @@ typedef struct xfs_btree_sblock xfs_allo XFS_BTREE_PTR_ADDR(xfs_alloc, bb, i, XFS_ALLOC_BLOCK_MAXRECS(1, cur)) /* - * Delete the record pointed to by cur. - * The cursor refers to the place where the record was (could be inserted) - * when the operation returns. - */ -extern int xfs_alloc_delete(struct xfs_btree_cur *cur, int *stat); - -/* * Get the data from the pointed-to record. */ extern int xfs_alloc_get_rec(struct xfs_btree_cur *cur, xfs_agblock_t *bno, Index: linux-2.6-xfs/fs/xfs/xfs_bmap.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap.c 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap.c 2008-08-04 01:18:45.000000000 +0200 @@ -864,7 +864,7 @@ xfs_bmap_add_extent_delay_real( RIGHT.br_blockcount, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_btree_decrement(cur, 0, &i))) @@ -1425,13 +1425,13 @@ xfs_bmap_add_extent_unwritten_real( RIGHT.br_blockcount, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_btree_decrement(cur, 0, &i))) @@ -1474,7 +1474,7 @@ xfs_bmap_add_extent_unwritten_real( &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_btree_decrement(cur, 0, &i))) @@ -1517,7 +1517,7 @@ xfs_bmap_add_extent_unwritten_real( RIGHT.br_blockcount, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_btree_decrement(cur, 0, &i))) @@ -2152,7 +2152,7 @@ xfs_bmap_add_extent_hole_real( right.br_blockcount, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); if ((error = xfs_btree_decrement(cur, 0, &i))) @@ -3216,7 +3216,7 @@ xfs_bmap_del_extent( flags |= XFS_ILOG_FEXT(whichfork); break; } - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; XFS_WANT_CORRUPTED_GOTO(i == 1, done); break; Index: linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_bmap_btree.h 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_bmap_btree.h 2008-08-04 01:18:45.000000000 +0200 @@ -237,10 +237,7 @@ typedef struct xfs_btree_lblock xfs_bmbt * Prototypes for xfs_bmap.c to call. */ extern void xfs_bmdr_to_bmbt(xfs_bmdr_block_t *, int, xfs_bmbt_block_t *, int); -extern int xfs_bmbt_delete(struct xfs_btree_cur *, int *); extern void xfs_bmbt_get_all(xfs_bmbt_rec_host_t *r, xfs_bmbt_irec_t *s); -extern xfs_bmbt_block_t *xfs_bmbt_get_block(struct xfs_btree_cur *cur, - int, struct xfs_buf **bpp); extern xfs_filblks_t xfs_bmbt_get_blockcount(xfs_bmbt_rec_host_t *r); extern xfs_fsblock_t xfs_bmbt_get_startblock(xfs_bmbt_rec_host_t *r); extern xfs_fileoff_t xfs_bmbt_get_startoff(xfs_bmbt_rec_host_t *r); Index: linux-2.6-xfs/fs/xfs/xfs_ialloc.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc.c 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc.c 2008-08-04 01:18:45.000000000 +0200 @@ -1168,8 +1168,8 @@ xfs_difree( xfs_trans_mod_sb(tp, XFS_TRANS_SB_ICOUNT, -ilen); xfs_trans_mod_sb(tp, XFS_TRANS_SB_IFREE, -(ilen - 1)); - if ((error = xfs_inobt_delete(cur, &i))) { - cmn_err(CE_WARN, "xfs_difree: xfs_inobt_delete returned an error %d on %s.\n", + if ((error = xfs_btree_delete(cur, &i))) { + cmn_err(CE_WARN, "xfs_difree: xfs_btree_delete returned an error %d on %s.\n", error, mp->m_fsname); goto error0; } Index: linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_ialloc_btree.h 2008-08-04 01:18:41.000000000 +0200 +++ linux-2.6-xfs/fs/xfs/xfs_ialloc_btree.h 2008-08-04 01:18:45.000000000 +0200 @@ -117,13 +117,6 @@ typedef struct xfs_btree_sblock xfs_inob i, XFS_INOBT_BLOCK_MAXRECS(1, cur))) /* - * Delete the record pointed to by cur. - * The cursor refers to the place where the record was (could be inserted) - * when the operation returns. - */ -extern int xfs_inobt_delete(struct xfs_btree_cur *cur, int *stat); - -/* * Get the data from the pointed-to record. */ extern int xfs_inobt_get_rec(struct xfs_btree_cur *cur, xfs_agino_t *ino, -- From owner-xfs@oss.sgi.com Sun Aug 3 18:53:16 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 18:53:20 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m741rFiE005734 for ; Sun, 3 Aug 2008 18:53:16 -0700 X-ASG-Debug-ID: 1217814868-40cc021b0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail01.adl6.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0A708EF69DE for ; Sun, 3 Aug 2008 18:54:29 -0700 (PDT) Received: from ipmail01.adl6.internode.on.net (ipmail01.adl6.internode.on.net [203.16.214.146]) by cuda.sgi.com with ESMTP id 1xttSRFi8hnNX5Vw for ; Sun, 03 Aug 2008 18:54:29 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AkwMABf+lUh5LBjh/2dsb2JhbACKZKMp X-IronPort-AV: E=Sophos;i="4.31,302,1215354600"; d="scan'208";a="163504180" Received: from ppp121-44-24-225.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.24.225]) by ipmail01.adl6.internode.on.net with ESMTP; 04 Aug 2008 11:24:27 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KPpHF-0001ip-Rr; Mon, 04 Aug 2008 11:54:25 +1000 Date: Mon, 4 Aug 2008 11:54:25 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 00/26] generic btree implementation, version 3 Subject: Re: [PATCH 00/26] generic btree implementation, version 3 Message-ID: <20080804015425.GE6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013158.GA8819@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804013158.GA8819@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail01.adl6.internode.on.net[203.16.214.146] X-Barracuda-Start-Time: 1217814870 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1689 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17353 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 03:31:58AM +0200, Christoph Hellwig wrote: > Firstly all but one of Dave's review comments are addressed. There is also > a new first patch to clean up the defintion of struct xfs_btree_block which > I would consider an ASAP merge candidate. > > But there are also some bigger changes. For reimplemented the newroot and > killroot methods. In both cases one method was used for block rooted and > inode rooted btree but doing something quite different. This patchset > gets rid of these two method completely and implements both cases in > xfs_btree.c. For new_root that was as easy as just calling > xfs_btree_new_root directly for the !XFS_BTREE_ROOT_IN_INODE and moving > xfs_bmbt_newroot into the common code and calling it for the other case. > The XFS_BTREE_ROOT_IN_INODE case for kill_root was similarly easily solved > by just moving the code around and making it block/rec/key/ptr size > agnostic, but the !XFS_BTREE_ROOT_IN_INODE needed a little care. > > The alloc and inobt trees were in fact using the almost exact sequence > built from ->set_root and ->free_block but the odd way to pass the > block to be freed made this non-obvious. By changing the calling > convention to the normal one this one could be unified. I've left > in some nasty asserts to ensure the assumptions about these blocks > wasn't wrong. Ok. I'll comment on it when the patches come through.... > The other big news is the last patch in the series, which adds rec_len > and key_len members to the btree_ops structure and uses this to make the > ptr_addr, key_addr, rec_addr, set_key, set_rec, move_keys, move_recs, > copy_keys, copy_recsm log_keys and log_recs methods generic. This > patch by itselfs saves a few hundred lines of code and more than two > kilobytes in the binary image. With this eventual worse generated > code that needs to multiply by variables instead of constants should > easily be offset by the reduced instruction cache footprint. The patch > is so far not really comments and more a proof of concept - if everyone > agrees with this approach the changes will be merged into the earlier > patches. I think this is a good approach to take. It certainly makes the move/copy/logging interfaces simpler and easier for future btrees to implement.... > Now even with this move the set_*, move_* and copy_* interfaces are > left as-is. All the old discussion still applies except that things > might be a little more clear now that there's just one implementation > each and everything is contained in one file. I think keeping the > copy_* interfaces is a good idea, they are now basically just typed > memcpy wrappers - although we should switch the order of the src > and dst arguments to be the same as memcpy. Agreed. > The set_* interfaces > can probably go away - over copy_ they just add the index based > addressing which most callers don't use anyway. Yes, they could call the copy interface directly. > I'm not sure > what to do with move_* - these are the most ugly helpers, so maybe > we should just make them memmove wrappers in the style of copy_ > and leave all addressing to the callers. Yes, it would be nice to have them use the same interface. If we do that, then there's no real point for having a copy vs move distinction - we could just make everything use the memmove version and drop one of the interfaces altogether.... > With all these changes the stats for this series are now: > > 37 files changed, 6309 insertions(+), 7873 deletions(-) > > text data bss dec hex filename > 631577 4227 3092 638896 9bfb0 fs/xfs/xfs.ko.base > 614298 4435 3124 621857 97d21 fs/xfs/xfs.ko.btree FWIW, this aggregate doesn't show the real picture of the savings. What is really interesting is how small the individual implementations become.... Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Sun Aug 3 19:01:14 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 19:01:16 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m7421EJL006564 for ; Sun, 3 Aug 2008 19:01:14 -0700 X-ASG-Debug-ID: 1217815344-739f02970000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 73F3C355A86 for ; Sun, 3 Aug 2008 19:02:25 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id MAvYlMTpe608C2hh for ; Sun, 03 Aug 2008 19:02:25 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m7422QIF010238 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Mon, 4 Aug 2008 04:02:26 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m7422QIT010236; Mon, 4 Aug 2008 04:02:26 +0200 Date: Mon, 4 Aug 2008 04:02:26 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 00/26] generic btree implementation, version 3 Subject: Re: [PATCH 00/26] generic btree implementation, version 3 Message-ID: <20080804020226.GA9995@lst.de> References: <20080804013158.GA8819@lst.de> <20080804015425.GE6119@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804015425.GE6119@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217815347 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1690 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17354 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 11:54:25AM +1000, Dave Chinner wrote: > > With all these changes the stats for this series are now: > > > > 37 files changed, 6309 insertions(+), 7873 deletions(-) > > > > text data bss dec hex filename > > 631577 4227 3092 638896 9bfb0 fs/xfs/xfs.ko.base > > 614298 4435 3124 621857 97d21 fs/xfs/xfs.ko.btree > > FWIW, this aggregate doesn't show the real picture of the savings. > What is really interesting is how small the individual > implementations become.... hch@bigmac:~/work/linux-2.6-xfs$ wc -l fs/xfs/xfs{_alloc,_ialloc,_bmap,}_btree.c 409 fs/xfs/xfs_alloc_btree.c 309 fs/xfs/xfs_ialloc_btree.c 858 fs/xfs/xfs_bmap_btree.c 3813 fs/xfs/xfs_btree.c 5389 total hch@bigmac:~/work/linux-2.6-xfs$ size fs/xfs/xfs{_alloc,_ialloc,_bmap,}_btree.o text data bss dec hex filename 2335 0 4 2339 923 fs/xfs/xfs_alloc_btree.o 1589 0 4 1593 639 fs/xfs_ialloc_btree.o 5827 0 4 5831 16c7 fs/xfs/xfs_bmap_btree.o 32785 0 4 32789 8015 fs/xfs/xfs_btree.o In comparism for the old code: hch@bigmac:~/work/linux-2.6-xfs$ wc -l fs/xfs/xfs{_alloc,_ialloc,_bmap,}_btree.c 2211 fs/xfs/xfs_alloc_btree.c 2078 fs/xfs/xfs_ialloc_btree.c 2610 fs/xfs/xfs_bmap_btree.c 914 fs/xfs/xfs_btree.c 7813 total hch@bigmac:~/work/linux-2.6-xfs$ size fs/xfs/xfs{_alloc,_ialloc,_bmap,}_btree.o text data bss dec hex filename 13383 0 0 13383 3447 fs/xfs/xfs_alloc_btree.o 12923 0 0 12923 327b fs/xfs/xfs_ialloc_btree.o 28870 22 4 28896 70e0 fs/xfs/xfs_bmap_btree.o 7062 0 4 7066 1b9a fs/xfs/xfs_btree.o From owner-xfs@oss.sgi.com Sun Aug 3 19:59:57 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 20:00:02 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m742xuwH010286 for ; Sun, 3 Aug 2008 19:59:56 -0700 Received: from cxfsmac10.melbourne.sgi.com (cxfsmac10.melbourne.sgi.com [134.14.55.100]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA02331; Mon, 4 Aug 2008 13:01:07 +1000 Message-ID: <489670F3.7070706@sgi.com> Date: Mon, 04 Aug 2008 13:01:07 +1000 From: Donald Douwsma User-Agent: Thunderbird 2.0.0.16 (Macintosh/20080707) MIME-Version: 1.0 To: Ruben Porras CC: xfs@oss.sgi.com Subject: Re: [PING] do not test return value of xfs_bmap_*_count_leaves References: <1217281149.5345.39.camel@marzo> In-Reply-To: <1217281149.5345.39.camel@marzo> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17355 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: donaldd@sgi.com Precedence: bulk X-list: xfs Ruben Porras wrote: > http://oss.sgi.com/archives/xfs/2008-04/msg00087.html > Got it, sorry we dropped it for so long. Your second patch missed the int/void decleration changes, I've added that to the patch. I'm also assuming the following Signed-off-by and description, is that ok by you? Make xfs_bmap_*_count_leaves void. xfs_bmap_count_leaves and xfs_bmap_disk_count_leaves always return always 0, make them void. Signed-off-by: "Ruben Porras " --- fs/xfs/xfs_bmap.c.orig +++ fs/xfs/xfs_bmap.c @@ -384,14 +384,14 @@ xfs_bmap_count_tree( int levelin, int *count); -STATIC int +STATIC void xfs_bmap_count_leaves( xfs_ifork_t *ifp, xfs_extnum_t idx, int numrecs, int *count); -STATIC int +STATIC void xfs_bmap_disk_count_leaves( xfs_extnum_t idx, xfs_bmbt_block_t *block, @@ -6367,13 +6367,9 @@ xfs_bmap_count_blocks( mp = ip->i_mount; ifp = XFS_IFORK_PTR(ip, whichfork); if ( XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_EXTENTS ) { - if (unlikely(xfs_bmap_count_leaves(ifp, 0, + xfs_bmap_count_leaves(ifp, 0, ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t), - count) < 0)) { - XFS_ERROR_REPORT("xfs_bmap_count_blocks(1)", - XFS_ERRLEVEL_LOW, mp); - return XFS_ERROR(EFSCORRUPTED); - } + count); return 0; } @@ -6454,13 +6450,7 @@ xfs_bmap_count_tree( for (;;) { nextbno = be64_to_cpu(block->bb_rightsib); numrecs = be16_to_cpu(block->bb_numrecs); - if (unlikely(xfs_bmap_disk_count_leaves(0, - block, numrecs, count) < 0)) { - xfs_trans_brelse(tp, bp); - XFS_ERROR_REPORT("xfs_bmap_count_tree(2)", - XFS_ERRLEVEL_LOW, mp); - return XFS_ERROR(EFSCORRUPTED); - } + xfs_bmap_disk_count_leaves(0, block, numrecs, count); xfs_trans_brelse(tp, bp); if (nextbno == NULLFSBLOCK) break; @@ -6478,7 +6468,7 @@ xfs_bmap_count_tree( /* * Count leaf blocks given a range of extent records. */ -STATIC int +STATIC void xfs_bmap_count_leaves( xfs_ifork_t *ifp, xfs_extnum_t idx, @@ -6491,14 +6481,13 @@ xfs_bmap_count_leaves( xfs_bmbt_rec_host_t *frp = xfs_iext_get_ext(ifp, idx + b); *count += xfs_bmbt_get_blockcount(frp); } - return 0; } /* * Count leaf blocks given a range of extent records originally * in btree format. */ -STATIC int +STATIC void xfs_bmap_disk_count_leaves( xfs_extnum_t idx, xfs_bmbt_block_t *block, @@ -6512,5 +6501,4 @@ xfs_bmap_disk_count_leaves( frp = XFS_BTREE_REC_ADDR(xfs_bmbt, block, idx + b); *count += xfs_bmbt_disk_get_blockcount(frp); } - return 0; } From owner-xfs@oss.sgi.com Sun Aug 3 22:12:51 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 22:12:58 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m745CmYr022852 for ; Sun, 3 Aug 2008 22:12:50 -0700 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA04565; Mon, 4 Aug 2008 15:13:57 +1000 Message-ID: <48969015.6020205@sgi.com> Date: Mon, 04 Aug 2008 15:13:57 +1000 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.14 (Macintosh/20080421) MIME-Version: 1.0 To: Dave Chinner , "Linda A. Walsh" CC: xfs-oss Subject: Re: Question about extended attributes... References: <48925495.7040804@tlinx.org> <4892B361.9030900@sgi.com> <20080801123253.GG6201@disturbed> In-Reply-To: <20080801123253.GG6201@disturbed> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17356 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Dave Chinner wrote: > On Fri, Aug 01, 2008 at 04:55:29PM +1000, Timothy Shimmin wrote: >> Hi Linda, >> >> Linda A. Walsh wrote: >>> my man page says extended xfs attributes can have 256-byte names >>> with up to 64K of data. >>> >>> Is there a limit on the number of extended attributes max data size or >>> name size? >>> >>> I.e. could I have 1000 attributes with 64K of data each? >>> >> Yep. >> >>> Is there a strong reason why the file and data sizes were limited to >>> 256/64K? > > .... > >> I'm not sure why 64K was chosen for a value size limit. > > Because changes to EAs are journalled. Hence there must be a bound > size limit because log space is limited. > Yeah, good point. Which I guess also reflects how we consider extended attributes to be more for metadata. --Tim From owner-xfs@oss.sgi.com Sun Aug 3 23:16:36 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 23:16:45 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m746GW64026859 for ; Sun, 3 Aug 2008 23:16:35 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA05706; Mon, 4 Aug 2008 16:17:37 +1000 Message-ID: <4896A070.2010805@sgi.com> Date: Mon, 04 Aug 2008 16:23:44 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: Nick Piggin CC: Linux Memory Management List , xfs@oss.sgi.com, xen-devel@lists.xensource.com, Linux Kernel Mailing List , Andrew Morton , dri-devel@lists.sourceforge.net Subject: Re: [rfc][patch 2/3] xfs: remove vmap cache References: <20080728123438.GA13926@wotan.suse.de> <20080728123621.GB13926@wotan.suse.de> In-Reply-To: <20080728123621.GB13926@wotan.suse.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17357 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Nick Piggin wrote: > XFS's vmap batching simply defers a number (up to 64) of vunmaps, and keeps > track of them in a list. To purge the batch, it just goes through the list and > calls vunamp on each one. This is pretty poor: a global TLB flush is still > performed on each vunmap, with the most expensive parts of the operation > being the broadcast IPIs and locking involved in the SMP callouts, and the > locking involved in the vmap management -- none of these are avoided by just > batching up the calls. I'm actually surprised it ever made much difference > at all. So am I. > > Rip all this logic out of XFS completely. I improve vmap performance > and scalability directly in the previous and subsequent patch. Sounds good to me. > > Signed-off-by: Nick Piggin > --- > > Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c > =================================================================== > --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c > +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c > @@ -166,75 +166,6 @@ test_page_region( > } > > /* > - * Mapping of multi-page buffers into contiguous virtual space > - */ > - > -typedef struct a_list { > - void *vm_addr; > - struct a_list *next; > -} a_list_t; > - > -static a_list_t *as_free_head; > -static int as_list_len; > -static DEFINE_SPINLOCK(as_lock); > - > -/* > - * Try to batch vunmaps because they are costly. > - */ > -STATIC void > -free_address( > - void *addr) > -{ > - a_list_t *aentry; > - > -#ifdef CONFIG_XEN > - /* > - * Xen needs to be able to make sure it can get an exclusive > - * RO mapping of pages it wants to turn into a pagetable. If > - * a newly allocated page is also still being vmap()ed by xfs, > - * it will cause pagetable construction to fail. This is a > - * quick workaround to always eagerly unmap pages so that Xen > - * is happy. > - */ > - vunmap(addr); > - return; > -#endif > - > - aentry = kmalloc(sizeof(a_list_t), GFP_NOWAIT); > - if (likely(aentry)) { > - spin_lock(&as_lock); > - aentry->next = as_free_head; > - aentry->vm_addr = addr; > - as_free_head = aentry; > - as_list_len++; > - spin_unlock(&as_lock); > - } else { > - vunmap(addr); > - } > -} > - > -STATIC void > -purge_addresses(void) > -{ > - a_list_t *aentry, *old; > - > - if (as_free_head == NULL) > - return; > - > - spin_lock(&as_lock); > - aentry = as_free_head; > - as_free_head = NULL; > - as_list_len = 0; > - spin_unlock(&as_lock); > - > - while ((old = aentry) != NULL) { > - vunmap(aentry->vm_addr); > - aentry = aentry->next; > - kfree(old); > - } > -} > - > -/* > * Internal xfs_buf_t object manipulation > */ > > @@ -334,7 +265,7 @@ xfs_buf_free( > uint i; > > if ((bp->b_flags & XBF_MAPPED) && (bp->b_page_count > 1)) > - free_address(bp->b_addr - bp->b_offset); > + vunmap(bp->b_addr - bp->b_offset); > > for (i = 0; i < bp->b_page_count; i++) { > struct page *page = bp->b_pages[i]; > @@ -456,8 +387,6 @@ _xfs_buf_map_pages( > bp->b_addr = page_address(bp->b_pages[0]) + bp->b_offset; > bp->b_flags |= XBF_MAPPED; > } else if (flags & XBF_MAPPED) { > - if (as_list_len > 64) > - purge_addresses(); > bp->b_addr = vmap(bp->b_pages, bp->b_page_count, > VM_MAP, PAGE_KERNEL); > if (unlikely(bp->b_addr == NULL)) > @@ -1739,8 +1668,6 @@ xfsbufd( > count++; > } > > - if (as_list_len > 0) > - purge_addresses(); > if (count) > blk_run_address_space(target->bt_mapping); > > > > From owner-xfs@oss.sgi.com Sun Aug 3 23:21:22 2008 Received: with ECARTIS (v1.0.0; list xfs); Sun, 03 Aug 2008 23:21:24 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m746LKQE027398 for ; Sun, 3 Aug 2008 23:21:21 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA05860; Mon, 4 Aug 2008 16:22:32 +1000 Message-ID: <4896A197.3090004@sgi.com> Date: Mon, 04 Aug 2008 16:28:39 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: Nick Piggin CC: Linux Memory Management List , xfs@oss.sgi.com, xen-devel@lists.xensource.com, Linux Kernel Mailing List , Andrew Morton , dri-devel@lists.sourceforge.net Subject: Re: [rfc][patch 3/3] xfs: use new vmap API References: <20080728123438.GA13926@wotan.suse.de> <20080728123703.GC13926@wotan.suse.de> In-Reply-To: <20080728123703.GC13926@wotan.suse.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17358 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Looks good to me. Nick Piggin wrote: > Implement XFS's large buffer support with the new vmap APIs. See the vmap > rewrite patch for some numbers. > > Signed-off-by: Nick Piggin > --- > > Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c > =================================================================== > --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c > +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c > @@ -265,7 +265,7 @@ xfs_buf_free( > uint i; > > if ((bp->b_flags & XBF_MAPPED) && (bp->b_page_count > 1)) > - vunmap(bp->b_addr - bp->b_offset); > + vm_unmap_ram(bp->b_addr - bp->b_offset, bp->b_page_count); > > for (i = 0; i < bp->b_page_count; i++) { > struct page *page = bp->b_pages[i]; > @@ -387,8 +387,8 @@ _xfs_buf_map_pages( > bp->b_addr = page_address(bp->b_pages[0]) + bp->b_offset; > bp->b_flags |= XBF_MAPPED; > } else if (flags & XBF_MAPPED) { > - bp->b_addr = vmap(bp->b_pages, bp->b_page_count, > - VM_MAP, PAGE_KERNEL); > + bp->b_addr = vm_map_ram(bp->b_pages, bp->b_page_count, > + -1, PAGE_KERNEL); > if (unlikely(bp->b_addr == NULL)) > return -ENOMEM; > bp->b_addr += bp->b_offset; > > > From owner-xfs@oss.sgi.com Mon Aug 4 07:20:21 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 07:20:48 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_63 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m74EKK7A002839 for ; Mon, 4 Aug 2008 07:20:21 -0700 X-ASG-Debug-ID: 1217859693-46e603210000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7B340EFE952 for ; Mon, 4 Aug 2008 07:21:33 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id auB4GioeNUJRKnMV for ; Mon, 04 Aug 2008 07:21:33 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m74ELWIF010136 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Mon, 4 Aug 2008 16:21:32 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m74ELWsS010134; Mon, 4 Aug 2008 16:21:32 +0200 Date: Mon, 4 Aug 2008 16:21:32 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 00/26] generic btree implementation, version 3 Subject: Re: [PATCH 00/26] generic btree implementation, version 3 Message-ID: <20080804142131.GA9892@lst.de> References: <20080804013158.GA8819@lst.de> <20080804015425.GE6119@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804015425.GE6119@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217859695 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1736 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17359 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 11:54:25AM +1000, Dave Chinner wrote: > > I'm not sure > > what to do with move_* - these are the most ugly helpers, so maybe > > we should just make them memmove wrappers in the style of copy_ > > and leave all addressing to the callers. > > Yes, it would be nice to have them use the same interface. If > we do that, then there's no real point for having a copy vs move > distinction - we could just make everything use the > memmove version and drop one of the interfaces altogether.... I've actually come up with another variant. Since what we do in the memmove case is to move a number of entries in a single block up or down one position I've added the following helper: STATIC void xfs_btree_shift_keys( struct xfs_btree_cur *cur, union xfs_btree_key *key int dir, int numkeys) { char *dst_key; ASSERT(numkeys >= 0); ASSERT(dir == 1 || dir == -1); dst_key = (char *)key + (dir * cur->bc_ops->key_len); memmove(dst_key, key, numkeys * cur->bc_ops->key_len); } and the same for ptrs and recs. This follows the original code in spirit and is quite readable. From owner-xfs@oss.sgi.com Mon Aug 4 09:46:41 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 09:46:43 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m74GkdFB016120 for ; Mon, 4 Aug 2008 09:46:40 -0700 X-ASG-Debug-ID: 1217868471-6a1c02a10000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from smtp.stepping-stone.ch (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 58DCAEFF711 for ; Mon, 4 Aug 2008 09:47:52 -0700 (PDT) Received: from smtp.stepping-stone.ch (smtp.stepping-stone.ch [193.58.255.135]) by cuda.sgi.com with ESMTP id P6PndgRMBADI6l0i for ; Mon, 04 Aug 2008 09:47:52 -0700 (PDT) Received: from localhost (mail-scanner-01.int.stepping-stone.ch [10.59.255.136]) by smtp.stepping-stone.ch (Postfix) with ESMTP id 3497688F26C for ; Mon, 4 Aug 2008 18:47:50 +0200 (CEST) Received: from smtp.stepping-stone.ch ([10.59.255.135]) by localhost (mail-scanner-01.int.stepping-stone.ch [10.59.255.136]) (amavisd-new, port 10024) with LMTP id 29903-02-23 for ; Mon, 4 Aug 2008 18:47:47 +0200 (CEST) Received: from [192.168.1.201] (unknown [212.103.65.198]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by smtp.stepping-stone.ch (Postfix) with ESMTP id 78F3889076D for ; Mon, 4 Aug 2008 18:47:47 +0200 (CEST) Message-ID: <489732B2.7000201@stepping-stone.ch> Date: Mon, 04 Aug 2008 18:47:46 +0200 From: Christian Affolter User-Agent: Thunderbird 2.0.0.14 (X11/20080505) MIME-Version: 1.0 To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Corruption of in-memory data detected - on heavy hard linking Subject: Re: Corruption of in-memory data detected - on heavy hard linking References: <48876D03.8010804@stepping-stone.ch> <20080725052051.GA26367@infradead.org> In-Reply-To: <20080725052051.GA26367@infradead.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Scanned: amavisd-new at stepping-stone.ch X-Barracuda-Connect: smtp.stepping-stone.ch[193.58.255.135] X-Barracuda-Start-Time: 1217868474 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1748 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Status: Clean X-archive-position: 17360 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: c.affolter@stepping-stone.ch Precedence: bulk X-list: xfs Hi > On Wed, Jul 23, 2008 at 07:40:19PM +0200, Christian Affolter wrote: >> Kernel-Error: >> Filesystem "sdc1": XFS internal error xfs_trans_cancel at line 1163 of >> file fs/xfs/xfs_trans.c. Caller 0xffffffff803a4fcf >> Pid: 22816, comm: cp Not tainted 2.6.24-gentoo-r8 #1 > > 2.6.24 is pretty old. Did you try with a recent kernel? We had some > fixes for in-core memory corruption although I don't remember one in > this area. I finally found the time to update the kernel to a recent 2.6.26 version. Unfortunately the problem still exists: Filesystem "dm-3": XFS internal error xfs_trans_cancel at line 1163 of file fs/xfs/xfs_trans.c. Caller 0xffffffff803a6672 Pid: 12584, comm: cp Not tainted 2.6.26-gentoo #1 Call Trace: [] xfs_create+0x1c2/0x4c0 [] xfs_trans_cancel+0x126/0x150 [] xfs_create+0x1c2/0x4c0 [] xfs_vn_mknod+0x16d/0x2c0 [] vfs_create+0xcc/0x130 [] do_filp_open+0x77f/0x860 [] do_sys_open+0x5a/0xf0 [] system_call_after_swapgs+0x7b/0x80 xfs_force_shutdown(dm-3,0x8) called from line 1164 of file fs/xfs/xfs_trans.c. Return address = 0xffffffff8039fd2f Filesystem "dm-3": Corruption of in-memory data detected. Shutting down filesystem: dm-3 Please umount the filesystem, and rectify the problem(s) Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. xfs_force_shutdown(dm-3,0x1) called from line 420 of file fs/xfs/xfs_rw.c. Return address = 0xffffffff803a9529 Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. xfs_force_shutdown(dm-3,0x1) called from line 420 of file fs/xfs/xfs_rw.c. Return address = 0xffffffff803a9529 Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Filesystem "dm-3": xfs_log_force: error 5 returned. Before the shutdown happens the copy command receives a "No space left on device" error: cp: cannot create regular file `[file name snipped': No space left on device cp: cannot create regular file `[file name snipped]': Input/output error Although the device has more than 50% free space as well as free inodes. The affected device was initialized with old xfsprogs (2.8.11): meta-data=/dev/evms/vol1 isize=256 agcount=3207, agsize=4096 blks = sectsz=512 attr=0 data = bsize=4096 blocks=13132799, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 log =internal bsize=4096 blocks=1024, version=1 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=65536 blocks=0, rtextents=0 Creating a new device with xfsprogs (2.9.7) leads to the following layout: meta-data=/dev/sdc1 isize=256 agcount=5, agsize=3662818 blks = sectsz=512 attr=2 data = bsize=4096 blocks=17750000, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 log =internal bsize=4096 blocks=7153, version=2 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 On the newly created device, the problem is much harder to reproduce, however it happens nonetheless after around a day of heavy copying and deleting. Any further hints? Many thanks Chris From owner-xfs@oss.sgi.com Mon Aug 4 17:18:52 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 17:18:55 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m750Iph3018189 for ; Mon, 4 Aug 2008 17:18:51 -0700 X-ASG-Debug-ID: 1217895603-246003b20000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 158F535AEE4 for ; Mon, 4 Aug 2008 17:20:04 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id MaLyzP4k8JB52PxN for ; Mon, 04 Aug 2008 17:20:04 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvwLAJU5l0h5LAiF/2dsb2JhbACKaKYG X-IronPort-AV: E=Sophos;i="4.31,306,1215354600"; d="scan'208";a="174248469" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 09:49:53 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQAHI-0000LQ-J0; Tue, 05 Aug 2008 10:19:52 +1000 Date: Tue, 5 Aug 2008 10:19:52 +1000 From: Dave Chinner To: Christian Affolter Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Corruption of in-memory data detected - on heavy hard linking Subject: Re: Corruption of in-memory data detected - on heavy hard linking Message-ID: <20080805001952.GI6119@disturbed> Mail-Followup-To: Christian Affolter , xfs@oss.sgi.com References: <48876D03.8010804@stepping-stone.ch> <20080725052051.GA26367@infradead.org> <489732B2.7000201@stepping-stone.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <489732B2.7000201@stepping-stone.ch> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217895606 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1778 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17361 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 06:47:46PM +0200, Christian Affolter wrote: > Hi > >> On Wed, Jul 23, 2008 at 07:40:19PM +0200, Christian Affolter wrote: >>> Kernel-Error: >>> Filesystem "sdc1": XFS internal error xfs_trans_cancel at line 1163 >>> of file fs/xfs/xfs_trans.c. Caller 0xffffffff803a4fcf >>> Pid: 22816, comm: cp Not tainted 2.6.24-gentoo-r8 #1 >> >> 2.6.24 is pretty old. Did you try with a recent kernel? We had some >> fixes for in-core memory corruption although I don't remember one in >> this area. > > I finally found the time to update the kernel to a recent 2.6.26 version. > > Unfortunately the problem still exists: > Filesystem "dm-3": XFS internal error xfs_trans_cancel at line 1163 of > file fs/xfs/xfs_trans.c. Caller 0xffffffff803a6672 > Pid: 12584, comm: cp Not tainted 2.6.26-gentoo #1 Ok, what we need is the following. First, try to reproduce the problem on a small filesystem (say a few GB). Once you've reproduced the problem, unmount and remount the filesystem to get the log replayed, then take a xfs_metadump image of the filesystem. Put the metadump image somewhere that can be downloaded (ftp/web site) and let us know where it is. If this is anything like the previous problem I found and fixed, then it will be a corner-case bug that is only triggered by a specific layout of free space and we need the filesystem image to be able to work out exactly what corner case is broken.... > Before the shutdown happens the copy command receives a > "No space left on device" error: > cp: cannot create regular file `[file name snipped': No space left on device > cp: cannot create regular file `[file name snipped]': Input/output error > > Although the device has more than 50% free space as well as free inodes. It will be an AG that is out of space, not the entire filesystem. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 17:24:59 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 17:25:01 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_63 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m750Owv0018801 for ; Mon, 4 Aug 2008 17:24:59 -0700 X-ASG-Debug-ID: 1217895972-194b02030000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 307B5F027BF for ; Mon, 4 Aug 2008 17:26:12 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id TijMNu21BMH2UwXg for ; Mon, 04 Aug 2008 17:26:12 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvwLAJU5l0h5LAiF/2dsb2JhbACKaKYG X-IronPort-AV: E=Sophos;i="4.31,306,1215354600"; d="scan'208";a="174254005" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 09:56:10 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQANL-0000U1-8s; Tue, 05 Aug 2008 10:26:07 +1000 Date: Tue, 5 Aug 2008 10:26:07 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 00/26] generic btree implementation, version 3 Subject: Re: [PATCH 00/26] generic btree implementation, version 3 Message-ID: <20080805002607.GJ6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013158.GA8819@lst.de> <20080804015425.GE6119@disturbed> <20080804142131.GA9892@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804142131.GA9892@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217895973 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1778 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17362 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 04:21:32PM +0200, Christoph Hellwig wrote: > On Mon, Aug 04, 2008 at 11:54:25AM +1000, Dave Chinner wrote: > > > I'm not sure > > > what to do with move_* - these are the most ugly helpers, so maybe > > > we should just make them memmove wrappers in the style of copy_ > > > and leave all addressing to the callers. > > > > Yes, it would be nice to have them use the same interface. If > > we do that, then there's no real point for having a copy vs move > > distinction - we could just make everything use the > > memmove version and drop one of the interfaces altogether.... > > I've actually come up with another variant. Since what we do in the > memmove case is to move a number of entries in a single block up or down > one position I've added the following helper: > > STATIC void > xfs_btree_shift_keys( > struct xfs_btree_cur *cur, > union xfs_btree_key *key > int dir, > int numkeys) > { > char *dst_key; > > ASSERT(numkeys >= 0); > ASSERT(dir == 1 || dir == -1); > > dst_key = (char *)key + (dir * cur->bc_ops->key_len); > memmove(dst_key, key, numkeys * cur->bc_ops->key_len); > } > > and the same for ptrs and recs. This follows the original code in > spirit and is quite readable. Nice. That fits nicely into the 'make a hole' or 'fill a hole' parts of various functions which will remove a lot of magic from them. The only thing I'd do is add an enum for the direction so the callers are self-documenting as to the direction of the shift.... Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 17:48:11 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 17:48:13 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m750mAT8020213 for ; Mon, 4 Aug 2008 17:48:11 -0700 X-ASG-Debug-ID: 1217897362-194d03600000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3DF8CF02A19 for ; Mon, 4 Aug 2008 17:49:23 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id F1VephwCmPrndohj for ; Mon, 04 Aug 2008 17:49:23 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m750nOIF005069 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Tue, 5 Aug 2008 02:49:24 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m750nOp1005067; Tue, 5 Aug 2008 02:49:24 +0200 Date: Tue, 5 Aug 2008 02:49:24 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 00/26] generic btree implementation, version 3 Subject: Re: [PATCH 00/26] generic btree implementation, version 3 Message-ID: <20080805004924.GA5002@lst.de> References: <20080804013158.GA8819@lst.de> <20080804015425.GE6119@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804015425.GE6119@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217897365 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1778 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17363 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 11:54:25AM +1000, Dave Chinner wrote: > > The alloc and inobt trees were in fact using the almost exact sequence > > built from ->set_root and ->free_block but the odd way to pass the > > block to be freed made this non-obvious. By changing the calling > > convention to the normal one this one could be unified. I've left > > in some nasty asserts to ensure the assumptions about these blocks > > wasn't wrong. > > Ok. I'll comment on it when the patches come through.... The killroot unification turned out to be buggy, I'll fix it in the next round of patches. From owner-xfs@oss.sgi.com Mon Aug 4 17:49:19 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 17:49:21 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m750nJ0q020593 for ; Mon, 4 Aug 2008 17:49:19 -0700 X-ASG-Debug-ID: 1217897430-3f4a02900000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 853C519844B3 for ; Mon, 4 Aug 2008 17:50:31 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id ggd2JBrNNTOblqyt for ; Mon, 04 Aug 2008 17:50:31 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvwLAJdAl0h5LAiF/2dsb2JhbACKaKV+ X-IronPort-AV: E=Sophos;i="4.31,306,1215354600"; d="scan'208";a="174274106" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 10:20:19 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQAkk-00019R-ES; Tue, 05 Aug 2008 10:50:18 +1000 Date: Tue, 5 Aug 2008 10:50:13 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 20/26] move xfs_bmbt_newroot to common code Subject: Re: [PATCH 20/26] move xfs_bmbt_newroot to common code Message-ID: <20080805005013.GK6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013530.GU8819@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804013530.GU8819@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217897433 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1780 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17364 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 03:35:30AM +0200, Christoph Hellwig wrote: > xfs_bmbt_newroot is a mostly generic implementation of moving from > an inode root to a real block based root. So move it to xfs_btree.c > where it can use all the nice infrastructure there and make it pointer > size agnostic > > The new name for it is xfs_btree_iroot_to_root which is not very > nice but at least slightly more descriptive than the old name. ..... > + > + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); > + XFS_BTREE_STATS_INC(cur, newroot); > + > + level = cur->bc_nlevels - 1; > + /* XXX(hch): this should be an get_inode_from_root, right? */ > + block = xfs_btree_get_block(cur, level, &bp); Yes, probably should be. Also an assert to ensure the btree does have the root in inode falg set might be appropriate.... Otherwise looks fine. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 18:07:17 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 18:07:18 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,MIME_8BIT_HEADER, RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m7517G3d021799 for ; Mon, 4 Aug 2008 18:07:16 -0700 X-ASG-Debug-ID: 1217898509-3f8e03bb0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3502A1984588 for ; Mon, 4 Aug 2008 18:08:30 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id APzhMdqwHf0AUDk9 for ; Mon, 04 Aug 2008 18:08:30 -0700 (PDT) Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 10:35:58 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQAzt-0001Vi-5e; Tue, 05 Aug 2008 11:05:57 +1000 Date: Tue, 5 Aug 2008 11:05:57 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 21/26] implement =?utf-8?Q?gene?= =?utf-8?Q?ric_xfs=5Fbtree=5Fin=D1=95ert=2Finsrec?= Subject: Re: [PATCH 21/26] implement =?utf-8?Q?gene?= =?utf-8?Q?ric_xfs=5Fbtree=5Fin=D1=95ert=2Finsrec?= Message-ID: <20080805010557.GL6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013535.GV8819@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804013535.GV8819@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217898511 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1780 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17365 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 03:35:35AM +0200, Christoph Hellwig wrote: > Make the btree insert code generic. Based on a patch from David Chinner > with lots of changes to follow the original btree implementations more > closely. While this loses some of the generic helper routines for > inserting/moving/removing records it also solves some of the one off > bugs in the original code and makes it easier to verify. .... > + if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) { > + if (numrecs < cur->bc_ops->get_dmaxrecs(cur, level)) { > + /* A resizeable root block that can be made bigger. */ > + cur->bc_ops->realloc_root(cur, 1); > + return 0; > + } I think that ->get_dmaxrecs is probably badly named. I called it that originally because it matched the macro name it was wrapping. Realisitically it should be ->get_root_maxrecs.... > + /* Make a key out of the record data to be inserted, and save it. */ > + cur->bc_ops->init_key_from_rec(cur, &key, recp); > + > + /* If we're off the left edge, return failure. */ > + ptr = cur->bc_ptrs[level]; > + if (ptr == 0) { > + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); > + *stat = 0; > + return 0; > + } You can probably move the key initialisation till after then initial 'in this block' checks. > + /* > + * If the block is full, we can't insert the new entry until we > + * make the block un-full. > + */ > + xfs_btree_set_ptr_null(cur, &nptr); > + if (numrecs == cur->bc_ops->get_maxrecs(cur, level)) { > + error = xfs_btree_make_block_unfull(cur, level, numrecs, > + &optr, &ptr, &nptr, &ncur, &nrec, stat); > + if (error || *stat == 0) > + goto error0; > + } > + > + /* The current block may have changed during the split. */ > + block = xfs_btree_get_block(cur, level, &bp); > + numrecs = xfs_btree_get_numrecs(block); The comment here should probably refer to the unfull call, not a 'split'. i.e.: /* * the current block may have changed if the block was * previously full and we have just made space in it. */ Otherwise looks good. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 18:13:53 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 18:13:55 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_65, RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m751Dr31022427 for ; Mon, 4 Aug 2008 18:13:53 -0700 X-ASG-Debug-ID: 1217898906-6d0100b50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B00F919845CF for ; Mon, 4 Aug 2008 18:15:06 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id LX5t736CKaSJDPct for ; Mon, 04 Aug 2008 18:15:06 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvwLAOBDl0h5LAiF/2dsb2JhbACKaKVq X-IronPort-AV: E=Sophos;i="4.31,306,1215354600"; d="scan'208";a="174293186" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 10:44:38 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQB8H-0001gg-SG; Tue, 05 Aug 2008 11:14:37 +1000 Date: Tue, 5 Aug 2008 11:14:37 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 22/26] move xfs_bmbt_killroot to common code Subject: Re: [PATCH 22/26] move xfs_bmbt_killroot to common code Message-ID: <20080805011437.GM6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013542.GW8819@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804013542.GW8819@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217898907 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1780 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17366 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 03:35:42AM +0200, Christoph Hellwig wrote: > xfs_bmbt_killroot is a mostly generic implementation of moving from > a real block based root to an inode based root. So move it to xfs_btree.c > where it can use all the nice infrastructure there and make it pointer > size agnostic > > The new name for it is xfs_btree_root_to_iroot which is not very > nice but at least slightly more descriptive than the old name. ..... > + > + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); > + level = cur->bc_nlevels - 1; > + ASSERT(level >= 1); probably should assert root in inode is set here. > + cblock = xfs_btree_get_block(cur, level - 1, &cbp); > + numrecs = xfs_btree_get_numrecs(cblock); > + > + if (numrecs > cur->bc_ops->get_dmaxrecs(cur, level)) > + goto out0; ^^^^ Stray whitespace. > + > + XFS_BTREE_STATS_INC(cur, killroot); > + > +#ifdef DEBUG > + xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_LEFTSIB); > + ASSERT(xfs_btree_ptr_is_null(cur, &ptr)); > + xfs_btree_get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB); > + ASSERT(xfs_btree_ptr_is_null(cur, &ptr)); > + > + // XXX(hch): this assert is bmap btree specific > + ASSERT(cur->bc_ops->get_maxrecs(cur, level) == ^ > + XFS_BMAP_BROOT_MAXRECS(ifp->if_broot_bytes)); Stray whitespace. As to the assert - what is it really trying to check? That the btree root space in the inode is large enough to fit the max number of records? If so, does it really need to be checked here (i.e. could the caller do it?) Otherwise looks ok. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 18:25:19 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 18:25:23 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_65, RDNS_NONE,URIBL_BLACK autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m751PGU0023270 for ; Mon, 4 Aug 2008 18:25:18 -0700 X-ASG-Debug-ID: 1217899590-6cd4019e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C275B19848A1 for ; Mon, 4 Aug 2008 18:26:30 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id rVFx9uulCixCSbsr for ; Mon, 04 Aug 2008 18:26:30 -0700 (PDT) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1KQBJm-0006A0-CO; Tue, 05 Aug 2008 01:26:30 +0000 Date: Mon, 4 Aug 2008 21:26:30 -0400 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 22/26] move xfs_bmbt_killroot to common code Subject: Re: [PATCH 22/26] move xfs_bmbt_killroot to common code Message-ID: <20080805012630.GA22147@infradead.org> References: <20080804013542.GW8819@lst.de> <20080805011437.GM6119@disturbed> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080805011437.GM6119@disturbed> User-Agent: Mutt/1.5.18 (2008-05-17) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1217899590 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1780 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17367 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 11:14:37AM +1000, Dave Chinner wrote: > On Mon, Aug 04, 2008 at 03:35:42AM +0200, Christoph Hellwig wrote: > > xfs_bmbt_killroot is a mostly generic implementation of moving from > > a real block based root to an inode based root. So move it to xfs_btree.c > > where it can use all the nice infrastructure there and make it pointer > > size agnostic > > > > The new name for it is xfs_btree_root_to_iroot which is not very > > nice but at least slightly more descriptive than the old name. > ..... > > + > > + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); > > + level = cur->bc_nlevels - 1; > > + ASSERT(level >= 1); > > probably should assert root in inode is set here. We have that check just before the call, so this might be a little overkill.. > > > + cblock = xfs_btree_get_block(cur, level - 1, &cbp); > > + numrecs = xfs_btree_get_numrecs(cblock); > > + > > + if (numrecs > cur->bc_ops->get_dmaxrecs(cur, level)) > > + goto out0; > ^^^^ > Stray whitespace. Already fixed before your mail after I ran checkpatch.pl over all patches - there were a few spread over the various patches. > > + // XXX(hch): this assert is bmap btree specific > > + ASSERT(cur->bc_ops->get_maxrecs(cur, level) == > ^ > > + XFS_BMAP_BROOT_MAXRECS(ifp->if_broot_bytes)); > > Stray whitespace. As to the assert - what is it really trying to > check? That the btree root space in the inode is large enough to > fit the max number of records? If so, does it really need to be > checked here (i.e. could the caller do it?) I don't think it makes much sense at all. My preference would be to simply kill it. From owner-xfs@oss.sgi.com Mon Aug 4 18:28:29 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 18:28:34 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m751ST9v023720 for ; Mon, 4 Aug 2008 18:28:29 -0700 X-ASG-Debug-ID: 1217899783-3d6402320000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5222C12432CA for ; Mon, 4 Aug 2008 18:29:44 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id puDLVgivkGY3NzNw for ; Mon, 04 Aug 2008 18:29:44 -0700 (PDT) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1KQBMt-0006FN-L8; Tue, 05 Aug 2008 01:29:43 +0000 Date: Mon, 4 Aug 2008 21:29:43 -0400 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 21/26] implement generic xfs_btree_in??ert/insrec Subject: Re: [PATCH 21/26] implement generic xfs_btree_in??ert/insrec Message-ID: <20080805012943.GB22147@infradead.org> References: <20080804013535.GV8819@lst.de> <20080805010557.GL6119@disturbed> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080805010557.GL6119@disturbed> User-Agent: Mutt/1.5.18 (2008-05-17) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1217899784 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1781 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17368 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 11:05:57AM +1000, Dave Chinner wrote: > I think that ->get_dmaxrecs is probably badly named. I called it > that originally because it matched the macro name it was wrapping. > Realisitically it should be ->get_root_maxrecs.... Yeah. The name giving macro is already gone in my current tree. > > > + /* Make a key out of the record data to be inserted, and save it. */ > > + cur->bc_ops->init_key_from_rec(cur, &key, recp); > > + > > + /* If we're off the left edge, return failure. */ > > + ptr = cur->bc_ptrs[level]; > > + if (ptr == 0) { > > + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); > > + *stat = 0; > > + return 0; > > + } > > You can probably move the key initialisation till after then initial > 'in this block' checks. Done. > > + /* > > + * If the block is full, we can't insert the new entry until we > > + * make the block un-full. > > + */ > > + xfs_btree_set_ptr_null(cur, &nptr); > > + if (numrecs == cur->bc_ops->get_maxrecs(cur, level)) { > > + error = xfs_btree_make_block_unfull(cur, level, numrecs, > > + &optr, &ptr, &nptr, &ncur, &nrec, stat); > > + if (error || *stat == 0) > > + goto error0; > > + } > > + > > + /* The current block may have changed during the split. */ > > + block = xfs_btree_get_block(cur, level, &bp); > > + numrecs = xfs_btree_get_numrecs(block); > > The comment here should probably refer to the unfull call, not a > 'split'. i.e.: > > /* > * the current block may have changed if the block was > * previously full and we have just made space in it. > */ Updated. From owner-xfs@oss.sgi.com Mon Aug 4 18:35:36 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 18:35:37 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m751ZZnn024372 for ; Mon, 4 Aug 2008 18:35:36 -0700 X-ASG-Debug-ID: 1217900209-3cf0029e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7005612434AA for ; Mon, 4 Aug 2008 18:36:49 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id Serp44RR43fZmMHz for ; Mon, 04 Aug 2008 18:36:49 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqcOACRLl0h5LAiFXWdsb2JhbACKaIZNHp5o X-IronPort-AV: E=Sophos;i="4.31,306,1215354600"; d="scan'208";a="174312851" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 11:06:26 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQBTN-0002A2-52; Tue, 05 Aug 2008 11:36:25 +1000 Date: Tue, 5 Aug 2008 11:36:25 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 23/26] implement generic xfs_btree_delete/delrec Subject: Re: [PATCH 23/26] implement generic xfs_btree_delete/delrec Message-ID: <20080805013625.GN6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013550.GX8819@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804013550.GX8819@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217900210 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1781 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17369 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 03:35:50AM +0200, Christoph Hellwig wrote: > Make the btree delete code generic. Based on a patch from David Chinner > with lots of changes to follow the original btree implementations more > closely. While this loses some of the generic helper routines for > inserting/moving/removing records it also solves some of the one off > bugs in the original code and makes it easier to verify. ..... > +STATIC int > +xfs_btree_update_root( > + struct xfs_btree_cur *cur, > + struct xfs_buf *bp, > + int level, > + union xfs_btree_ptr *newroot) > +{ > + int error; > + > + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); > + XFS_BTREE_STATS_INC(cur, killroot); > + > +#ifdef DEBUG > + if (cur->bc_btnum == XFS_BTNUM_BNO || cur->bc_btnum == XFS_BTNUM_CNT) { > + struct xfs_buf *agbp = cur->bc_private.a.agbp; > + struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp); > + > + ASSERT(be32_to_cpu(agf->agf_roots[cur->bc_btnum]) == > + XFS_DADDR_TO_AGBNO(cur->bc_mp, XFS_BUF_ADDR(bp))); > + } else if (cur->bc_btnum == XFS_BTNUM_INO) { > + struct xfs_buf *agbp = cur->bc_private.a.agbp; > + struct xfs_agi *agi = XFS_BUF_TO_AGI(agbp); > + xfs_fsblock_t fsbno; > + > + fsbno = XFS_AGB_TO_FSB(cur->bc_mp, cur->bc_private.a.agno, > + be32_to_cpu(agi->agi_root)); > + > + ASSERT(fsbno == XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(bp))); > + } > +#endif That's kind of messy - can it be pushed out to a separate debug only function? Also, why do we check the inobt against a long pointer, and the allocbt against a short pointer? both are short pointer btrees, so it doesn't really make sense to me to have different checks on them... Otherwise looks pretty good. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 18:36:42 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 18:36:47 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m751aeLk024659 for ; Mon, 4 Aug 2008 18:36:41 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA29805; Tue, 5 Aug 2008 11:37:45 +1000 Message-ID: <4897B05A.7040002@sgi.com> Date: Tue, 05 Aug 2008 11:43:54 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: Nick Piggin CC: Nick Piggin , Linux Memory Management List , xfs@oss.sgi.com, xen-devel@lists.xensource.com, Linux Kernel Mailing List , Andrew Morton , dri-devel@lists.sourceforge.net Subject: Re: [rfc][patch 3/3] xfs: use new vmap API References: <20080728123438.GA13926@wotan.suse.de> <20080728123703.GC13926@wotan.suse.de> <4896A197.3090004@sgi.com> <200808042057.20607.nickpiggin@yahoo.com.au> In-Reply-To: <200808042057.20607.nickpiggin@yahoo.com.au> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17370 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Okay. When the time comes will you push the XFS changes to mainline or would you like us to? Nick Piggin wrote: > Thanks for taking a look. I'll send them over to -mm with patch 1, > then, for some testing. > > On Monday 04 August 2008 16:28, Lachlan McIlroy wrote: >> Looks good to me. >> >> Nick Piggin wrote: >>> Implement XFS's large buffer support with the new vmap APIs. See the vmap >>> rewrite patch for some numbers. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > From owner-xfs@oss.sgi.com Mon Aug 4 18:42:20 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 18:42:22 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m751gIqv025277 for ; Mon, 4 Aug 2008 18:42:20 -0700 X-ASG-Debug-ID: 1217900611-705802e90000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1E3FD35BEDB for ; Mon, 4 Aug 2008 18:43:31 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id DaNr5ezNyXDmdCIG for ; Mon, 04 Aug 2008 18:43:31 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqcOACRLl0h5LAiFXWdsb2JhbACKaIZNHp5o X-IronPort-AV: E=Sophos;i="4.31,306,1215354600"; d="scan'208";a="174318447" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 11:13:08 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQBZq-0002Ie-51; Tue, 05 Aug 2008 11:43:06 +1000 Date: Tue, 5 Aug 2008 11:43:06 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 24/26] implement generic xfs_btree_getrec Subject: Re: [PATCH 24/26] implement generic xfs_btree_getrec Message-ID: <20080805014306.GO6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013556.GY8819@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804013556.GY8819@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217900613 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1781 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17371 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 03:35:56AM +0200, Christoph Hellwig wrote: > Not really much reason to make it generic given that it's so small, > but this is the last non-method in xfs_alloc_btree.c and xfs_ialloc_btree.c, > so it makes the whole btree implementation more structured. .... > +int /* error */ > +xfs_btree_getrec( > + struct xfs_btree_cur *cur, /* btree cursor */ > + union xfs_btree_rec **recp, /* output: btree record */ > + int *stat) /* output: success/failure */ > +{ > + struct xfs_btree_block *block; /* btree block */ > + int ptr; /* record number */ > +#ifdef DEBUG > + int error; /* error return value */ > +#endif > + > + ptr = cur->bc_ptrs[0]; > + block = XFS_BUF_TO_BLOCK(cur->bc_bufs[0]); Would it make more sense to use: block = xfs_btree_get_block(cur, 0, &bp); > + > +#ifdef DEBUG > + error = xfs_btree_check_block(cur, block, 0, cur->bc_bufs[0]); and then pass bp here? I'd rather use the helpers to do this than open code it like everything else does.... Otherwise it looks good. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 18:43:06 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 18:43:09 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m751h6Xf025594 for ; Mon, 4 Aug 2008 18:43:06 -0700 X-ASG-Debug-ID: 1217900659-0b3d00ce0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id F0BA235BEEC for ; Mon, 4 Aug 2008 18:44:19 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id 0e2F5YC3H6nP2svr for ; Mon, 04 Aug 2008 18:44:19 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqcOACRLl0h5LAiFXWdsb2JhbACKaIZNHp5o X-IronPort-AV: E=Sophos;i="4.31,306,1215354600"; d="scan'208";a="174318983" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 11:13:52 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQBaW-0002Jd-GA; Tue, 05 Aug 2008 11:43:48 +1000 Date: Tue, 5 Aug 2008 11:43:48 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 25/26] kill xfs_bmbt_log_block Subject: Re: [PATCH 25/26] kill xfs_bmbt_log_block Message-ID: <20080805014348.GP6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013602.GZ8819@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804013602.GZ8819@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217900660 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1781 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17372 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 03:36:02AM +0200, Christoph Hellwig wrote: > Only one caller left, which can just use xfs_btree_log_block if we export it. looks fine. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 18:44:11 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 18:44:13 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m751iBVl025974 for ; Mon, 4 Aug 2008 18:44:11 -0700 X-ASG-Debug-ID: 1217900724-3cc803310000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 67744124399A for ; Mon, 4 Aug 2008 18:45:24 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id d1Gx5UiBIn5v0nX5 for ; Mon, 04 Aug 2008 18:45:24 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m751jOIF006853 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Tue, 5 Aug 2008 03:45:25 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m751jOHo006850; Tue, 5 Aug 2008 03:45:24 +0200 Date: Tue, 5 Aug 2008 03:45:24 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 23/26] implement generic xfs_btree_delete/delrec Subject: Re: [PATCH 23/26] implement generic xfs_btree_delete/delrec Message-ID: <20080805014524.GA6465@lst.de> References: <20080804013550.GX8819@lst.de> <20080805013625.GN6119@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080805013625.GN6119@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217900726 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1781 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17373 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 11:36:25AM +1000, Dave Chinner wrote: > > +#ifdef DEBUG > > + if (cur->bc_btnum == XFS_BTNUM_BNO || cur->bc_btnum == XFS_BTNUM_CNT) { > > + struct xfs_buf *agbp = cur->bc_private.a.agbp; > > + struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp); > > + > > + ASSERT(be32_to_cpu(agf->agf_roots[cur->bc_btnum]) == > > + XFS_DADDR_TO_AGBNO(cur->bc_mp, XFS_BUF_ADDR(bp))); > > + } else if (cur->bc_btnum == XFS_BTNUM_INO) { > > + struct xfs_buf *agbp = cur->bc_private.a.agbp; > > + struct xfs_agi *agi = XFS_BUF_TO_AGI(agbp); > > + xfs_fsblock_t fsbno; > > + > > + fsbno = XFS_AGB_TO_FSB(cur->bc_mp, cur->bc_private.a.agno, > > + be32_to_cpu(agi->agi_root)); > > + > > + ASSERT(fsbno == XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(bp))); > > + } > > +#endif > > That's kind of messy - can it be pushed out to a separate debug only > function? Also, why do we check the inobt against a long pointer, > and the allocbt against a short pointer? both are short pointer > btrees, so it doesn't really make sense to me to have different > checks on them... The old inobt code built a long pointer to pass it to xfs_free_extent, and this assert validates we get the same block. The right thing to do here is to just remove this crap, it was just there to make sure I didn't add a really bad thinko. > Otherwise looks pretty good. Unfortunately it's buggy, though. The cur->bc_bufs[level] = NULL; vs xfs_btree_setbuf(cur, level, NULL); does make an enormous difference for 512byte block size filesystems. I'll have to find out what's going on in this area, and to make things worse I remember spotting a similar different between the ialloc and alloc btrees somewhere earlier in the series. From owner-xfs@oss.sgi.com Mon Aug 4 18:45:58 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 18:45:59 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m751jvJT026383 for ; Mon, 4 Aug 2008 18:45:57 -0700 X-ASG-Debug-ID: 1217900831-7afa02230000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 28FEF1984A45 for ; Mon, 4 Aug 2008 18:47:11 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id aETbBvkt52A5KW5u for ; Mon, 04 Aug 2008 18:47:11 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m751lDIF007087 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Tue, 5 Aug 2008 03:47:13 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m751lDoa007085; Tue, 5 Aug 2008 03:47:13 +0200 Date: Tue, 5 Aug 2008 03:47:13 +0200 From: Christoph Hellwig To: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 24/26] implement generic xfs_btree_getrec Subject: Re: [PATCH 24/26] implement generic xfs_btree_getrec Message-ID: <20080805014713.GB6465@lst.de> References: <20080804013556.GY8819@lst.de> <20080805014306.GO6119@disturbed> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080805014306.GO6119@disturbed> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1217900832 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1780 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17374 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 11:43:06AM +1000, Dave Chinner wrote: > On Mon, Aug 04, 2008 at 03:35:56AM +0200, Christoph Hellwig wrote: > > Not really much reason to make it generic given that it's so small, > > but this is the last non-method in xfs_alloc_btree.c and xfs_ialloc_btree.c, > > so it makes the whole btree implementation more structured. > .... > > +int /* error */ > > +xfs_btree_getrec( > > + struct xfs_btree_cur *cur, /* btree cursor */ > > + union xfs_btree_rec **recp, /* output: btree record */ > > + int *stat) /* output: success/failure */ > > +{ > > + struct xfs_btree_block *block; /* btree block */ > > + int ptr; /* record number */ > > +#ifdef DEBUG > > + int error; /* error return value */ > > +#endif > > + > > + ptr = cur->bc_ptrs[0]; > > + block = XFS_BUF_TO_BLOCK(cur->bc_bufs[0]); > > Would it make more sense to use: > > block = xfs_btree_get_block(cur, 0, &bp); > > > + > > +#ifdef DEBUG > > + error = xfs_btree_check_block(cur, block, 0, cur->bc_bufs[0]); > > and then pass bp here? I'd rather use the helpers to do this than > open code it like everything else does.... Thanks, updated. From owner-xfs@oss.sgi.com Mon Aug 4 19:04:32 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 19:04:35 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m7524T7n027619 for ; Mon, 4 Aug 2008 19:04:32 -0700 X-ASG-Debug-ID: 1217901943-0b3d01cf0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.suse.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A18D935C33C for ; Mon, 4 Aug 2008 19:05:44 -0700 (PDT) Received: from mx1.suse.de (ns1.suse.de [195.135.220.2]) by cuda.sgi.com with ESMTP id IbKIDGtBoMHrglxv for ; Mon, 04 Aug 2008 19:05:44 -0700 (PDT) X-ASG-Whitelist: Client X-ASG-Whitelist: Barracuda Reputation Received: from Relay1.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 071CD415E2; Tue, 5 Aug 2008 04:05:40 +0200 (CEST) Date: Tue, 5 Aug 2008 04:05:39 +0200 From: Nick Piggin To: Lachlan McIlroy Cc: Nick Piggin , Linux Memory Management List , xfs@oss.sgi.com, xen-devel@lists.xensource.com, Linux Kernel Mailing List , Andrew Morton , dri-devel@lists.sourceforge.net X-ASG-Orig-Subj: Re: [rfc][patch 3/3] xfs: use new vmap API Subject: Re: [rfc][patch 3/3] xfs: use new vmap API Message-ID: <20080805020539.GA15075@wotan.suse.de> References: <20080728123438.GA13926@wotan.suse.de> <20080728123703.GC13926@wotan.suse.de> <4896A197.3090004@sgi.com> <200808042057.20607.nickpiggin@yahoo.com.au> <4897B05A.7040002@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4897B05A.7040002@sgi.com> User-Agent: Mutt/1.5.9i X-Barracuda-Connect: ns1.suse.de[195.135.220.2] X-Barracuda-Start-Time: 1217901944 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17375 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: npiggin@suse.de Precedence: bulk X-list: xfs Assuming patch 1 gets merged upstream, I think Andrew would normally send off 2 and 3 to the XFS maintainers at that point (ie. when its prerequisites are upstream) for you to merge. On Tue, Aug 05, 2008 at 11:43:54AM +1000, Lachlan McIlroy wrote: > Okay. When the time comes will you push the XFS changes to mainline > or would you like us to? > > Nick Piggin wrote: > >Thanks for taking a look. I'll send them over to -mm with patch 1, > >then, for some testing. > > > >On Monday 04 August 2008 16:28, Lachlan McIlroy wrote: > >>Looks good to me. > >> > >>Nick Piggin wrote: > >>>Implement XFS's large buffer support with the new vmap APIs. See the vmap > >>>rewrite patch for some numbers. From owner-xfs@oss.sgi.com Mon Aug 4 19:06:52 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 19:06:54 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_62, J_CHICKENPOX_64,J_CHICKENPOX_66,J_CHICKENPOX_74 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m7526qdK028074 for ; Mon, 4 Aug 2008 19:06:52 -0700 X-ASG-Debug-ID: 1217902085-0c5201c40000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id CBE6735C369 for ; Mon, 4 Aug 2008 19:08:05 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id hyoqI3gnJGSx1xQS for ; Mon, 04 Aug 2008 19:08:05 -0700 (PDT) Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 11:36:57 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQBwu-0002ol-UU; Tue, 05 Aug 2008 12:06:56 +1000 Date: Tue, 5 Aug 2008 12:06:51 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 26/26] add rec_len and key_len fields to struct xfs_btree_ops Subject: Re: [PATCH 26/26] add rec_len and key_len fields to struct xfs_btree_ops Message-ID: <20080805020651.GQ6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013608.GA8819@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080804013608.GA8819@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217902086 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1781 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17376 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 03:36:08AM +0200, Christoph Hellwig wrote: > XXX: add detailed patch explanation here heh ;) ..... > Index: linux-2.6-xfs/fs/xfs/xfs_btree.c > =================================================================== > --- linux-2.6-xfs.orig/fs/xfs/xfs_btree.c 2008-08-04 01:48:35.000000000 +0200 > +++ linux-2.6-xfs/fs/xfs/xfs_btree.c 2008-08-04 01:49:13.000000000 +0200 > @@ -1037,6 +1037,82 @@ xfs_btree_read_buf_block( > return error; > } > > +static inline size_t xfs_btree_block_len(struct xfs_btree_cur *cur) > +{ > + return (cur->bc_flags & XFS_BTREE_LONG_PTRS) ? > + sizeof(struct xfs_btree_lblock) : > + sizeof(struct xfs_btree_sblock); > +} That's really the block header length, not the block length. Can you change the name to reflect that? > + > +static inline size_t xfs_btree_ptr_len(struct xfs_btree_cur *cur) > +{ > + return (cur->bc_flags & XFS_BTREE_LONG_PTRS) ? > + sizeof(__be64) : sizeof(__be32); > +} > + > +static size_t > +xfs_btree_ptr_offset( > + struct xfs_btree_cur *cur, > + int index, > + int level) > +{ > + return xfs_btree_block_len(cur) + > + cur->bc_ops->get_maxrecs(cur, level) * cur->bc_ops->key_len + > + (index - 1) * xfs_btree_ptr_len(cur); > +} I'd suggest a comment here just reminding the reader about the key/ptr block structure. i.e. key space is at the start of the block, ptr space is after the key space. Also a comment here reminding that btree indexes are 1-numbered, not 0-numbered and hence the need for 'index - 1' in the offset calculations. > STATIC void > +xfs_btree_set_key( > + struct xfs_btree_cur *cur, > + union xfs_btree_key *key_addr, > + int index, > + union xfs_btree_key *newkey) > +{ > + char *kp; > + > + kp = (char *)key_addr + (index * cur->bc_ops->key_len); > + > + memcpy(kp, newkey, cur->bc_ops->key_len); > +} And then for these set functions, the index has already been converted from 1-numbered to 0-numbered by the caller, hence the direct use of 'index' without correction. > @@ -1079,6 +1183,44 @@ xfs_btree_move_ptrs( > } > > STATIC void > +xfs_btree_move_keys( > + struct xfs_btree_cur *cur, > + union xfs_btree_key *key, > + int from, > + int to, > + int numkeys) > +{ > + char *base = (char *)key; > + > + ASSERT(from >= 0); > + ASSERT(to >= 0); > + ASSERT(numkeys >= 0); > + > + memmove(base + (to * cur->bc_ops->key_len), > + base + (from * cur->bc_ops->key_len), > + numkeys * cur->bc_ops->key_len); > +} Comment about from and to being logical offsets from the base, not byte counts. Hmmm - seeing how frequently the functions like xfs_btree_key_addr() are called, should we prevent them from being inlined automatically by the compiler (i.e. make them STATIC or explictly noinline)? Otherwise seems sane to me. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 19:08:27 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 19:08:29 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_65, RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m7528QjO028483 for ; Mon, 4 Aug 2008 19:08:27 -0700 X-ASG-Debug-ID: 1217902175-097e025e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id F1CE2198487A for ; Mon, 4 Aug 2008 19:09:40 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id r4anVa9FG52aojV6 for ; Mon, 04 Aug 2008 19:09:40 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqcOAO9Rl0h5LAiFXWdsb2JhbACKaIZNHp46 X-IronPort-AV: E=Sophos;i="4.31,307,1215354600"; d="scan'208";a="174334550" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 11:38:58 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQByq-0002rP-8t; Tue, 05 Aug 2008 12:08:56 +1000 Date: Tue, 5 Aug 2008 12:08:56 +1000 From: Dave Chinner To: Christoph Hellwig Cc: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 22/26] move xfs_bmbt_killroot to common code Subject: Re: [PATCH 22/26] move xfs_bmbt_killroot to common code Message-ID: <20080805020856.GR6119@disturbed> Mail-Followup-To: Christoph Hellwig , Christoph Hellwig , xfs@oss.sgi.com References: <20080804013542.GW8819@lst.de> <20080805011437.GM6119@disturbed> <20080805012630.GA22147@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080805012630.GA22147@infradead.org> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217902180 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1784 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17377 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Mon, Aug 04, 2008 at 09:26:30PM -0400, Christoph Hellwig wrote: > On Tue, Aug 05, 2008 at 11:14:37AM +1000, Dave Chinner wrote: > > On Mon, Aug 04, 2008 at 03:35:42AM +0200, Christoph Hellwig wrote: > > > xfs_bmbt_killroot is a mostly generic implementation of moving from > > > a real block based root to an inode based root. So move it to xfs_btree.c > > > where it can use all the nice infrastructure there and make it pointer > > > size agnostic > > > > > > The new name for it is xfs_btree_root_to_iroot which is not very > > > nice but at least slightly more descriptive than the old name. > > ..... > > > + > > > + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); > > > + level = cur->bc_nlevels - 1; > > > + ASSERT(level >= 1); > > > > probably should assert root in inode is set here. > > We have that check just before the call, so this might be a little > overkill.. Fair enough. > > > + // XXX(hch): this assert is bmap btree specific > > > + ASSERT(cur->bc_ops->get_maxrecs(cur, level) == > > ^ > > > + XFS_BMAP_BROOT_MAXRECS(ifp->if_broot_bytes)); > > > > Stray whitespace. As to the assert - what is it really trying to > > check? That the btree root space in the inode is large enough to > > fit the max number of records? If so, does it really need to be > > checked here (i.e. could the caller do it?) > > I don't think it makes much sense at all. My preference would > be to simply kill it. Agreed. Kill it - I don't think it adds any value. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 19:19:50 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 19:19:52 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_65 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m752Jorh029537 for ; Mon, 4 Aug 2008 19:19:50 -0700 X-ASG-Debug-ID: 1217902864-0c1f028f0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 89C5635C1A4 for ; Mon, 4 Aug 2008 19:21:04 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id XyWtGGADmrJG4jkG for ; Mon, 04 Aug 2008 19:21:04 -0700 (PDT) Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 11:48:45 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQC8A-00034G-V3; Tue, 05 Aug 2008 12:18:34 +1000 Date: Tue, 5 Aug 2008 12:18:34 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 23/26] implement generic xfs_btree_delete/delrec Subject: Re: [PATCH 23/26] implement generic xfs_btree_delete/delrec Message-ID: <20080805021834.GS6119@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080804013550.GX8819@lst.de> <20080805013625.GN6119@disturbed> <20080805014524.GA6465@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080805014524.GA6465@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217902865 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0012 1.0000 -2.0135 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.41 X-Barracuda-Spam-Status: No, SCORE=-1.41 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1785 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17378 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 03:45:24AM +0200, Christoph Hellwig wrote: > On Tue, Aug 05, 2008 at 11:36:25AM +1000, Dave Chinner wrote: > > > +#ifdef DEBUG > > > + if (cur->bc_btnum == XFS_BTNUM_BNO || cur->bc_btnum == XFS_BTNUM_CNT) { > > > + struct xfs_buf *agbp = cur->bc_private.a.agbp; > > > + struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp); > > > + > > > + ASSERT(be32_to_cpu(agf->agf_roots[cur->bc_btnum]) == > > > + XFS_DADDR_TO_AGBNO(cur->bc_mp, XFS_BUF_ADDR(bp))); > > > + } else if (cur->bc_btnum == XFS_BTNUM_INO) { > > > + struct xfs_buf *agbp = cur->bc_private.a.agbp; > > > + struct xfs_agi *agi = XFS_BUF_TO_AGI(agbp); > > > + xfs_fsblock_t fsbno; > > > + > > > + fsbno = XFS_AGB_TO_FSB(cur->bc_mp, cur->bc_private.a.agno, > > > + be32_to_cpu(agi->agi_root)); > > > + > > > + ASSERT(fsbno == XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(bp))); > > > + } > > > +#endif > > > > That's kind of messy - can it be pushed out to a separate debug only > > function? Also, why do we check the inobt against a long pointer, > > and the allocbt against a short pointer? both are short pointer > > btrees, so it doesn't really make sense to me to have different > > checks on them... > > The old inobt code built a long pointer to pass it to xfs_free_extent, > and this assert validates we get the same block. The right thing to > do here is to just remove this crap, it was just there to make sure > I didn't add a really bad thinko. > > > Otherwise looks pretty good. > > Unfortunately it's buggy, though. The cur->bc_bufs[level] = NULL; vs > xfs_btree_setbuf(cur, level, NULL); does make an enormous difference > for 512byte block size filesystems. I was going to come back to that - I'd noticed that difference but hadn't looked deeply into the change that it would cause. It looks like: a) it releases the old buffer b) clears the readahead status in the cursor a) will increase CPU overhead, but if the buffer is dirty won't affect behaviour at all as it will be held till transaction commit. b) will have a significant impact on the performance of btree traversals. That will show up most in small block size filesystems... So it shouldn't be a corruption causing change, only a performance degrading change. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Mon Aug 4 23:26:01 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 23:26:28 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m756Q0Fu015064 for ; Mon, 4 Aug 2008 23:26:00 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA05547; Tue, 5 Aug 2008 16:27:14 +1000 Message-ID: <4897F434.2010309@sgi.com> Date: Tue, 05 Aug 2008 16:33:24 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: xfs@oss.sgi.com, xfs-dev Subject: [PATCH] Don't release root inode until finished using it Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17379 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs During unmount we're releasing the last reference on the root inode before we've finished using it so move it after the XFS_SEND_UNMOUNT hook. --- a/fs/xfs/linux-2.6/xfs_super.c 2008-08-05 13:12:39.000000000 +1000 +++ b/fs/xfs/linux-2.6/xfs_super.c 2008-08-04 14:34:56.000000000 +1000 @@ -1132,8 +1132,6 @@ xfs_fs_put_super( error = xfs_unmount_flush(mp, 0); WARN_ON(error); - IRELE(rip); - /* * If we're forcing a shutdown, typically because of a media error, * we want to make sure we invalidate dirty pages that belong to @@ -1149,6 +1147,8 @@ xfs_fs_put_super( unmount_event_flags); } + IRELE(rip); + xfs_unmountfs(mp); xfs_icsb_destroy_counters(mp); xfs_close_devices(mp); From owner-xfs@oss.sgi.com Mon Aug 4 23:36:06 2008 Received: with ECARTIS (v1.0.0; list xfs); Mon, 04 Aug 2008 23:36:11 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m756a5Rl015846 for ; Mon, 4 Aug 2008 23:36:05 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA05799; Tue, 5 Aug 2008 16:37:19 +1000 Message-ID: <4897F691.6010806@sgi.com> Date: Tue, 05 Aug 2008 16:43:29 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: xfs@oss.sgi.com, xfs-dev Subject: [PATCH] Move vn_iowait() earlier in the reclaim path Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17380 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Currently by the time we get to vn_iowait() in xfs_reclaim() we have already gone through xfs_inactive()/xfs_free() and recycled the inode. Any I/O completions still running (file size updates and unwritten extent conversions) may be working on an inode that is no longer valid. --- a/fs/xfs/linux-2.6/xfs_super.c 2008-08-05 16:27:57.000000000 +1000 +++ b/fs/xfs/linux-2.6/xfs_super.c 2008-08-05 13:54:41.000000000 +1000 @@ -935,6 +935,7 @@ xfs_fs_clear_inode( XFS_STATS_INC(vn_reclaim); XFS_STATS_DEC(vn_active); + vn_iowait(ip); xfs_inactive(ip); xfs_iflags_clear(ip, XFS_IMODIFIED); if (xfs_reclaim(ip)) --- a/fs/xfs/xfs_vnodeops.c 2008-08-05 16:27:57.000000000 +1000 +++ b/fs/xfs/xfs_vnodeops.c 2008-08-05 13:54:43.000000000 +1000 @@ -2802,8 +2802,6 @@ xfs_reclaim( return 0; } - vn_iowait(ip); - ASSERT(XFS_FORCED_SHUTDOWN(ip->i_mount) || ip->i_delayed_blks == 0); /* From owner-xfs@oss.sgi.com Tue Aug 5 00:02:36 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:02:47 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m7572YPm019666 for ; Tue, 5 Aug 2008 00:02:35 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA06321; Tue, 5 Aug 2008 17:03:49 +1000 Message-ID: <4897FCC7.400@sgi.com> Date: Tue, 05 Aug 2008 17:09:59 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: xfs-oss , xfs-dev Subject: [PATCH] Use KM_NOFS for debug trace buffers Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17381 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Use KM_NOFS to prevent recursion back into the filesystem which can cause deadlocks. In the case of xfs_iread() we hold the lock on the inode cluster buffer while allocating memory for the trace buffers. If we recurse back into XFS to flush data that may require a transaction to allocate extents which needs log space. This can deadlock with the xfsaild thread which can't push the tail of the log because it is trying to get the inode cluster buffer lock. --- a/fs/xfs/linux-2.6/xfs_buf.c 2008-08-05 16:38:48.000000000 +1000 +++ b/fs/xfs/linux-2.6/xfs_buf.c 2008-08-05 16:38:43.000000000 +1000 @@ -1795,7 +1795,7 @@ int __init xfs_buf_init(void) { #ifdef XFS_BUF_TRACE - xfs_buf_trace_buf = ktrace_alloc(XFS_BUF_TRACE_SIZE, KM_SLEEP); + xfs_buf_trace_buf = ktrace_alloc(XFS_BUF_TRACE_SIZE, KM_NOFS); #endif xfs_buf_zone = kmem_zone_init_flags(sizeof(xfs_buf_t), "xfs_buf", --- a/fs/xfs/quota/xfs_dquot.c 2008-08-05 16:38:48.000000000 +1000 +++ b/fs/xfs/quota/xfs_dquot.c 2008-08-05 13:54:41.000000000 +1000 @@ -105,7 +105,7 @@ xfs_qm_dqinit( sv_init(&dqp->q_pinwait, SV_DEFAULT, "pdq"); #ifdef XFS_DQUOT_TRACE - dqp->q_trace = ktrace_alloc(DQUOT_TRACE_SIZE, KM_SLEEP); + dqp->q_trace = ktrace_alloc(DQUOT_TRACE_SIZE, KM_NOFS); xfs_dqtrace_entry(dqp, "DQINIT"); #endif } else { --- a/fs/xfs/xfs_buf_item.c 2008-08-05 16:38:48.000000000 +1000 +++ b/fs/xfs/xfs_buf_item.c 2008-08-05 13:54:42.000000000 +1000 @@ -737,7 +737,7 @@ xfs_buf_item_init( bip->bli_format.blf_len = (ushort)BTOBB(XFS_BUF_COUNT(bp)); bip->bli_format.blf_map_size = map_size; #ifdef XFS_BLI_TRACE - bip->bli_trace = ktrace_alloc(XFS_BLI_TRACE_SIZE, KM_SLEEP); + bip->bli_trace = ktrace_alloc(XFS_BLI_TRACE_SIZE, KM_NOFS); #endif #ifdef XFS_TRANS_DEBUG --- a/fs/xfs/xfs_filestream.c 2008-08-05 16:38:48.000000000 +1000 +++ b/fs/xfs/xfs_filestream.c 2008-08-05 13:54:42.000000000 +1000 @@ -400,7 +400,7 @@ xfs_filestream_init(void) if (!item_zone) return -ENOMEM; #ifdef XFS_FILESTREAMS_TRACE - xfs_filestreams_trace_buf = ktrace_alloc(XFS_FSTRM_KTRACE_SIZE, KM_SLEEP); + xfs_filestreams_trace_buf = ktrace_alloc(XFS_FSTRM_KTRACE_SIZE, KM_NOFS); #endif return 0; } --- a/fs/xfs/xfs_inode.c 2008-08-05 16:38:48.000000000 +1000 +++ b/fs/xfs/xfs_inode.c 2008-08-05 13:54:42.000000000 +1000 @@ -835,22 +835,22 @@ xfs_iread( * Do this before xfs_iformat in case it adds entries. */ #ifdef XFS_INODE_TRACE - ip->i_trace = ktrace_alloc(INODE_TRACE_SIZE, KM_SLEEP); + ip->i_trace = ktrace_alloc(INODE_TRACE_SIZE, KM_NOFS); #endif #ifdef XFS_BMAP_TRACE - ip->i_xtrace = ktrace_alloc(XFS_BMAP_KTRACE_SIZE, KM_SLEEP); + ip->i_xtrace = ktrace_alloc(XFS_BMAP_KTRACE_SIZE, KM_NOFS); #endif #ifdef XFS_BMBT_TRACE - ip->i_btrace = ktrace_alloc(XFS_BMBT_KTRACE_SIZE, KM_SLEEP); + ip->i_btrace = ktrace_alloc(XFS_BMBT_KTRACE_SIZE, KM_NOFS); #endif #ifdef XFS_RW_TRACE - ip->i_rwtrace = ktrace_alloc(XFS_RW_KTRACE_SIZE, KM_SLEEP); + ip->i_rwtrace = ktrace_alloc(XFS_RW_KTRACE_SIZE, KM_NOFS); #endif #ifdef XFS_ILOCK_TRACE - ip->i_lock_trace = ktrace_alloc(XFS_ILOCK_KTRACE_SIZE, KM_SLEEP); + ip->i_lock_trace = ktrace_alloc(XFS_ILOCK_KTRACE_SIZE, KM_NOFS); #endif #ifdef XFS_DIR2_TRACE - ip->i_dir_trace = ktrace_alloc(XFS_DIR2_KTRACE_SIZE, KM_SLEEP); + ip->i_dir_trace = ktrace_alloc(XFS_DIR2_KTRACE_SIZE, KM_NOFS); #endif /* --- a/fs/xfs/xfs_log.c 2008-08-05 16:38:48.000000000 +1000 +++ b/fs/xfs/xfs_log.c 2008-08-05 13:54:42.000000000 +1000 @@ -160,7 +160,7 @@ void xlog_trace_iclog(xlog_in_core_t *iclog, uint state) { if (!iclog->ic_trace) - iclog->ic_trace = ktrace_alloc(256, KM_SLEEP); + iclog->ic_trace = ktrace_alloc(256, KM_NOFS); ktrace_enter(iclog->ic_trace, (void *)((unsigned long)state), (void *)((unsigned long)current_pid()), From owner-xfs@oss.sgi.com Tue Aug 5 00:12:27 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:12:34 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m757CPia024085 for ; Tue, 5 Aug 2008 00:12:26 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA06637; Tue, 5 Aug 2008 17:13:37 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 3576B58C52A4; Tue, 5 Aug 2008 17:13:37 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - use get_unaligned_* helpers Message-Id: <20080805071337.3576B58C52A4@chook.melbourne.sgi.com> Date: Tue, 5 Aug 2008 17:13:37 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17382 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs use get_unaligned_* helpers Signed-off-by: Harvey Harrison Date: Tue Aug 5 17:12:48 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: harvey.harrison@gmail.com lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31813a fs/xfs/xfs_inode.c - 1.515 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.515&r2=text&tr2=1.514&f=h - use get_unaligned_* helpers From owner-xfs@oss.sgi.com Tue Aug 5 00:20:28 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:20:34 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m757KR9m003760 for ; Tue, 5 Aug 2008 00:20:27 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA06914; Tue, 5 Aug 2008 17:21:40 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id EBE0D58C52A4; Tue, 5 Aug 2008 17:21:39 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - clean up stale references to semaphores Message-Id: <20080805072139.EBE0D58C52A4@chook.melbourne.sgi.com> Date: Tue, 5 Aug 2008 17:21:39 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17383 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs clean up stale references to semaphores A lot of code has been converted away from semaphores, but there are still comments that reference semaphore behaviour. The log code is the worst offender. Update the comments to reflect what the code really does now. Signed-off-by: Dave Chinner Date: Tue Aug 5 17:20:54 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: david@fromorbit.com lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31814a fs/xfs/xfs_log.c - 1.360 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log.c.diff?r1=text&tr1=1.360&r2=text&tr2=1.359&f=h fs/xfs/xfs_log_priv.h - 1.132 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_priv.h.diff?r1=text&tr1=1.132&r2=text&tr2=1.131&f=h fs/xfs/linux-2.6/xfs_vnode.c - 1.162 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_vnode.c.diff?r1=text&tr1=1.162&r2=text&tr2=1.161&f=h - clean up stale references to semaphores From owner-xfs@oss.sgi.com Tue Aug 5 00:25:13 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:25:23 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m757PBlb007810 for ; Tue, 5 Aug 2008 00:25:12 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA07004; Tue, 5 Aug 2008 17:26:24 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 9FFAA58C52A4; Tue, 5 Aug 2008 17:26:24 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - replace the XFS buf iodone semaphore with a completion Message-Id: <20080805072624.9FFAA58C52A4@chook.melbourne.sgi.com> Date: Tue, 5 Aug 2008 17:26:24 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17384 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs replace the XFS buf iodone semaphore with a completion The xfs_buf_t b_iodonesema is really just a semaphore that wants to be a completion. Change it to a completion and remove the last user of the sema_t from XFS. Signed-off-by: Dave Chinner Date: Tue Aug 5 17:25:44 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: david@fromorbit.com lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31815a fs/xfs/xfs_rw.c - 1.401 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_rw.c.diff?r1=text&tr1=1.401&r2=text&tr2=1.400&f=h fs/xfs/xfs_buf_item.c - 1.167 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_buf_item.c.diff?r1=text&tr1=1.167&r2=text&tr2=1.166&f=h fs/xfs/linux-2.6/xfs_buf.h - 1.127 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_buf.h.diff?r1=text&tr1=1.127&r2=text&tr2=1.126&f=h fs/xfs/linux-2.6/xfs_buf.c - 1.261 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_buf.c.diff?r1=text&tr1=1.261&r2=text&tr2=1.260&f=h - replace the XFS buf iodone semaphore with a completion From owner-xfs@oss.sgi.com Tue Aug 5 00:34:26 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:34:43 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m757YOmc012985 for ; Tue, 5 Aug 2008 00:34:25 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA07259; Tue, 5 Aug 2008 17:35:36 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id A587B58C52A4; Tue, 5 Aug 2008 17:35:36 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - extend completions to provide XFS object flush requirements Message-Id: <20080805073536.A587B58C52A4@chook.melbourne.sgi.com> Date: Tue, 5 Aug 2008 17:35:36 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17385 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Date: Tue Aug 5 17:34:21 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: david@fromorbit.com lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: 2.6.x-xfs-melb:linux:31816a include/linux/completion.h - 1.6 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/linux-2.6-xfs/include/linux/completion.h.diff?r1=text&tr1=1.6&r2=text&tr2=1.5&f=h - extend completions to provide XFS object flush requirements XFS object flushing doesn't quite match existing completion semantics. It mixed exclusive access with completion. That is, we need to mark an object as being flushed before flushing it to disk, and then block any other attempt to flush it until the completion occurs. We do this but adding an extra count to the completion before we start using them. However, we still need to determine if there is a completion in progress, and allow no-blocking attempts fo completions to decrement the count. To do this we introduce: int try_wait_for_completion(struct completion *x) returns a failure status if done == 0, otherwise decrements done to zero and returns a "started" status. This is provided to allow counted completions to begin safely while holding object locks in inverted order. int completion_done(struct completion *x) returns 1 if there is no waiter, 0 if there is a waiter (i.e. a completion in progress). This replaces the use of semaphores for providing this exclusion and completion mechanism. Signed-off-by: Dave Chinner From owner-xfs@oss.sgi.com Tue Aug 5 00:36:08 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:36:18 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m757a8RH013247 for ; Tue, 5 Aug 2008 00:36:08 -0700 X-ASG-Debug-ID: 1217921841-75e702c20000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A607E35CE61 for ; Tue, 5 Aug 2008 00:37:21 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id 7i03jcvdICXypBot for ; Tue, 05 Aug 2008 00:37:21 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgUMAN+dl0h5LAiFXWdsb2JhbACKc4Y7Hpxn X-IronPort-AV: E=Sophos;i="4.31,309,1215354600"; d="scan'208";a="174560785" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 17:07:18 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQH6V-0001Up-F2; Tue, 05 Aug 2008 17:37:11 +1000 Date: Tue, 5 Aug 2008 17:37:11 +1000 From: Dave Chinner To: Lachlan McIlroy Cc: xfs@oss.sgi.com, xfs-dev X-ASG-Orig-Subj: Re: [PATCH] Move vn_iowait() earlier in the reclaim path Subject: Re: [PATCH] Move vn_iowait() earlier in the reclaim path Message-ID: <20080805073711.GA21635@disturbed> Mail-Followup-To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev References: <4897F691.6010806@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4897F691.6010806@sgi.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217921842 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1803 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17386 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 04:43:29PM +1000, Lachlan McIlroy wrote: > Currently by the time we get to vn_iowait() in xfs_reclaim() we have already > gone through xfs_inactive()/xfs_free() and recycled the inode. Any I/O xfs_free()? What's that? > completions still running (file size updates and unwritten extent conversions) > may be working on an inode that is no longer valid. The linux inode does not get freed until after ->clear_inode completes, hence it is perfectly valid to reference it anywhere in the ->clear_inode path. My bet is that you are seeing I/O completion mark an inode dirty that is being freed. ie. Calling mark_inode_dirty_sync() in the I/O completion blindly assumes that the linux inode is still valid, when it may be in the 'being freed' path. e.g. we can put it back on the superblock dirty list just before it gets freed for real... I came across this about a week ago when tracking down a QA failure with a combined linux/XFS inode patch. The fix is to make I/O completion call xfs_mark_inode_dirty_sync() so we check that this linux inode not in the process of being freed before we try to mark it dirty. Cheers, Dave. -- Dave Chinner david@fromorbit.com XFS: Never call mark_inode_dirty_sync() directly Once the Linux inode and the XFS inode are combined, we cannot rely on just checking if the linux inode exists as a method of determining if it is valid or not. Hence we should always call xfs_mark_inode_dirty_sync() as it does the correct checks to determine if the linux inode is in a valid state or not before marking it dirty. --- fs/xfs/linux-2.6/xfs_aops.c | 2 +- fs/xfs/linux-2.6/xfs_iops.c | 4 ++-- fs/xfs/linux-2.6/xfs_super.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c index 0b211cb..45c53a7 100644 --- a/fs/xfs/linux-2.6/xfs_aops.c +++ b/fs/xfs/linux-2.6/xfs_aops.c @@ -192,7 +192,7 @@ xfs_setfilesize( ip->i_d.di_size = isize; ip->i_update_core = 1; ip->i_update_size = 1; - mark_inode_dirty_sync(ioend->io_inode); + xfs_mark_inode_dirty_sync(ip); } xfs_iunlock(ip, XFS_ILOCK_EXCL); diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c index 1240b73..379f19b 100644 --- a/fs/xfs/linux-2.6/xfs_iops.c +++ b/fs/xfs/linux-2.6/xfs_iops.c @@ -133,7 +133,7 @@ xfs_ichgtime( SYNCHRONIZE(); ip->i_update_core = 1; if (!(inode->i_state & I_NEW)) - mark_inode_dirty_sync(inode); + xfs_mark_inode_dirty_sync(ip); } /* @@ -178,7 +178,7 @@ xfs_ichgtime_fast( SYNCHRONIZE(); ip->i_update_core = 1; if (!(inode->i_state & I_NEW)) - mark_inode_dirty_sync(inode); + xfs_mark_inode_dirty_sync(ip); } /* diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c index fc00499..e9cb0e9 100644 --- a/fs/xfs/linux-2.6/xfs_super.c +++ b/fs/xfs/linux-2.6/xfs_super.c @@ -963,7 +963,7 @@ xfs_fs_write_inode( * it dirty again so we'll try again later. */ if (error) - mark_inode_dirty_sync(inode); + xfs_mark_inode_dirty_sync(XFS_I(inode)); return -error; } From owner-xfs@oss.sgi.com Tue Aug 5 00:37:16 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:38:10 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m757bGR6013656 for ; Tue, 5 Aug 2008 00:37:16 -0700 X-ASG-Debug-ID: 1217921910-5564030d0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 26DBF1246C5E for ; Tue, 5 Aug 2008 00:38:30 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id ttc85ZGPhUMG9Qeu for ; Tue, 05 Aug 2008 00:38:30 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgUMAN+dl0h5LAiFXWdsb2JhbACKc4Y7Hpxn X-IronPort-AV: E=Sophos;i="4.31,309,1215354600"; d="scan'208";a="174561504" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 17:08:27 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQH7j-0001WX-4f; Tue, 05 Aug 2008 17:38:27 +1000 Date: Tue, 5 Aug 2008 17:38:27 +1000 From: Dave Chinner To: Lachlan McIlroy Cc: xfs@oss.sgi.com, xfs-dev X-ASG-Orig-Subj: Re: [PATCH] Don't release root inode until finished using it Subject: Re: [PATCH] Don't release root inode until finished using it Message-ID: <20080805073827.GB21635@disturbed> Mail-Followup-To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev References: <4897F434.2010309@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4897F434.2010309@sgi.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217921911 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1805 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17387 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 04:33:24PM +1000, Lachlan McIlroy wrote: > During unmount we're releasing the last reference on the root inode > before we've finished using it so move it after the XFS_SEND_UNMOUNT > hook. Looks ok. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Tue Aug 5 00:38:25 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:38:44 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m757cO22013889 for ; Tue, 5 Aug 2008 00:38:24 -0700 X-ASG-Debug-ID: 1217921978-591902900000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 563B11246DB8 for ; Tue, 5 Aug 2008 00:39:38 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id ZbgDJJKgTugaTSgH for ; Tue, 05 Aug 2008 00:39:38 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgUMAN+dl0h5LAiFXWdsb2JhbACKc4Y7Hpxn X-IronPort-AV: E=Sophos;i="4.31,309,1215354600"; d="scan'208";a="174562178" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 17:09:37 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQH8r-0001Y4-AH; Tue, 05 Aug 2008 17:39:37 +1000 Date: Tue, 5 Aug 2008 17:39:37 +1000 From: Dave Chinner To: Lachlan McIlroy Cc: xfs-oss , xfs-dev X-ASG-Orig-Subj: Re: [PATCH] Use KM_NOFS for debug trace buffers Subject: Re: [PATCH] Use KM_NOFS for debug trace buffers Message-ID: <20080805073937.GC21635@disturbed> Mail-Followup-To: Lachlan McIlroy , xfs-oss , xfs-dev References: <4897FCC7.400@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4897FCC7.400@sgi.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217921979 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1805 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17388 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 05:09:59PM +1000, Lachlan McIlroy wrote: > Use KM_NOFS to prevent recursion back into the filesystem which can > cause deadlocks. > > In the case of xfs_iread() we hold the lock on the inode cluster buffer > while allocating memory for the trace buffers. If we recurse back into > XFS to flush data that may require a transaction to allocate extents > which needs log space. This can deadlock with the xfsaild thread which > can't push the tail of the log because it is trying to get the inode > cluster buffer lock. Looks OK. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Tue Aug 5 00:40:15 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:40:26 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m757eDYt014531 for ; Tue, 5 Aug 2008 00:40:14 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA07454; Tue, 5 Aug 2008 17:41:26 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 7B66358C52A4; Tue, 5 Aug 2008 17:41:26 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - replace inode flush semaphore with a completion Message-Id: <20080805074126.7B66358C52A4@chook.melbourne.sgi.com> Date: Tue, 5 Aug 2008 17:41:26 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17389 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs replace inode flush semaphore with a completion Use the new completion flush code to implement the inode flush lock. Removes one of the final users of semaphores in the XFS code base. Signed-off-by: Dave Chinner Date: Tue Aug 5 17:40:34 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: david@fromorbit.com lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31817a fs/xfs/xfs_inode_item.c - 1.139 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode_item.c.diff?r1=text&tr1=1.139&r2=text&tr2=1.138&f=h fs/xfs/xfs_iget.c - 1.244 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_iget.c.diff?r1=text&tr1=1.244&r2=text&tr2=1.243&f=h fs/xfs/xfs_inode.c - 1.516 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.516&r2=text&tr2=1.515&f=h fs/xfs/xfs_inode.h - 1.255 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.h.diff?r1=text&tr1=1.255&r2=text&tr2=1.254&f=h - replace inode flush semaphore with a completion From owner-xfs@oss.sgi.com Tue Aug 5 00:43:16 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:43:22 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m757hGdj014975 for ; Tue, 5 Aug 2008 00:43:16 -0700 X-ASG-Debug-ID: 1217922269-19da038e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0F19E1985425 for ; Tue, 5 Aug 2008 00:44:30 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id wTMs54y6no6aJyvx for ; Tue, 05 Aug 2008 00:44:30 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgUMAGOhl0h5LAiFXWdsb2JhbACKc4Y7HpxV X-IronPort-AV: E=Sophos;i="4.31,309,1215354600"; d="scan'208";a="174565112" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 17:14:27 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQHDV-0001ee-KG; Tue, 05 Aug 2008 17:44:25 +1000 Date: Tue, 5 Aug 2008 17:44:25 +1000 From: Dave Chinner To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev X-ASG-Orig-Subj: Re: [PATCH] Move vn_iowait() earlier in the reclaim path Subject: Re: [PATCH] Move vn_iowait() earlier in the reclaim path Message-ID: <20080805074425.GD21635@disturbed> Mail-Followup-To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev References: <4897F691.6010806@sgi.com> <20080805073711.GA21635@disturbed> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080805073711.GA21635@disturbed> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217922271 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1804 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17390 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 05:37:11PM +1000, Dave Chinner wrote: > On Tue, Aug 05, 2008 at 04:43:29PM +1000, Lachlan McIlroy wrote: > > Currently by the time we get to vn_iowait() in xfs_reclaim() we have already > > gone through xfs_inactive()/xfs_free() and recycled the inode. Any I/O > > xfs_free()? What's that? > > > completions still running (file size updates and unwritten extent conversions) > > may be working on an inode that is no longer valid. > > The linux inode does not get freed until after ->clear_inode > completes, hence it is perfectly valid to reference it anywhere > in the ->clear_inode path. > > My bet is that you are seeing I/O completion mark an inode dirty > that is being freed. ie. Calling mark_inode_dirty_sync() in the I/O > completion blindly assumes that the linux inode is still valid, > when it may be in the 'being freed' path. e.g. we can put it back on the > superblock dirty list just before it gets freed for real... > > I came across this about a week ago when tracking down a QA failure > with a combined linux/XFS inode patch. The fix is to make I/O > completion call xfs_mark_inode_dirty_sync() so we check that this > linux inode not in the process of being freed before we try to > mark it dirty. Oh, I should point out that that xfs_mark_inode_dirty sync looked like this prior to the patch I posted: /* * If the linux inode is valid, mark it dirty. * Used when commiting a dirty inode into a transaction so that * the inode will get written back by the linux code */ void xfs_mark_inode_dirty_sync( xfs_inode_t *ip) { struct inode *inode = VFS_I(ip); if (!(inode->i_state & (I_WILL_FREE|I_FREEING|I_CLEAR))) mark_inode_dirty_sync(inode); } Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Tue Aug 5 00:45:12 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 00:45:24 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m757jBaV015389 for ; Tue, 5 Aug 2008 00:45:11 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA07675; Tue, 5 Aug 2008 17:46:23 +1000 Message-ID: <489806C2.7020200@sgi.com> Date: Tue, 05 Aug 2008 17:52:34 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev Subject: Re: [PATCH] Move vn_iowait() earlier in the reclaim path References: <4897F691.6010806@sgi.com> <20080805073711.GA21635@disturbed> In-Reply-To: <20080805073711.GA21635@disturbed> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17391 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Dave Chinner wrote: > On Tue, Aug 05, 2008 at 04:43:29PM +1000, Lachlan McIlroy wrote: >> Currently by the time we get to vn_iowait() in xfs_reclaim() we have already >> gone through xfs_inactive()/xfs_free() and recycled the inode. Any I/O > > xfs_free()? What's that? Sorry that should have been xfs_ifree() (we set the inode's mode to zero in there). > >> completions still running (file size updates and unwritten extent conversions) >> may be working on an inode that is no longer valid. > > The linux inode does not get freed until after ->clear_inode > completes, hence it is perfectly valid to reference it anywhere > in the ->clear_inode path. The problem I see is an assert in xfs_setfilesize() fail: ASSERT((ip->i_d.di_mode & S_IFMT) == S_IFREG); The mode of the XFS inode is zero at this time. > > My bet is that you are seeing I/O completion mark an inode dirty > that is being freed. ie. Calling mark_inode_dirty_sync() in the I/O > completion blindly assumes that the linux inode is still valid, > when it may be in the 'being freed' path. e.g. we can put it back on the > superblock dirty list just before it gets freed for real... > > I came across this about a week ago when tracking down a QA failure > with a combined linux/XFS inode patch. The fix is to make I/O > completion call xfs_mark_inode_dirty_sync() so we check that this > linux inode not in the process of being freed before we try to > mark it dirty. > > Cheers, > > Dave. From owner-xfs@oss.sgi.com Tue Aug 5 01:41:10 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 01:41:13 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m758fAfb018441 for ; Tue, 5 Aug 2008 01:41:10 -0700 X-ASG-Debug-ID: 1217925744-30cf01010000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7150AB1600B for ; Tue, 5 Aug 2008 01:42:24 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id lt9sZDubs3CwMxOg for ; Tue, 05 Aug 2008 01:42:24 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgUMAHivl0h5LAiFXWdsb2JhbACKc4Y7Hpxl X-IronPort-AV: E=Sophos;i="4.31,309,1215354600"; d="scan'208";a="174598368" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 05 Aug 2008 18:12:22 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQI7Z-0002tQ-1I; Tue, 05 Aug 2008 18:42:21 +1000 Date: Tue, 5 Aug 2008 18:42:20 +1000 From: Dave Chinner To: Lachlan McIlroy Cc: xfs@oss.sgi.com, xfs-dev X-ASG-Orig-Subj: Re: [PATCH] Move vn_iowait() earlier in the reclaim path Subject: Re: [PATCH] Move vn_iowait() earlier in the reclaim path Message-ID: <20080805084220.GF21635@disturbed> Mail-Followup-To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev References: <4897F691.6010806@sgi.com> <20080805073711.GA21635@disturbed> <489806C2.7020200@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <489806C2.7020200@sgi.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217925745 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1809 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17392 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 05:52:34PM +1000, Lachlan McIlroy wrote: > Dave Chinner wrote: >> On Tue, Aug 05, 2008 at 04:43:29PM +1000, Lachlan McIlroy wrote: >>> Currently by the time we get to vn_iowait() in xfs_reclaim() we have already >>> gone through xfs_inactive()/xfs_free() and recycled the inode. Any I/O >> >> xfs_free()? What's that? > Sorry that should have been xfs_ifree() (we set the inode's mode to > zero in there). > >> >>> completions still running (file size updates and unwritten extent conversions) >>> may be working on an inode that is no longer valid. >> >> The linux inode does not get freed until after ->clear_inode >> completes, hence it is perfectly valid to reference it anywhere >> in the ->clear_inode path. > The problem I see is an assert in xfs_setfilesize() fail: > > ASSERT((ip->i_d.di_mode & S_IFMT) == S_IFREG); > > The mode of the XFS inode is zero at this time. Ok, so the question has to be why is there I/O still in progress after the truncate is supposed to have already occurred and the vn_iowait() in xfs_itruncate_start() been executed. Something doesn't add up here - you can't be doing I/O on a file with no extents or delalloc blocks, hence that means we should be passing through the truncate path in xfs_inactive() before we call xfs_ifree() and therefore doing the vn_iowait().. Hmmmm - the vn_iowait() is conditional based on: /* wait for the completion of any pending DIOs */ if (new_size < ip->i_size) vn_iowait(ip); We are truncating to zero (new_size == 0), so the only case where this would not wait is if ip->i_size == 0. Still - I can't see how we'd be doing I/O on an inode with a zero i_size. I suspect ensuring we call vn_iowait() if newsize == 0 as well would fix the problem. If not, there's something much more subtle going on here that we should understand.... Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Tue Aug 5 04:02:56 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 04:03:01 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,J_CHICKENPOX_12, J_CHICKENPOX_72 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m75B2tI9025355 for ; Tue, 5 Aug 2008 04:02:56 -0700 X-ASG-Debug-ID: 1217934250-1b9f03070000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id E4BF135D91F for ; Tue, 5 Aug 2008 04:04:10 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com with ESMTP id JLeWBBSp8GOQ73EC for ; Tue, 05 Aug 2008 04:04:10 -0700 (PDT) X-ASG-Whitelist: Barracuda Reputation Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id m75B422I027040; Tue, 5 Aug 2008 07:04:02 -0400 Received: from file.rdu.redhat.com (file.rdu.redhat.com [10.11.255.147]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m75B42HR029116; Tue, 5 Aug 2008 07:04:02 -0400 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [10.10.36.72]) by file.rdu.redhat.com (8.13.1/8.13.1) with ESMTP id m75B416e009204; Tue, 5 Aug 2008 07:04:01 -0400 Received: from nb (vpn-4-109.str.redhat.com [10.32.4.109]) by devserv.devel.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id m75B3w4P012731; Tue, 5 Aug 2008 07:03:58 -0400 Date: Tue, 5 Aug 2008 13:03:57 +0200 From: Karel Zak To: Christoph Hellwig Cc: Jasper Bryant-Greene , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, util-linux-ng@vger.kernel.org X-ASG-Orig-Subj: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Subject: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Message-ID: <20080805110357.GL21873@nb.net.home> References: <1217553598.3860.16.camel@luna.unix.geek.nz> <20080801073033.GF6201@disturbed> <20080801193133.GA838@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20080801193133.GA838@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 X-Barracuda-Connect: mx1.redhat.com[66.187.233.31] X-Barracuda-Start-Time: 1217934250 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17393 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: kzak@redhat.com Precedence: bulk X-list: xfs On Fri, Aug 01, 2008 at 09:31:33PM +0200, Christoph Hellwig wrote: > I'ts most likely a fallout, but I wonder why. To get this behaviour > moutn would have to add all the options it finds in /proc/self/mounts > to the command line. mount(8) does not read and use /proc/self/mounts at all. Karel Man mount: remount Attempt to remount an already-mounted file system. This is commonly used to change the mount flags for a file system, especially to make a readonly file system writeable. It does not change device or mount point. The remount functionality follows the standard way how the mount command works with options from fstab. It means the mount command doesn’t read fstab (or mtab) only when a device and dir are fully specified. mount -o remount,rw /dev/foo /dir After this call all old mount options are replaced and arbitrary stuff from fstab is ignored, except the loop= option which is internally gener- ated and maintained by the mount command. mount -o remount,rw /dir After this call mount reads fstab (or mtab) and merges these options with options from command line ( -o ). -- Karel Zak From owner-xfs@oss.sgi.com Tue Aug 5 05:02:53 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 05:02:59 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_24, J_CHICKENPOX_53 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m75C2qm3001147 for ; Tue, 5 Aug 2008 05:02:53 -0700 X-ASG-Debug-ID: 1217937846-5eb702f60000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from tyo201.gate.nec.co.jp (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3806835DC29 for ; Tue, 5 Aug 2008 05:04:06 -0700 (PDT) Received: from tyo201.gate.nec.co.jp (TYO201.gate.nec.co.jp [202.32.8.193]) by cuda.sgi.com with ESMTP id 2RLgjAjj2GSebsZ5 for ; Tue, 05 Aug 2008 05:04:06 -0700 (PDT) Received: from mailgate3.nec.co.jp (mailgate54A.nec.co.jp [10.7.69.193]) by tyo201.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id m75C3Ml1000651; Tue, 5 Aug 2008 21:03:22 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id m75C3Me26984; Tue, 5 Aug 2008 21:03:22 +0900 (JST) Received: from yonosuke.jp.nec.com (yonosuke.jp.nec.com [10.26.220.15]) by mailsv.nec.co.jp (8.13.8/8.13.4) with ESMTP id m75C3MNd024232; Tue, 5 Aug 2008 21:03:22 +0900 (JST) Received: from TNESB07336 ([10.64.168.65] [10.64.168.65]) by mail.jp.nec.com with ESMTP; Tue, 5 Aug 2008 21:03:21 +0900 To: Takashi Sato Cc: "linux-fsdevel@vger.kernel.org" , "dm-devel@redhat.com" , Andrew Morton , "viro@ZenIV.linux.org.uk" , "linux-ext4@vger.kernel.org" , "xfs@oss.sgi.com" , Christoph Hellwig , "axboe@kernel.dk" , "mtk.manpages@googlemail.com" , "linux-kernel@vger.kernel.org" X-ASG-Orig-Subj: Re: [PATCH 0/3] freeze feature ver 1.9 Subject: Re: [PATCH 0/3] freeze feature ver 1.9 In-reply-to: <20080805113411t-sato@mail.jp.nec.com> References: <20080805113411t-sato@mail.jp.nec.com> Message-Id: <20080805210321t-sato@mail.jp.nec.com> Mime-Version: 1.0 X-Mailer: WeMail32[2.51] ID:1K0086 From: Takashi Sato Date: Tue, 5 Aug 2008 21:03:21 +0900 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Barracuda-Connect: TYO201.gate.nec.co.jp[202.32.8.193] X-Barracuda-Start-Time: 1217937847 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1822 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17394 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: t-sato@yk.jp.nec.com Precedence: bulk X-list: xfs Hi, I sent the latest patch-set of the freeze feature (ver 1.9) as the following mail. The points are: - When multiple freeze requests arrive simultaneously, only the last unfreeze process should unfreeze the frozen filesystem actually. So I added the reference counter to the freeze feature. - I left the timeout feature in my patch-set because we need it for the case someone dirties so much data that the freeze process is swapped out. Any comments? >When multiple freeze requests arrive simultaneously, only the last >unfreeze process should unfreeze the frozen filesystem actually >(as Dave Chinner, Eric Sandeen and Alasdair G Kergon commented). >So I've added the reference counter to the freeze feature. >It counts up in freeze_bdev() and counts down in thaw_bdev(). >When it becomes "0", thaw_bdev() will unfreeze actually. > >The following regular cases have worked correctly. > >A) >1. dmsetup suspend >2. FIFREEZE >3. FITHAW >4. dmsetup resume >B) >1. FIFREEZE >2. dmsetup suspend >3. dmsetup resume >4. FITHAW > >But in the following case, the last FITHAW has been frozen >for writing the super block because device-mapper layer is still frozen. >It's a irregular case (app's bug) and the next "dmsetup resume" >can solve it. So I don't think it is a problem. >C) >1. dmsetup suspend >2. FIFREEZE >3. FITHAW >4. FITHAW<- The thaw process was frozen. > >In my previous mail, I have mentioned considering removing the timeout >feature. But I leave it in my patch-set because we need it for the case >someone dirties so much data that the freeze process is swapped out >(as some people said). > >Currently, ext3 in mainline Linux doesn't have the freeze feature which >suspends write requests. So, we cannot take a backup which keeps >the filesystem's consistency with the storage device's features >(snapshot and replication) while it is mounted. >In many case, a commercial filesystem (e.g. VxFS) has >the freeze feature and it would be used to get the consistent backup. >If Linux's standard filesytem ext3 has the freeze feature, we can do it >without a commercial filesystem. > >So I have implemented the ioctls of the freeze feature. >I think we can take the consistent backup with the following steps. >1. Freeze the filesystem with the freeze ioctl. >2. Separate the replication volume or create the snapshot > with the storage device's feature. >3. Unfreeze the filesystem with the unfreeze ioctl. >4. Take the backup from the separated replication volume > or the snapshot. > >[PATCH 1/3] Implement generic freeze feature > I have modified to set the suitable error number (EOPNOTSUPP) > in case the filesystem doesn't support the freeze feature. > > The ioctls for the generic freeze feature are below. > o Freeze the filesystem > int ioctl(int fd, int FIFREEZE, arg) > fd: The file descriptor of the mountpoint > FIFREEZE: request code for the freeze > arg: Ignored > Return value: 0 if the operation succeeds. Otherwise, -1 > > o Unfreeze the filesystem > int ioctl(int fd, int FITHAW, arg) > fd: The file descriptor of the mountpoint > FITHAW: request code for unfreeze > arg: Ignored > Return value: 0 if the operation succeeds. Otherwise, -1 > >[PATCH 2/3] Remove XFS specific ioctl interfaces for freeze feature > It removes XFS specific ioctl interfaces and request codes > for freeze feature. > This patch has been supplied by David Chinner. > >[PATCH 3/3] Add timeout feature > The timeout feature is added to freeze ioctl. And new ioctl > to reset the timeout period is added. > o Freeze the filesystem > int ioctl(int fd, int FIFREEZE, long *timeout_sec) > fd: The file descriptor of the mountpoint > FIFREEZE: request code for the freeze > timeout_sec: the timeout period in seconds > If it's 0 or 1, the timeout isn't set. > This special case of "1" is implemented to keep > the compatibility with XFS applications. > Return value: 0 if the operation succeeds. Otherwise, -1 > > o Reset the timeout period > This is useful for the application to set the timeout_sec more accurately. > For example, the freezer resets the timeout_sec to 10 seconds every 5 > seconds. In this approach, even if the freezer causes a deadlock > by accessing the frozen filesystem, it will be solved by the timeout > in 10 seconds and the freezer can recognize that at the next reset > of timeout_sec. > int ioctl(int fd, int FIFREEZE_RESET_TIMEOUT, long *timeout_sec) > fd:file descriptor of mountpoint > FIFREEZE_RESET_TIMEOUT: request code for reset of timeout period > timeout_sec: new timeout period in seconds > Return value: 0 if the operation succeeds. Otherwise, -1 > Error number: If the filesystem has already been unfrozen, > errno is set to EINVAL. > >Any comments are very welcome. > >Cheers, Takashi From owner-xfs@oss.sgi.com Tue Aug 5 16:39:07 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 16:39:13 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_12, J_CHICKENPOX_72,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m75Nd7BF025032 for ; Tue, 5 Aug 2008 16:39:07 -0700 X-ASG-Debug-ID: 1217979619-26e1039a0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8EE87198FBC4 for ; Tue, 5 Aug 2008 16:40:20 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id PxQ9WBXbMtzV6jda for ; Tue, 05 Aug 2008 16:40:20 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aj0FAN9+mEh5LAiFXWdsb2JhbACKcoY6Hp06 X-IronPort-AV: E=Sophos;i="4.31,312,1215354600"; d="scan'208";a="175042409" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 06 Aug 2008 09:09:58 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQW8D-0001uJ-2f; Wed, 06 Aug 2008 09:39:57 +1000 Date: Wed, 6 Aug 2008 09:39:57 +1000 From: Dave Chinner To: Karel Zak Cc: Christoph Hellwig , Jasper Bryant-Greene , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, util-linux-ng@vger.kernel.org X-ASG-Orig-Subj: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Subject: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Message-ID: <20080805233956.GI21635@disturbed> Mail-Followup-To: Karel Zak , Christoph Hellwig , Jasper Bryant-Greene , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, util-linux-ng@vger.kernel.org References: <1217553598.3860.16.camel@luna.unix.geek.nz> <20080801073033.GF6201@disturbed> <20080801193133.GA838@lst.de> <20080805110357.GL21873@nb.net.home> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20080805110357.GL21873@nb.net.home> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217979621 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0017 1.0000 -2.0096 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.01 X-Barracuda-Spam-Status: No, SCORE=-2.01 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1865 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17395 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Tue, Aug 05, 2008 at 01:03:57PM +0200, Karel Zak wrote: > On Fri, Aug 01, 2008 at 09:31:33PM +0200, Christoph Hellwig wrote: > > I'ts most likely a fallout, but I wonder why. To get this behaviour > > moutn would have to add all the options it finds in /proc/self/mounts > > to the command line. > > mount(8) does not read and use /proc/self/mounts at all. > > Karel > > > Man mount: > > remount > > Attempt to remount an already-mounted file system. This is commonly used > to change the mount flags for a file system, especially to make a readonly > file system writeable. It does not change device or mount point. > > The remount functionality follows the standard way how the mount command > works with options from fstab. It means the mount command doesn’t read > fstab (or mtab) only when a device and dir are fully specified. > > mount -o remount,rw /dev/foo /dir > > After this call all old mount options are replaced and arbitrary stuff > from fstab is ignored, except the loop= option which is internally gener- > ated and maintained by the mount command. > > mount -o remount,rw /dir > > After this call mount reads fstab (or mtab) and merges these options with > options from command line ( -o ). So, given the command at issue was: luna ~ # mount -o remount,rw /usr We're seeing the second case where mount is merging all the options in /etc/fstab into the options passed into the remount command. How is the filesystem expected to behave in these difference cases? The first simply changes the ro/rw status, the second potentially asks for the filesystem to change a bunch of other mount options as well, which it may not be able to do. So what is the correct behaviour? Should the filesystem *silently ignore* unchangable options in the remount command, or should it fail the remount and warn the user that certain options are not allowed in remount? Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Tue Aug 5 16:43:01 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 16:43:04 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m75Nh1YC025562 for ; Tue, 5 Aug 2008 16:43:01 -0700 X-ASG-Debug-ID: 1217979855-01c601ce0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mars.unix.geek.nz (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 532D4F05BE5 for ; Tue, 5 Aug 2008 16:44:15 -0700 (PDT) Received: from mars.unix.geek.nz ([116.90.133.99]) by cuda.sgi.com with ESMTP id Sq5uSYRJrbOGKSgV for ; Tue, 05 Aug 2008 16:44:15 -0700 (PDT) Received: from [202.6.116.165] (unknown [202.6.116.165]) (Authenticated sender: jasper) by mars.unix.geek.nz (Postfix) with ESMTPSA id 6BD6F159B21; Wed, 6 Aug 2008 11:44:13 +1200 (NZST) X-ASG-Orig-Subj: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Subject: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 From: Jasper Bryant-Greene To: Dave Chinner Cc: Karel Zak , Christoph Hellwig , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, util-linux-ng@vger.kernel.org In-Reply-To: <20080805233956.GI21635@disturbed> References: <1217553598.3860.16.camel@luna.unix.geek.nz> <20080801073033.GF6201@disturbed> <20080801193133.GA838@lst.de> <20080805110357.GL21873@nb.net.home> <20080805233956.GI21635@disturbed> Content-Type: text/plain Organization: Amiton Ltd Date: Wed, 06 Aug 2008 11:44:22 +1200 Message-Id: <1217979862.2887.39.camel@luna.unix.geek.nz> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: UNKNOWN[116.90.133.99] X-Barracuda-Start-Time: 1217979856 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.92 X-Barracuda-Spam-Status: No, SCORE=-1.92 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=RDNS_NONE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1864 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.10 RDNS_NONE Delivered to trusted network by a host with no rDNS X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17396 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jasper@amiton.co.nz Precedence: bulk X-list: xfs On Wed, 2008-08-06 at 09:39 +1000, Dave Chinner wrote: > We're seeing the second case where mount is merging all the options in > /etc/fstab into the options passed into the remount command. How is > the filesystem expected to behave in these difference cases? The > first simply changes the ro/rw status, the second potentially > asks for the filesystem to change a bunch of other mount options > as well, which it may not be able to do. > > So what is the correct behaviour? Should the filesystem *silently > ignore* unchangable options in the remount command, or should it > fail the remount and warn the user that certain options are not > allowed in remount? (forgive me, I'm an XFS user, not an XFS developer, so this might be ignorant) The filesystem presumably knows what options it was originally mounted with. Thus if you take the difference of the set of options you were mounted with, and the set of options you are now being asked to remount with, you have the options which are being asked to change. If changing any of them is unsupported I would expect an error, but in this case the result of taking the above set difference would be merely replacing ro with rw, and thus the filesystem is presumably capable of doing the remount. -jasper From owner-xfs@oss.sgi.com Tue Aug 5 17:00:03 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 17:00:06 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m760026G027058 for ; Tue, 5 Aug 2008 17:00:03 -0700 X-ASG-Debug-ID: 1217980876-1ae0023b0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C002B361C99; Tue, 5 Aug 2008 17:01:17 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id lC5fh9w4TGPfjRCk; Tue, 05 Aug 2008 17:01:17 -0700 (PDT) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1KQWSq-0007ie-MK; Wed, 06 Aug 2008 00:01:16 +0000 Date: Tue, 5 Aug 2008 20:01:16 -0400 From: Christoph Hellwig To: Lachlan McIlroy Cc: xfs@oss.sgi.com, xfs-dev X-ASG-Orig-Subj: Re: [PATCH] Don't release root inode until finished using it Subject: Re: [PATCH] Don't release root inode until finished using it Message-ID: <20080806000116.GA15930@infradead.org> References: <4897F434.2010309@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4897F434.2010309@sgi.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1217980877 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1865 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17397 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Looks good, but I've already sent a nicer version on the 26th: [PATCH 1/6] move root inode IRELE into xfs_unmountfs Sometimes I wish people would at least glance over my patches when they hit the list.. From owner-xfs@oss.sgi.com Tue Aug 5 17:52:45 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 17:52:51 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_72, RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m760qiAZ032043 for ; Tue, 5 Aug 2008 17:52:45 -0700 X-ASG-Debug-ID: 1217984037-202a01520000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A935D198F977 for ; Tue, 5 Aug 2008 17:53:57 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id LCSDpqHvoprOn7t0 for ; Tue, 05 Aug 2008 17:53:57 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aj0FAHSQmEh5LAiFXWdsb2JhbACKcoY5Hp0O X-IronPort-AV: E=Sophos;i="4.31,313,1215354600"; d="scan'208";a="175099886" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 06 Aug 2008 10:23:56 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQXHm-0003Ua-CA; Wed, 06 Aug 2008 10:53:54 +1000 Date: Wed, 6 Aug 2008 10:53:54 +1000 From: Dave Chinner To: Jasper Bryant-Greene Cc: Karel Zak , Christoph Hellwig , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, util-linux-ng@vger.kernel.org X-ASG-Orig-Subj: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Subject: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Message-ID: <20080806005354.GL21635@disturbed> Mail-Followup-To: Jasper Bryant-Greene , Karel Zak , Christoph Hellwig , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, util-linux-ng@vger.kernel.org References: <1217553598.3860.16.camel@luna.unix.geek.nz> <20080801073033.GF6201@disturbed> <20080801193133.GA838@lst.de> <20080805110357.GL21873@nb.net.home> <20080805233956.GI21635@disturbed> <1217979862.2887.39.camel@luna.unix.geek.nz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1217979862.2887.39.camel@luna.unix.geek.nz> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1217984038 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1873 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17398 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Wed, Aug 06, 2008 at 11:44:22AM +1200, Jasper Bryant-Greene wrote: > On Wed, 2008-08-06 at 09:39 +1000, Dave Chinner wrote: > > We're seeing the second case where mount is merging all the options in > > /etc/fstab into the options passed into the remount command. How is > > the filesystem expected to behave in these difference cases? The > > first simply changes the ro/rw status, the second potentially > > asks for the filesystem to change a bunch of other mount options > > as well, which it may not be able to do. > > > > So what is the correct behaviour? Should the filesystem *silently > > ignore* unchangable options in the remount command, or should it > > fail the remount and warn the user that certain options are not > > allowed in remount? > > (forgive me, I'm an XFS user, not an XFS developer, so this might be > ignorant) > > The filesystem presumably knows what options it was originally mounted > with. > > Thus if you take the difference of the set of options you were mounted > with, and the set of options you are now being asked to remount with, > you have the options which are being asked to change. Sure. But that does not answer my question about what to do with options that can't be changed. Options can come from more than just /etc/fstab - they can come from the mount command line itself as entered by the admin. What do we do if an option is specified that we do not support in a remount? The problem is the way mount combines command line options with options in fstab. I'm not questioning what you did - I'm asking what the expected behaviour is supposed to be so we can make it behave the same way as all the other filesystems. > If changing any of them is unsupported I would expect an error, but in > this case the result of taking the above set difference would be merely > replacing ro with rw, and thus the filesystem is presumably capable of > doing the remount. Use the full device/directory syntax for the remount command and it will do just that. The command you issued was not a "pure" remount,rw, it was silently changed by mount.... Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Tue Aug 5 19:01:06 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 19:01:09 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: * X-Spam-Status: No, score=1.9 required=5.0 tests=BAYES_05,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m76214T1003680 for ; Tue, 5 Aug 2008 19:01:06 -0700 X-ASG-Debug-ID: 1217988138-0f9b02fa0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from dolby.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 153163621D1 for ; Tue, 5 Aug 2008 19:02:18 -0700 (PDT) Received: from dolby.com (silver.dolby.com [12.144.165.24]) by cuda.sgi.com with ESMTP id 2z6Hie4Q64teM2PO for ; Tue, 05 Aug 2008 19:02:18 -0700 (PDT) Received: from ([172.16.17.29]) by silver.dolby.com with ESMTP id 5202052.97509635; Tue, 05 Aug 2008 19:01:29 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 MIME-Version: 1.0 X-ASG-Orig-Subj: xfs_force_shutdown - file fs/xfs/xfs_rw.c Subject: xfs_force_shutdown - file fs/xfs/xfs_rw.c Date: Tue, 5 Aug 2008 19:02:05 -0700 Message-ID: <05BB196AB3DA6C4BBE11AB6C957581FE0E61DA47@sfo-exch-01.dolby.net> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: xfs_force_shutdown - file fs/xfs/xfs_rw.c Thread-Index: Acj3aGz8s0PqbVTiS/W6d2jV2mf5OQ== From: "Martinez, Sergio" To: X-Barracuda-Connect: silver.dolby.com[12.144.165.24] X-Barracuda-Start-Time: 1217988139 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1877 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 943 X-archive-position: 17399 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sergio.martinez@dolby.com Precedence: bulk X-list: xfs I'm running into a force shutdown issue on a system running linux-2.6.25. The xfs partition is 2 GB and we're using a hardware RAID-5 for storage. dmesg shows this error: xfs_force_shutdown(sda2,0x1) called from line 420 of file fs/xfs/xfs_rw.c. Return address = 0xc02ddebc An XFS_bwrite() calls is failing. Any suggestions on how to debug this? I'm suspecting that there's a problem with the RAID controller firmware, but I'd like to eliminate the possibility of a file system bug. Thanks, Sergio ----------------------------------------- This message (including any attachments) may contain confidential information intended for a specific individual and purpose. If you are not the intended recipient, delete this message. If you are not the intended recipient, disclosing, copying, distributing, or taking any action based on this message is strictly prohibited. [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Tue Aug 5 19:21:12 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 19:21:15 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m762LAmv005245 for ; Tue, 5 Aug 2008 19:21:11 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA00299; Wed, 6 Aug 2008 12:22:18 +1000 Message-ID: <48990C4E.9070102@sgi.com> Date: Wed, 06 Aug 2008 12:28:30 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev Subject: Re: [PATCH] Move vn_iowait() earlier in the reclaim path References: <4897F691.6010806@sgi.com> <20080805073711.GA21635@disturbed> <489806C2.7020200@sgi.com> <20080805084220.GF21635@disturbed> In-Reply-To: <20080805084220.GF21635@disturbed> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17400 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Dave Chinner wrote: > On Tue, Aug 05, 2008 at 05:52:34PM +1000, Lachlan McIlroy wrote: >> Dave Chinner wrote: >>> On Tue, Aug 05, 2008 at 04:43:29PM +1000, Lachlan McIlroy wrote: >>>> Currently by the time we get to vn_iowait() in xfs_reclaim() we have already >>>> gone through xfs_inactive()/xfs_free() and recycled the inode. Any I/O >>> xfs_free()? What's that? >> Sorry that should have been xfs_ifree() (we set the inode's mode to >> zero in there). >> >>>> completions still running (file size updates and unwritten extent conversions) >>>> may be working on an inode that is no longer valid. >>> The linux inode does not get freed until after ->clear_inode >>> completes, hence it is perfectly valid to reference it anywhere >>> in the ->clear_inode path. >> The problem I see is an assert in xfs_setfilesize() fail: >> >> ASSERT((ip->i_d.di_mode & S_IFMT) == S_IFREG); >> >> The mode of the XFS inode is zero at this time. > > Ok, so the question has to be why is there I/O still in progress > after the truncate is supposed to have already occurred and the > vn_iowait() in xfs_itruncate_start() been executed. > > Something doesn't add up here - you can't be doing I/O on a file > with no extents or delalloc blocks, hence that means we should be > passing through the truncate path in xfs_inactive() before we > call xfs_ifree() and therefore doing the vn_iowait().. > > Hmmmm - the vn_iowait() is conditional based on: > > /* wait for the completion of any pending DIOs */ > if (new_size < ip->i_size) > vn_iowait(ip); > > We are truncating to zero (new_size == 0), so the only case where > this would not wait is if ip->i_size == 0. Still - I can't see > how we'd be doing I/O on an inode with a zero i_size. I suspect > ensuring we call vn_iowait() if newsize == 0 as well would fix > the problem. If not, there's something much more subtle going > on here that we should understand.... If we make the vn_iowait() unconditional we might re-introduce the NFS exclusivity bug that killed performance. That was through xfs_release()->xfs_free_eofblocks()->xfs_itruncate_start(). So if we leave the above code as is then we need another vn_iowait() in xfs_inactive() to catch any remaining workqueue items that we didn't wait for in xfs_itruncate_start(). In that case the last call to vn_iowait() should be inside xfs_inactive() after the truncate but before the call to xfs_ifree(). From owner-xfs@oss.sgi.com Tue Aug 5 19:25:29 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 19:25:32 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m762PRbs005899 for ; Tue, 5 Aug 2008 19:25:28 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA00375; Wed, 6 Aug 2008 12:26:38 +1000 Message-ID: <48990D53.3070305@sgi.com> Date: Wed, 06 Aug 2008 12:32:51 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs@oss.sgi.com, xfs-dev Subject: Re: [PATCH] Don't release root inode until finished using it References: <4897F434.2010309@sgi.com> <20080806000116.GA15930@infradead.org> In-Reply-To: <20080806000116.GA15930@infradead.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17401 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Christoph Hellwig wrote: > Looks good, but I've already sent a nicer version on the 26th: > > [PATCH 1/6] move root inode IRELE into xfs_unmountfs > > > Sometimes I wish people would at least glance over my patches when they > hit the list.. Some of us are trying to fix bugs and don't have time to look at every cleanup patch that hits the list. From owner-xfs@oss.sgi.com Tue Aug 5 21:11:50 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 21:12:38 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m764Bk1J018098 for ; Tue, 5 Aug 2008 21:11:48 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA02290; Wed, 6 Aug 2008 14:12:57 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id B9C6758C52A4; Wed, 6 Aug 2008 14:12:57 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - replace dquot flush semaphore with a completion Message-Id: <20080806041257.B9C6758C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 14:12:57 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17402 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs replace dquot flush semaphore with a completion Use the new completion flush code to implement the dquot flush lock. Removes one of the final users of semaphores in the XFS code base. Signed-off-by: Dave Chinner Date: Wed Aug 6 14:12:02 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: david@fromorbit.com lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31822a fs/xfs/quota/xfs_dquot_item.c - 1.23 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_dquot_item.c.diff?r1=text&tr1=1.23&r2=text&tr2=1.22&f=h fs/xfs/quota/xfs_dquot.h - 1.13 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_dquot.h.diff?r1=text&tr1=1.13&r2=text&tr2=1.12&f=h fs/xfs/quota/xfs_dquot.c - 1.37 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_dquot.c.diff?r1=text&tr1=1.37&r2=text&tr2=1.36&f=h fs/xfs/quota/xfs_qm.c - 1.72 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm.c.diff?r1=text&tr1=1.72&r2=text&tr2=1.71&f=h - replace dquot flush semaphore with a completion From owner-xfs@oss.sgi.com Tue Aug 5 21:30:27 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 21:30:33 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m764UPv4019947 for ; Tue, 5 Aug 2008 21:30:26 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA02746; Wed, 6 Aug 2008 14:31:38 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 1E66D58C52A4; Wed, 6 Aug 2008 14:31:38 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - remove the sema_t from XFS. Message-Id: <20080806043138.1E66D58C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 14:31:38 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17403 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs remove the sema_t from XFS. Now that all users of the sema_t are gone from XFS we can finally kill it. Signed-off-by: Dave Chinner Date: Wed Aug 6 14:30:56 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: david@fromorbit.com lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31823a fs/xfs/linux-2.6/xfs_linux.h - 1.170 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_linux.h.diff?r1=text&tr1=1.170&r2=text&tr2=1.169&f=h fs/xfs/linux-2.6/sema.h - 1.16 - deleted http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/sema.h.diff?r1=text&tr1=1.16&r2=text&tr2=1.15&f=h - remove the sema_t from XFS. From owner-xfs@oss.sgi.com Tue Aug 5 21:32:38 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 21:32:49 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m764Wc04020419 for ; Tue, 5 Aug 2008 21:32:38 -0700 X-ASG-Debug-ID: 1217997232-218900b70000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from web34502.mail.mud.yahoo.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with SMTP id 59E6E362848 for ; Tue, 5 Aug 2008 21:33:52 -0700 (PDT) Received: from web34502.mail.mud.yahoo.com (web34502.mail.mud.yahoo.com [66.163.178.168]) by cuda.sgi.com with SMTP id XJSVaaALqU54sPWi for ; Tue, 05 Aug 2008 21:33:52 -0700 (PDT) Received: (qmail 22407 invoked by uid 60001); 6 Aug 2008 04:33:52 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Message-ID; b=NCAN3LvYp4nWQiuipjCc47UDDgbzsY+b+SZQRHsDG52mc7JMybvAJ6+wXQTsREnVfVxOQx7+4MyGVVKsVHzToTpjZjlqIvDdS3573ccI5fJKEddWT1q84+NwJ7Z6zZhqEwO0YvB8m0Oobt5biHLi+90Xdvv4vCvGnjsJ8x+Pufo=; Received: from [96.14.187.213] by web34502.mail.mud.yahoo.com via HTTP; Tue, 05 Aug 2008 21:33:52 PDT X-Mailer: YahooMailWebService/0.7.218 Date: Tue, 5 Aug 2008 21:33:52 -0700 (PDT) From: gus3 Reply-To: MusicMan529@yahoo.com X-ASG-Orig-Subj: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Subject: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 To: Karel Zak , Dave Chinner Cc: Christoph Hellwig , Jasper Bryant-Greene , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, util-linux-ng@vger.kernel.org In-Reply-To: <20080805233956.GI21635@disturbed> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Message-ID: <510965.21937.qm@web34502.mail.mud.yahoo.com> X-Barracuda-Connect: web34502.mail.mud.yahoo.com[66.163.178.168] X-Barracuda-Start-Time: 1217997233 X-Barracuda-Bayes: INNOCENT GLOBAL 0.3412 1.0000 -0.1845 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -0.18 X-Barracuda-Spam-Status: No, SCORE=-0.18 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1886 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17404 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: musicman529@yahoo.com Precedence: bulk X-list: xfs --- On Tue, 8/5/08, Dave Chinner wrote: > So what is the correct behaviour? Should the filesystem > *silently > ignore* unchangable options in the remount command, or > should it > fail the remount and warn the user that certain options are > not > allowed in remount? How about a middle ground: ignore, but not silently? Report an error, or send it to the syslog, or both, but ultimately ignore unchangeable options, change what can be changed, and give the user/admin as much as possible. This can be particularly pertinent for XFS root. If it's mounted RO at first, it may (will?) need to become RW at some later point. Failing the remount could result in a system that requires a rescue CD (or lots of headaches for remote administration). From owner-xfs@oss.sgi.com Tue Aug 5 21:35:12 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 21:35:39 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m764ZBis020859 for ; Tue, 5 Aug 2008 21:35:12 -0700 X-ASG-Debug-ID: 1217997385-1e8f01100000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mars.unix.geek.nz (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A416635BE23 for ; Tue, 5 Aug 2008 21:36:25 -0700 (PDT) Received: from mars.unix.geek.nz ([116.90.133.99]) by cuda.sgi.com with ESMTP id yBFWFKohUfS4rcSY for ; Tue, 05 Aug 2008 21:36:25 -0700 (PDT) Received: from [192.168.21.2] (ip-58-28-152-52.static-xdsl.xnet.co.nz [58.28.152.52]) (Authenticated sender: jasper) by mars.unix.geek.nz (Postfix) with ESMTPSA id 3E7D3124F3F; Wed, 6 Aug 2008 16:36:24 +1200 (NZST) X-ASG-Orig-Subj: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 Subject: Re: XFS noikeep remount in 2.6.27-rc1-next-20080730 From: Jasper Bryant-Greene To: MusicMan529@yahoo.com Cc: Karel Zak , Dave Chinner , Christoph Hellwig , linux-kernel@vger.kernel.org, xfs@oss.sgi.com, util-linux-ng@vger.kernel.org In-Reply-To: <510965.21937.qm@web34502.mail.mud.yahoo.com> References: <510965.21937.qm@web34502.mail.mud.yahoo.com> Content-Type: text/plain Organization: Amiton Ltd Date: Wed, 06 Aug 2008 16:36:35 +1200 Message-Id: <1217997395.3456.1.camel@luna.unix.geek.nz> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: UNKNOWN[116.90.133.99] X-Barracuda-Start-Time: 1217997386 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0001 1.0000 -2.0204 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.92 X-Barracuda-Spam-Status: No, SCORE=-1.92 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=RDNS_NONE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1886 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.10 RDNS_NONE Delivered to trusted network by a host with no rDNS X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17405 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jasper@amiton.co.nz Precedence: bulk X-list: xfs On Tue, 2008-08-05 at 21:33 -0700, gus3 wrote: > --- On Tue, 8/5/08, Dave Chinner wrote: > > > So what is the correct behaviour? Should the filesystem > > *silently > > ignore* unchangable options in the remount command, or > > should it > > fail the remount and warn the user that certain options are > > not > > allowed in remount? > > How about a middle ground: ignore, but not silently? Report an error, or send it to the syslog, or both, but ultimately ignore unchangeable options, change what can be changed, and give the user/admin as much as possible. I think the idea is to behave the same as other FS do, not to innovate. > This can be particularly pertinent for XFS root. If it's mounted RO at first, it may (will?) need to become RW at some later point. Failing the remount could result in a system that requires a rescue CD (or lots of headaches for remote administration). FWIW, your root filesystem does not need to be rw. I keep mine ro nearly all the time on my laptop, only mounting rw if I need to install software that puts files outside /usr or if I need to modify a config file in /etc. -jasper From owner-xfs@oss.sgi.com Tue Aug 5 21:45:33 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 21:45:37 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m764jVsK022249 for ; Tue, 5 Aug 2008 21:45:32 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA03070; Wed, 6 Aug 2008 14:46:43 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 6DC7858C52A4; Wed, 6 Aug 2008 14:46:43 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - update timestamp in xfs_ialloc manually Message-Id: <20080806044643.6DC7858C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 14:46:43 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17406 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs update timestamp in xfs_ialloc manually In xfs_ialloc we just want to set all timestamps to the current time. We don't need to mark the inode dirty like xfs_ichgtime does, and we don't need nor want the opimizations in xfs_ichgtime that I will introduce in the next patch. So just opencode the timestamp update in xfs_ialloc, and remove the new unused XFS_ICHGTIME_ACC case in xfs_ichgtime. Signed-off-by: Christoph Hellwig Date: Wed Aug 6 14:44:59 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: hch lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31825a fs/xfs/xfs_vnodeops.c - 1.772 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.772&r2=text&tr2=1.771&f=h fs/xfs/xfs_inode.c - 1.517 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.517&r2=text&tr2=1.516&f=h fs/xfs/xfs_inode.h - 1.256 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.h.diff?r1=text&tr1=1.256&r2=text&tr2=1.255&f=h fs/xfs/linux-2.6/xfs_iops.c - 1.297 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_iops.c.diff?r1=text&tr1=1.297&r2=text&tr2=1.296&f=h - update timestamp in xfs_ialloc manually From owner-xfs@oss.sgi.com Tue Aug 5 21:50:25 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 21:50:28 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m764nx1U023028 for ; Tue, 5 Aug 2008 21:50:24 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA03188; Wed, 6 Aug 2008 14:51:11 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 7E17458C52A4; Wed, 6 Aug 2008 14:51:11 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - optimize xfs_ichgtime Message-Id: <20080806045111.7E17458C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 14:51:11 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17407 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs optimize xfs_ichgtime Port a little optmization from file_update_time to xfs_ichgtime, and only update the timestamp and mark the inode dirty if the timestamp actually changes in the timer tick resultion supported by the running kernel. Signed-off-by: Christoph Hellwig Date: Wed Aug 6 14:50:29 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: hch lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31827a fs/xfs/linux-2.6/xfs_iops.c - 1.298 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_iops.c.diff?r1=text&tr1=1.298&r2=text&tr2=1.297&f=h - optimize xfs_ichgtime From owner-xfs@oss.sgi.com Tue Aug 5 21:51:41 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 21:51:50 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m764pdlM023520 for ; Tue, 5 Aug 2008 21:51:41 -0700 X-ASG-Debug-ID: 1217998371-1d0c007b0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wr-out-0506.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4C8BDF06DC4 for ; Tue, 5 Aug 2008 21:52:51 -0700 (PDT) Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.230]) by cuda.sgi.com with ESMTP id oxDAuK1IIEXj7dIP for ; Tue, 05 Aug 2008 21:52:51 -0700 (PDT) Received: by wr-out-0506.google.com with SMTP id 57so2248136wri.12 for ; Tue, 05 Aug 2008 21:52:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type:references; bh=E1KBD+1lYURkbuple1y7uzoahW8JBHTX5LD4BmuEMOY=; b=KwZDs3hAUT7QkLQ8TU9cbiohkZlfxu3CnjAK+3qnSLDbWhcvcMhe2Y2zy5OIXhycC8 59AX99XT7/H6Nv7Ro1dPcg00ZgUbt8vCYli7LHJIYUvEuEmM0Qt+BnKabCOAO88QjXX6 4OO+DpBz2FFBBpDIUYkutO6txlfBLyZBeu5Mw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:references; b=GIj8m+f2k/bHqXIrMDyi8UNy7UHnQ2nBAXG7YSIbe3vuZ2lrrh8I6uHrYjEh2M9Fp7 zgvJ5gCMYkLct6oUkdS/iG49/fnf03g3y+IvX16pNhdDSdJPkSlkvaXQ9K8AoDRx94n6 g1JDOvCeRUL37hFKAH+DRz0s0vPCcc/4y1hB0= Received: by 10.90.31.8 with SMTP id e8mr2562081age.68.1217998370868; Tue, 05 Aug 2008 21:52:50 -0700 (PDT) Received: by 10.90.115.11 with HTTP; Tue, 5 Aug 2008 21:52:49 -0700 (PDT) Message-ID: Date: Wed, 6 Aug 2008 10:22:49 +0530 From: "Bhagi rathi" To: "Martinez, Sergio" X-ASG-Orig-Subj: Re: xfs_force_shutdown - file fs/xfs/xfs_rw.c Subject: Re: xfs_force_shutdown - file fs/xfs/xfs_rw.c Cc: xfs@oss.sgi.com In-Reply-To: <05BB196AB3DA6C4BBE11AB6C957581FE0E61DA47@sfo-exch-01.dolby.net> MIME-Version: 1.0 References: <05BB196AB3DA6C4BBE11AB6C957581FE0E61DA47@sfo-exch-01.dolby.net> X-Barracuda-Connect: wr-out-0506.google.com[64.233.184.230] X-Barracuda-Start-Time: 1217998374 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0532 1.0000 -1.6796 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.68 X-Barracuda-Spam-Status: No, SCORE=-1.68 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1888 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 1367 X-archive-position: 17408 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs You can just run without raid controller and rule out XFS issue. If the file-system meta-data/log-flush update meets an error, the file-system will be marked for shutdown. The reason for this seems to be xfs_bwrite failure. Investigate on xfs_bwrite failure. -Bhagi. On Wed, Aug 6, 2008 at 7:32 AM, Martinez, Sergio wrote: > > I'm running into a force shutdown issue on a system running > linux-2.6.25. The xfs partition is 2 GB and we're using a hardware > RAID-5 for storage. > > dmesg shows this error: > > xfs_force_shutdown(sda2,0x1) called from line 420 of file > fs/xfs/xfs_rw.c. Return address = 0xc02ddebc > > An XFS_bwrite() calls is failing. > > Any suggestions on how to debug this? I'm suspecting that there's a > problem with the RAID controller firmware, but I'd like to eliminate the > possibility of a file system bug. > > Thanks, > Sergio > > > > > > ----------------------------------------- > This message (including any attachments) may contain confidential > information intended for a specific individual and purpose. If you > are not the intended recipient, delete this message. If you are > not the intended recipient, disclosing, copying, distributing, or > taking any action based on this message is strictly prohibited. > > [[HTML alternate version deleted]] > > > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Tue Aug 5 21:55:47 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 21:55:57 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m764th2a024217 for ; Tue, 5 Aug 2008 21:55:45 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA03285; Wed, 6 Aug 2008 14:56:55 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 3489958C52A4; Wed, 6 Aug 2008 14:56:55 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - stop using file_update_time Message-Id: <20080806045655.3489958C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 14:56:55 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17409 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs stop using file_update_time xfs_ichtime updates the xfs_inode and Linux inode timestamps just fine, no need to call file_update_time and then copy the values over to the XFS inode. The only additional thing in file_update_time are checks not applicable to the write path. Signed-off-by: Christoph Hellwig Date: Wed Aug 6 14:56:16 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: hch lachlan david@fromorbit.com Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31829a fs/xfs/linux-2.6/xfs_lrw.c - 1.282 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_lrw.c.diff?r1=text&tr1=1.282&r2=text&tr2=1.281&f=h fs/xfs/linux-2.6/xfs_iops.c - 1.299 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_iops.c.diff?r1=text&tr1=1.299&r2=text&tr2=1.298&f=h fs/xfs/linux-2.6/xfs_iops.h - 1.37 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_iops.h.diff?r1=text&tr1=1.37&r2=text&tr2=1.36&f=h fs/xfs/linux-2.6/xfs_ksyms.c - 1.89 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_ksyms.c.diff?r1=text&tr1=1.89&r2=text&tr2=1.88&f=h - stop using file_update_time From owner-xfs@oss.sgi.com Tue Aug 5 22:10:20 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 22:10:23 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m765AJkL006976 for ; Tue, 5 Aug 2008 22:10:20 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA03617; Wed, 6 Aug 2008 15:11:32 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 0E58A58C52A4; Wed, 6 Aug 2008 15:11:32 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - move root inode IRELE into xfs_unmountfs Message-Id: <20080806051132.0E58A58C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 15:11:32 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17410 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs move root inode IRELE into xfs_unmountfs The root inode is allocated in xfs_mountfs so it should be release in xfs_unmountfs. For the unmount case that means we do it after the the xfs_sync(mp, SYNC_WAIT | SYNC_CLOSE) in the forced shutdown case and the dmapi unmount event. Note that both reference the rip variable which might be freed by that time in case inode flushing has kicked in, so strictly speaking this might count as a bug fix Signed-off-by: Christoph Hellwig Date: Wed Aug 6 15:10:49 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: hch lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31830a fs/xfs/xfs_mount.c - 1.440 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.c.diff?r1=text&tr1=1.440&r2=text&tr2=1.439&f=h fs/xfs/linux-2.6/xfs_super.c - 1.444 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_super.c.diff?r1=text&tr1=1.444&r2=text&tr2=1.443&f=h - move root inode IRELE into xfs_unmountfs From owner-xfs@oss.sgi.com Tue Aug 5 22:19:44 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 22:19:47 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m765JhSt008561 for ; Tue, 5 Aug 2008 22:19:44 -0700 X-ASG-Debug-ID: 1218000056-4cab00830000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 2C31D362967 for ; Tue, 5 Aug 2008 22:20:57 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id TvYSxN30Ik3Yah9p for ; Tue, 05 Aug 2008 22:20:57 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApsEALzPmEh5LAiF/2dsb2JhbACKcqIs X-IronPort-AV: E=Sophos;i="4.31,314,1215354600"; d="scan'208";a="175297063" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 06 Aug 2008 14:50:54 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQbS9-0002lc-AF; Wed, 06 Aug 2008 15:20:53 +1000 Date: Wed, 6 Aug 2008 15:20:53 +1000 From: Dave Chinner To: Lachlan McIlroy Cc: xfs@oss.sgi.com, xfs-dev X-ASG-Orig-Subj: Re: [PATCH] Move vn_iowait() earlier in the reclaim path Subject: Re: [PATCH] Move vn_iowait() earlier in the reclaim path Message-ID: <20080806052053.GU6119@disturbed> Mail-Followup-To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev References: <4897F691.6010806@sgi.com> <20080805073711.GA21635@disturbed> <489806C2.7020200@sgi.com> <20080805084220.GF21635@disturbed> <48990C4E.9070102@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48990C4E.9070102@sgi.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1218000058 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0003 1.0000 -2.0189 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1890 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17412 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Wed, Aug 06, 2008 at 12:28:30PM +1000, Lachlan McIlroy wrote: > Dave Chinner wrote: >> On Tue, Aug 05, 2008 at 05:52:34PM +1000, Lachlan McIlroy wrote: >>> Dave Chinner wrote: >>>> On Tue, Aug 05, 2008 at 04:43:29PM +1000, Lachlan McIlroy wrote: >>>>> Currently by the time we get to vn_iowait() in xfs_reclaim() we have already >>>>> gone through xfs_inactive()/xfs_free() and recycled the inode. Any I/O >>>> xfs_free()? What's that? >>> Sorry that should have been xfs_ifree() (we set the inode's mode to >>> zero in there). >>> >>>>> completions still running (file size updates and unwritten extent conversions) >>>>> may be working on an inode that is no longer valid. >>>> The linux inode does not get freed until after ->clear_inode >>>> completes, hence it is perfectly valid to reference it anywhere >>>> in the ->clear_inode path. >>> The problem I see is an assert in xfs_setfilesize() fail: >>> >>> ASSERT((ip->i_d.di_mode & S_IFMT) == S_IFREG); >>> >>> The mode of the XFS inode is zero at this time. >> >> Ok, so the question has to be why is there I/O still in progress >> after the truncate is supposed to have already occurred and the >> vn_iowait() in xfs_itruncate_start() been executed. >> >> Something doesn't add up here - you can't be doing I/O on a file >> with no extents or delalloc blocks, hence that means we should be >> passing through the truncate path in xfs_inactive() before we >> call xfs_ifree() and therefore doing the vn_iowait().. >> >> Hmmmm - the vn_iowait() is conditional based on: >> >> /* wait for the completion of any pending DIOs */ >> if (new_size < ip->i_size) >> vn_iowait(ip); >> >> We are truncating to zero (new_size == 0), so the only case where >> this would not wait is if ip->i_size == 0. Still - I can't see >> how we'd be doing I/O on an inode with a zero i_size. I suspect >> ensuring we call vn_iowait() if newsize == 0 as well would fix >> the problem. If not, there's something much more subtle going >> on here that we should understand.... > > If we make the vn_iowait() unconditional we might re-introduce the > NFS exclusivity bug that killed performance. That was through > xfs_release()->xfs_free_eofblocks()->xfs_itruncate_start(). It won't reintroduce that problem because ->clear_inode() is not called on every NFS write operation. > So if we leave the above code as is then we need another > vn_iowait() in xfs_inactive() to catch any remaining workqueue > items that we didn't wait for in xfs_itruncate_start(). How do we have any new *data* I/O at all in progress at this point? That does not explain why we need an additional vn_iowait() call. All I see from this is a truncate race that has somethign to do with the vn_iowait() call being conditional. That is, if we truncate to zero, then the current code in xfs_itruncate_start() should wait unconditinally for *all* I/O to complete because, by definition, all that I/O is beyond the new EOF and we have to wait for it to complete before truncating the file. Seeing as we are in ->clear_inode(), no new data I/O can start while we are deep in this code, hence we should not be seeing I/O completions after the truncate starts and vn_iowait() has completed. Hence we need to know, firstly, if the truncate code has been called; Secondly, what the value of i_size and i_new_size was when the truncate was started and, finally, whether ip->i_iocount was non-zero when the truncate was run. That is, we need to gather enough data to determine whether we should have waited in the truncate but didn't. If either the vn_iowait() in the truncate path is not sufficient, or the truncate code is not being called, there is *some other bug* that we don't yet understand. Adding an unconditional vn_iowait() appear to me to be fixing a symptom, not the underlying cause of the problem.... Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Tue Aug 5 22:18:47 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 22:18:50 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m765Ij91008237 for ; Tue, 5 Aug 2008 22:18:46 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA03898; Wed, 6 Aug 2008 15:19:57 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 3F05758C52A4; Wed, 6 Aug 2008 15:19:57 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - cleanup xfs_mountfs Message-Id: <20080806051957.3F05758C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 15:19:57 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17411 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs cleanup xfs_mountfs Remove all the useless flags and code keyed off it in xfs_mountfs. Signed-off-by: Christoph Hellwig Date: Wed Aug 6 15:19:16 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: hch lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31831a fs/xfs/xfs_log.h - 1.82 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log.h.diff?r1=text&tr1=1.82&r2=text&tr2=1.81&f=h fs/xfs/xfs_log.c - 1.361 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log.c.diff?r1=text&tr1=1.361&r2=text&tr2=1.360&f=h fs/xfs/xfs_log_priv.h - 1.133 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_priv.h.diff?r1=text&tr1=1.133&r2=text&tr2=1.132&f=h fs/xfs/xfs_log_recover.c - 1.347 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_recover.c.diff?r1=text&tr1=1.347&r2=text&tr2=1.346&f=h fs/xfs/xfs_mount.h - 1.273 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.h.diff?r1=text&tr1=1.273&r2=text&tr2=1.272&f=h fs/xfs/xfs_mount.c - 1.441 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.c.diff?r1=text&tr1=1.441&r2=text&tr2=1.440&f=h fs/xfs/quota/xfs_qm_bhv.c - 1.29 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm_bhv.c.diff?r1=text&tr1=1.29&r2=text&tr2=1.28&f=h fs/xfs/quota/xfs_qm.h - 1.20 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm.h.diff?r1=text&tr1=1.20&r2=text&tr2=1.19&f=h fs/xfs/quota/xfs_qm.c - 1.73 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm.c.diff?r1=text&tr1=1.73&r2=text&tr2=1.72&f=h fs/xfs/linux-2.6/xfs_super.c - 1.445 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_super.c.diff?r1=text&tr1=1.445&r2=text&tr2=1.444&f=h - cleanup xfs_mountfs From owner-xfs@oss.sgi.com Tue Aug 5 22:22:31 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 22:22:34 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m765MT2H009378 for ; Tue, 5 Aug 2008 22:22:30 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA04010; Wed, 6 Aug 2008 15:23:41 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id D436A58C52A4; Wed, 6 Aug 2008 15:23:41 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - xfs_unmountfs should return void Message-Id: <20080806052341.D436A58C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 15:23:41 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17413 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs xfs_unmountfs should return void xfs_unmounts can't and shouldn't return errors so declare it as returning void. Signed-off-by: Christoph Hellwig Date: Wed Aug 6 15:23:03 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: hch lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31833a fs/xfs/xfs_mount.h - 1.274 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.h.diff?r1=text&tr1=1.274&r2=text&tr2=1.273&f=h fs/xfs/xfs_mount.c - 1.442 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.c.diff?r1=text&tr1=1.442&r2=text&tr2=1.441&f=h - xfs_unmountfs should return void From owner-xfs@oss.sgi.com Tue Aug 5 22:31:20 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 22:31:23 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m765VI9U010699 for ; Tue, 5 Aug 2008 22:31:19 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA04277; Wed, 6 Aug 2008 15:32:30 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 7444A58C52A4; Wed, 6 Aug 2008 15:32:30 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - don't call xfs_freesb from xfs_unmountfs Message-Id: <20080806053230.7444A58C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 15:32:30 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17414 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs don't call xfs_freesb from xfs_unmountfs xfs_readsb is called before xfs_mount so xfs_freesb should be called after xfs_unmountfs, too. This means it now happens after a few things during the of xfs_unmount which all have nothing to do with the superblock. Signed-off-by: Christoph Hellwig Date: Wed Aug 6 15:31:50 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: hch lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31835a fs/xfs/xfs_mount.c - 1.443 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.c.diff?r1=text&tr1=1.443&r2=text&tr2=1.442&f=h fs/xfs/linux-2.6/xfs_super.c - 1.446 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_super.c.diff?r1=text&tr1=1.446&r2=text&tr2=1.445&f=h - don't call xfs_freesb from xfs_unmountfs From owner-xfs@oss.sgi.com Tue Aug 5 22:36:57 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 22:37:01 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m765aubW011896 for ; Tue, 5 Aug 2008 22:36:57 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA04392; Wed, 6 Aug 2008 15:38:08 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 6790658C52A4; Wed, 6 Aug 2008 15:38:08 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - refactor xfs_mount_free Message-Id: <20080806053808.6790658C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 15:38:08 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17415 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs refactor xfs_mount_free xfs_mount_free mostly frees the perag data, which is something that is duplicated in the mount error path. Move the XFS_QM_DONE call to the caller and remove the useless mutex_destroy/spinlock_destroy calls so that we can re-use it for the mount error path. Also rename it to xfs_free_perag to reflect what it does. Signed-off-by: Christoph Hellwig Date: Wed Aug 6 15:37:28 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: hch lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31836a fs/xfs/xfs_mount.c - 1.444 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.c.diff?r1=text&tr1=1.444&r2=text&tr2=1.443&f=h - refactor xfs_mount_free From owner-xfs@oss.sgi.com Tue Aug 5 22:40:11 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 22:40:13 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m765e9kx012776 for ; Tue, 5 Aug 2008 22:40:10 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA04464; Wed, 6 Aug 2008 15:41:21 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id CB2F258C52A4; Wed, 6 Aug 2008 15:41:21 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Message-Id: <20080806054121.CB2F258C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 15:41:21 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17416 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs use KM_MAYFAIL in xfs_mountfs Use KM_MAYFAIL for the m_perag allocation, we can deal with the error easily and blocking forever during mount is not a good idea either. Signed-off-by: Christoph Hellwig Date: Wed Aug 6 15:40:44 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: hch lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31837a fs/xfs/xfs_mount.c - 1.445 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.c.diff?r1=text&tr1=1.445&r2=text&tr2=1.444&f=h - use KM_MAYFAIL in xfs_mountfs From owner-xfs@oss.sgi.com Tue Aug 5 23:03:30 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 23:03:39 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m7663ScW015594 for ; Tue, 5 Aug 2008 23:03:29 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA04934; Wed, 6 Aug 2008 16:04:40 +1000 Message-ID: <4899406D.5020802@sgi.com> Date: Wed, 06 Aug 2008 16:10:53 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev Subject: Re: [PATCH] Move vn_iowait() earlier in the reclaim path References: <4897F691.6010806@sgi.com> <20080805073711.GA21635@disturbed> <489806C2.7020200@sgi.com> <20080805084220.GF21635@disturbed> <48990C4E.9070102@sgi.com> <20080806052053.GU6119@disturbed> In-Reply-To: <20080806052053.GU6119@disturbed> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17417 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Dave Chinner wrote: > On Wed, Aug 06, 2008 at 12:28:30PM +1000, Lachlan McIlroy wrote: >> Dave Chinner wrote: >>> On Tue, Aug 05, 2008 at 05:52:34PM +1000, Lachlan McIlroy wrote: >>>> Dave Chinner wrote: >>>>> On Tue, Aug 05, 2008 at 04:43:29PM +1000, Lachlan McIlroy wrote: >>>>>> Currently by the time we get to vn_iowait() in xfs_reclaim() we have already >>>>>> gone through xfs_inactive()/xfs_free() and recycled the inode. Any I/O >>>>> xfs_free()? What's that? >>>> Sorry that should have been xfs_ifree() (we set the inode's mode to >>>> zero in there). >>>> >>>>>> completions still running (file size updates and unwritten extent conversions) >>>>>> may be working on an inode that is no longer valid. >>>>> The linux inode does not get freed until after ->clear_inode >>>>> completes, hence it is perfectly valid to reference it anywhere >>>>> in the ->clear_inode path. >>>> The problem I see is an assert in xfs_setfilesize() fail: >>>> >>>> ASSERT((ip->i_d.di_mode & S_IFMT) == S_IFREG); >>>> >>>> The mode of the XFS inode is zero at this time. >>> Ok, so the question has to be why is there I/O still in progress >>> after the truncate is supposed to have already occurred and the >>> vn_iowait() in xfs_itruncate_start() been executed. >>> >>> Something doesn't add up here - you can't be doing I/O on a file >>> with no extents or delalloc blocks, hence that means we should be >>> passing through the truncate path in xfs_inactive() before we >>> call xfs_ifree() and therefore doing the vn_iowait().. >>> >>> Hmmmm - the vn_iowait() is conditional based on: >>> >>> /* wait for the completion of any pending DIOs */ >>> if (new_size < ip->i_size) >>> vn_iowait(ip); >>> >>> We are truncating to zero (new_size == 0), so the only case where >>> this would not wait is if ip->i_size == 0. Still - I can't see >>> how we'd be doing I/O on an inode with a zero i_size. I suspect >>> ensuring we call vn_iowait() if newsize == 0 as well would fix >>> the problem. If not, there's something much more subtle going >>> on here that we should understand.... >> If we make the vn_iowait() unconditional we might re-introduce the >> NFS exclusivity bug that killed performance. That was through >> xfs_release()->xfs_free_eofblocks()->xfs_itruncate_start(). > > It won't reintroduce that problem because ->clear_inode() > is not called on every NFS write operation. Yes but xfs_itruncate_start() can be called from every NFS write so modifying the above code will re-introduce the problem. > >> So if we leave the above code as is then we need another >> vn_iowait() in xfs_inactive() to catch any remaining workqueue >> items that we didn't wait for in xfs_itruncate_start(). > > How do we have any new *data* I/O at all in progress at this point? It's not new data I/O. It's workqueue items that have been queued from previous I/Os that are still outstanding. > That does not explain why we need an additional vn_iowait() call. > All I see from this is a truncate race that has somethign to do with > the vn_iowait() call being conditional. > > That is, if we truncate to zero, then the current code in > xfs_itruncate_start() should wait unconditinally for *all* I/O to > complete because, by definition, all that I/O is beyond the new EOF > and we have to wait for it to complete before truncating the file. That makes sense. If new_size is zero and ip->i_size is not then we will wait. If ip->i_size is also zero we will not wait but if the file size is already zero there should not be any I/Os in progress and therefore no workqueue items outstanding. > Seeing as we are in ->clear_inode(), no new data I/O can start > while we are deep in this code, hence we should not be seeing > I/O completions after the truncate starts and vn_iowait() has > completed. > > Hence we need to know, firstly, if the truncate code has been > called; Secondly, what the value of i_size and i_new_size was when > the truncate was started and, finally, whether ip->i_iocount was > non-zero when the truncate was run. That is, we need to gather > enough data to determine whether we should have waited in the > truncate but didn't. > > If either the vn_iowait() in the truncate path is not sufficient, or > the truncate code is not being called, there is *some other bug* > that we don't yet understand. Adding an unconditional vn_iowait() > appear to me to be fixing a symptom, not the underlying cause of > the problem.... I'm not adding a new call to vn_iowait(). I'm just moving the existing one from xfs_ireclaim() so that we wait for all I/O to complete before we tear the inode down. Here's some info from the bug: Stack traceback for pid 272 0xffff81007f3ea600 272 2 1 3 R 0xffff81007f3ea950 *xfsdatad/3 rsp rip Function (args) 0xffff81007e275e28 0xffffffff803b8cd0 assfail+0x1a (invalid, invalid, invalid) 0xffff81007e275e58 0xffffffff803adaad xfs_setfilesize+0x3d (0xffff8100383de7f8) 0xffff81007e275e78 0xffffffff803adc28 xfs_end_bio_written+0x10 (invalid) 0xffff81007e275e88 0xffffffff8023cf9a run_workqueue+0xdf (0xffff81007e7d0070) 0xffff81007e275ed8 0xffffffff8023da8f worker_thread+0xd8 (0xffff81007e7d0070) 0xffff81007e275f28 0xffffffff80240314 kthread+0x47 (invalid) 0xffff81007e275f48 0xffffffff8020bd08 child_rip+0xa (invalid, invalid) <5>Filesystem "sdb1": Disabling barriers, not supported with external log device <5>XFS mounting filesystem sdb1 <7>Ending clean XFS mount for filesystem: sdb1 <4>Assertion failed: (ip->i_d.di_mode & S_IFMT) == S_IFREG, file: fs/xfs/linux-2.6/xfs_aops.c, line: 178 <0>------------[ cut here ]------------ <2>kernel BUG at fs/xfs/support/debug.c:81! <0>invalid opcode: 0000 [1] SMP [3]kdb> md8c20 0xffff8100383de7f8 0xffff8100383de7f8 0000000000000000 0000000000000020 ........ ....... 0xffff8100383de808 5a5a5a5a00000000 ffff810054062048 ....ZZZZH .T.... 0xffff8100383de818 0000000000000000 0000000000000000 ................ 0xffff8100383de828 0000000000003400 00000000000fe200 .4.............. 0xffff8100383de838 ffff81007e7d0070 ffff8100383de840 p.}~....@.=8.... 0xffff8100383de848 ffff8100383de840 ffffffff803adc18 @.=8......:..... 0xffff8100383de858 ffffffff80b162a0 0000000000000000 .b.............. 0xffff8100383de868 ffffffff80784c51 d84156c5635688c0 QLx.......Vc.VA. 0xffff8100383de878 ffffffff8025d1f6 09f911029d74e35b ..%.....[.t..... 0xffff8100383de888 6b6b6b6b6b6b6b6b 6b6b6b6b6b6b6b6b kkkkkkkkkkkkkkkk I think io_type here is 0x20 which is IOMAP_UNWRITTEN. Since we've come through xfs_end_bio_written() (not xfs_end_bio_unwritten()) this must be a direct I/O write to a written extent. [3]kdb> inode ffff810054062048 struct inode at 0xffff810054062048 i_ino = 80848951 i_count = 0 i_size 0 i_mode = 0100666 i_nlink = 0 i_rdev = 0x0 i_hash.nxt = 0x0000000000000000 i_hash.pprev = 0xffffc20000208518 i_list.nxt = 0xffff810054062048 i_list.prv = 0xffff810054062048 i_dentry.nxt = 0xffff810054061fe0 i_dentry.prv = 0xffff810054061fe0 i_sb = 0xffff81006d5b0508 i_op = 0xffffffff806647a0 i_data = 0xffff810054062230 nrpages = 0 i_fop= 0xffffffff806644e0 i_flock = 0x0000000000000000 i_mapping = 0xffff810054062230 i_flags 0x0 i_state 0x21 [I_DIRTY_SYNC I_FREEING] fs specific info @ 0xffff810054062418 Mode is 0100666. S_IFREG is 0100000 so linux inode would not have failed assert. So why did XFS inode fail it? From owner-xfs@oss.sgi.com Tue Aug 5 23:14:43 2008 Received: with ECARTIS (v1.0.0; list xfs); Tue, 05 Aug 2008 23:14:46 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m766EfZ4016659 for ; Tue, 5 Aug 2008 23:14:42 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA05025; Wed, 6 Aug 2008 16:15:53 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id A8D8958C52A4; Wed, 6 Aug 2008 16:15:53 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - Use KM_NOFS for debug trace buffers Message-Id: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 16:15:53 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17418 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Use KM_NOFS for debug trace buffers Use KM_NOFS to prevent recursion back into the filesystem which can cause deadlocks. In the case of xfs_iread() we hold the lock on the inode cluster buffer while allocating memory for the trace buffers. If we recurse back into XFS to flush data that may require a transaction to allocate extents which needs log space. This can deadlock with the xfsaild thread which can't push the tail of the log because it is trying to get the inode cluster buffer lock. Date: Wed Aug 6 16:15:14 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm Inspected by: david@fromorbit.com Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31838a fs/xfs/xfs_log.c - 1.362 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log.c.diff?r1=text&tr1=1.362&r2=text&tr2=1.361&f=h fs/xfs/xfs_buf_item.c - 1.168 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_buf_item.c.diff?r1=text&tr1=1.168&r2=text&tr2=1.167&f=h fs/xfs/xfs_inode.c - 1.518 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.518&r2=text&tr2=1.517&f=h fs/xfs/quota/xfs_dquot.c - 1.38 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_dquot.c.diff?r1=text&tr1=1.38&r2=text&tr2=1.37&f=h fs/xfs/linux-2.6/xfs_buf.c - 1.262 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_buf.c.diff?r1=text&tr1=1.262&r2=text&tr2=1.261&f=h fs/xfs/xfs_filestream.c - 1.9 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_filestream.c.diff?r1=text&tr1=1.9&r2=text&tr2=1.8&f=h - Use KM_NOFS for debug trace buffers From owner-xfs@oss.sgi.com Wed Aug 6 00:26:24 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 00:26:30 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m767QMmV032052 for ; Wed, 6 Aug 2008 00:26:23 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA06430; Wed, 6 Aug 2008 17:27:35 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 44625) id 1DD3858C52A4; Wed, 6 Aug 2008 17:27:35 +1000 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 981498 - Fix fallout from semaphore -> completion changes. Message-Id: <20080806072735.1DD3858C52A4@chook.melbourne.sgi.com> Date: Wed, 6 Aug 2008 17:27:35 +1000 (EST) From: lachlan@sgi.com (Lachlan McIlroy) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17419 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Fix fallout from semaphore -> completion changes. Date: Wed Aug 6 17:26:51 AEST 2008 Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-tot Inspected by: lachlan Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31839a fs/xfs/xfsidbg.c - 1.360 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfsidbg.c.diff?r1=text&tr1=1.360&r2=text&tr2=1.359&f=h - Fix fallout from semaphore -> completion changes. From owner-xfs@oss.sgi.com Wed Aug 6 01:50:12 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 01:50:15 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m768oAll008900 for ; Wed, 6 Aug 2008 01:50:11 -0700 Received: from chapter11.melbourne.sgi.com (chapter11.melbourne.sgi.com [134.14.54.96]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA08363; Wed, 6 Aug 2008 18:51:22 +1000 Received: by chapter11.melbourne.sgi.com (Postfix, from userid 16365) id E146D3D2B6DB; Wed, 6 Aug 2008 18:51:21 +1000 (EST) To: xfs@oss.sgi.com, sgi.bugs.xfs@engr.sgi.com Subject: TAKE 981498 - Make xfs_bmap_*_count_leaves void. Message-Id: <20080806085121.E146D3D2B6DB@chapter11.melbourne.sgi.com> Date: Wed, 6 Aug 2008 18:51:21 +1000 (EST) From: donaldd@sgi.com (Donald Douwsma) X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17420 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: donaldd@sgi.com Precedence: bulk X-list: xfs Make xfs_bmap_*_count_leaves void. xfs_bmap_count_leaves and xfs_bmap_disk_count_leaves always return always 0, make them void. Signed-off-by: Ruben Porras Date: Wed Aug 6 18:50:30 AEST 2008 Workarea: chapter11.melbourne.sgi.com:/scratch/donaldd/isms/2.6.x-xfs Inspected by: ruben.porras@linworks.de The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:31844a fs/xfs/xfs_bmap.c - 1.401 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bmap.c.diff?r1=text&tr1=1.401&r2=text&tr2=1.400&f=h - Make xfs_bmap_*_count_leaves void. From owner-xfs@oss.sgi.com Wed Aug 6 02:37:35 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 02:37:38 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_44 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m769bZ4k009502 for ; Wed, 6 Aug 2008 02:37:35 -0700 X-ASG-Debug-ID: 1218015527-319c00a40000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail04.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 6555B363505 for ; Wed, 6 Aug 2008 02:38:48 -0700 (PDT) Received: from ipmail04.adl2.internode.on.net (ipmail04.adl2.internode.on.net [203.16.214.57]) by cuda.sgi.com with ESMTP id GXOm9dUEhU25NVpI for ; Wed, 06 Aug 2008 02:38:48 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAIALmUh5LAiF/2dsb2JhbACtUEQ X-IronPort-AV: E=Sophos;i="4.31,314,1215354600"; d="scan'208";a="175512847" Received: from ppp121-44-8-133.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.8.133]) by ipmail04.adl2.internode.on.net with ESMTP; 06 Aug 2008 19:08:45 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQfTg-0008LA-Hr; Wed, 06 Aug 2008 19:38:44 +1000 Date: Wed, 6 Aug 2008 19:38:44 +1000 From: Dave Chinner To: Lachlan McIlroy Cc: xfs@oss.sgi.com, xfs-dev X-ASG-Orig-Subj: Re: [PATCH] Move vn_iowait() earlier in the reclaim path Subject: Re: [PATCH] Move vn_iowait() earlier in the reclaim path Message-ID: <20080806093844.GZ6119@disturbed> Mail-Followup-To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev References: <4897F691.6010806@sgi.com> <20080805073711.GA21635@disturbed> <489806C2.7020200@sgi.com> <20080805084220.GF21635@disturbed> <48990C4E.9070102@sgi.com> <20080806052053.GU6119@disturbed> <4899406D.5020802@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4899406D.5020802@sgi.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail04.adl2.internode.on.net[203.16.214.57] X-Barracuda-Start-Time: 1218015530 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1906 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17421 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Wed, Aug 06, 2008 at 04:10:53PM +1000, Lachlan McIlroy wrote: > Dave Chinner wrote: >> On Wed, Aug 06, 2008 at 12:28:30PM +1000, Lachlan McIlroy wrote: >>> Dave Chinner wrote: >>>> On Tue, Aug 05, 2008 at 05:52:34PM +1000, Lachlan McIlroy wrote: >>>>> Dave Chinner wrote: >>>>>> On Tue, Aug 05, 2008 at 04:43:29PM +1000, Lachlan McIlroy wrote: >>>>>>> Currently by the time we get to vn_iowait() in xfs_reclaim() we have already >>>>>>> gone through xfs_inactive()/xfs_free() and recycled the inode. Any I/O >>>>>> xfs_free()? What's that? >>>>> Sorry that should have been xfs_ifree() (we set the inode's mode to >>>>> zero in there). >>>>> >>>>>>> completions still running (file size updates and unwritten extent conversions) >>>>>>> may be working on an inode that is no longer valid. >>>>>> The linux inode does not get freed until after ->clear_inode >>>>>> completes, hence it is perfectly valid to reference it anywhere >>>>>> in the ->clear_inode path. >>>>> The problem I see is an assert in xfs_setfilesize() fail: >>>>> >>>>> ASSERT((ip->i_d.di_mode & S_IFMT) == S_IFREG); >>>>> >>>>> The mode of the XFS inode is zero at this time. >>>> Ok, so the question has to be why is there I/O still in progress >>>> after the truncate is supposed to have already occurred and the >>>> vn_iowait() in xfs_itruncate_start() been executed. >>>> >>>> Something doesn't add up here - you can't be doing I/O on a file >>>> with no extents or delalloc blocks, hence that means we should be >>>> passing through the truncate path in xfs_inactive() before we >>>> call xfs_ifree() and therefore doing the vn_iowait().. >>>> >>>> Hmmmm - the vn_iowait() is conditional based on: >>>> >>>> /* wait for the completion of any pending DIOs */ >>>> if (new_size < ip->i_size) >>>> vn_iowait(ip); >>>> >>>> We are truncating to zero (new_size == 0), so the only case where >>>> this would not wait is if ip->i_size == 0. Still - I can't see >>>> how we'd be doing I/O on an inode with a zero i_size. I suspect >>>> ensuring we call vn_iowait() if newsize == 0 as well would fix >>>> the problem. If not, there's something much more subtle going >>>> on here that we should understand.... >>> If we make the vn_iowait() unconditional we might re-introduce the >>> NFS exclusivity bug that killed performance. That was through >>> xfs_release()->xfs_free_eofblocks()->xfs_itruncate_start(). >> >> It won't reintroduce that problem because ->clear_inode() >> is not called on every NFS write operation. > Yes but xfs_itruncate_start() can be called from every NFS write so > modifying the above code will re-introduce the problem. Ah, no. The case here is new_size == 0, which will almost never be the case in the ->release call... > >> >>> So if we leave the above code as is then we need another >>> vn_iowait() in xfs_inactive() to catch any remaining workqueue >>> items that we didn't wait for in xfs_itruncate_start(). >> >> How do we have any new *data* I/O at all in progress at this point? > It's not new data I/O. Then why isn't is being caught by the vn_iowait() in the truncate code????? > It's workqueue items that have been queued > from previous I/Os that are still outstanding. The iocount is decremented when the completion is finished, not when it is queued. Hence vn_iowait() should be taking into account this case. >> That does not explain why we need an additional vn_iowait() call. >> All I see from this is a truncate race that has somethign to do with >> the vn_iowait() call being conditional. >> >> That is, if we truncate to zero, then the current code in >> xfs_itruncate_start() should wait unconditinally for *all* I/O to >> complete because, by definition, all that I/O is beyond the new EOF >> and we have to wait for it to complete before truncating the file. > That makes sense. If new_size is zero and ip->i_size is not then we > will wait. If ip->i_size is also zero we will not wait but if the > file size is already zero there should not be any I/Os in progress > and therefore no workqueue items outstanding. I note from the debug below that the linux inode size is zero, but you didn't include the dump of the xfs inode so we can't see what the other variables are. Also, i_size is updated after write I/O is dispatched. If we are doing synchronous I/O, then i_size is not updated until after the I/O completes (in xfs_write()). Hence we could have the situation of I/O being run while i_size = 0. This is why I wanted to know what i_new_size is, because that gets set before the I/O is issued. if i_new_size is non-zero and i_size is zero,that tends to imply the conditional vn_iowait() in the truncate path needs to take MAX(ip->i_size, ip->i_new_size) for the check, not just ip->i_size... FWIW, from the dump below, we have: typedef struct xfs_ioend { struct xfs_ioend *io_list = NULL unsigned int io_type = 0x20 = IOMAP_UNWRITTEN int io_error = 0 atomic_t io_remaining = 0; struct inode *io_inode = 0xffff810054062048 struct buffer_head *io_buffer_head = NULL struct buffer_head *io_buffer_tail = NULL size_t io_size = 0x3400 xfs_off_t io_offset = 0xfe200 struct work_struct io_work; /* xfsdatad work queue */ } xfs_ioend_t; So the I/O was not even close to offset zero. Also, the fact that the stack trace says it came through the written path, but the io_type says unwritten which says that there's something fishy here. Either the stack trace is wrong, or there's been a memory corruption.... >> If either the vn_iowait() in the truncate path is not sufficient, or >> the truncate code is not being called, there is *some other bug* >> that we don't yet understand. Adding an unconditional vn_iowait() >> appear to me to be fixing a symptom, not the underlying cause of >> the problem.... > > I'm not adding a new call to vn_iowait(). I'm just moving the > existing one from xfs_ireclaim() so that we wait for all I/O to > complete before we tear the inode down. Yes, but that is there to catch inodes with non-zero link counts because we are not doing a truncate in that case. We still need to get to the bottom of why the truncate is not waiting for all I/O. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Wed Aug 6 10:11:06 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 10:11:32 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m76HB5Qc029677 for ; Wed, 6 Aug 2008 10:11:06 -0700 X-ASG-Debug-ID: 1218042739-10f0039e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wr-out-0506.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8917EF0BAE7 for ; Wed, 6 Aug 2008 10:12:19 -0700 (PDT) Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.228]) by cuda.sgi.com with ESMTP id 2LUX5fjDZvbYLq8E for ; Wed, 06 Aug 2008 10:12:19 -0700 (PDT) Received: by wr-out-0506.google.com with SMTP id 57so16231wri.12 for ; Wed, 06 Aug 2008 10:12:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type:references; bh=oXhd3wRalpHIU3gV2lrKE0VOtPMq08+mQxNUn5iKik0=; b=RPW1zgLmp7rJ9bdM6FdaqKkAeeIhY0kyNrSTimBBXHDy07Y2GN15tuUD2hgNM4x3ts H6fQt0jY34nryo9k0fq6fS7WKJr+/bCLKvgjYVRmoU9vjRnHxZqjXYEFGmfR3L2ekhWD jbn0kxphNT4uAKyAZP5zLSaM0CDwCeB7lm7vg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:references; b=Hi1mCCslWXTj4ru5c/2IUeQcUsE3HOHJkVWzFm+U+Av1oLhinhV8FGMZo4dUEifahO ge6vRBWN/XRiGjo5MXNz+uiRKuUbdDvjk9lm/vJofhh3TrmVtBTVLOcM7vuMQ6e9IGGA 8DgQdVtHsiskisYBQke2SCyBAaafJ8mhxZ8E4= Received: by 10.90.33.5 with SMTP id g5mr3441010agg.113.1218042738227; Wed, 06 Aug 2008 10:12:18 -0700 (PDT) Received: by 10.90.115.11 with HTTP; Wed, 6 Aug 2008 10:12:15 -0700 (PDT) Message-ID: Date: Wed, 6 Aug 2008 22:42:15 +0530 From: "Bhagi rathi" To: "Lachlan McIlroy" X-ASG-Orig-Subj: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Subject: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Cc: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com In-Reply-To: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> MIME-Version: 1.0 References: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> X-Barracuda-Connect: wr-out-0506.google.com[64.233.184.228] X-Barracuda-Start-Time: 1218042740 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1938 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 3263 X-archive-position: 17422 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs I couldn't get a chance to read the diff's completely. If I click on Lachlan's url for diff's, I couldn't access them. It looks to me that the issue is not just with trace buffers. It can extend to xfs_iformat as well. The same dead-lock can spring via xfs_iread -> xfs_iformat -> xfs_iformat_extents -> xfs_iext_add -> xfs_iext_inline_to_direct -> which can do kmem_alloc with KM_SLEEP flag. The source of the problem is that holding a lock and entering into file-system once again. This can lead to dead-lock on the same clustered buffer during cleaning of log space. Cheers, Bhagi. On Wed, Aug 6, 2008 at 11:45 AM, Lachlan McIlroy wrote: > Use KM_NOFS for debug trace buffers > > Use KM_NOFS to prevent recursion back into the filesystem which can > cause deadlocks. > > In the case of xfs_iread() we hold the lock on the inode cluster buffer > while allocating memory for the trace buffers. If we recurse back into > XFS to flush data that may require a transaction to allocate extents > which needs log space. This can deadlock with the xfsaild thread which > can't push the tail of the log because it is trying to get the inode > cluster buffer lock. > > Date: Wed Aug 6 16:15:14 AEST 2008 > Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm > Inspected by: david@fromorbit.com > Author: lachlan > > The following file(s) were checked into: > longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb > > > Modid: xfs-linux-melb:xfs-kern:31838a > fs/xfs/xfs_log.c - 1.362 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/> xfs_log.c.diff?r1=text&tr1=1.362&r2=text&tr2=1.361&f=h > > http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log.c.diff?r1=text&tr1=1.362&r2=text&tr2=1.361&f=h > fs/xfs/xfs_buf_item.c- 1.168 - changed > > http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_buf_item.c.diff?r1=text&tr1=1.168&r2=text&tr2=1.167&f=h > fs/xfs/xfs_inode.c- 1.518 - changed > > http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.518&r2=text&tr2=1.517&f=h > fs/xfs/quota/xfs_dquot.c- 1.38 - changed > > http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_dquot.c.diff?r1=text&tr1=1.38&r2=text&tr2=1.37&f=h > fs/xfs/linux-2.6/xfs_buf.c- 1.262 - changed > > http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_buf.c.diff?r1=text&tr1=1.262&r2=text&tr2=1.261&f=h > fs/xfs/xfs_filestream.c- 1.9 - changed > > http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_filestream.c.diff?r1=text&tr1=1.9&r2=text&tr2=1.8&f=h > - Use KM_NOFS for debug trace buffers > > > > > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Wed Aug 6 10:21:25 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 10:21:28 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m76HLPuH002542 for ; Wed, 6 Aug 2008 10:21:25 -0700 X-ASG-Debug-ID: 1218043359-7b3b00830000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wr-out-0506.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 6D3F2365BCF for ; Wed, 6 Aug 2008 10:22:39 -0700 (PDT) Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.234]) by cuda.sgi.com with ESMTP id 5I2eVLR70zy4dREz for ; Wed, 06 Aug 2008 10:22:39 -0700 (PDT) Received: by wr-out-0506.google.com with SMTP id 57so20820wri.12 for ; Wed, 06 Aug 2008 10:22:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type:references; bh=24q5A6b9S0FYMxRmAAnqG8biWTe0WdsP2IgVJGF0wDk=; b=ELTVHHU4WTxpuel3InfBaAZIU4hvcIkxWxEkKxxD5ymv7n5dCt+bZEAJrzKUx5ONc7 7R/sFdD5sVldBroBWj8TEBcu/4WvzWfkXeZ2q7xOEDxpDPW7XBeGv7yvvLeycXic5Fsl fz9fyvgWam6LPO6oHaKemqS2Bw/b1jwS1Gn1Q= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:references; b=spncBunkRR/OMS3i54AoXry1UUghrA4Fs/38C9Nfhj+dLxIDJ/F8MQVZFpPks1HkQ/ g5dOkh9ZuGAyMJwhGl/SSMIByZFFYTGdg7ezYAP9zZfEsjyO8gApd+1i1OT5/1Z3pB17 uGqVv4BVVDpGloc/j2orJad3Z8+1U4PxcyZfE= Received: by 10.90.118.19 with SMTP id q19mr3488663agc.62.1218043359008; Wed, 06 Aug 2008 10:22:39 -0700 (PDT) Received: by 10.90.115.11 with HTTP; Wed, 6 Aug 2008 10:22:38 -0700 (PDT) Message-ID: Date: Wed, 6 Aug 2008 22:52:38 +0530 From: "Bhagi rathi" To: "Lachlan McIlroy" X-ASG-Orig-Subj: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Subject: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Cc: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com In-Reply-To: <20080806054121.CB2F258C52A4@chook.melbourne.sgi.com> MIME-Version: 1.0 References: <20080806054121.CB2F258C52A4@chook.melbourne.sgi.com> X-Barracuda-Connect: wr-out-0506.google.com[64.233.184.234] X-Barracuda-Start-Time: 1218043360 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0098 1.0000 -1.9570 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.96 X-Barracuda-Spam-Status: No, SCORE=-1.96 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1938 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 1332 X-archive-position: 17423 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs Why are we going to block for ever? Mounting a file-system requires in-core log space buffers, reading of other buffers which needs allocation of memory greater than per ag structures. I am trying to understand why xfs_perag_t? Mount/Unmount are not frequent activities, it is better for them to succeed if operating system can allocate memory and take them forward. -Saradhi. On Wed, Aug 6, 2008 at 11:11 AM, Lachlan McIlroy wrote: > use KM_MAYFAIL in xfs_mountfs > > Use KM_MAYFAIL for the m_perag allocation, we can deal with the error > easily and blocking forever during mount is not a good idea either. > > > Signed-off-by: Christoph Hellwig > > Date: Wed Aug 6 15:40:44 AEST 2008 > Workarea: redback.melbourne.sgi.com:/home/lachlan/isms/2.6.x-mm > Inspected by: > hch > lachlan > Author: lachlan > > The following file(s) were checked into: > longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb > > > Modid: xfs-linux-melb:xfs-kern:31837a > fs/xfs/xfs_mount.c - 1.445 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/> xfs_mount.c.diff?r1=text&tr1=1.445&r2=text&tr2=1.444&f=h > > http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.c.diff?r1=text&tr1=1.445&r2=text&tr2=1.444&f=h > - use KM_MAYFAIL in xfs_mountfs > > > > > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Wed Aug 6 12:55:12 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 12:55:18 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m76JtCOq007239 for ; Wed, 6 Aug 2008 12:55:12 -0700 X-ASG-Debug-ID: 1218052579-4ce103210002-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id ABBED366F43; Wed, 6 Aug 2008 12:56:26 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com with ESMTP id oxZmx4OiL4lAgh0q; Wed, 06 Aug 2008 12:56:26 -0700 (PDT) X-ASG-Whitelist: Barracuda Reputation Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id m76JuIOp011750; Wed, 6 Aug 2008 15:56:18 -0400 Received: from file.rdu.redhat.com (file.rdu.redhat.com [10.11.255.147]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m76JuIJf007152; Wed, 6 Aug 2008 15:56:18 -0400 Received: from neon.msp.redhat.com (neon.msp.redhat.com [10.15.80.10]) by file.rdu.redhat.com (8.13.1/8.13.1) with ESMTP id m76JuHJP005982; Wed, 6 Aug 2008 15:56:18 -0400 Message-ID: <489A01E1.1040309@sandeen.net> Date: Wed, 06 Aug 2008 14:56:17 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.14 (X11/20080501) MIME-Version: 1.0 To: Bhagi rathi CC: Lachlan McIlroy , sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Subject: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers References: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 X-Barracuda-Connect: mx1.redhat.com[66.187.233.31] X-Barracuda-Start-Time: 1218052587 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17425 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Bhagi rathi wrote: > I couldn't get a chance to read the diff's completely. If I click on > Lachlan's url for diff's, I couldn't access them. Try again, it takes a while for cvs to catch up. -Eric > It looks to me that > the issue is not just with trace buffers. It can extend to xfs_iformat > as well. The same dead-lock can spring via > > xfs_iread -> xfs_iformat -> xfs_iformat_extents -> xfs_iext_add -> > xfs_iext_inline_to_direct -> which can do kmem_alloc with > KM_SLEEP flag. > > > The source of the problem is that holding a lock and entering into > file-system once again. This can lead to dead-lock on the same > clustered buffer during cleaning of log space. > > Cheers, > Bhagi. From owner-xfs@oss.sgi.com Wed Aug 6 12:54:22 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 12:54:24 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m76JsM14006847 for ; Wed, 6 Aug 2008 12:54:22 -0700 X-ASG-Debug-ID: 1218052530-2feb03010002-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 9425B18307C9; Wed, 6 Aug 2008 12:55:35 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com with ESMTP id yqFZRiMUGUICuoTq; Wed, 06 Aug 2008 12:55:35 -0700 (PDT) X-ASG-Whitelist: Barracuda Reputation Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id m76JtUcF011425; Wed, 6 Aug 2008 15:55:30 -0400 Received: from file.rdu.redhat.com (file.rdu.redhat.com [10.11.255.147]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m76JtTEE006722; Wed, 6 Aug 2008 15:55:29 -0400 Received: from neon.msp.redhat.com (neon.msp.redhat.com [10.15.80.10]) by file.rdu.redhat.com (8.13.1/8.13.1) with ESMTP id m76JtSeF005571; Wed, 6 Aug 2008 15:55:29 -0400 Message-ID: <489A01B0.5050606@sandeen.net> Date: Wed, 06 Aug 2008 14:55:28 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.14 (X11/20080501) MIME-Version: 1.0 To: Bhagi rathi CC: Lachlan McIlroy , sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Subject: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs References: <20080806054121.CB2F258C52A4@chook.melbourne.sgi.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 X-Barracuda-Connect: mx1.redhat.com[66.187.233.31] X-Barracuda-Start-Time: 1218052537 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17424 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Bhagi rathi wrote: > Why are we going to block for ever? Mounting a file-system > requires in-core log space buffers, reading of other buffers > which needs allocation of memory greater than per ag > structures. > > I am trying to understand why xfs_perag_t? Mount/Unmount > are not frequent activities, it is better for them to succeed > if operating system can allocate memory and take them > forward. But that's the big if, right? If the system is so starved that you can't get this memory to even start the mount process, I'm sure it's better to fail the mount with -ENOMEM than to add to the current system memory stress. In general KM_MAYFAIL sounds like a good plan when you can handle the failure gracefully, I think. -Eric From owner-xfs@oss.sgi.com Wed Aug 6 13:18:47 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 13:18:50 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m76KIjn6016195 for ; Wed, 6 Aug 2008 13:18:47 -0700 X-ASG-Debug-ID: 1218053999-5b9402e70000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail05.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A33D6367340 for ; Wed, 6 Aug 2008 13:19:59 -0700 (PDT) Received: from ipmail05.adl2.internode.on.net (ipmail05.adl2.internode.on.net [203.16.214.145]) by cuda.sgi.com with ESMTP id 8eR20UH91HzjWsQ4 for ; Wed, 06 Aug 2008 13:19:59 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAKyimUh5LAMb/2dsb2JhbACuEw X-IronPort-AV: E=Sophos;i="4.31,316,1215354600"; d="scan'208";a="175864780" Received: from ppp121-44-3-27.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.3.27]) by ipmail05.adl2.internode.on.net with ESMTP; 07 Aug 2008 05:49:57 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQpUD-0007k2-8q; Thu, 07 Aug 2008 06:19:57 +1000 Date: Thu, 7 Aug 2008 06:19:57 +1000 From: Dave Chinner To: Bhagi rathi Cc: Lachlan McIlroy , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Subject: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Message-ID: <20080806201957.GQ21635@disturbed> Mail-Followup-To: Bhagi rathi , Lachlan McIlroy , xfs@oss.sgi.com References: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail05.adl2.internode.on.net[203.16.214.145] X-Barracuda-Start-Time: 1218054000 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0069 1.0000 -1.9756 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.98 X-Barracuda-Spam-Status: No, SCORE=-1.98 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1950 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17426 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Wed, Aug 06, 2008 at 10:42:15PM +0530, Bhagi rathi wrote: > I couldn't get a chance to read the diff's completely. If I click on > Lachlan's url for diff's, I couldn't access them. It looks to me that > the issue is not just with trace buffers. It can extend to xfs_iformat > as well. The same dead-lock can spring via > > xfs_iread -> xfs_iformat -> xfs_iformat_extents -> xfs_iext_add -> > xfs_iext_inline_to_direct -> which can do kmem_alloc with > KM_SLEEP flag. Fixed already: Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Wed Aug 6 13:21:44 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 13:21:46 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m76KLiew017598 for ; Wed, 6 Aug 2008 13:21:44 -0700 X-ASG-Debug-ID: 1218054178-5b9603140000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail05.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id D3EEA36736D for ; Wed, 6 Aug 2008 13:22:59 -0700 (PDT) Received: from ipmail05.adl2.internode.on.net (ipmail05.adl2.internode.on.net [203.16.214.145]) by cuda.sgi.com with ESMTP id Vp9D30rlruD0RTgm for ; Wed, 06 Aug 2008 13:22:59 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAKyimUh5LAMb/2dsb2JhbACuEw X-IronPort-AV: E=Sophos;i="4.31,316,1215354600"; d="scan'208";a="175866215" Received: from ppp121-44-3-27.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.3.27]) by ipmail05.adl2.internode.on.net with ESMTP; 07 Aug 2008 05:52:57 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQpX6-0007oN-5G; Thu, 07 Aug 2008 06:22:56 +1000 Date: Thu, 7 Aug 2008 06:22:56 +1000 From: Dave Chinner To: Eric Sandeen Cc: Bhagi rathi , Lachlan McIlroy , sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Subject: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Message-ID: <20080806202256.GR21635@disturbed> Mail-Followup-To: Eric Sandeen , Bhagi rathi , Lachlan McIlroy , sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com References: <20080806054121.CB2F258C52A4@chook.melbourne.sgi.com> <489A01B0.5050606@sandeen.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <489A01B0.5050606@sandeen.net> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail05.adl2.internode.on.net[203.16.214.145] X-Barracuda-Start-Time: 1218054179 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1950 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17427 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Wed, Aug 06, 2008 at 02:55:28PM -0500, Eric Sandeen wrote: > Bhagi rathi wrote: > > Why are we going to block for ever? Mounting a file-system > > requires in-core log space buffers, reading of other buffers > > which needs allocation of memory greater than per ag > > structures. ..... > In general KM_MAYFAIL sounds like a good plan when you can handle the > failure gracefully, I think. Yes, and that is the long term plan - to remove all KM_SLEEP allocations from XFS and allow them to fail gracefully. There's lots of work needed before we get there, though. e.g. right now we cannot survive an ENOMEM error in a transaction.... Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Wed Aug 6 13:26:34 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 13:26:36 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m76KQXOZ019479 for ; Wed, 6 Aug 2008 13:26:34 -0700 X-ASG-Debug-ID: 1218054467-23f4003b0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail05.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 9344F1994535 for ; Wed, 6 Aug 2008 13:27:47 -0700 (PDT) Received: from ipmail05.adl2.internode.on.net (ipmail05.adl2.internode.on.net [203.16.214.145]) by cuda.sgi.com with ESMTP id 5PaRPuTHzR6V0Zz5 for ; Wed, 06 Aug 2008 13:27:47 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAHqmmUh5LAMb/2dsb2JhbACtfw X-IronPort-AV: E=Sophos;i="4.31,316,1215354600"; d="scan'208";a="175868346" Received: from ppp121-44-3-27.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.3.27]) by ipmail05.adl2.internode.on.net with ESMTP; 07 Aug 2008 05:57:46 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQpbm-00082F-CG; Thu, 07 Aug 2008 06:27:46 +1000 Date: Thu, 7 Aug 2008 06:27:46 +1000 From: Dave Chinner To: Bhagi rathi , Lachlan McIlroy , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Subject: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Message-ID: <20080806202746.GC6119@disturbed> Mail-Followup-To: Bhagi rathi , Lachlan McIlroy , xfs@oss.sgi.com References: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> <20080806201957.GQ21635@disturbed> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080806201957.GQ21635@disturbed> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail05.adl2.internode.on.net[203.16.214.145] X-Barracuda-Start-Time: 1218054468 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0010 1.0000 -2.0143 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.01 X-Barracuda-Spam-Status: No, SCORE=-2.01 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1951 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17428 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Thu, Aug 07, 2008 at 06:19:57AM +1000, Dave Chinner wrote: > On Wed, Aug 06, 2008 at 10:42:15PM +0530, Bhagi rathi wrote: > > I couldn't get a chance to read the diff's completely. If I click on > > Lachlan's url for diff's, I couldn't access them. It looks to me that > > the issue is not just with trace buffers. It can extend to xfs_iformat > > as well. The same dead-lock can spring via > > > > xfs_iread -> xfs_iformat -> xfs_iformat_extents -> xfs_iext_add -> > > xfs_iext_inline_to_direct -> which can do kmem_alloc with > > KM_SLEEP flag. > > Fixed already: > > Hmmm. where did that url go? Try again: Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Wed Aug 6 14:02:45 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 14:02:47 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m76L2iwN032358 for ; Wed, 6 Aug 2008 14:02:44 -0700 X-ASG-Debug-ID: 1218056638-1f1a02bc0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail05.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A0E931994CC8 for ; Wed, 6 Aug 2008 14:03:58 -0700 (PDT) Received: from ipmail05.adl2.internode.on.net (ipmail05.adl2.internode.on.net [203.16.214.145]) by cuda.sgi.com with ESMTP id SEv9z2kmaYru9Ula for ; Wed, 06 Aug 2008 14:03:58 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEAHqtmUh5LAMb/2dsb2JhbACuCQ X-IronPort-AV: E=Sophos;i="4.31,316,1215354600"; d="scan'208";a="175881309" Received: from ppp121-44-3-27.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.3.27]) by ipmail05.adl2.internode.on.net with ESMTP; 07 Aug 2008 06:33:56 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KQqAl-0000d4-Kp; Thu, 07 Aug 2008 07:03:55 +1000 Date: Thu, 7 Aug 2008 07:03:55 +1000 From: Dave Chinner To: Bhagi rathi , Lachlan McIlroy , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Subject: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Message-ID: <20080806210355.GS21635@disturbed> Mail-Followup-To: Bhagi rathi , Lachlan McIlroy , xfs@oss.sgi.com References: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> <20080806201957.GQ21635@disturbed> <20080806202746.GC6119@disturbed> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080806202746.GC6119@disturbed> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail05.adl2.internode.on.net[203.16.214.145] X-Barracuda-Start-Time: 1218056639 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0012 1.0000 -2.0131 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.01 X-Barracuda-Spam-Status: No, SCORE=-2.01 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1953 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17429 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Thu, Aug 07, 2008 at 06:27:46AM +1000, Dave Chinner wrote: > On Thu, Aug 07, 2008 at 06:19:57AM +1000, Dave Chinner wrote: > > On Wed, Aug 06, 2008 at 10:42:15PM +0530, Bhagi rathi wrote: > > > I couldn't get a chance to read the diff's completely. If I click on > > > Lachlan's url for diff's, I couldn't access them. It looks to me that > > > the issue is not just with trace buffers. It can extend to xfs_iformat > > > as well. The same dead-lock can spring via > > > > > > xfs_iread -> xfs_iformat -> xfs_iformat_extents -> xfs_iext_add -> > > > xfs_iext_inline_to_direct -> which can do kmem_alloc with > > > KM_SLEEP flag. > > > > Fixed already: > > > > > > Hmmm. where did that url go? Try again: Ok, something is stripping URLs out of emails. I just sent that URL to myself and it wasn't stripped so it's not my mail infrastructure that is doing it. Did someone "upgrade" the spam filters on oss.sgi.com or the barracuda overnight? The link - minus the "http://" bit is: oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs-2.6.git;a=commit;h=8c6266658cb76e282c14cb92f8ba5a1c674f4928 let's see if that gets stripped.... Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Wed Aug 6 19:18:18 2008 Received: with ECARTIS (v1.0.0; list xfs); Wed, 06 Aug 2008 19:18:23 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m772II7p030212 for ; Wed, 6 Aug 2008 19:18:18 -0700 X-ASG-Debug-ID: 1218075572-275a03b20000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from slurp.thebarn.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 99682F10B87; Wed, 6 Aug 2008 19:19:32 -0700 (PDT) Received: from slurp.thebarn.com (cattelan-host202.dsl.visi.com [208.42.117.202]) by cuda.sgi.com with ESMTP id OV2n97hGGxIfofPf; Wed, 06 Aug 2008 19:19:32 -0700 (PDT) Received: from funky.thebarn.com (slurp.thebarn.com [208.42.117.201]) (authenticated bits=0) by slurp.thebarn.com (8.14.0/8.13.8) with ESMTP id m772JSw2073892; Wed, 6 Aug 2008 21:19:29 -0500 (CDT) (envelope-from cattelan@thebarn.com) Message-ID: <489A5BB0.9020900@thebarn.com> Date: Wed, 06 Aug 2008 21:19:28 -0500 From: Russell Cattelan User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Bhagi rathi , Lachlan McIlroy , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Subject: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers References: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> <20080806201957.GQ21635@disturbed> <20080806202746.GC6119@disturbed> <20080806210355.GS21635@disturbed> In-Reply-To: <20080806210355.GS21635@disturbed> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Scanned: ClamAV 0.91.2/7966/Wed Aug 6 21:05:25 2008 on slurp.thebarn.com X-Virus-Status: Clean X-Barracuda-Connect: cattelan-host202.dsl.visi.com[208.42.117.202] X-Barracuda-Start-Time: 1218075573 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0126 1.0000 -1.9390 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.94 X-Barracuda-Spam-Status: No, SCORE=-1.94 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.1974 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 17430 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cattelan@thebarn.com Precedence: bulk X-list: xfs Dave Chinner wrote: > On Thu, Aug 07, 2008 at 06:27:46AM +1000, Dave Chinner wrote: > >> On Thu, Aug 07, 2008 at 06:19:57AM +1000, Dave Chinner wrote: >> >>> On Wed, Aug 06, 2008 at 10:42:15PM +0530, Bhagi rathi wrote: >>> >>>> I couldn't get a chance to read the diff's completely. If I click on >>>> Lachlan's url for diff's, I couldn't access them. It looks to me that >>>> the issue is not just with trace buffers. It can extend to xfs_iformat >>>> as well. The same dead-lock can spring via >>>> >>>> xfs_iread -> xfs_iformat -> xfs_iformat_extents -> xfs_iext_add -> >>>> xfs_iext_inline_to_direct -> which can do kmem_alloc with >>>> KM_SLEEP flag. >>>> >>> Fixed already: >>> >>> >>> >> Hmmm. where did that url go? Try again: >> > > Ok, something is stripping URLs out of emails. I just sent that URL > to myself and it wasn't stripped so it's not my mail infrastructure > that is doing it. > > Did someone "upgrade" the spam filters on oss.sgi.com or the > barracuda overnight? > not oss I'm seeing the correct url's BTW > The link - minus the "http://" bit is: > > oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs-2.6.git;a=commit;h=8c6266658cb76e282c14cb92f8ba5a1c674f4928 > > let's see if that gets stripped.... > > Cheers, > > Dave. > From owner-xfs@oss.sgi.com Thu Aug 7 01:35:36 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 01:36:03 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_44 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with SMTP id m778ZWYf002292 for ; Thu, 7 Aug 2008 01:35:33 -0700 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA07224; Thu, 7 Aug 2008 18:36:44 +1000 Message-ID: <489AB596.1010505@sgi.com> Date: Thu, 07 Aug 2008 18:43:02 +1000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.16 (X11/20080707) MIME-Version: 1.0 To: Lachlan McIlroy , xfs@oss.sgi.com, xfs-dev Subject: Re: [PATCH] Move vn_iowait() earlier in the reclaim path References: <4897F691.6010806@sgi.com> <20080805073711.GA21635@disturbed> <489806C2.7020200@sgi.com> <20080805084220.GF21635@disturbed> <48990C4E.9070102@sgi.com> <20080806052053.GU6119@disturbed> <4899406D.5020802@sgi.com> <20080806093844.GZ6119@disturbed> In-Reply-To: <20080806093844.GZ6119@disturbed> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17431 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Dave Chinner wrote: > On Wed, Aug 06, 2008 at 04:10:53PM +1000, Lachlan McIlroy wrote: >> Dave Chinner wrote: >>> On Wed, Aug 06, 2008 at 12:28:30PM +1000, Lachlan McIlroy wrote: >>>> Dave Chinner wrote: >>>>> On Tue, Aug 05, 2008 at 05:52:34PM +1000, Lachlan McIlroy wrote: >>>>>> Dave Chinner wrote: >>>>>>> On Tue, Aug 05, 2008 at 04:43:29PM +1000, Lachlan McIlroy wrote: >>>>>>>> Currently by the time we get to vn_iowait() in xfs_reclaim() we have already >>>>>>>> gone through xfs_inactive()/xfs_free() and recycled the inode. Any I/O >>>>>>> xfs_free()? What's that? >>>>>> Sorry that should have been xfs_ifree() (we set the inode's mode to >>>>>> zero in there). >>>>>> >>>>>>>> completions still running (file size updates and unwritten extent conversions) >>>>>>>> may be working on an inode that is no longer valid. >>>>>>> The linux inode does not get freed until after ->clear_inode >>>>>>> completes, hence it is perfectly valid to reference it anywhere >>>>>>> in the ->clear_inode path. >>>>>> The problem I see is an assert in xfs_setfilesize() fail: >>>>>> >>>>>> ASSERT((ip->i_d.di_mode & S_IFMT) == S_IFREG); >>>>>> >>>>>> The mode of the XFS inode is zero at this time. >>>>> Ok, so the question has to be why is there I/O still in progress >>>>> after the truncate is supposed to have already occurred and the >>>>> vn_iowait() in xfs_itruncate_start() been executed. >>>>> >>>>> Something doesn't add up here - you can't be doing I/O on a file >>>>> with no extents or delalloc blocks, hence that means we should be >>>>> passing through the truncate path in xfs_inactive() before we >>>>> call xfs_ifree() and therefore doing the vn_iowait().. >>>>> >>>>> Hmmmm - the vn_iowait() is conditional based on: >>>>> >>>>> /* wait for the completion of any pending DIOs */ >>>>> if (new_size < ip->i_size) >>>>> vn_iowait(ip); >>>>> >>>>> We are truncating to zero (new_size == 0), so the only case where >>>>> this would not wait is if ip->i_size == 0. Still - I can't see >>>>> how we'd be doing I/O on an inode with a zero i_size. I suspect >>>>> ensuring we call vn_iowait() if newsize == 0 as well would fix >>>>> the problem. If not, there's something much more subtle going >>>>> on here that we should understand.... >>>> If we make the vn_iowait() unconditional we might re-introduce the >>>> NFS exclusivity bug that killed performance. That was through >>>> xfs_release()->xfs_free_eofblocks()->xfs_itruncate_start(). >>> It won't reintroduce that problem because ->clear_inode() >>> is not called on every NFS write operation. >> Yes but xfs_itruncate_start() can be called from every NFS write so >> modifying the above code will re-introduce the problem. > > Ah, no. The case here is new_size == 0, which will almost never be > the case in the ->release call... True. > >>>> So if we leave the above code as is then we need another >>>> vn_iowait() in xfs_inactive() to catch any remaining workqueue >>>> items that we didn't wait for in xfs_itruncate_start(). >>> How do we have any new *data* I/O at all in progress at this point? >> It's not new data I/O. > > Then why isn't is being caught by the vn_iowait() in the truncate > code????? No idea. > >> It's workqueue items that have been queued >> from previous I/Os that are still outstanding. > > The iocount is decremented when the completion is finished, not when it > is queued. Hence vn_iowait() should be taking into account this case. Hmmm. It should. > >>> That does not explain why we need an additional vn_iowait() call. >>> All I see from this is a truncate race that has somethign to do with >>> the vn_iowait() call being conditional. >>> >>> That is, if we truncate to zero, then the current code in >>> xfs_itruncate_start() should wait unconditinally for *all* I/O to >>> complete because, by definition, all that I/O is beyond the new EOF >>> and we have to wait for it to complete before truncating the file. >> That makes sense. If new_size is zero and ip->i_size is not then we >> will wait. If ip->i_size is also zero we will not wait but if the >> file size is already zero there should not be any I/Os in progress >> and therefore no workqueue items outstanding. > > I note from the debug below that the linux inode size is zero, > but you didn't include the dump of the xfs inode so we can't see > what the other variables are. I tried to dump the xfs inode at the time but kdb encountered a bad address. I'm pretty sure I was able to get a dump of the XFS inode on another crash - that's how I found the mode to be zero. I can't be sure now - I'll have to reproduce the problem again. > > Also, i_size is updated after write I/O is dispatched. If we are > doing synchronous I/O, then i_size is not updated until after the > I/O completes (in xfs_write()). Hence we could have the situation of > I/O being run while i_size = 0. This is why I wanted to know what > i_new_size is, because that gets set before the I/O is issued. > > if i_new_size is non-zero and i_size is zero,that tends to imply > the conditional vn_iowait() in the truncate path needs to take > MAX(ip->i_size, ip->i_new_size) for the check, not just ip->i_size... > > FWIW, from the dump below, we have: > > typedef struct xfs_ioend { > struct xfs_ioend *io_list = NULL > unsigned int io_type = 0x20 = IOMAP_UNWRITTEN > int io_error = 0 > atomic_t io_remaining = 0; > struct inode *io_inode = 0xffff810054062048 > struct buffer_head *io_buffer_head = NULL > struct buffer_head *io_buffer_tail = NULL > size_t io_size = 0x3400 > xfs_off_t io_offset = 0xfe200 > struct work_struct io_work; /* xfsdatad work queue */ > } xfs_ioend_t; > > So the I/O was not even close to offset zero. > > Also, the fact that the stack trace says it came through the > written path, but the io_type says unwritten which says that there's > something fishy here. Either the stack trace is wrong, or there's > been a memory corruption.... I'm pretty sure that's because it was a direct I/O write to a written extent. The I/O starts out as IOMAP_UNWRITTEN and if we didn't map to an unwritten extent it's completion handler is switched to the written one. Looking at the direct I/O write path I can see where the direct I/O write would wait for the bio's to complete but it's not waiting for the workqueue items to be flushed. Not sure if that's part of the problem though. > >>> If either the vn_iowait() in the truncate path is not sufficient, or >>> the truncate code is not being called, there is *some other bug* >>> that we don't yet understand. Adding an unconditional vn_iowait() >>> appear to me to be fixing a symptom, not the underlying cause of >>> the problem.... >> I'm not adding a new call to vn_iowait(). I'm just moving the >> existing one from xfs_ireclaim() so that we wait for all I/O to >> complete before we tear the inode down. > > Yes, but that is there to catch inodes with non-zero link counts because > we are not doing a truncate in that case. We still need to get to the > bottom of why the truncate is not waiting for all I/O. I'm wondering if we have an extra decrement on the i_iocount atomic counter that tricked vn_iowait() into thinking that all I/Os have completed. Should this code in xfs_vm_direct_IO(): if (unlikely(ret != -EIOCBQUEUED && iocb->private)) xfs_destroy_ioend(iocb->private); be: if (unlikely(ret < 0 && ret != -EIOCBQUEUED && iocb->private)) xfs_destroy_ioend(iocb->private); Ordinarily we'd drop the i_iocount reference by calling xfs_end_io_direct(). From owner-xfs@oss.sgi.com Thu Aug 7 10:17:33 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 10:17:35 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.5 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m77HHWPG017728 for ; Thu, 7 Aug 2008 10:17:33 -0700 X-ASG-Debug-ID: 1218129527-595302080000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wr-out-0506.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5209C124CA89 for ; Thu, 7 Aug 2008 10:18:47 -0700 (PDT) Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.230]) by cuda.sgi.com with ESMTP id b4dkVsMXfVTi3BhD for ; Thu, 07 Aug 2008 10:18:47 -0700 (PDT) Received: by wr-out-0506.google.com with SMTP id 57so438565wri.12 for ; Thu, 07 Aug 2008 10:18:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type:references; bh=PXZ4XiFcR0ho2hDiSGvoK6BFKOOvfu+rk2SMGuFwZoE=; b=BkmkoLomBhVmjuXEPVaF/Koek0lNYiZZeu3TDrmngCv4sjL+s+P2fxVU828oHxIumA wSsQelO5jmhr1Z8SqyiUOoTBD9XcUjSktXfbMEr81fVLRuAbLUHItqyS/7FKYd6SdMHl CCJOgoE9d05S4ipvBx9LqIaqAABmo2zrffmaQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:references; b=VlCUFEokbSYeqY3grGb9ghqj4okckE5Ki3IeEml7oO9jPHqTENyTSp9wpfDNQYXIdv MDkzUM6BYyvzUM65PwHpcxvPPdwiqfni9IQtyS6Tk4rBy2wbOIaiAIj+M6wGCpo1/SOS Ul3tI7rrUqr37O7bH4TKNLBdBuC5wz8KifJMA= Received: by 10.90.98.13 with SMTP id v13mr5250345agb.86.1218129527158; Thu, 07 Aug 2008 10:18:47 -0700 (PDT) Received: by 10.90.115.11 with HTTP; Thu, 7 Aug 2008 10:18:46 -0700 (PDT) Message-ID: Date: Thu, 7 Aug 2008 22:48:46 +0530 From: "Bhagi rathi" To: "Eric Sandeen" X-ASG-Orig-Subj: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Subject: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Cc: "Lachlan McIlroy" , sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com In-Reply-To: <489A01B0.5050606@sandeen.net> MIME-Version: 1.0 References: <20080806054121.CB2F258C52A4@chook.melbourne.sgi.com> <489A01B0.5050606@sandeen.net> X-Barracuda-Connect: wr-out-0506.google.com[64.233.184.230] X-Barracuda-Start-Time: 1218129528 X-Barracuda-Bayes: INNOCENT GLOBAL 0.4969 1.0000 0.0000 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2034 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 1493 X-archive-position: 17432 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs On Thu, Aug 7, 2008 at 1:25 AM, Eric Sandeen wrote: > Bhagi rathi wrote: > > Why are we going to block for ever? Mounting a file-system > > requires in-core log space buffers, reading of other buffers > > which needs allocation of memory greater than per ag > > structures. > > > > I am trying to understand why xfs_perag_t? Mount/Unmount > > are not frequent activities, it is better for them to succeed > > if operating system can allocate memory and take them > > forward. > > But that's the big if, right? > > If the system is so starved that you can't get this memory to even start > the mount process, I'm sure it's better to fail the mount with -ENOMEM > than to add to the current system memory stress. Not really. It is going to fail many automated scripts. We are designing for a problem that system is starved with memory. It points to a bug in memory & system dirty state cleaning, they are the ideal problems to be solved this instead of this. As long as system recovers, it is good not to disturb automated scripts by introducing these kind of unnecessary failures of the mount command. > > > In general KM_MAYFAIL sounds like a good plan when you can handle the > failure gracefully, I think. Not really. It fails mount gracefully, however, it needs administrative action. It is expected that operating system will recover and functional without admin intervention. Cheers, Bhagi. > > > -Eric > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Aug 7 10:22:43 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 10:22:45 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.5 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m77HMhGj018461 for ; Thu, 7 Aug 2008 10:22:43 -0700 X-ASG-Debug-ID: 1218129836-67d302080000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wr-out-0506.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 39D6E373493 for ; Thu, 7 Aug 2008 10:23:56 -0700 (PDT) Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.229]) by cuda.sgi.com with ESMTP id FFYvfDrtrxYhcHMG for ; Thu, 07 Aug 2008 10:23:56 -0700 (PDT) Received: by wr-out-0506.google.com with SMTP id 57so440709wri.12 for ; Thu, 07 Aug 2008 10:23:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=uvdD4/E6v59xphbXmHr7Hk4uU1tFe4JJVmqnHAFVKJE=; b=iOTH0vSzo9Exgt0V/KppxRgJsNHluyiaJLHKjPXzs5U1vT03SkRiQ1+F9LqupouNpP /bYposkzbfOArZKPh/NehlsKEpoNRqqDrJjhQj1dkPagn/bq8H1U/YExS1NzcQwaIK1/ cjQFBnOU7jbIb4ggPthJ8wu+0liELpwmRhBrg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=LrWlz3X0FkCJmtYqaS0pWTsKACwU/fHxt2j0LWHekiwNFiI2uPTpPxlosDKwXOQ/Zw dBjtYvekjN+VnNtrFULcnf41U6gPKEndhYOxEp0rQjefI5ybO75+XrIyvM8+osSTcWH0 13AMQSO+elvsXZRkIXnSy5IO+4WK7IOSMdix8= Received: by 10.90.118.19 with SMTP id q19mr5272715agc.62.1218129836195; Thu, 07 Aug 2008 10:23:56 -0700 (PDT) Received: by 10.90.115.11 with HTTP; Thu, 7 Aug 2008 10:23:55 -0700 (PDT) Message-ID: Date: Thu, 7 Aug 2008 22:53:55 +0530 From: "Bhagi rathi" To: "Eric Sandeen" , "Bhagi rathi" , "Lachlan McIlroy" , sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Subject: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs In-Reply-To: <20080806202256.GR21635@disturbed> MIME-Version: 1.0 References: <20080806054121.CB2F258C52A4@chook.melbourne.sgi.com> <489A01B0.5050606@sandeen.net> <20080806202256.GR21635@disturbed> X-Barracuda-Connect: wr-out-0506.google.com[64.233.184.229] X-Barracuda-Start-Time: 1218129839 X-Barracuda-Bayes: INNOCENT GLOBAL 0.4725 1.0000 0.0000 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2034 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 1074 X-archive-position: 17433 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs On Thu, Aug 7, 2008 at 1:52 AM, Dave Chinner wrote: > On Wed, Aug 06, 2008 at 02:55:28PM -0500, Eric Sandeen wrote: > > Bhagi rathi wrote: > > > Why are we going to block for ever? Mounting a file-system > > > requires in-core log space buffers, reading of other buffers > > > which needs allocation of memory greater than per ag > > > structures. > ..... > > In general KM_MAYFAIL sounds like a good plan when you can handle the > > failure gracefully, I think. > > Yes, and that is the long term plan - to remove all KM_SLEEP > allocations from XFS and allow them to fail gracefully. There's > lots of work needed before we get there, though. e.g. > right now we cannot survive an ENOMEM error in a transaction.... I am not sure that we are solving right problem. Isn't the above is fall-out of XFS needing memory to clean dirty memory? That is of good priority to engineer than this handling of ENOMEM in transactions. Cheers, Bhagi. > > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Aug 7 10:42:47 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 10:42:49 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: * X-Spam-Status: No, score=1.5 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m77HgkM0020196 for ; Thu, 7 Aug 2008 10:42:47 -0700 X-ASG-Debug-ID: 1218131040-596b03590000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wf-out-1314.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id D84CE124C9CF for ; Thu, 7 Aug 2008 10:44:00 -0700 (PDT) Received: from wf-out-1314.google.com (wf-out-1314.google.com [209.85.200.173]) by cuda.sgi.com with ESMTP id d8cXTcvhG1L73thw for ; Thu, 07 Aug 2008 10:44:00 -0700 (PDT) Received: by wf-out-1314.google.com with SMTP id 26so480661wfd.32 for ; Thu, 07 Aug 2008 10:44:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=7Du/9PYAokNN1B0xzaiQSj2YTI/oMwp+E8V+S15Dhe0=; b=KN9kxexnKlGOQpKJgRY8arsu6saJLVCWuRy6oQskG4AzasQh/WtJDeTAULRsvLB8fm HA6P3aD5ZjfoLjeHwx6K12w8+FNRGJZdTluP63O3iTCAQpGX/UBTiMPLukGbx7xmLf5p +6aB4O387OrptRkF/rmz6vlRYQex7nLDvGQL8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=veFtohS2mkY7XV7dvmyLK6p+KldaMYqPHxeO/UeaUdAyl7PaF1kAY30tVwShjlFlmL G8X7emm6SsBGQG6+mCxtYuK0NQKE80IVxupFxw+5hbNTG4JppmFe8aIMsyINAtXUH9Te BJlXBNHEmszBtOVTjrJKobWVGgbaMAj4dyQFQ= Received: by 10.142.105.13 with SMTP id d13mr558287wfc.275.1218131040096; Thu, 07 Aug 2008 10:44:00 -0700 (PDT) Received: by 10.142.164.8 with HTTP; Thu, 7 Aug 2008 10:43:59 -0700 (PDT) Message-ID: Date: Thu, 7 Aug 2008 23:13:59 +0530 From: "Bhagi rathi" To: "Bhagi rathi" , "Lachlan McIlroy" , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Subject: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers In-Reply-To: <20080806201957.GQ21635@disturbed> MIME-Version: 1.0 References: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> <20080806201957.GQ21635@disturbed> X-Barracuda-Connect: wf-out-1314.google.com[209.85.200.173] X-Barracuda-Start-Time: 1218131040 X-Barracuda-Bayes: INNOCENT GLOBAL 0.4987 1.0000 0.0000 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2036 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 1490 X-archive-position: 17434 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs On Thu, Aug 7, 2008 at 1:49 AM, Dave Chinner wrote: > On Wed, Aug 06, 2008 at 10:42:15PM +0530, Bhagi rathi wrote: > > I couldn't get a chance to read the diff's completely. If I click on > > Lachlan's url for diff's, I couldn't access them. It looks to me that > > the issue is not just with trace buffers. It can extend to xfs_iformat > > as well. The same dead-lock can spring via > > > > xfs_iread -> xfs_iformat -> xfs_iformat_extents -> xfs_iext_add -> > > xfs_iext_inline_to_direct -> which can do kmem_alloc with > > KM_SLEEP flag. > > Fixed already: > > > http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs-2.6.git;a=commit;h=8c6266658cb76e282c14cb92f8ba5a1c674f4928 > Thanks Dave. However, My concern is just not one allocation. We need to clean all allocations that can re-enter to file-system. I see that this issue exists in attributes format for i_afp allocations. It may exist with local format of data and attributes too. xfs_iread->xfs_iformat->xfs_iformat_local. Are we safe that we fixed all these real problems by looking into possible allocations that will enter into file-system? The problem with trace buffers is telling us to clean this code path. By the way, I browse the source code from lxr.linux.no. If I have to browse the latest xfs source code with linux kernel that is used at SGI, how can I do that? Cheers, Bhagi. > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Aug 7 10:44:17 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 10:44:19 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: * X-Spam-Status: No, score=1.4 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m77HiHxb020607 for ; Thu, 7 Aug 2008 10:44:17 -0700 X-ASG-Debug-ID: 1218131132-67c003310000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wf-out-1314.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id AAA4937390E for ; Thu, 7 Aug 2008 10:45:32 -0700 (PDT) Received: from wf-out-1314.google.com (wf-out-1314.google.com [209.85.200.171]) by cuda.sgi.com with ESMTP id 47c72dCY1VKms56T for ; Thu, 07 Aug 2008 10:45:32 -0700 (PDT) Received: by wf-out-1314.google.com with SMTP id 26so481152wfd.32 for ; Thu, 07 Aug 2008 10:45:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=DBQyiIneRRlhG5dwt2J9yP7ycq/MSXDrZmBzby/PTkk=; b=ln7PzuE/U+2EyafwZTPfOjz8sotMVEveHVX26EDAEr9FTaqP5nce7lk5SdjqTdG5eU tPGwP9+viTbMKFiffWDJ8/ZnV1rw18vKCCxrpnDIucXvw8VgZp4zx9ThWg27VPcR2aP7 Vedg/uPCQ50FdXEmhV+UqsPcNek8GgzDD6uRs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=qPTr7w9b0Pp9D+zg63eRkA5a4IE4BBt1DeIEy786FqO6koVpTTpgNvMPQcRJWB+NUO 69hqMZh2ClFPOrtvSAmSExkMY7KcdRsqTB22LXiYDlMtUjM4EGGksnUufzdoL25pvgkk Io13XROMV6W3CqjGmw66lln+gjG2w6TaVrhxs= Received: by 10.142.180.19 with SMTP id c19mr347777wff.263.1218131131988; Thu, 07 Aug 2008 10:45:31 -0700 (PDT) Received: by 10.142.164.8 with HTTP; Thu, 7 Aug 2008 10:45:31 -0700 (PDT) Message-ID: Date: Thu, 7 Aug 2008 23:15:31 +0530 From: "Bhagi rathi" To: "Bhagi rathi" , "Lachlan McIlroy" , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Subject: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers In-Reply-To: <20080806210355.GS21635@disturbed> MIME-Version: 1.0 References: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> <20080806201957.GQ21635@disturbed> <20080806202746.GC6119@disturbed> <20080806210355.GS21635@disturbed> X-Barracuda-Connect: wf-out-1314.google.com[209.85.200.171] X-Barracuda-Start-Time: 1218131132 X-Barracuda-Bayes: INNOCENT GLOBAL 0.4774 1.0000 0.0000 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2034 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 1575 X-archive-position: 17435 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs On Thu, Aug 7, 2008 at 2:33 AM, Dave Chinner wrote: > On Thu, Aug 07, 2008 at 06:27:46AM +1000, Dave Chinner wrote: > > On Thu, Aug 07, 2008 at 06:19:57AM +1000, Dave Chinner wrote: > > > On Wed, Aug 06, 2008 at 10:42:15PM +0530, Bhagi rathi wrote: > > > > I couldn't get a chance to read the diff's completely. If I click on > > > > Lachlan's url for diff's, I couldn't access them. It looks to me that > > > > the issue is not just with trace buffers. It can extend to > xfs_iformat > > > > as well. The same dead-lock can spring via > > > > > > > > xfs_iread -> xfs_iformat -> xfs_iformat_extents -> xfs_iext_add -> > > > > xfs_iext_inline_to_direct -> which can do kmem_alloc with > > > > KM_SLEEP flag. > > > > > > Fixed already: > > > > > > > > > > Hmmm. where did that url go? Try again: > > Ok, something is stripping URLs out of emails. I just sent that URL > to myself and it wasn't stripped so it's not my mail infrastructure > that is doing it. > > Did someone "upgrade" the spam filters on oss.sgi.com or the > barracuda overnight? > > The link - minus the "http://" bit is: > > > oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs-2.6.git;a=commit;h=8c6266658cb76e282c14cb92f8ba5a1c674f4928 > The above just worked fine for me. Lachlan's URL still have the same issue. Cheers, Bhagi. > > > > let's see if that gets stripped.... > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Aug 7 14:53:55 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 14:54:23 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=BAYES_00,RCVD_NUMERIC_HELO autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m77LrsYw028297 for ; Thu, 7 Aug 2008 14:53:55 -0700 X-ASG-Debug-ID: 1218146108-1171024e0000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ciao.gmane.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 185CA374C72 for ; Thu, 7 Aug 2008 14:55:08 -0700 (PDT) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by cuda.sgi.com with ESMTP id 1wnCkpXmMwgxSXjP for ; Thu, 07 Aug 2008 14:55:08 -0700 (PDT) Received: from root by ciao.gmane.org with local (Exim 4.43) id 1KRDRm-0001Jj-Te for linux-xfs@oss.sgi.com; Thu, 07 Aug 2008 21:55:03 +0000 Received: from 155.91.45.232 ([155.91.45.232]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 07 Aug 2008 21:55:02 +0000 Received: from freemanrich by 155.91.45.232 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 07 Aug 2008 21:55:02 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: linux-xfs@oss.sgi.com From: Richard Freeman X-ASG-Orig-Subj: Re: =?utf-8?b?eGZzX2ZvcmNlX3NodXRkb3du?= called from file =?utf-8?b?ZnMveGZzL3hmc190cmFuc19idWYuYw==?= Subject: Re: =?utf-8?b?eGZzX2ZvcmNlX3NodXRkb3du?= called from file =?utf-8?b?ZnMveGZzL3hmc190cmFuc19idWYuYw==?= Date: Mon, 4 Aug 2008 16:55:41 +0000 (UTC) Lines: 26 Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: main.gmane.org User-Agent: Loom/3.14 (http://gmane.org/) X-Loom-IP: 155.91.45.232 (Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; InfoPath.1; .NET CLR 1.1.4322; .NET CLR 1.0.3705; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)) X-Barracuda-Connect: main.gmane.org[80.91.229.2] X-Barracuda-Start-Time: 1218146110 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -0.77 X-Barracuda-Spam-Status: No, SCORE=-0.77 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=RCVD_NUMERIC_HELO X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2053 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 1.25 RCVD_NUMERIC_HELO Received: contains an IP address used for HELO X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17436 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: freemanrich@gmail.com Precedence: bulk X-list: xfs Jay Sullivan rit.edu> writes: > > Today I upgraded to the latest stable kernel in Gentoo (2.6.23-r3) and > I'm still on xfsprogs 2.9.4, also the latest stable release. A few > hours after rebooting to load the new kernel, I saw the following in > dmesg: > > #################### > attempt to access beyond end of device > dm-0: rw=0, want=68609558288793608, limit=8178892800 I just started getting these on an ext3 filesystem also on gentoo, with the latest stable kernel. I suspect there is an lvm bug of some kind that is responsible. I ran an e2fsck on the filesystem and managed to corrupt not only that filesystem, but also several others on the same RAID. I'm probably going to have to try to salvage what I can from the no-longer-booting system and rebuild from scatch/backups. Either lvm has some major bug, or somehow e2fsck is bypassing the lvm layer and writing directly to the drives. It shouldn't be possible to write to one logical volume and modify data stored in a different logical volume on the same md raid-5 device. A check of the underlying RAID turns up no issues - I suspect the problem is in the lvm layer. Googling around for "access beyond end of device" turns up other reports of similar issues. Obviously the problem is rare. From owner-xfs@oss.sgi.com Thu Aug 7 14:54:37 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 14:54:40 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com ([192.48.176.15]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m77LsaQZ028419 for ; Thu, 7 Aug 2008 14:54:37 -0700 X-ASG-Debug-ID: 1218146149-1690021b0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8776A199CC1F for ; Thu, 7 Aug 2008 14:55:50 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id gjZL54lIRql83IHh for ; Thu, 07 Aug 2008 14:55:50 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m77LtpIF011301 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Thu, 7 Aug 2008 23:55:52 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m77LtpxC011299 for xfs@oss.sgi.com; Thu, 7 Aug 2008 23:55:51 +0200 Date: Thu, 7 Aug 2008 23:55:51 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: generic btree implementation, version 4 Subject: generic btree implementation, version 4 Message-ID: <20080807215551.GA11084@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1218146151 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.42 X-Barracuda-Spam-Status: No, SCORE=-1.42 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2053 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17437 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs As the mainlinglist seems to have some problems with the large number of patches and I don't want to monopolize it anyway there is just this announcement with a link to the patches: http://verein.lst.de/~hch/xfs/patches.btree.tgz The biggest change this time around is that the generic record, key and pointer addressing has been merged into the main patches. There is a new patch early in the series introducing the helpers for this, and a big comment describing how all this addressing works. The smalles and most important changes is a fix for xfs_btree_delrec which used to pass an unitialized xfs_btree_key structure to xfs_btree_updkey for the non-leaf case since the very first version of the generic btree patches. On the method side ->get_root_from_inode and ->resize_root have been dropped because we hardcode that way the btree is rooted in the inode in so many other places. If we ever get a differently inode rooted btree we can add the proper abstractions for it. The ->kill_root is back for now, I hope we can sort out the tiny difference between the alloc and inobt btrees later. There also are two new methods to move xfs_btree_check_key / xfs_btree_check_rec into the btree implementations. In addition to that I've addressed Dave's review comments, except that I've kept the name for the get_dmaxrecs and instead documented why it's called that way (hint the d stands for disk) and what the callers try to do with it. I think the code is in shape now, and ready for some more QA. I will post a test harness based on Dave's WIP code to help with that shortly. From owner-xfs@oss.sgi.com Thu Aug 7 14:57:35 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 14:57:37 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m77LvYg4029251 for ; Thu, 7 Aug 2008 14:57:35 -0700 X-ASG-Debug-ID: 1218146327-12e202a80000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from verein.lst.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id CF6C0374CD7 for ; Thu, 7 Aug 2008 14:58:47 -0700 (PDT) Received: from verein.lst.de (verein.lst.de [213.95.11.210]) by cuda.sgi.com with ESMTP id rjPN9md18B74KNBv for ; Thu, 07 Aug 2008 14:58:47 -0700 (PDT) Received: from verein.lst.de (localhost [127.0.0.1]) by verein.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id m77LwnIF011426 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Thu, 7 Aug 2008 23:58:49 +0200 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id m77Lwnht011424 for xfs@oss.sgi.com; Thu, 7 Aug 2008 23:58:49 +0200 Date: Thu, 7 Aug 2008 23:58:49 +0200 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: PATCH] trivial xfs_remove comment fixup Subject: PATCH] trivial xfs_remove comment fixup Message-ID: <20080807215849.GA11345@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-Barracuda-Connect: verein.lst.de[213.95.11.210] X-Barracuda-Start-Time: 1218146329 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2053 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17438 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs The dp to ip comment should be for the unconditional xfs_droplink call, and the "." link obviously only exists for directories, so it should be in the is_dir conditional. Signed-off-by: Christoph Hellwig Index: linux-2.6-xfs/fs/xfs/xfs_vnodeops.c =================================================================== --- linux-2.6-xfs.orig/fs/xfs/xfs_vnodeops.c 2008-08-06 17:26:37.000000000 -0300 +++ linux-2.6-xfs/fs/xfs/xfs_vnodeops.c 2008-08-06 17:36:44.000000000 -0300 @@ -2014,7 +2014,7 @@ xfs_remove( goto out_bmap_cancel; /* - * Drop the link from dp to ip. + * Drop the "." link from ip to self. */ error = xfs_droplink(tp, ip); if (error) @@ -2029,7 +2029,7 @@ xfs_remove( } /* - * Drop the "." link from ip to self. + * Drop the link from dp to ip. */ error = xfs_droplink(tp, ip); if (error) From owner-xfs@oss.sgi.com Thu Aug 7 15:03:48 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 15:03:50 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_22 autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m77M3m3j030262 for ; Thu, 7 Aug 2008 15:03:48 -0700 X-ASG-Debug-ID: 1218146699-1eb402990000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id E742F374D12; Thu, 7 Aug 2008 15:05:00 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id vWxr7ZIRGdSYQrGN; Thu, 07 Aug 2008 15:05:00 -0700 (PDT) Received: from hch by bombadil.infradead.org with local (Exim 4.68 #1 (Red Hat Linux)) id 1KRDbP-00043R-O4; Thu, 07 Aug 2008 22:04:59 +0000 Date: Thu, 7 Aug 2008 18:04:59 -0400 From: Christoph Hellwig To: Lachlan McIlroy Cc: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Subject: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Message-ID: <20080807220459.GA15199@infradead.org> References: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1218146703 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2053 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17439 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Sorry for the later review, but the xfs_super.c changes don't make sense. These allocations are done during module_init time and thus it'ss fien to call back into any fs to reclaim pages for it. Fortunately the affect is harmless, but I'd suggest reverting that part of the change. From owner-xfs@oss.sgi.com Thu Aug 7 17:30:38 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 17:30:41 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m780Uab4012622 for ; Thu, 7 Aug 2008 17:30:38 -0700 X-ASG-Debug-ID: 1218155510-290b028d0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail05.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 32E5EF193C2 for ; Thu, 7 Aug 2008 17:31:51 -0700 (PDT) Received: from ipmail05.adl2.internode.on.net (ipmail05.adl2.internode.on.net [203.16.214.145]) by cuda.sgi.com with ESMTP id 7RXFd4EHHSqtbjt0 for ; Thu, 07 Aug 2008 17:31:51 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEADYwm0h5LAMb/2dsb2JhbACsbQ X-IronPort-AV: E=Sophos;i="4.31,323,1215354600"; d="scan'208";a="176784491" Received: from ppp121-44-3-27.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.3.27]) by ipmail05.adl2.internode.on.net with ESMTP; 08 Aug 2008 10:01:48 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KRFtT-0007aT-6H; Fri, 08 Aug 2008 10:31:47 +1000 Date: Fri, 8 Aug 2008 10:31:47 +1000 From: Dave Chinner To: Bhagi rathi Cc: Eric Sandeen , Lachlan McIlroy , sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Subject: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Message-ID: <20080808003147.GA24344@disturbed> Mail-Followup-To: Bhagi rathi , Eric Sandeen , Lachlan McIlroy , sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com References: <20080806054121.CB2F258C52A4@chook.melbourne.sgi.com> <489A01B0.5050606@sandeen.net> <20080806202256.GR21635@disturbed> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail05.adl2.internode.on.net[203.16.214.145] X-Barracuda-Start-Time: 1218155512 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2061 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17440 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Thu, Aug 07, 2008 at 10:53:55PM +0530, Bhagi rathi wrote: > On Thu, Aug 7, 2008 at 1:52 AM, Dave Chinner wrote: > > > On Wed, Aug 06, 2008 at 02:55:28PM -0500, Eric Sandeen wrote: > > > Bhagi rathi wrote: > > > > Why are we going to block for ever? Mounting a file-system > > > > requires in-core log space buffers, reading of other buffers > > > > which needs allocation of memory greater than per ag > > > > structures. > > ..... > > > In general KM_MAYFAIL sounds like a good plan when you can handle the > > > failure gracefully, I think. > > > > Yes, and that is the long term plan - to remove all KM_SLEEP > > allocations from XFS and allow them to fail gracefully. There's > > lots of work needed before we get there, though. e.g. > > right now we cannot survive an ENOMEM error in a transaction.... > > > I am not sure that we are solving right problem. Isn't the above is > fall-out > of XFS needing memory to clean dirty memory? We can't avoid that. It is inherent in the design of XFS. And the amount of memory is not easily bounded so existing solutions like wrapping slabs in mempools don't work, either. > That is of good priority > to engineer than this handling of ENOMEM in transactions. That's just one example of an extremely non-trivial piece of work that needs to be done to allow all memory allocations to fail. This work will be done for other reasons (like transaction rollback to prevent shutdowns on errors inside a dirty transaction), not specifically to allow memory allocations to fail. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Thu Aug 7 17:32:32 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 17:32:35 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m780WWeC012976 for ; Thu, 7 Aug 2008 17:32:32 -0700 X-ASG-Debug-ID: 1218155626-70da00bb0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ipmail05.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 32D04375830 for ; Thu, 7 Aug 2008 17:33:46 -0700 (PDT) Received: from ipmail05.adl2.internode.on.net (ipmail05.adl2.internode.on.net [203.16.214.145]) by cuda.sgi.com with ESMTP id LsKblLLKHZIpZQjG for ; Thu, 07 Aug 2008 17:33:46 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoEADYwm0h5LAMb/2dsb2JhbACsbQ X-IronPort-AV: E=Sophos;i="4.31,323,1215354600"; d="scan'208";a="176785781" Received: from ppp121-44-3-27.lns10.syd7.internode.on.net (HELO disturbed) ([121.44.3.27]) by ipmail05.adl2.internode.on.net with ESMTP; 08 Aug 2008 10:03:45 +0930 Received: from dave by disturbed with local (Exim 4.69) (envelope-from ) id 1KRFvM-0007cy-FM; Fri, 08 Aug 2008 10:33:44 +1000 Date: Fri, 8 Aug 2008 10:33:44 +1000 From: Dave Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: PATCH] trivial xfs_remove comment fixup Subject: Re: PATCH] trivial xfs_remove comment fixup Message-ID: <20080808003344.GB24344@disturbed> Mail-Followup-To: Christoph Hellwig , xfs@oss.sgi.com References: <20080807215849.GA11345@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080807215849.GA11345@lst.de> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: ipmail05.adl2.internode.on.net[203.16.214.145] X-Barracuda-Start-Time: 1218155628 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2054 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 17441 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@fromorbit.com Precedence: bulk X-list: xfs On Thu, Aug 07, 2008 at 11:58:49PM +0200, Christoph Hellwig wrote: > The dp to ip comment should be for the unconditional xfs_droplink > call, and the "." link obviously only exists for directories, > so it should be in the is_dir conditional. Ack. Cheers, Dave. -- Dave Chinner david@fromorbit.com From owner-xfs@oss.sgi.com Thu Aug 7 22:15:46 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 22:16:24 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: * X-Spam-Status: No, score=1.4 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m785FkjU017046 for ; Thu, 7 Aug 2008 22:15:46 -0700 X-ASG-Debug-ID: 1218172617-1d4102f30000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wf-out-1314.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id F196A375CA0 for ; Thu, 7 Aug 2008 22:16:57 -0700 (PDT) Received: from wf-out-1314.google.com (wf-out-1314.google.com [209.85.200.172]) by cuda.sgi.com with ESMTP id SwtYChjJ6ijNjAY3 for ; Thu, 07 Aug 2008 22:16:57 -0700 (PDT) Received: by wf-out-1314.google.com with SMTP id 26so696615wfd.32 for ; Thu, 07 Aug 2008 22:16:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=uS881Ku/WMmszhFp85i6Ktk/N/G7l+V5jMlPzvPD30A=; b=ZzJEvfN9PMl8hfvuaTuDlKOdui90pDWhSkPNx1wg3oOmLGFIMyQBXoOYConL/6/9ik Ycpq050bfiYo4DBn6CWXNzNxkKzCB169UE7o9oJ4sCD9UNs2+YjOZta1IM8tJmjAyxbw kJ9eB8wpxOMzlwzHupZ5RYcZMqlZH1GhUMKfQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=h36oE/wFEL3A0g7f32fxDgj3hDUb8I84EbKKuOJ7/bZgAdQ39m/ywK5dpwpzKXSg1+ yCWwqMnIW7ho7BjgZQD97oqPZ/bZc8UpirusxRUkWsbin3v7q+oCBzcmnjK3hFDb9axD G3Sz99p1xliQE5Es4q4rm9yi/Fji5yPr2F910= Received: by 10.142.162.9 with SMTP id k9mr782749wfe.211.1218172616945; Thu, 07 Aug 2008 22:16:56 -0700 (PDT) Received: by 10.142.164.8 with HTTP; Thu, 7 Aug 2008 22:16:56 -0700 (PDT) Message-ID: Date: Fri, 8 Aug 2008 10:46:56 +0530 From: "Bhagi rathi" To: "Bhagi rathi" , "Lachlan McIlroy" , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers Subject: Re: TAKE 981498 - Use KM_NOFS for debug trace buffers In-Reply-To: MIME-Version: 1.0 References: <20080806061553.A8D8958C52A4@chook.melbourne.sgi.com> <20080806201957.GQ21635@disturbed> X-Barracuda-Connect: wf-out-1314.google.com[209.85.200.172] X-Barracuda-Start-Time: 1218172617 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0029 1.0000 -2.0021 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.00 X-Barracuda-Spam-Status: No, SCORE=-2.00 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE, MAILTO_TO_SPAM_ADDR X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2081 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 MAILTO_TO_SPAM_ADDR URI: Includes a link to a likely spammer email 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 1849 X-archive-position: 17442 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs On Thu, Aug 7, 2008 at 11:13 PM, Bhagi rathi wrote: > > > On Thu, Aug 7, 2008 at 1:49 AM, Dave Chinner wrote: > >> On Wed, Aug 06, 2008 at 10:42:15PM +0530, Bhagi rathi wrote: >> > I couldn't get a chance to read the diff's completely. If I click on >> > Lachlan's url for diff's, I couldn't access them. It looks to me that >> > the issue is not just with trace buffers. It can extend to xfs_iformat >> > as well. The same dead-lock can spring via >> > >> > xfs_iread -> xfs_iformat -> xfs_iformat_extents -> xfs_iext_add -> >> > xfs_iext_inline_to_direct -> which can do kmem_alloc with >> > KM_SLEEP flag. >> >> Fixed already: >> >> >> http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/xfs-2.6.git;a=commit;h=8c6266658cb76e282c14cb92f8ba5a1c674f4928 >> > > Thanks Dave. However, My concern is just not one allocation. We > need to clean all allocations that can re-enter to file-system. > I see that this issue exists in attributes format for i_afp allocations. > It may exist with local format of data and attributes too. > > xfs_iread->xfs_iformat->xfs_iformat_local. > Is the above issue fixed already? I see this existing in latest linux kernel. > > Are we safe that we fixed all these real problems by looking > into possible allocations that will enter into file-system? The > problem with trace buffers is telling us to clean this code path. > > By the way, I browse the source code from lxr.linux.no. > If I have to browse the latest xfs source code with linux > kernel that is used at SGI, how can I do that? > Any pointers on this? I wish to setup my own xfs development enviroment and details on this will be helpful. Cheers, Bhagi. > > > Cheers, > Bhagi. > >> >> Cheers, >> >> Dave. >> -- >> Dave Chinner >> david@fromorbit.com >> > > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Aug 7 22:21:36 2008 Received: with ECARTIS (v1.0.0; list xfs); Thu, 07 Aug 2008 22:21:37 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: * X-Spam-Status: No, score=1.3 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.11.20060308/8.12.11/SuSE Linux 0.7) with ESMTP id m785LZLA017654 for ; Thu, 7 Aug 2008 22:21:36 -0700 X-ASG-Debug-ID: 1218172970-12a602100000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wf-out-1314.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 243F211F5CEF for ; Thu, 7 Aug 2008 22:22:50 -0700 (PDT) Received: from wf-out-1314.google.com (wf-out-1314.google.com [209.85.200.169]) by cuda.sgi.com with ESMTP id Ai7uBcbotS4Ptxyz for ; Thu, 07 Aug 2008 22:22:50 -0700 (PDT) Received: by wf-out-1314.google.com with SMTP id 26so698268wfd.32 for ; Thu, 07 Aug 2008 22:22:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=9AyXsxYkVeKTsS5V32yoUzvKx9jm4q1s9ujgdDh2Y+M=; b=Hjrzy/60jU5DBAFAd4vDB+tg6dlvfo3jzPtNaukg6vuieNTHwyavgI6h4pnRBSGPF3 GBfvpx1Gpu6itI6RHdKTnzsMNxsspQemCQTsAQwVgEqVwVm1LmKqjrQBcTfkfRi0InGw OkmVP//hjIKydlJ94NAk5vIr86rJys4Tg8/Ys= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=HyvTcxE4pI4/JWEWEEPP/OgEZaeMdhgy3AHPHxyf9BUY0SMH4buPdTZFkfRBHE47ef UweV18GiWdIw+FM8KpDPTl/p9lkLuL2Y745HQS5LV3trmCfbRJvg5FvPJzgaX/CrZomI zuSfLcCXPXD10jR+Y38RGp5AuyAEmmBg8NOIQ= Received: by 10.143.29.17 with SMTP id g17mr783963wfj.239.1218172970092; Thu, 07 Aug 2008 22:22:50 -0700 (PDT) Received: by 10.142.164.8 with HTTP; Thu, 7 Aug 2008 22:22:50 -0700 (PDT) Message-ID: Date: Fri, 8 Aug 2008 10:52:50 +0530 From: "Bhagi rathi" To: "Bhagi rathi" , "Eric Sandeen" , "Lachlan McIlroy" , sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs Subject: Re: TAKE 981498 - use KM_MAYFAIL in xfs_mountfs In-Reply-To: <20080808003147.GA24344@disturbed> MIME-Version: 1.0 References: <20080806054121.CB2F258C52A4@chook.melbourne.sgi.com> <489A01B0.5050606@sandeen.net> <20080806202256.GR21635@disturbed> <20080808003147.GA24344@disturbed> X-Barracuda-Connect: wf-out-1314.google.com[209.85.200.169] X-Barracuda-Start-Time: 1218172971 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0082 1.0000 -1.9673 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.97 X-Barracuda-Spam-Status: No, SCORE=-1.97 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.1.2081 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV 0.91.2/6021/Wed Feb 27 15:55:48 2008 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 2093 X-archive-position: 17443 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs On Fri, Aug 8, 2008 at 6:01 AM, Dave Chinner wrote: > On Thu, Aug 07, 2008 at 10:53:55PM +0530, Bhagi rathi wrote: > > On Thu, Aug 7, 2008 at 1:52 AM, Dave Chinner > wrote: > > > > > On Wed, Aug 06, 2008 at 02:55:28PM -0500, Eric Sandeen wrote: > > > > Bhagi rathi wrote: > > > > > Why are we going to block for ever? Mounting a file-system > > > > > requires in-core log space buffers, reading of other buffers > > > > > which needs allocation of memory greater than per ag > > > > > structures. > > > ..... > > > > In general KM_MAYFAIL sounds like a good plan when you can handle the > > > > failure gracefully, I think. > > > > > > Yes, and that is the long term plan - to remove all KM_SLEEP > > > allocations from XFS and allow them to fail gracefully. There's > > > lots of work needed before we get there, though. e.g. > > > right now we cannot survive an ENOMEM error in a transaction.... > > > > > > I am not sure that we are solving right problem. Isn't the above is > > fall-out > > of XFS needing memory to clean dirty memory? > > We can't avoid that. It is inherent in the design of XFS. And the > amount of memory is not easily bounded so existing solutions like > wrapping slabs in mempools don't work, either. Interesting. The only dirty items are xfs inodes, quota's and meta-data buffers. We can fixed number of these items and ensure pushing of data one after the other in the case of crunch. What are the objects of dirty data where the memory is not bounded? > > > That is of good priority > > to engineer than this handling of ENOMEM in transactions. > > That's just one e