From michael.monnerie@is.it-management.at Tue Sep 1 02:19:48 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_33 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n817JSRe159400 for ; Tue, 1 Sep 2009 02:19:38 -0500 X-ASG-Debug-ID: 1251789592-136a02020000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mailsrv5.zmi.at (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 29DC6159ADB9 for ; Tue, 1 Sep 2009 00:19:55 -0700 (PDT) Received: from mailsrv5.zmi.at (mailsrv5.zmi.at [212.69.164.54]) by cuda.sgi.com with ESMTP id 4ve91BzIwDGKCtyz for ; Tue, 01 Sep 2009 00:19:55 -0700 (PDT) Received: from mailsrv.i.zmi.at (h081217106033.dyn.cm.kabsi.at [81.217.106.33]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailsrv2.i.zmi.at", Issuer "power4u.zmi.at" (not verified)) by mailsrv5.zmi.at (Postfix) with ESMTP id 7F92D689 for ; Tue, 1 Sep 2009 09:19:18 +0200 (CEST) Received: from saturn.localnet (saturn.i.zmi.at [10.72.27.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mailsrv.i.zmi.at (Postfix) with ESMTPSA id A33C940015E for ; Tue, 1 Sep 2009 09:19:20 +0200 (CEST) From: Michael Monnerie Organization: it-management http://it-management.at To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: zero size file after power failure with kernel 2.6.30.5 Subject: Re: zero size file after power failure with kernel 2.6.30.5 Date: Tue, 1 Sep 2009 09:18:30 +0200 User-Agent: KMail/1.10.3 (Linux/2.6.30.5-ZMI; KDE/4.1.3; x86_64; ; ) References: <200908292102.21710@zmi.at> <4A99A80C.9010307@sandeen.net> <19100.22644.149019.555685@tree.ty.sabi.co.uk> In-Reply-To: <19100.22644.149019.555685@tree.ty.sabi.co.uk> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart9961981.iA24zTgTkV"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200909010918.37886@zmi.at> X-Barracuda-Connect: mailsrv5.zmi.at[212.69.164.54] X-Barracuda-Start-Time: 1251789618 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7799 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean --nextPart9961981.iA24zTgTkV Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Dienstag 01 September 2009 Peter Grandi wrote: > Then 'mount' with '-o sync' [snip] Yes. I could also simply switch back to reiserfs, where I never had this=20 kind of issue, despite lots of crashes etc. I'm not here to blame the=20 devs, just wanted to report that this kind of problem still exists, and=20 maybe someone taps into the problem and can improve it. There was a similar problem with the change from ext3 to ext4, with a=20 big discussion. Ext4 has been improved, I don't know how good it is now. And I know lots of discussions whether the app or the kernel is wrong,=20 and whether you should fsync() after rename(). In ext4 they reorganized=20 the way metaupdates are done, maybe that can help xfs too. It seems kmail writes its config every 7 minutes, so it is vulnerable=20 for 3 seconds then. I've set vm.dirty_expire_centisecs =3D 1000 now to improve the situation a bit. mfg zmi =2D-=20 // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 --nextPart9961981.iA24zTgTkV Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) iEYEABECAAYFAkqcys0ACgkQzhSR9xwSCbR90gCg79ZmRZA9/cM81r8aiofBvnCR gV8AmwXkwcpvitkkmsnBt4bPRh0jOioc =eA5E -----END PGP SIGNATURE----- --nextPart9961981.iA24zTgTkV-- From michael.monnerie@is.it-management.at Tue Sep 1 03:45:35 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_33 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n818jF6n163075 for ; Tue, 1 Sep 2009 03:45:25 -0500 X-ASG-Debug-ID: 1251794739-74e503110000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mailsrv5.zmi.at (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A9FF4414FCA for ; Tue, 1 Sep 2009 01:45:40 -0700 (PDT) Received: from mailsrv5.zmi.at (mailsrv5.zmi.at [212.69.164.54]) by cuda.sgi.com with ESMTP id HTYwyfvovIqwbhE6 for ; Tue, 01 Sep 2009 01:45:40 -0700 (PDT) Received: from mailsrv.i.zmi.at (h081217106033.dyn.cm.kabsi.at [81.217.106.33]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailsrv2.i.zmi.at", Issuer "power4u.zmi.at" (not verified)) by mailsrv5.zmi.at (Postfix) with ESMTP id BF0B96AD for ; Tue, 1 Sep 2009 10:45:30 +0200 (CEST) Received: from saturn.localnet (saturn.i.zmi.at [10.72.27.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mailsrv.i.zmi.at (Postfix) with ESMTPSA id 909F340015E for ; Tue, 1 Sep 2009 10:45:30 +0200 (CEST) From: Michael Monnerie Organization: it-management http://it-management.at To: xfs@oss.sgi.com X-ASG-Orig-Subj: minor bug in xfsprogs-3.0.3 Subject: minor bug in xfsprogs-3.0.3 Date: Tue, 1 Sep 2009 10:44:44 +0200 User-Agent: KMail/1.10.3 (Linux/2.6.30.5-ZMI; KDE/4.1.3; x86_64; ; ) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200909011044.44938@zmi.at> X-Barracuda-Connect: mailsrv5.zmi.at[212.69.164.54] X-Barracuda-Start-Time: 1251794765 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7803 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean # xfs_info -V Usage: xfs_info [-V] [-t mtab] mountpoint It should print the version, right? mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 From pg_mh@sabi.co.UK Tue Sep 1 07:48:39 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81Cm9FY175068 for ; Tue, 1 Sep 2009 07:48:29 -0500 X-ASG-Debug-ID: 1251809340-6e6a00c90000-ps1ADW X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ty.sabi.co.UK (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4F26E415FDF for ; Tue, 1 Sep 2009 05:49:00 -0700 (PDT) Received: from ty.sabi.co.UK (82-69-39-138.dsl.in-addr.zen.co.uk [82.69.39.138]) by cuda.sgi.com with ESMTP id wacgQQjMk1OfsaZY for ; Tue, 01 Sep 2009 05:49:00 -0700 (PDT) Received: from from [127.0.0.1] (helo=tree.ty.sabi.co.uk) by ty.sabi.co.UK with esmtp(Exim 4.63 #1) id 1MiSjY-0003eT-W4 for ; Tue, 01 Sep 2009 12:45:13 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <19101.5976.387292.614270@tree.ty.sabi.co.uk> Date: Tue, 1 Sep 2009 12:45:12 +0000 X-Face: SMJE]JPYVBO-9UR%/8d'mG.F!@.,l@c[f'[%S8'BZIcbQc3/">GrXDwb#;fTRGNmHr^JFb SAptvwWc,0+z+~p~"Gdr4H$(|N(yF(wwCM2bW0~U?HPEE^fkPGx^u[*[yV.gyB!hDOli}EF[\cW*S H&spRGFL}{`bj1TaD^l/"[ msn( /TH#THs{Hpj>)]f> X-ASG-Orig-Subj: Re: xfs data loss Subject: Re: xfs data loss In-Reply-To: References: <4A975A35.3060809@sandeen.net> <4A981133.6060009@sandeen.net> X-Mailer: VM 8.0.12-devo-585 under 21.5 (beta27) XEmacs Lucid (i686-redhat-linux) From: pg_xf2@xf2.to.sabi.co.UK (Peter Grandi) X-Disclaimer: This message contains only personal opinions X-Barracuda-Connect: 82-69-39-138.dsl.in-addr.zen.co.uk[82.69.39.138] X-Barracuda-Start-Time: 1251809344 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.92 X-Barracuda-Spam-Status: No, SCORE=-1.92 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=RDNS_DYNAMIC X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7815 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.10 RDNS_DYNAMIC Delivered to trusted network by host with dynamic-looking rDNS X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean > [ ... ] such a harsh way. Harsh? That sounds way too harsh. :-) When you write to a mailing list asking for free help and support, it is rather rude to not have done some preliminary work, such as figuring out the characterisics of RAID5 in case of failure. It is also somewhat rude (but amazingly common) to make confused and partial reports, such as not checking and reporting what has actually failed. > Is this the habit of this mailing list? Depends -- some people here are XFS salesmen, in that their career and employability depend at least in part on widespread adoption of XFS, and on support from other kernel subsystem guys, who may be one day on an interview panel (the guild of Linux kernel hackers is a pretty small and closed world in practice). These are sell-side engineers, and they will be smooth and emollient even in the face of outrageously ridiculous stuff. Sell-side engineers just like sell-side stack analyst never issue anything as harsh as a "sell" recommendation. That's what I do myself when I am on the sell-side, to my coworkers and customers; they pay me to solve their problems, not to tell them they are idiots for creating those problems, and suffering fools gladly is pat of what I get paid for. But here I am on the buy-side; I am buying XFS (and the Linux block layer), not selling it. Not only that, I am providing unpaid opinions. Since I am here buying, and actually paying with my time, I can comment more openly than a someone with a sell-side POV, but still in a relatively soft way, about the merit of the issues I comment upon. > Apart from that, thank you for you help. But a soft but more open assessment of how outrageous some queries are is help too as it makes it easier to assess the gravity of the situation. The smooth, emollient sell-side people will let you dig your own grave. Just consider your statement below about "assume clean" that to me sounds very dangerous (big euphemism), and that did not elicit any warning from the sell-side: > Moreover, when a raid loses 2 devices, and the devices are still > ok, it is possible to reassemble the raid by assuming the > devices clean. Sure you can reassemble the RAID, but what do you mean by "still ok"? Have you read-tested those 2 drives? Have you tested the *other* 18 drives? How do you know none of the other 18 drives got damaged? Have you verified that only the host adapter electronics failed or whatever it was that made those 2 drives drop out? Why do you *need* to assume clean? If the 2 "lost" drives are really ok, you just resync the array. If you *need* to assume clean, it is likely that you have lost something like 5% of data in (every stripe and thus) most files and directories (and internal metadata) and will be replacing it with random bytes. That will very likely cause XFS problems (the least of the problems of course). > I understand that RAID5 is not the ideal solution for that > system, [ ... ] That we don't know for sure; I personaly very much dislike RAID5, but for throw-away mostly read-only data I have to concede that it seems appropriate. It is rather better than RAID6 in almost every reasonable situation. Still a 19+1 array sounds rather bizarre to say the least. Especially in a place where part of the everyday activity is earthquake simulation... > But apart from that, it is not as easy to backup 20 TB, Or to 'fsck' several TB as you also discovered. Anyhow my opinion is that the best way to backup large storage servers is another large storage server (or more than one). When I buy a hard drive I buy 3 backup drives for each "live" drive I use -- at *home*. > so we decided to set it as data storage leaving the > responsibilty of the backup to our users. I do not consider it > completely absurd. Not at all absurd -- if those users *really* accept that. But you are trying to recover the arrays instead of scratching them and restarting. That suggests to me that the users did not actually accept that. If the real agreement with the users is "you have to keep backups, but if something happens you will behave as if you cannot or don't want to restore them" it is quite different. > This is not the case for /Raid/md4, where apparently all devices > are there. That's not so clear. One problem with trying to provide some opinions on your issue and whether the filesystems are recoverable is that you haven't made clear what failed and how you tested each component of each array to make sure that what is still working is known (and talk of "assume clean" is very suspicious). I'd check *everything* because until then you don't know how much has been damaged where, as a major power issue may have affected *everything* even if only partially. When you wrote: > one half (5 TB) of the user directories on /dev/md4 have > disappeared. that seems to indicate some major filesystem metadata and data loss, and the idea of "assume clean" seems to me extremely dangerous. Also '/dev/md5' seems to have reported serious drive issues, so perhaps something bad happened to the '/dev/md4' drives too. That you have tried to run repair tools on a filesystem with an incomplete storage layer may have made things rather worse, so knowing *exactly* what has failed may help you a lot. From pg_mh@sabi.co.UK Tue Sep 1 07:48:48 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81CmI4r175077 for ; Tue, 1 Sep 2009 07:48:38 -0500 X-ASG-Debug-ID: 1251809344-6e6a00ca0000-ps1ADW X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ty.sabi.co.UK (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id AB388415FE9 for ; Tue, 1 Sep 2009 05:49:04 -0700 (PDT) Received: from ty.sabi.co.UK (82-69-39-138.dsl.in-addr.zen.co.uk [82.69.39.138]) by cuda.sgi.com with ESMTP id ZZuDEwsSwuArvu31 for ; Tue, 01 Sep 2009 05:49:04 -0700 (PDT) Received: from from [127.0.0.1] (helo=tree.ty.sabi.co.uk) by ty.sabi.co.UK with esmtp(Exim 4.63 #1) id 1MiQfO-000376-N7 for ; Tue, 01 Sep 2009 10:32:46 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <19100.63566.98250.185404@tree.ty.sabi.co.uk> Date: Tue, 1 Sep 2009 10:32:46 +0000 X-Face: SMJE]JPYVBO-9UR%/8d'mG.F!@.,l@c[f'[%S8'BZIcbQc3/">GrXDwb#;fTRGNmHr^JFb SAptvwWc,0+z+~p~"Gdr4H$(|N(yF(wwCM2bW0~U?HPEE^fkPGx^u[*[yV.gyB!hDOli}EF[\cW*S H&spRGFL}{`bj1TaD^l/"[ msn( /TH#THs{Hpj>)]f> X-ASG-Orig-Subj: Re: zero size file after power failure with kernel 2.6.30.5 Subject: Re: zero size file after power failure with kernel 2.6.30.5 In-Reply-To: <200909010918.37886@zmi.at> References: <200908292102.21710@zmi.at> <4A99A80C.9010307@sandeen.net> <19100.22644.149019.555685@tree.ty.sabi.co.uk> <200909010918.37886@zmi.at> X-Mailer: VM 8.0.12-devo-585 under 21.5 (beta27) XEmacs Lucid (i686-redhat-linux) From: pg_xf2@xf2.sabi.co.UK (Peter Grandi) X-Disclaimer: This message contains only personal opinions X-Barracuda-Connect: 82-69-39-138.dsl.in-addr.zen.co.uk[82.69.39.138] X-Barracuda-Start-Time: 1251809350 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.92 X-Barracuda-Spam-Status: No, SCORE=-1.92 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=RDNS_DYNAMIC X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7815 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.10 RDNS_DYNAMIC Delivered to trusted network by host with dynamic-looking rDNS X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean [ ... ] >> Then 'mount' with '-o sync' [ ... ] > Yes. I could also simply switch back to reiserfs, where I > never had this kind of issue, despite lots of crashes etc. Other people have a very different impression. Like 'ext3' ReiserFS does ordered writes, but those don't necessarily help because of the colossal amount of buffering that happens anyhow nowadays. > [ ... ] maybe someone taps into the problem and can improve > it. It is foremost an application problem, and then a block layer problem. The first is unsolvable ("user space sucks") in our lifetimes, and the second depends on the goodwill of the proprietors of the relevant kernel subsystem. As to application design, XFS is targeted at heavily parallel workloads on large storage arrays; its design takes advantage of what API semantics permit to improve that use case, and relies on applications making use of those API semantics properly. If that and having good scalable performance at the same time requires having dual power supplies, redundant storage paths, and battery backup, that is the typical platform on which XFS is deployed. > There was a similar problem with the change from ext3 to ext4, > with a big discussion. Ext4 has been improved, Actually it has been made worse, to compensate for bad application and block layer behaviour. Red Hat with 'ext4' have been trying to imply that an in-place upgrade to an 'ext3' compatible filesystem can support every possible point on the spectrum. Well, it turned out that they cannot. So there have been motions towards supporting XFS in 5.4, to have a dual-filesystem strategy, which is what a large number of their important enterprise customers do anyhow. > [ ... ] In ext4 they reorganized the way metaupdates are done, > maybe that can help xfs too. But that makes performance worse in the large/paralell case. > [ ... ] It seems kmail writes its config every 7 minutes, so > it is vulnerable for 3 seconds then. That won't help that much. Apps and the block layer are really designed for older, gentler times. And never mind the clueless, moronic "optimization" of Linux block layer plugging/unplugging. Currently a single disk can write 100MB/s, memory sizes on many _laptops_ are 4GB with potentially 1-2GB or 10-20s of writes cached. On a server one can have RAIDs that can write at/s. If applications and the block layer are misbehaving, and '-o sync' is not used, even if one flushes cache every second, there can still be dozens of MB (on a laptop) to some GB (on a server) that get lost in that one second. The filesystem can try hard to ensure that metadata gets written nearly immediately, ensuring 'fsck'-consistency, but it cannot do that for data in any sensible way unless the application and the block layer do the right thing, so data persistency is at best elusive. > I've set vm.dirty_expire_centisecs = 1000 now to improve the > situation a bit. It does not help that not only the applications and the block layer are misdesigned, but they also misdesigned for a time where data rates were a lot lower, so outstanding updates were bounded a lot lower. There are workarounds and by careful patching and changing default settings one can palliate the worst situations; but for example 10 seconds of 'dirty_expire_centisecs' seems way too long (IIRC you have a fairly large memory and RAID) and other settings matter more. I have written quite a bit in my blog about these issues, and you may find this particular entry rather relevant: http://www.sabi.co.uk/blog/0707jul.html#070701 In general on a fast machine I would use: vm/dirty_ratio =4 vm/dirty_background_ratio =2 vm/dirty_expire_centisecs =400 vm/dirty_writeback_centisecs =200 or half of every one. Short flushing times also ensure more continuous flushing (without huge periodic gulps), which can significantly improve *write* performance for streaming applications (XFS etc. delayed allocation is designed to improve read performance despite the lack of preallocation). This cannot be done on laptops, where short flushing times are bad for power consumption, but at least they are battery backed, and hopefully SSDs will save us anyhow. From eflorac@intellique.com Tue Sep 1 09:20:16 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,RCVD_IN_BRBL autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81EJuKv179894 for ; Tue, 1 Sep 2009 09:20:06 -0500 X-ASG-Debug-ID: 1251814782-322a013a0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from smtp2-g21.free.fr (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7E7504164FA for ; Tue, 1 Sep 2009 07:20:01 -0700 (PDT) Received: from smtp2-g21.free.fr (smtp2-g21.free.fr [212.27.42.2]) by cuda.sgi.com with ESMTP id 9I1GYpknkffHXbmB for ; Tue, 01 Sep 2009 07:20:01 -0700 (PDT) Received: from smtp2-g21.free.fr (localhost [127.0.0.1]) by smtp2-g21.free.fr (Postfix) with ESMTP id 1C8264B01D8 for ; Tue, 1 Sep 2009 16:19:05 +0200 (CEST) Received: from harpe.intellique.com (labo.djinux.com [82.225.196.72]) by smtp2-g21.free.fr (Postfix) with ESMTP id 2C0B24B0143 for ; Tue, 1 Sep 2009 16:19:02 +0200 (CEST) Date: Tue, 1 Sep 2009 16:19:06 +0200 From: Emmanuel Florac To: Linux XFS X-ASG-Orig-Subj: Re: zero size file after power failure with kernel 2.6.30.5 Subject: Re: zero size file after power failure with kernel 2.6.30.5 Message-ID: <20090901161906.0a4ca1d1@harpe.intellique.com> In-Reply-To: <19100.63566.98250.185404@tree.ty.sabi.co.uk> References: <200908292102.21710@zmi.at> <4A99A80C.9010307@sandeen.net> <19100.22644.149019.555685@tree.ty.sabi.co.uk> <200909010918.37886@zmi.at> <19100.63566.98250.185404@tree.ty.sabi.co.uk> Organization: Intellique X-Mailer: Claws Mail 3.7.1 (GTK+ 2.16.4; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Barracuda-Connect: smtp2-g21.free.fr[212.27.42.2] X-Barracuda-Start-Time: 1251814852 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7822 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Le Tue, 1 Sep 2009 10:32:46 +0000 pg_xf2@xf2.sabi.co.UK (Peter Grandi) =E9crivait: > If that and having good scalable performance at the same time > requires having dual power supplies, redundant storage paths, > and battery backup, that is the typical platform on which XFS is > deployed. To mitigate this, I used systems with XFS daily for the last 13 years, (including IRIX workstations or PCs with only one drive) and had only once a problem clearly related to XFS (a well known bug, long corrected nowadays). --=20 ---------------------------------------- Emmanuel Florac | Intellique ---------------------------------------- From just.for.lkml@googlemail.com Tue Sep 1 11:59:45 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=AWL,BAYES_00,HEADER_ESQ autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81GxP2G188445 for ; Tue, 1 Sep 2009 11:59:35 -0500 X-ASG-Debug-ID: 1251824410-4a3703820000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail-bw0-f216.google.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id CC317159CCAE for ; Tue, 1 Sep 2009 10:00:11 -0700 (PDT) Received: from mail-bw0-f216.google.com (mail-bw0-f216.google.com [209.85.218.216]) by cuda.sgi.com with ESMTP id nR8z18Vrr54xdQlp for ; Tue, 01 Sep 2009 10:00:11 -0700 (PDT) Received: by bwz12 with SMTP id 12so134975bwz.20 for ; Tue, 01 Sep 2009 10:00:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=X4iTiJ/9yf/RnmXcsxWuABJK+RG2qQ6uUDeCOt8n98A=; b=VtBQWLYkOQOIauG80pyb/Qs1mARjFnjKPn0W44vXG/oOOOHTMFd4W3Tdf/B0CKwmVG N3mAn6beSQ19wwFPeUd8qXigSDPblxDNu+//Fzl+PcU2oYBFEs0pqp8Yaw0LxnWtv/q9 S2hZACgEu7NvJoZMqGS9eHu6oZaPa8sWBUhYk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=KCes2ylM0V6xTWczlkfnnu43yuNhaZPolJE1iazHBZcqqNTDRYA/ByCbag4QfdK7i/ Xn6qADoWjbQYGb28ogkOxPaKuk+GxMfEhPqBmriNMbiYQCnwbLNFjyNPuoAuHxNBtMo/ 8r2BY7jsfs/q3We9FGVn/0vSMWfHJtvDZ9aDc= MIME-Version: 1.0 Received: by 10.223.6.23 with SMTP id 23mr2849976fax.89.1251824406221; Tue, 01 Sep 2009 10:00:06 -0700 (PDT) In-Reply-To: <20090831182754.GA3620@infradead.org> References: <4A9B759B.7020401@msgid.tls.msk.ru> <20090831123010.GA2368@infradead.org> <64bb37e0908311114t4a3cefc3v8ea5092e1558c578@mail.gmail.com> <20090831182754.GA3620@infradead.org> Date: Tue, 1 Sep 2009 19:00:05 +0200 Message-ID: <64bb37e0909011000l78a0aef0wf4c53252c14af75e@mail.gmail.com> X-ASG-Orig-Subj: Re: xfs compat_ioctl? Subject: Re: xfs compat_ioctl? From: Torsten Kaiser To: Christoph Hellwig Cc: Michael Tokarev , Linux-kernel , linux-fsdevel , xfs@oss.sgi.com Content-Type: text/plain; charset=ISO-8859-1 X-Barracuda-Connect: mail-bw0-f216.google.com[209.85.218.216] X-Barracuda-Start-Time: 1251824417 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7831 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Mon, Aug 31, 2009 at 8:27 PM, Christoph Hellwig wrote: > I think you are right, the constant used is incorrect. Does the small > patch below fix it for you? Yes, after adding this patch, xfs_fsr works. > Index: linux-2.6/fs/xfs/linux-2.6/xfs_ioctl32.c > =================================================================== > --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_ioctl32.c 2009-08-31 15:25:06.093044591 -0300 > +++ linux-2.6/fs/xfs/linux-2.6/xfs_ioctl32.c 2009-08-31 15:25:10.856544216 -0300 > @@ -619,7 +619,7 @@ xfs_file_compat_ioctl( > case XFS_IOC_GETVERSION_32: > cmd = _NATIVE_IOC(cmd, long); > return xfs_file_ioctl(filp, cmd, p); > - case XFS_IOC_SWAPEXT: { > + case XFS_IOC_SWAPEXT_32: { > struct xfs_swapext sxp; > struct compat_xfs_swapext __user *sxu = arg; > > From sandeen@sandeen.net Tue Sep 1 12:55:30 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81HtAQA197497 for ; Tue, 1 Sep 2009 12:55:20 -0500 X-ASG-Debug-ID: 1251827739-729301110000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 317F315A10C5 for ; Tue, 1 Sep 2009 10:55:39 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by cuda.sgi.com with ESMTP id lsfdAjSuNmlgl5F0 for ; Tue, 01 Sep 2009 10:55:39 -0700 (PDT) Received: from int-mx04.intmail.prod.int.phx2.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.17]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id n81HtPTk003063; Tue, 1 Sep 2009 13:55:26 -0400 Received: from neon.msp.redhat.com (neon.msp.redhat.com [10.15.80.10]) by int-mx04.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id n81HtNE2015956; Tue, 1 Sep 2009 13:55:25 -0400 Message-ID: <4A9D600B.4020405@sandeen.net> Date: Tue, 01 Sep 2009 12:55:23 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.21 (X11/20090320) MIME-Version: 1.0 To: Torsten Kaiser CC: Christoph Hellwig , linux-fsdevel , Michael Tokarev , Linux-kernel , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: xfs compat_ioctl? Subject: Re: xfs compat_ioctl? References: <4A9B759B.7020401@msgid.tls.msk.ru> <20090831123010.GA2368@infradead.org> <64bb37e0908311114t4a3cefc3v8ea5092e1558c578@mail.gmail.com> <20090831182754.GA3620@infradead.org> <64bb37e0909011000l78a0aef0wf4c53252c14af75e@mail.gmail.com> In-Reply-To: <64bb37e0909011000l78a0aef0wf4c53252c14af75e@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 10.5.11.17 X-Barracuda-Connect: mx1.redhat.com[209.132.183.28] X-Barracuda-Start-Time: 1251827763 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7834 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Torsten Kaiser wrote: > On Mon, Aug 31, 2009 at 8:27 PM, Christoph Hellwig wrote: >> I think you are right, the constant used is incorrect. Does the small >> patch below fix it for you? > > Yes, after adding this patch, xfs_fsr works. Crud, sorry about that. I swear I ran 32-bit xfstests under a 64-bit kernel, but I think we were lacking in fsr coverage.... -Eric >> Index: linux-2.6/fs/xfs/linux-2.6/xfs_ioctl32.c >> =================================================================== >> --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_ioctl32.c 2009-08-31 15:25:06.093044591 -0300 >> +++ linux-2.6/fs/xfs/linux-2.6/xfs_ioctl32.c 2009-08-31 15:25:10.856544216 -0300 >> @@ -619,7 +619,7 @@ xfs_file_compat_ioctl( >> case XFS_IOC_GETVERSION_32: >> cmd = _NATIVE_IOC(cmd, long); >> return xfs_file_ioctl(filp, cmd, p); >> - case XFS_IOC_SWAPEXT: { >> + case XFS_IOC_SWAPEXT_32: { >> struct xfs_swapext sxp; >> struct compat_xfs_swapext __user *sxu = arg; >> >> > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs > From BATV+88945cdf03a0dcaa8c4e+2200+infradead.org+hch@bombadil.srs.infradead.org Tue Sep 1 13:05:47 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81I5LsR198025 for ; Tue, 1 Sep 2009 13:05:37 -0500 X-ASG-Debug-ID: 1251828377-7294015e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 252C014EE726 for ; Tue, 1 Sep 2009 11:06:17 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id IJJVSY57Ers7AhbA for ; Tue, 01 Sep 2009 11:06:17 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1MiXhE-0007C5-9r for xfs@oss.sgi.com; Tue, 01 Sep 2009 18:03:08 +0000 Date: Tue, 1 Sep 2009 14:03:08 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH] xfs: actually enable the swapext compat handler Subject: [PATCH] xfs: actually enable the swapext compat handler Message-ID: <20090901180308.GA26071@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251828378 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Fix a small typo in the compat ioctl handler that cause the swapext compat handler to never be called. Signed-off-by: Christoph Hellwig Reviewed-by: Torsten Kaiser Tested-by: Torsten Kaiser Index: linux-2.6/fs/xfs/linux-2.6/xfs_ioctl32.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_ioctl32.c 2009-08-31 15:25:06.093044591 -0300 +++ linux-2.6/fs/xfs/linux-2.6/xfs_ioctl32.c 2009-08-31 15:25:10.856544216 -0300 @@ -619,7 +619,7 @@ xfs_file_compat_ioctl( case XFS_IOC_GETVERSION_32: cmd = _NATIVE_IOC(cmd, long); return xfs_file_ioctl(filp, cmd, p); - case XFS_IOC_SWAPEXT: { + case XFS_IOC_SWAPEXT_32: { struct xfs_swapext sxp; struct compat_xfs_swapext __user *sxu = arg; From BATV+88945cdf03a0dcaa8c4e+2200+infradead.org+hch@bombadil.srs.infradead.org Tue Sep 1 13:07:49 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81I7O9P198127 for ; Tue, 1 Sep 2009 13:07:39 -0500 X-ASG-Debug-ID: 1251828500-726c01770000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id F0BA1159E3B8 for ; Tue, 1 Sep 2009 11:08:20 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id nenOO104O2S2QQF5 for ; Tue, 01 Sep 2009 11:08:20 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1MiXmG-0008DM-L0 for xfs@oss.sgi.com; Tue, 01 Sep 2009 18:08:20 +0000 Date: Tue, 1 Sep 2009 14:08:20 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: PATCH] xfs: implement .dirty_inode to fix timestamp handling Subject: Re: PATCH] xfs: implement .dirty_inode to fix timestamp handling Message-ID: <20090901180820.GB26071@infradead.org> References: <20090827031242.GB6147@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090827031242.GB6147@infradead.org> User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251828500 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean I managed to trigger the ASSERT in the reclaim path, so it looks both this version and our previous code is buggy. It's back to the drawing board for now until I gifure out what's going on. From BATV+88945cdf03a0dcaa8c4e+2200+infradead.org+hch@bombadil.srs.infradead.org Tue Sep 1 13:13:10 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81ICjVF198483 for ; Tue, 1 Sep 2009 13:13:00 -0500 X-ASG-Debug-ID: 1251828821-721901db0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 101C3417D89 for ; Tue, 1 Sep 2009 11:13:41 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id QmwgHQfsI8SY7ai6 for ; Tue, 01 Sep 2009 11:13:41 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1MiXrR-0001Cl-EL; Tue, 01 Sep 2009 18:13:41 +0000 Date: Tue, 1 Sep 2009 14:13:41 -0400 From: Christoph Hellwig To: John Quigley Cc: XFS Development X-ASG-Orig-Subj: Re: XFS corruption with power failure Subject: Re: XFS corruption with power failure Message-ID: <20090901181341.GC26071@infradead.org> References: <606994882.2142291250648292843.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> <4A8C1E6E.8020405@jquigley.com> <4A9187C7.9010206@jquigley.com> <4A9576F0.2060304@jquigley.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4A9576F0.2060304@jquigley.com> User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251828822 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Wed, Aug 26, 2009 at 12:54:56PM -0500, John Quigley wrote: > John Quigley wrote: >> John Quigley wrote: >>> We've distilled this into a reproducible environment with a stack of >>> NFS + XFS to a local disk + automated sysrq 'b' reboots. We're >>> working on getting this bundled up into a nice little package as a >>> VirtualBox vm for your consumption. Please tell me if this is not >>> desirable. >> >> The self-contained and reproducible environment can be downloaded from >> the following location: >> >> http://www.jquigley.com/tmp/xfsVM.tar.bz2 > > Has anyone by chance had an opportunity to utilize this? Any corruption reports? Looked at it, but it turns virtualbox is a real big pile of junk including it's own huge kernel module. Qemu/kvm now has support for the virtualbox disk images and I will give it a try next. From richardc@efilmgroup.com Tue Sep 1 13:27:00 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: *** X-Spam-Status: No, score=3.3 required=5.0 tests=BAYES_50,HTML_MESSAGE, URIBL_GREY autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81IQdJN199224 for ; Tue, 1 Sep 2009 13:26:49 -0500 X-ASG-Debug-ID: 1251829647-2da900860000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from goff1.goffgrafix.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 9A9CA15A0EF9 for ; Tue, 1 Sep 2009 11:27:27 -0700 (PDT) Received: from goff1.goffgrafix.com (goff1.goffgrafix.com [208.43.246.232]) by cuda.sgi.com with ESMTP id qdec0hBv116tg1sZ for ; Tue, 01 Sep 2009 11:27:27 -0700 (PDT) Received: from c-76-118-59-6.hsd1.ma.comcast.net ([76.118.59.6] helo=Distrobution) by goff1.goffgrafix.com with esmtpa (Exim 4.69) (envelope-from ) id 1MiXph-0000pp-IK; Tue, 01 Sep 2009 14:11:53 -0400 From: "Richard Cohen" To: X-ASG-Orig-Subj: Active Shooters - Response Training Subject: Active Shooters - Response Training Date: Tue, 1 Sep 2009 14:11:40 -0400 Message-ID: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_00C0_01CA2B0E.27A87B50" X-Mailer: Microsoft Office Outlook 11 Thread-Index: AcorFgn3Y82VyBjdQ164b/W5NvEbdgAABijAAADL3cAAADO2gAAEfCkgAABCW1AAAC9+QAAAFOOgAABK8yAAABHZcA== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 X-Antivirus: avast! (VPS 090831-0, 08/31/2009), Outbound message X-Antivirus-Status: Clean X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - goff1.goffgrafix.com X-AntiAbuse: Original Domain - oss.sgi.com X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - efilmgroup.com X-Barracuda-Connect: goff1.goffgrafix.com[208.43.246.232] X-Barracuda-Start-Time: 1251829649 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.01 X-Barracuda-Spam-Status: No, SCORE=-2.01 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC0_SA_TO_FROM_DOMAIN_MATCH, HTML_MESSAGE, HTTP_ESCAPED_HOST X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7836 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTTP_ESCAPED_HOST URI: Uses %-escapes inside a URL's hostname 0.00 HTML_MESSAGE BODY: HTML included in message 0.01 BSF_SC0_SA_TO_FROM_DOMAIN_MATCH Sender Domain Matches Recipient Domain X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean This is a multi-part message in MIME format. ------=_NextPart_000_00C0_01CA2B0E.27A87B50 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Tactics and strategy for rapid response to an active shooter incident In mass shootings, the usual rules do not apply Waiting for support teams is not an option Mass killings at Columbine High School and Virginia Tech are only two of the many notorious incidents that have been widely reported. The 2008 terrorist attack in Mumbai may represent a new level of sophistication in active shooter incidents. Pre-incident planning and immediate action by response personnel are required to control the outcome of a mass killing and minimize the number of dead and injured. Emergency Film Group has produced "Active Shooter: Rapid Response," a DVD-based training package for law enforcement, emergency management, and other emergency personnel who may respond to a mass shooting. Included with the package is a 35-minute training film, a bonus segment on "Tactics," and an Instructor's CD-Rom with customizable PowerPoint presentation, Post-Seminar Quiz, and other resources helpful in presenting a seminar. Package price is just $425. Volume discounts apply. Please call or write richardc@efilmgroup.com. For a free preview clip, visit http://www.efilmgroup.com/Law-Enforcement/Active-Shooter-Rapid-Response-Vide o.html. This message was sent from: Emergency Film Group, 140 Cooke St., Edgartown, MA 02539. You may unsubscribe by replying with "Unsubscribe" in the subject line. ------=_NextPart_000_00C0_01CA2B0E.27A87B50 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

 

Tactics and strategy for rapid response to an active shooter incident

 

 

 

In mass shootings, the usual rules do not = apply

Waiting for support teams is not an = option 

 

Mass = killings at Columbine High School and Virginia Tech are only two of the = many notorious incidents that have been widely reported. The 2008 = terrorist attack in Mumbai may represent a new level of sophistication = in active shooter incidents. Pre-incident planning and immediate action by = response personnel are required to control the outcome of a mass killing = and minimize the number of dead and = injured.

 

Emergency = Film Group has produced "Active Shooter: Rapid = Response," a DVD-based training package for law enforcement, emergency = management, and other emergency personnel who may respond to a mass shooting.  Included with the package is a = 35-minute training film, a bonus segment on "Tactics," and an Instructor's CD-Rom with customizable PowerPoint = presentation, Post-Seminar Quiz, and other resources helpful in presenting a seminar.   Package price is just $425. Volume = discounts apply. Please call or write richardc@efilmgroup.com. &n= bsp;

 

 

 

This message was sent from: Emergency Film = Group, 140 Cooke St., Edgartown, MA 02539. You may unsubscribe by replying with “Unsubscribe” in the = subject line.

 

 

 

------=_NextPart_000_00C0_01CA2B0E.27A87B50-- From jquigley@jquigley.com Tue Sep 1 14:16:43 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81JGMS9201750 for ; Tue, 1 Sep 2009 14:16:33 -0500 X-ASG-Debug-ID: 1251832634-470100540000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.jquigley.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 45C07418225 for ; Tue, 1 Sep 2009 12:17:14 -0700 (PDT) Received: from mail.jquigley.com (main.jquigley.com [67.23.32.156]) by cuda.sgi.com with ESMTP id bG3IWaspvvqtWdfm for ; Tue, 01 Sep 2009 12:17:14 -0700 (PDT) Received: from [10.1.1.10] (OSH-NAT-213-67.onshore.net [66.146.213.67]) (Authenticated sender: jquigley@mail.jquigley.com) by mail.jquigley.com (Postfix) with ESMTPSA id 6F926204052 for ; Tue, 1 Sep 2009 19:17:11 +0000 (UTC) Message-ID: <4A9D7334.2040500@jquigley.com> Date: Tue, 01 Sep 2009 14:17:08 -0500 From: John Quigley User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: XFS Development X-ASG-Orig-Subj: Re: XFS corruption with power failure Subject: Re: XFS corruption with power failure References: <606994882.2142291250648292843.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> <4A8C1E6E.8020405@jquigley.com> <4A9187C7.9010206@jquigley.com> In-Reply-To: <4A9187C7.9010206@jquigley.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Connect: main.jquigley.com[67.23.32.156] X-Barracuda-Start-Time: 1251832639 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7838 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean John Quigley wrote: > John Quigley wrote: >> We've distilled this into a reproducible environment with a stack of >> NFS + XFS to a local disk + automated sysrq 'b' reboots. We're >> working on getting this bundled up into a nice little package as a >> VirtualBox vm for your consumption. Please tell me if this is not >> desirable. By way of an update, the corruption is definitely specific to Linux nfsd access to XFS at time of power failure. We've be unable to reproduce the problem in any other context except when running IO through NFS to the underlying XFS mount. - John Quigley From sandeen@sandeen.net Tue Sep 1 14:46:00 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81JjebJ203182 for ; Tue, 1 Sep 2009 14:45:50 -0500 X-ASG-Debug-ID: 1251834366-467500f60000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 201254183FA for ; Tue, 1 Sep 2009 12:46:06 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by cuda.sgi.com with ESMTP id wl3WKzDN7YItWKP0 for ; Tue, 01 Sep 2009 12:46:06 -0700 (PDT) Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id n81Jk0qt021149; Tue, 1 Sep 2009 15:46:00 -0400 Received: from neon.msp.redhat.com (neon.msp.redhat.com [10.15.80.10]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id n81JjxYh007765; Tue, 1 Sep 2009 15:45:59 -0400 Message-ID: <4A9D79F7.9080702@sandeen.net> Date: Tue, 01 Sep 2009 14:45:59 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.21 (X11/20090320) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] xfs: actually enable the swapext compat handler Subject: Re: [PATCH] xfs: actually enable the swapext compat handler References: <20090901180308.GA26071@infradead.org> In-Reply-To: <20090901180308.GA26071@infradead.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-Barracuda-Connect: mx1.redhat.com[209.132.183.28] X-Barracuda-Start-Time: 1251834393 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7840 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Christoph Hellwig wrote: > Fix a small typo in the compat ioctl handler that cause the swapext > compat handler to never be called. > > Signed-off-by: Christoph Hellwig > Reviewed-by: Torsten Kaiser > Tested-by: Torsten Kaiser Guilty-party: Eric Sandeen Reviewed-by: Eric Sandeen > Index: linux-2.6/fs/xfs/linux-2.6/xfs_ioctl32.c > =================================================================== > --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_ioctl32.c 2009-08-31 15:25:06.093044591 -0300 > +++ linux-2.6/fs/xfs/linux-2.6/xfs_ioctl32.c 2009-08-31 15:25:10.856544216 -0300 > @@ -619,7 +619,7 @@ xfs_file_compat_ioctl( > case XFS_IOC_GETVERSION_32: > cmd = _NATIVE_IOC(cmd, long); > return xfs_file_ioctl(filp, cmd, p); > - case XFS_IOC_SWAPEXT: { > + case XFS_IOC_SWAPEXT_32: { > struct xfs_swapext sxp; > struct compat_xfs_swapext __user *sxu = arg; > > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs > From BATV+88945cdf03a0dcaa8c4e+2200+infradead.org+hch@bombadil.srs.infradead.org Tue Sep 1 15:13:18 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81KCrik204599 for ; Tue, 1 Sep 2009 15:13:08 -0500 X-ASG-Debug-ID: 1251836009-575201a90000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id EC75615A20E3 for ; Tue, 1 Sep 2009 13:13:29 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id POHuqa8KStNqGEv3 for ; Tue, 01 Sep 2009 13:13:29 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1MiZjN-0003o2-E3; Tue, 01 Sep 2009 20:13:29 +0000 Date: Tue, 1 Sep 2009 16:13:29 -0400 From: Christoph Hellwig To: Eric Sandeen Cc: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] xfs: actually enable the swapext compat handler Subject: Re: [PATCH] xfs: actually enable the swapext compat handler Message-ID: <20090901201329.GA14400@infradead.org> References: <20090901180308.GA26071@infradead.org> <4A9D79F7.9080702@sandeen.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4A9D79F7.9080702@sandeen.net> User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251836029 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Tue, Sep 01, 2009 at 02:45:59PM -0500, Eric Sandeen wrote: > Christoph Hellwig wrote: > > Fix a small typo in the compat ioctl handler that cause the swapext > > compat handler to never be called. > > > > Signed-off-by: Christoph Hellwig > > Reviewed-by: Torsten Kaiser > > Tested-by: Torsten Kaiser > > Guilty-party: Eric Sandeen > Reviewed-by: Eric Sandeen Haha, thanks. Felix, can you push this one to Linus for 2.6.31? From felixb@sgi.com Tue Sep 1 16:37:32 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81LbC6p209103 for ; Tue, 1 Sep 2009 16:37:22 -0500 Received: from estes.americas.sgi.com (estes.americas.sgi.com [128.162.236.10]) by relay2.corp.sgi.com (Postfix) with ESMTP id D89193040C0 for ; Tue, 1 Sep 2009 14:38:08 -0700 (PDT) Received: from eagdhcp-232-185.americas.sgi.com (eagdhcp-232-185.americas.sgi.com [128.162.232.185]) by estes.americas.sgi.com (Postfix) with ESMTP id AA302700074B; Tue, 1 Sep 2009 16:22:54 -0500 (CDT) Cc: Eric Sandeen , xfs@oss.sgi.com Message-Id: From: Felix Blyakher To: Christoph Hellwig In-Reply-To: <20090901201329.GA14400@infradead.org> Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v926) Subject: Re: [PATCH] xfs: actually enable the swapext compat handler Date: Tue, 1 Sep 2009 16:22:54 -0500 References: <20090901180308.GA26071@infradead.org> <4A9D79F7.9080702@sandeen.net> <20090901201329.GA14400@infradead.org> X-Mailer: Apple Mail (2.926) X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Sep 1, 2009, at 3:13 PM, Christoph Hellwig wrote: > On Tue, Sep 01, 2009 at 02:45:59PM -0500, Eric Sandeen wrote: >> Christoph Hellwig wrote: >>> Fix a small typo in the compat ioctl handler that cause the swapext >>> compat handler to never be called. >>> >>> Signed-off-by: Christoph Hellwig >>> Reviewed-by: Torsten Kaiser >>> Tested-by: Torsten Kaiser >> >> Guilty-party: Eric Sandeen >> Reviewed-by: Eric Sandeen Yep, trivial fix. Looks good. Reviewed-by: Felix Blyakher >> > > Haha, thanks. Felix, can you push this one to Linus for 2.6.31? Sure, doing it now. Felix > > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs From felixb@oss.sgi.com Tue Sep 1 16:41:27 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from oss.sgi.com (localhost [127.0.0.1]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81LfM0x209411 for ; Tue, 1 Sep 2009 16:41:27 -0500 Received: (from felixb@localhost) by oss.sgi.com (8.14.3/8.14.3/Submit) id n81LfMOq209325; Tue, 1 Sep 2009 16:41:22 -0500 Date: Tue, 1 Sep 2009 16:41:22 -0500 Message-Id: <200909012141.n81LfMOq209325@oss.sgi.com> From: xfs@oss.sgi.com To: xfs@oss.sgi.com Subject: [XFS updates] XFS development tree branch, master, updated. v2.6.30-rc4-12474-gaa72a5c X-Git-Refname: refs/heads/master X-Git-Reftype: branch X-Git-Oldrev: 1da1daed813c534263a87ffc36d5b775e65231ad X-Git-Newrev: aa72a5cf00001d0b952c7c755be404b9118ceb2e This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "XFS development tree". The branch, master has been updated aa72a5c xfs: simplify xfs_trans_iget 13e6d5c xfs: merge fsync and O_SYNC handling bd16956 xfs: speed up free inode search 2187550 xfs: rationalize xfs_inobt_lookup* 4254b0b xfs: untangle xfs_dialloc 0b48db8 xfs: factor out debug checks from xfs_dialloc and xfs_difree afabc24 xfs: improve xfs_inobt_update prototype 2e287a7 xfs: improve xfs_inobt_get_rec prototype 85c0b2a xfs: factor out inode initialisation from 1da1daed813c534263a87ffc36d5b775e65231ad (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit aa72a5cf00001d0b952c7c755be404b9118ceb2e Author: Christoph Hellwig Date: Mon Aug 31 21:51:52 2009 -0300 xfs: simplify xfs_trans_iget xfs_trans_iget is a wrapper for xfs_iget that adds the inode to the transaction after it is read. Except when the inode already is in the inode cache, in which case it returns the existing locked inode with increment lock recursion counts. Now, no one in the tree every decrements these lock recursion counts, so any user of this gets a potential double unlock when both the original owner of the inode and the xfs_trans_iget caller unlock it. When looking back in a git bisect in the historic XFS tree there was only one place that decremented these counts, xfs_trans_iput. Introduced in commit ca25df7a840f426eb566d52667b6950b92bb84b5 by Adam Sweeney in 1993, and removed in commit 19f899a3ab155ff6a49c0c79b06f2f61059afaf3 by Steve Lord in 2003. And as long as it didn't slip through git bisects cracks never actually used in that time frame. A quick audit of the callers of xfs_trans_iget shows that no caller really relies on this behaviour fortunately - xfs_ialloc allows this inode from disk so it must not be there before, and all the RT allocator routines only every add each RT bitmap inode once. In addition to removing lots of code and reducing the size of the inode item this patch also avoids the double inode cache lookup in each create/mkdir/mknod transaction. Signed-off-by: Christoph Hellwig Reviewed-by: Alex Elder Signed-off-by: Felix Blyakher commit 13e6d5cdde0e785aa943810f08b801cadd0935df Author: Christoph Hellwig Date: Mon Aug 31 21:00:31 2009 -0300 xfs: merge fsync and O_SYNC handling The guarantees for O_SYNC are exactly the same as the ones we need to make for an fsync call (and given that Linux O_SYNC is O_DSYNC the equivalent is fdadatasync, but we treat both the same in XFS), except with a range data writeout. Jan Kara has started unifying these two path for filesystems using the generic helpers, and I've started to look at XFS. The actual transaction commited by xfs_fsync and xfs_write_sync_logforce has a different transaction number, but actually is exactly the same. We'll only use the fsync transaction going forward. One major difference is that xfs_write_sync_logforce never issues a cache flush unless we commit a transaction causing that as a side-effect, which is an obvious bug in the O_SYNC handling. Second all the locking and i_update_size vs i_update_core changes from 978b7237123d007b9fa983af6e0e2fa8f97f9934 never made it to xfs_write_sync_logforce, so we add them back. To make xfs_fsync easily usable from the O_SYNC path, the filemap_fdatawait call is moved up to xfs_file_fsync, so that we don't wait on the whole file after we already waited for our portion in xfs_write. We'll also use a plain call to filemap_write_and_wait_range instead of the previous sync_page_rang which did it in two steps including an half-hearted inode write out that doesn't help us. Once we're done with this also remove the now useless i_update_size tracking. Signed-off-by: Christoph Hellwig Reviewed-by: Felix Blyakher Signed-off-by: Felix Blyakher commit bd169565993b39b9b4b102cdac8b13e0a259ce2f Author: Dave Chinner Date: Mon Aug 31 20:58:28 2009 -0300 xfs: speed up free inode search Don't search too far - abort if it is outside a certain radius and simply do a linear search for the first free inode. In AGs with a million inodes this can speed up allocation speed by 3-4x. [hch: ported to the new xfs_ialloc.c world order] Signed-off-by: Dave Chinner Signed-off-by: Christoph Hellwig Reviewed-by: Alex Elder Signed-off-by: Felix Blyakher commit 2187550525d7bcb8c87689e4eca41b1955bf9ac3 Author: Christoph Hellwig Date: Mon Aug 31 20:58:21 2009 -0300 xfs: rationalize xfs_inobt_lookup* Currenly we have a xfs_inobt_lookup* variant for each comparism direction, and all these get all three fields of the inobt records passed, while the common case is just looking for the inode number and we have only marginally more callers than xfs_inobt_lookup* variants. So opencode a direct call to xfs_btree_lookup for the single case where we need all fields, and replace xfs_inobt_lookup* with a xfs_inobt_looku that just takes the inode number and the direction for all other callers. Signed-off-by: Christoph Hellwig Reviewed-by: Alex Elder Signed-off-by: Felix Blyakher commit 4254b0bbb1c0826b7443ffa593576696bc591aa2 Author: Christoph Hellwig Date: Mon Aug 31 20:57:14 2009 -0300 xfs: untangle xfs_dialloc Clarify the control flow in xfs_dialloc. Factor out a helper to go to the next node from the current one and improve the control flow by expanding composite if statements and using gotos. The xfs_ialloc_next_rec helper is borrowed from Dave Chinners dynamic allocation policy patches. Signed-off-by: Christoph Hellwig Reviewed-by: Alex Elder Signed-off-by: Felix Blyakher commit 0b48db80ba689edfd96ed06c3124d6cf1146de3f Author: Dave Chinner Date: Mon Aug 31 20:57:09 2009 -0300 xfs: factor out debug checks from xfs_dialloc and xfs_difree Factor out a common helper from repeated debug checks in xfs_dialloc and xfs_difree. [hch: split out from Dave's dynamic allocation policy patches] Signed-off-by: Christoph Hellwig Reviewed-by: Alex Elder Signed-off-by: Felix Blyakher commit afabc24a73bfee2656724b0a70395f1693eaa62b Author: Christoph Hellwig Date: Mon Aug 31 20:57:03 2009 -0300 xfs: improve xfs_inobt_update prototype Both callers of xfs_inobt_update have the record in form of a xfs_inobt_rec_incore_t, so just pass a pointer to it instead of the individual variables. Signed-off-by: Christoph Hellwig Reviewed-by: Alex Elder Signed-off-by: Felix Blyakher commit 2e287a731e0607e0371dc6165b7dd3ebc67fa8e1 Author: Christoph Hellwig Date: Mon Aug 31 20:56:58 2009 -0300 xfs: improve xfs_inobt_get_rec prototype Most callers of xfs_inobt_get_rec need to fill a xfs_inobt_rec_incore_t, and those who don't yet are fine with a xfs_inobt_rec_incore_t, instead of the three individual variables, too. So just change xfs_inobt_get_rec to write the output into a xfs_inobt_rec_incore_t directly. Signed-off-by: Christoph Hellwig Reviewed-by: Alex Elder Signed-off-by: Felix Blyakher commit 85c0b2ab5e69ca6133380ead1c50e0840d136b39 Author: Dave Chinner Date: Mon Aug 31 20:56:51 2009 -0300 xfs: factor out inode initialisation Factor out code to initialize new inode clusters into a function of it's own. This keeps xfs_ialloc_ag_alloc smaller and better structured and enables a future inode cluster initialization transaction. Also initialize the agno variable earlier in xfs_ialloc_ag_alloc to avoid repeated byte swaps. [hch: The original patch is from Dave from his unpublished inode create transaction patch series, with some modifcations by me to apply stand-alone] Signed-off-by: Dave Chinner Signed-off-by: Christoph Hellwig Reviewed-by: Alex Elder Signed-off-by: Felix Blyakher ----------------------------------------------------------------------- Summary of changes: fs/xfs/linux-2.6/xfs_aops.c | 1 - fs/xfs/linux-2.6/xfs_file.c | 19 +- fs/xfs/linux-2.6/xfs_lrw.c | 7 +- fs/xfs/xfs_ag.h | 9 + fs/xfs/xfs_ialloc.c | 805 ++++++++++++++++++++++--------------------- fs/xfs/xfs_ialloc.h | 18 +- fs/xfs/xfs_iget.c | 27 -- fs/xfs/xfs_inode.h | 3 - fs/xfs/xfs_inode_item.c | 10 - fs/xfs/xfs_inode_item.h | 2 - fs/xfs/xfs_itable.c | 96 +++--- fs/xfs/xfs_rw.c | 84 ----- fs/xfs/xfs_rw.h | 1 - fs/xfs/xfs_trans.h | 2 +- fs/xfs/xfs_trans_inode.c | 86 +----- fs/xfs/xfs_vnodeops.c | 11 +- 16 files changed, 499 insertions(+), 682 deletions(-) hooks/post-receive -- XFS development tree From felixb@oss.sgi.com Tue Sep 1 16:56:30 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from oss.sgi.com (localhost [127.0.0.1]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81LuPiT210267 for ; Tue, 1 Sep 2009 16:56:30 -0500 Received: (from felixb@localhost) by oss.sgi.com (8.14.3/8.14.3/Submit) id n81LuOO5210222; Tue, 1 Sep 2009 16:56:24 -0500 Date: Tue, 1 Sep 2009 16:56:24 -0500 Message-Id: <200909012156.n81LuOO5210222@oss.sgi.com> From: xfs@oss.sgi.com To: xfs@oss.sgi.com Subject: [XFS updates] XFS development tree branch, master, updated. v2.6.30-rc4-12475-gf4378b6 X-Git-Refname: refs/heads/master X-Git-Reftype: branch X-Git-Oldrev: aa72a5cf00001d0b952c7c755be404b9118ceb2e X-Git-Newrev: f4378b6eaf63492c0f9a397d52813e0ae6b49e7b This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "XFS development tree". The branch, master has been updated f4378b6 xfs: actually enable the swapext compat handler from aa72a5cf00001d0b952c7c755be404b9118ceb2e (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit f4378b6eaf63492c0f9a397d52813e0ae6b49e7b Author: Christoph Hellwig Date: Tue Sep 1 14:03:08 2009 -0400 xfs: actually enable the swapext compat handler Fix a small typo in the compat ioctl handler that cause the swapext compat handler to never be called. Signed-off-by: Christoph Hellwig Reviewed-by: Torsten Kaiser Tested-by: Torsten Kaiser Reviewed-by: Eric Sandeen Reviewed-by: Felix Blyakher Signed-off-by: Felix Blyakher ----------------------------------------------------------------------- Summary of changes: fs/xfs/linux-2.6/xfs_ioctl32.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) hooks/post-receive -- XFS development tree From felixb@oss.sgi.com Tue Sep 1 17:00:52 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from oss.sgi.com (localhost [127.0.0.1]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81M0lhF210619 for ; Tue, 1 Sep 2009 17:00:52 -0500 Received: (from felixb@localhost) by oss.sgi.com (8.14.3/8.14.3/Submit) id n81M0lxe210585; Tue, 1 Sep 2009 17:00:47 -0500 Date: Tue, 1 Sep 2009 17:00:47 -0500 Message-Id: <200909012200.n81M0lxe210585@oss.sgi.com> From: xfs@oss.sgi.com To: xfs@oss.sgi.com Subject: [XFS updates] XFS development tree branch, for-linus, updated. v2.6.30-rc4-12442-g3725867 X-Git-Refname: refs/heads/for-linus X-Git-Reftype: branch X-Git-Oldrev: bc990f5cb424cdca9dda866785d088e2c2110ecc X-Git-Newrev: 3725867dccfb83e4b0cff64e916a04258f300591 This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "XFS development tree". The branch, for-linus has been updated 3725867 xfs: actually enable the swapext compat handler from bc990f5cb424cdca9dda866785d088e2c2110ecc (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 3725867dccfb83e4b0cff64e916a04258f300591 Author: Christoph Hellwig Date: Tue Sep 1 14:03:08 2009 -0400 xfs: actually enable the swapext compat handler Fix a small typo in the compat ioctl handler that cause the swapext compat handler to never be called. Signed-off-by: Christoph Hellwig Reviewed-by: Torsten Kaiser Tested-by: Torsten Kaiser Reviewed-by: Eric Sandeen Reviewed-by: Felix Blyakher Signed-off-by: Felix Blyakher ----------------------------------------------------------------------- Summary of changes: fs/xfs/linux-2.6/xfs_ioctl32.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) hooks/post-receive -- XFS development tree From felixb@sgi.com Tue Sep 1 17:11:06 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81MAjVH211390 for ; Tue, 1 Sep 2009 17:10:56 -0500 Received: from attica.americas.sgi.com (attica.americas.sgi.com [128.162.236.44]) by relay2.corp.sgi.com (Postfix) with ESMTP id B9A3F3040BC for ; Tue, 1 Sep 2009 15:11:42 -0700 (PDT) Received: by attica.americas.sgi.com (Postfix, from userid 29043) id BED9AA23CA70; Tue, 1 Sep 2009 17:06:12 -0500 (CDT) Date: Tue, 01 Sep 2009 17:06:12 -0500 To: torvalds@linux-foundation.org Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com, akpm@linux-foundation.org Subject: [GIT PULL] XFS update for 2.6.31 User-Agent: Heirloom mailx 12.2 01/07/07 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20090901220612.BED9AA23CA70@attica.americas.sgi.com> From: felixb@sgi.com (Felix Blyakher) X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean The following changes since commit 37d0892c5a94e208cf863e3b7bac014edee4346d: Ian Kent (1): autofs4 - fix missed case when changing to use struct path are available in the git repository at: git://oss.sgi.com/xfs/xfs for-linus Christoph Hellwig (1): xfs: actually enable the swapext compat handler fs/xfs/linux-2.6/xfs_ioctl32.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) From michael.monnerie@is.it-management.at Tue Sep 1 17:17:08 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_33 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81MGmWR211868 for ; Tue, 1 Sep 2009 17:16:58 -0500 X-ASG-Debug-ID: 1251843424-45f601740000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mailsrv5.zmi.at (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id E0D0A418DE6 for ; Tue, 1 Sep 2009 15:17:04 -0700 (PDT) Received: from mailsrv5.zmi.at (mailsrv5.zmi.at [212.69.164.54]) by cuda.sgi.com with ESMTP id QC683xtlAYPewMcj for ; Tue, 01 Sep 2009 15:17:04 -0700 (PDT) Received: from mailsrv.i.zmi.at (unknown [81.217.106.33]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailsrv2.i.zmi.at", Issuer "power4u.zmi.at" (not verified)) by mailsrv5.zmi.at (Postfix) with ESMTP id 5552069A for ; Wed, 2 Sep 2009 00:16:55 +0200 (CEST) Received: from saturn.localnet (saturn.i.zmi.at [10.72.27.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mailsrv.i.zmi.at (Postfix) with ESMTPSA id 26477400161 for ; Wed, 2 Sep 2009 00:16:58 +0200 (CEST) From: Michael Monnerie Organization: it-management http://it-management.at To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: xfs data loss Subject: Re: xfs data loss Date: Wed, 2 Sep 2009 00:16:07 +0200 User-Agent: KMail/1.10.3 (Linux/2.6.30.5-ZMI; KDE/4.1.3; x86_64; ; ) References: <19101.5976.387292.614270@tree.ty.sabi.co.uk> In-Reply-To: <19101.5976.387292.614270@tree.ty.sabi.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200909020016.07984@zmi.at> X-Barracuda-Connect: mailsrv5.zmi.at[212.69.164.54] X-Barracuda-Start-Time: 1251843455 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7849 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Dienstag 01 September 2009 Peter Grandi wrote: > knowing *exactly* what has failed may help you a lot. Thank you for your very verbose posting, it was fun to read. And the last line should be answered by the OP. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 From michael.monnerie@is.it-management.at Tue Sep 1 17:53:40 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_33, J_CHICKENPOX_53 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81MrK3B213884 for ; Tue, 1 Sep 2009 17:53:30 -0500 X-ASG-Debug-ID: 1251845621-51aa02990000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mailsrv5.zmi.at (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1D69315A468D for ; Tue, 1 Sep 2009 15:53:41 -0700 (PDT) Received: from mailsrv5.zmi.at (mailsrv5.zmi.at [212.69.164.54]) by cuda.sgi.com with ESMTP id 7iPRul3tJfchNnBb for ; Tue, 01 Sep 2009 15:53:41 -0700 (PDT) Received: from mailsrv.i.zmi.at (unknown [81.217.106.33]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailsrv2.i.zmi.at", Issuer "power4u.zmi.at" (not verified)) by mailsrv5.zmi.at (Postfix) with ESMTP id 476516C0 for ; Wed, 2 Sep 2009 00:53:30 +0200 (CEST) Received: from saturn.localnet (saturn.i.zmi.at [10.72.27.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mailsrv.i.zmi.at (Postfix) with ESMTPSA id 1DCED400161 for ; Wed, 2 Sep 2009 00:53:33 +0200 (CEST) From: Michael Monnerie Organization: it-management http://it-management.at To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: zero size file after power failure with kernel 2.6.30.5 Subject: Re: zero size file after power failure with kernel 2.6.30.5 Date: Wed, 2 Sep 2009 00:52:41 +0200 User-Agent: KMail/1.10.3 (Linux/2.6.30.5-ZMI; KDE/4.1.3; x86_64; ; ) References: <200908292102.21710@zmi.at> <200909010918.37886@zmi.at> <19100.63566.98250.185404@tree.ty.sabi.co.uk> In-Reply-To: <19100.63566.98250.185404@tree.ty.sabi.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200909020052.42421@zmi.at> X-Barracuda-Connect: mailsrv5.zmi.at[212.69.164.54] X-Barracuda-Start-Time: 1251845647 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7853 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Dienstag 01 September 2009 Peter Grandi wrote: > Other people have a very different impression. Like 'ext3' > ReiserFS does ordered writes, but those don't necessarily help > because of the colossal amount of buffering that happens anyhow > nowadays. Maybe. I had reiserfs on this system until two weeks ago, with this quad-core 8GB desktop. Had power failures, crashes, and so on. Can't remember a situation where a KDE app lost its config. But I had a server with the OSS XEN, running a single VM which is my internal mailserver using PostgreSQL as it's store on XFS. My daughter managed to switch the server off (yeah, having redundant power supplies and UPS are still not enough). After reboot, the PostgreSQL database was *damaged*, so much that I had to restore. This should never have happened, and until now I don't know who was guilty for that: XFS? XEN? The RAID Controller with BBU and hard disk cache=off? That's why I'm very sensible to even a small data loss (I had a backup of my kmail config), and I think the filesystem has to do everything to try to keep my data. XFS seems to be optimized more for speed before security, would you mean that? I've often heard "enterprise hardware", which sounds like "if anything crashes, it's your problem" ;-) > http://www.sabi.co.uk/blog/0707jul.html#070701 I like your blog, and http://www.myri.com/scs/READMES/README.myri10ge-linux gave me a good hint to optimize tcp settings a long time ago. > In general on a fast machine I would use: > vm/dirty_ratio =4 > vm/dirty_background_ratio =2 > vm/dirty_expire_centisecs =400 > vm/dirty_writeback_centisecs =200 Since May I use these new settings with kernel 2.6.(29|30): vm.dirty_background_bytes = 16123456 vm.dirty_bytes = 250123456 vm.dirty_expire_centisecs = 1000 vm.dirty_writeback_centisecs = 100 (the expire was on 3000 until the crash). mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 From BATV+88945cdf03a0dcaa8c4e+2200+infradead.org+hch@bombadil.srs.infradead.org Tue Sep 1 18:56:28 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n81Nu2A5217223 for ; Tue, 1 Sep 2009 18:56:18 -0500 X-ASG-Debug-ID: 1251849419-3d6c018b0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3A36415A5E95 for ; Tue, 1 Sep 2009 16:56:59 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id GPasp8SC58vVDZAv for ; Tue, 01 Sep 2009 16:56:59 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1MidDb-00044x-Ju for xfs@oss.sgi.com; Tue, 01 Sep 2009 23:56:55 +0000 Date: Tue, 1 Sep 2009 19:56:55 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH] xfs: un-static xfs_inobt_lookup Subject: [PATCH] xfs: un-static xfs_inobt_lookup Message-ID: <20090901235655.GA15321@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251849419 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean xfs_inobt_lookup is also used in xfs_itable.c, remove the STATIC modifier from it's declaration to fix non-debug builds. This was already fixed in my git tree vs the version last posted to the list. Signed-off-by: Christoph Hellwig Index: xfs/fs/xfs/xfs_ialloc.c =================================================================== --- xfs.orig/fs/xfs/xfs_ialloc.c 2009-09-01 20:47:28.515468366 -0300 +++ xfs/fs/xfs/xfs_ialloc.c 2009-09-01 20:47:33.867913011 -0300 @@ -59,7 +59,7 @@ xfs_ialloc_cluster_alignment( /* * Lookup a record by ino in the btree given by cur. */ -STATIC int /* error */ +int /* error */ xfs_inobt_lookup( struct xfs_btree_cur *cur, /* btree cursor */ xfs_agino_t ino, /* starting inode of chunk */ From edvx1@systemanalysen.net Tue Sep 1 20:19:37 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43, J_CHICKENPOX_45,J_CHICKENPOX_65 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n821JHgJ222472 for ; Tue, 1 Sep 2009 20:19:27 -0500 X-ASG-Debug-ID: 1251854395-045903400000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ngcobalt07.manitu.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 76284419461 for ; Tue, 1 Sep 2009 18:19:55 -0700 (PDT) Received: from ngcobalt07.manitu.net (ngcobalt07.manitu.net [217.11.48.107]) by cuda.sgi.com with ESMTP id r09e6zbC9kuoV71R for ; Tue, 01 Sep 2009 18:19:55 -0700 (PDT) Received: from mobil.systemanalysen.net (localhost [127.0.0.1]) (authenticated as r.mail with PLAIN) by localhost (8.10.2/8.10.2) with ESMTP id n821K1F06106; Wed, 2 Sep 2009 03:20:01 +0200 X-manitu-Original-Sender-IP: 127.0.0.1 X-manitu-Original-Receiver-Name: localhost From: Roland Eggner Reply-To: "Roland Eggner" To: SGI Project XFS mailing list X-ASG-Orig-Subj: free space of root partition decreases unaccountably by some 1024 blocks on every umount+linux shutdown : additional informations Subject: free space of root partition decreases unaccountably by some 1024 blocks on every umount+linux shutdown : additional informations Date: Wed, 2 Sep 2009 03:16:36 +0200 User-Agent: KMail/1.11.2 (Linux/2.6.29.6.roland.0; KDE/4.2.2; i686; ; ) References: <200908121955.07682.edvx1@systemanalysen.net> <4A838598.4000608@sandeen.net> In-Reply-To: <4A838598.4000608@sandeen.net> MIME-Version: 1.0 Message-Id: <200909020316.48300.edvx1@systemanalysen.net> Content-Type: multipart/signed; boundary="nextPart4021922.nUrLzVR9CD"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: ngcobalt07.manitu.net[217.11.48.107] X-Barracuda-Start-Time: 1251854402 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.01 X-Barracuda-Spam-Status: No, SCORE=-1.01 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MARKETING_SUBJECT, SUBJECT_FUZZY_TION X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7863 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.60 MARKETING_SUBJECT Subject contains popular marketing words 0.41 SUBJECT_FUZZY_TION Attempt to obfuscate words in Subject: X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean --nextPart4021922.nUrLzVR9CD Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Thursday August 13th 05:16:40 2009 Eric Sandeen wrote: > Roland Eggner wrote: > > On July 18th I noticed the first time this unaccountable decrease of fr= ee space of my root partition /dev/hda7: > > For at least several boot-shutdown-cycles it has decreased on every cyc= le by some 1020 =E2=80=A6 1030 blocks from originally above 100 MB to 96 MB. > > Expected change at most =C2=B11 block. Neither xfs_check nor xfs_repai= r -dn could detect any flaws. > Maybe I missed it in the email, but how have you ruled out the > possibility that files are simply growing, thereby using the space? Look at the last but one paragraph in my mail from August 12th http://oss.sgi.com/archives/xfs/2009-08/msg00113.html (obviously the find argument =E2=80=9C-mount=E2=80=9D got lost on copy+past= e, sorry). =46acts summarized: =2D--------------- (a) Free space decreases unaccountably even if I circumvent all shutdown s= cripts und shutdown by SysReq R S U B. Note: no SIGTERM, no SIGKILL. (b) Unaccountable free space decrease is triggered EXACTLY, ALWAYS and EXC= LUSIVELY by this =E2=80=9Cremount,ro=E2=80=9D. Note: NOT by any other writ= e activities. (c) Binary comparison of images written with dd exhibited NO unexpected wr= ites to me (I do not have deeper knowledge of xfs internals): in a particu= lar case free space difference reported by df has been 1026 blocks =3D 1050= 624 byte, whereas count of differing bytes reported by =E2=80=9Ccmp -b -l= =E2=80=9D has been only 135189. =E2=80=9Ccmp -b -l=E2=80=9D DID exhibit ex= pected writes e.g. /etc/mtab. (Eric Sandeen got details via private mail a= nd offered kindly to have a look at xfs_metadump extraction from this image= s =E2=80=94 thanks!). (d) Until now I could NOT detect this problem at any other partition, it s= eems that ONLY the root partition is affected. (e) I performed xfsdump | mkfs.xfs | xfsrestore and got back some 200 Mbyt= e free space, but only temporarily =E2=80=94 at subsequent linux shutdowns = free space CONTINUES to decrease, just starting from a new offset. (f) I booted a sidux image and reset the lazy-counter attribute by =E2=80= =9Cxfs_admin -c 0=E2=80=9D =E2=9E=94 Apart from =E2=80=9CXFS: correcting sb= _features alignment problem=E2=80=9D message at next mount, EXACTLY the sam= e result as after measures (e). (WARNING: Never use =E2=80=9Cxfs_admin -c= 0=E2=80=9D unless you have a current backup of your valuable data!!) (g) Kernel 2.6.30.4 shows the problem too. (And introduces some other fla= ws, =E2=80=9Cshow stopping=E2=80=9D for me =E2=80=94 therefore I stay at ke= rnel 2.6.29.6). (h) Apart from following single incident, I got never any error messages f= rom this filesystem, neither from xfs_check nor from xfs_repair: On July 17th a run of xfs_repair yielded following report =E2=80=94 beeing = busy on that day, I ignored the message apart from saving it for later anal= ysis: # xfs_repair -dn /dev/hda7 bad nblocks 1 for free inode 9722 bad nlink 1 for free inode 9722 bad mode 0100644 for free inode 9722 link count mismatch for inode 9722 (name ?), nlink 0, counted 1 Phase 1 - find and verify superblock... Phase 2 - using internal log - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno =3D 0 - agno =3D 1 - agno =3D 2 - agno =3D 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno =3D 0 - agno =3D 1 - agno =3D 2 - agno =3D 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. If I can provide any additional debugging info, let me know. =2D-=20 Roland Eggner --nextPart4021922.nUrLzVR9CD Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEABECAAYFAkqdx3QACgkQdN/hKfT7G/KnMQCgtfQhfym9YIHDQn1W4UyKK1yw DisAnj+6ZYFKYbygHHePkEl6wxw9zKxR =bNcT -----END PGP SIGNATURE----- --nextPart4021922.nUrLzVR9CD-- From felixb@oss.sgi.com Tue Sep 1 20:44:00 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from oss.sgi.com (localhost [127.0.0.1]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n821htpS223641 for ; Tue, 1 Sep 2009 20:44:00 -0500 Received: (from felixb@localhost) by oss.sgi.com (8.14.3/8.14.3/Submit) id n821htNu223600; Tue, 1 Sep 2009 20:43:55 -0500 Date: Tue, 1 Sep 2009 20:43:55 -0500 Message-Id: <200909020143.n821htNu223600@oss.sgi.com> From: xfs@oss.sgi.com To: xfs@oss.sgi.com Subject: [XFS updates] XFS development tree branch, master, updated. v2.6.30-rc4-12476-g81e2517 X-Git-Refname: refs/heads/master X-Git-Reftype: branch X-Git-Oldrev: f4378b6eaf63492c0f9a397d52813e0ae6b49e7b X-Git-Newrev: 81e251766e8f8c9d7abb5db784e58c5c45f82797 This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "XFS development tree". The branch, master has been updated 81e2517 xfs: un-static xfs_inobt_lookup from f4378b6eaf63492c0f9a397d52813e0ae6b49e7b (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 81e251766e8f8c9d7abb5db784e58c5c45f82797 Author: Christoph Hellwig Date: Tue Sep 1 19:56:55 2009 -0400 xfs: un-static xfs_inobt_lookup xfs_inobt_lookup is also used in xfs_itable.c, remove the STATIC modifier from it's declaration to fix non-debug builds. Signed-off-by: Christoph Hellwig Reviewed-by: Felix Blyakher Signed-off-by: Felix Blyakher ----------------------------------------------------------------------- Summary of changes: fs/xfs/xfs_ialloc.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) hooks/post-receive -- XFS development tree From felixb@sgi.com Tue Sep 1 21:00:24 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from relay.sgi.com (relay1.corp.sgi.com [137.38.102.111]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82203FJ224547 for ; Tue, 1 Sep 2009 21:00:13 -0500 Received: from estes.americas.sgi.com (estes.americas.sgi.com [128.162.236.10]) by relay1.corp.sgi.com (Postfix) with ESMTP id 961438F80D9 for ; Tue, 1 Sep 2009 19:01:00 -0700 (PDT) Received: from [IPv6???1] (sshgate.corp.sgi.com [198.149.20.12]) by estes.americas.sgi.com (Postfix) with ESMTP id 37C98700075C; Tue, 1 Sep 2009 20:37:57 -0500 (CDT) Cc: xfs@oss.sgi.com Message-Id: From: Felix Blyakher To: Christoph Hellwig In-Reply-To: <20090901235655.GA15321@infradead.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: [PATCH] xfs: un-static xfs_inobt_lookup Date: Tue, 1 Sep 2009 20:37:56 -0500 References: <20090901235655.GA15321@infradead.org> X-Mailer: Apple Mail (2.930.3) X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Sep 1, 2009, at 6:56 PM, Christoph Hellwig wrote: > > xfs_inobt_lookup is also used in xfs_itable.c, remove the STATIC > modifier > from it's declaration to fix non-debug builds. Argh! I think, this or similar thing bit us once in past. > > > This was already fixed in my git tree vs the version last posted to > the > list. Applying it quickly to the oss as well. > Signed-off-by: Christoph Hellwig Reviewed-by: Felix Blyakher > > > Index: xfs/fs/xfs/xfs_ialloc.c > =================================================================== > --- xfs.orig/fs/xfs/xfs_ialloc.c 2009-09-01 20:47:28.515468366 -0300 > +++ xfs/fs/xfs/xfs_ialloc.c 2009-09-01 20:47:33.867913011 -0300 > @@ -59,7 +59,7 @@ xfs_ialloc_cluster_alignment( > /* > * Lookup a record by ino in the btree given by cur. > */ > -STATIC int /* error */ > +int /* error */ > xfs_inobt_lookup( > struct xfs_btree_cur *cur, /* btree cursor */ > xfs_agino_t ino, /* starting inode of chunk */ > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 06:30:42 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82BUERg260070 for ; Wed, 2 Sep 2009 06:30:32 -0500 X-ASG-Debug-ID: 1251891071-2fa102e40000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 653801B8760F for ; Wed, 2 Sep 2009 04:31:11 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id oCrFkYvcgjDSTIfI for ; Wed, 02 Sep 2009 04:31:11 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Mio3P-0003p7-M6; Wed, 02 Sep 2009 11:31:07 +0000 Date: Wed, 2 Sep 2009 07:31:07 -0400 From: Christoph Hellwig To: Michael Monnerie Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: minor bug in xfsprogs-3.0.3 Subject: Re: minor bug in xfsprogs-3.0.3 Message-ID: <20090902113107.GA6908@infradead.org> References: <200909011044.44938@zmi.at> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200909011044.44938@zmi.at> User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251891071 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Tue, Sep 01, 2009 at 10:44:44AM +0200, Michael Monnerie wrote: > # xfs_info -V > Usage: xfs_info [-V] [-t mtab] mountpoint > > It should print the version, right? Yes, it should. But from looking at git history it looks like it never did. Same for various other shell scripts (xfs_check/xfs_ncheck/xfs_admin). I'll fix it up. From jpiszcz@lucidpixels.com Wed Sep 2 06:45:09 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82BinKe260970 for ; Wed, 2 Sep 2009 06:44:59 -0500 X-ASG-Debug-ID: 1251891938-0e7100770000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from lucidpixels.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 9208C15A9E1D for ; Wed, 2 Sep 2009 04:45:38 -0700 (PDT) Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by cuda.sgi.com with ESMTP id nSZfWfV4wA93pamH for ; Wed, 02 Sep 2009 04:45:38 -0700 (PDT) Received: by lucidpixels.com (Postfix, from userid 1001) id 1747337EC; Wed, 2 Sep 2009 07:45:36 -0400 (EDT) Date: Wed, 2 Sep 2009 07:45:36 -0400 (EDT) From: Justin Piszcz To: Christoph Hellwig cc: Nikanth Karthikesan , Jens Axboe , linux-kernel@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Kernel 2.6.30.4 loop(..?) regression (& with/2.6.31-rc6) Subject: Re: Kernel 2.6.30.4 loop(..?) regression (& with/2.6.31-rc6) In-Reply-To: Message-ID: References: <20090822201558.GA17955@infradead.org> <20090822205502.GA18904@infradead.org> <20090823224504.GA19942@infradead.org> <20090826180234.GC14019@infradead.org> <20090826212732.GA18124@infradead.org> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Barracuda-Connect: lucidpixels.com[75.144.35.66] X-Barracuda-Start-Time: 1251891945 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7898 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Fri, 28 Aug 2009, Justin Piszcz wrote: > > On Wed, 26 Aug 2009, Justin Piszcz wrote: > >> On Wed, 26 Aug 2009, Christoph Hellwig wrote: >> >>> On Wed, Aug 26, 2009 at 05:19:11PM -0400, Justin Piszcz wrote: Christoph, Now 6 days without any problems using -o nobarrier. $ uptime 07:43:55 up 6 days, 14:12, 1 user, load average: 0.00, 0.00, 0.00 So -o nobarrier is a workaround for the issue. How / what debug settings should be enabled to catch the bug/problem when -o nobarrier is used? Justin. From twalberg@comcast.net Wed Sep 2 06:50:50 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82BoUUx261420 for ; Wed, 2 Sep 2009 06:50:40 -0500 X-ASG-Debug-ID: 1251892285-53ac01690000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from QMTA03.westchester.pa.mail.comcast.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5ABB241AE9B for ; Wed, 2 Sep 2009 04:51:26 -0700 (PDT) Received: from QMTA03.westchester.pa.mail.comcast.net (qmta03.westchester.pa.mail.comcast.net [76.96.62.32]) by cuda.sgi.com with ESMTP id s7rpZksdLmc0Dx6n for ; Wed, 02 Sep 2009 04:51:26 -0700 (PDT) Received: from OMTA19.westchester.pa.mail.comcast.net ([76.96.62.98]) by QMTA03.westchester.pa.mail.comcast.net with comcast id bms01c00327AodY53nqZze; Wed, 02 Sep 2009 11:50:33 +0000 Received: from beta.localdomain ([24.14.6.228]) by OMTA19.westchester.pa.mail.comcast.net with comcast id bnva1c00A4vB7EY3fnvaCv; Wed, 02 Sep 2009 11:55:35 +0000 Received: from calvin.localdomain ([10.0.0.8]) by beta.localdomain with esmtp (Exim 4.69) (envelope-from ) id 1MioN1-00051R-Vz; Wed, 02 Sep 2009 06:51:24 -0500 Received: from tew by calvin.localdomain with local (Exim 4.69) (envelope-from ) id 1MioN1-00029V-Ox; Wed, 02 Sep 2009 06:51:23 -0500 Date: Wed, 2 Sep 2009 06:51:23 -0500 From: Tim Walberg To: Tim Walberg , Christoph Hellwig , Linux-kernel , linux-fsdevel , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: xfs compat_ioctl? Subject: Re: xfs compat_ioctl? Message-ID: <20090902115123.GA13355@comcast.net> Reply-To: Tim Walberg Mail-Followup-To: Tim Walberg , Christoph Hellwig , Linux-kernel , linux-fsdevel , xfs@oss.sgi.com References: <4A9B759B.7020401@msgid.tls.msk.ru> <20090831123010.GA2368@infradead.org> <20090831183751.GC19343@comcast.net> <20090831184822.GA10393@infradead.org> <20090831184918.GB10393@infradead.org> <20090831190209.GD19343@comcast.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/9DWx/yDrRhgMJTb" Content-Disposition: inline In-Reply-To: <20090831190209.GD19343@comcast.net> Errors-To: Tim Walberg User-Agent: Mutt/1.5.16 (2007-06-09) X-Barracuda-Connect: qmta03.westchester.pa.mail.comcast.net[76.96.62.32] X-Barracuda-Start-Time: 1251892286 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.52 X-Barracuda-Spam-Status: No, SCORE=-1.52 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC0_SA_TO_FROM_ADDR_MATCH X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7900 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_SC0_SA_TO_FROM_ADDR_MATCH Sender Address Matches Recipient Address X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean --/9DWx/yDrRhgMJTb Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Ok, that patch has fixed my issues with xfs_fsr. Now I just need to pull a new copy of xfs_progs to get the xfs_db alignment issue fix... Thanks, tw On 08/31/2009 14:02 -0500, Walberg, Tim wrote: >>=09 >>=09 >> On 08/31/2009 14:49 -0400, Christoph Hellwig wrote: >> >> On Mon, Aug 31, 2009 at 02:48:22PM -0400, Christoph Hellwig wrote: >> >> > On Mon, Aug 31, 2009 at 01:37:51PM -0500, Tim Walberg wrote: >> >> > > Linux sparcy 2.6.30.4-sparcy #2 Sat Aug 1 21:14:46 CDT 2009 sparc= 64 GNU/Linux >> >> > > sparcy:~# file $(which xfs_fsr) >> >> > > /usr/sbin/xfs_fsr: ELF 32-bit MSB executable, SPARC32PLUS, V8+ Re= quired, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Li= nux 2.6.18, stripped >> >> >=20 >> >> > I'll take a look at that, thanks. >> >>=09 >> >> Err, sorry - is that with the patch I posted in this thread or withou= t? >> End of included message >>=09 >>=09 >> No, that's the generic 2.6.30.4... I can attempt with that patch as well= , but >> it might be a day or two... >>=09 >> tw >>=09 End of included message --=20 twalberg@comcast.net --/9DWx/yDrRhgMJTb Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkqeXDsACgkQw+Wcj22rJWaHTgCcDXj5l5OCxgtqjI0yX3/iqjEz I9gAnA1wnOpQh6vGhmNxMMvJOphas5tM =mAQt -----END PGP SIGNATURE----- --/9DWx/yDrRhgMJTb-- From rumi_ml@rtfm.hu Wed Sep 2 08:17:51 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82DHVJW004099 for ; Wed, 2 Sep 2009 08:17:41 -0500 X-ASG-Debug-ID: 1251897482-6bbc03340000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from nexus.dynaweb.hu (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id C001041BCA0 for ; Wed, 2 Sep 2009 06:18:02 -0700 (PDT) Received: from nexus.dynaweb.hu (nexus.dynaweb.hu [195.70.37.87]) by cuda.sgi.com with ESMTP id QA0JSOLW3FFoLMK7 for ; Wed, 02 Sep 2009 06:18:02 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by nexus.dynaweb.hu (Postfix) with ESMTP id 5515243B0E for ; Wed, 2 Sep 2009 15:17:31 +0200 (CEST) X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Scanned: by amavisd-new using ClamAV at dynaweb.hu Received: from nexus.dynaweb.hu ([127.0.0.1]) by localhost (nexus.dynaweb.hu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OQDweFb6RIiU for ; Wed, 2 Sep 2009 15:17:30 +0200 (CEST) Received: from raketa.ipn.dynaweb.hu (catv-80-99-36-176.catv.broadband.hu [80.99.36.176]) by nexus.dynaweb.hu (Postfix) with ESMTPSA id 0FC5440131 for ; Wed, 2 Sep 2009 15:17:30 +0200 (CEST) Date: Wed, 2 Sep 2009 15:17:29 +0200 From: RUMI Szabolcs To: xfs@oss.sgi.com X-ASG-Orig-Subj: Structure needs cleaning? Subject: Structure needs cleaning? Message-Id: <20090902151729.32701dd7.rumi_ml@rtfm.hu> X-Mailer: Sylpheed 2.6.0 (GTK+ 2.16.5; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Barracuda-Connect: nexus.dynaweb.hu[195.70.37.87] X-Barracuda-Start-Time: 1251897507 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7904 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Status: Clean Hi! I'm experiencing errors like this: ls: cannot access 11090000.xhp: Structure needs cleaning These files were part of an OpenOffice 3.1.0 source tree, and could not be removed by rm -rf, which was also reporting the above error. In the dmesg there are errors like this, apparently the same one for each access attempt: Pid: 29510, comm: mc Tainted: P 2.6.29-gentoo-r5-PAE #1 Call Trace: [] xfs_da_do_buf+0x8c4/0x900 [] xfs_da_read_buf+0x30/0x40 [] xfs_da_read_buf+0x30/0x40 [] pollwake+0x0/0x50 [] pollwake+0x0/0x50 [] xfs_da_read_buf+0x30/0x40 [] xfs_dir2_leaf_lookup_int+0x63/0x2f0 [] xfs_dir2_leaf_lookup_int+0x63/0x2f0 [] xfs_dir2_leaf_lookup+0x27/0xc0 [] xfs_dir2_isleaf+0x1f/0x60 [] xfs_dir_lookup+0xd8/0x180 [] xfs_lookup+0x6b/0xf0 [] xfs_vn_lookup+0x55/0xa0 [] do_lookup+0x1ba/0x1e0 [] __link_path_walk+0x6cd/0xd60 [] xfs_dir2_leaf_getdents+0x5ff/0xad0 [] path_walk+0x54/0xc0 [] do_path_lookup+0x83/0x170 [] getname+0x9b/0xe0 [] user_path_at+0x5a/0x90 [] vfs_lstat_fd+0x1f/0x50 [] sys_lstat64+0xf/0x30 [] touch_atime+0x14/0x130 [] vfs_readdir+0x78/0xb0 [] sys_getdents64+0xa1/0xd0 [] sysenter_do_call+0x12/0x25 Is this a known one? Thanks, Sab From rumi_ml@rtfm.hu Wed Sep 2 08:23:12 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82DMpYp004377 for ; Wed, 2 Sep 2009 08:23:01 -0500 X-ASG-Debug-ID: 1251897801-0e6f02d70000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from nexus.dynaweb.hu (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7C7151B886C3 for ; Wed, 2 Sep 2009 06:23:21 -0700 (PDT) Received: from nexus.dynaweb.hu (nexus.dynaweb.hu [195.70.37.87]) by cuda.sgi.com with ESMTP id 7iZN6mzd0f2dO8P8 for ; Wed, 02 Sep 2009 06:23:21 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by nexus.dynaweb.hu (Postfix) with ESMTP id A5B304D454 for ; Wed, 2 Sep 2009 15:22:47 +0200 (CEST) X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Scanned: by amavisd-new using ClamAV at dynaweb.hu Received: from nexus.dynaweb.hu ([127.0.0.1]) by localhost (nexus.dynaweb.hu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F5OKfyZODlOD for ; Wed, 2 Sep 2009 15:22:46 +0200 (CEST) Received: from raketa.ipn.dynaweb.hu (catv-80-99-36-176.catv.broadband.hu [80.99.36.176]) by nexus.dynaweb.hu (Postfix) with ESMTPSA id 2A22C43B15 for ; Wed, 2 Sep 2009 15:22:46 +0200 (CEST) Date: Wed, 2 Sep 2009 15:22:45 +0200 From: RUMI Szabolcs To: xfs@oss.sgi.com X-ASG-Orig-Subj: Structure needs cleaning? (take #2) Subject: Structure needs cleaning? (take #2) Message-Id: <20090902152245.b2969883.rumi_ml@rtfm.hu> X-Mailer: Sylpheed 2.6.0 (GTK+ 2.16.5; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Barracuda-Connect: nexus.dynaweb.hu[195.70.37.87] X-Barracuda-Start-Time: 1251897825 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7906 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Status: Clean Hi! Sorry but my previous post was missing the important first two lines: d62c8000: 2c 30 78 41 41 41 41 30 30 30 30 2c 36 2c 30 78 ,0xAAAA0000,6,0x Filesystem "sda10": XFS internal error xfs_da_do_buf(2) at line 2112 of file fs/xfs/xfs_da_btree.c. Caller 0xc02cc790 Pid: 29510, comm: mc Tainted: P 2.6.29-gentoo-r5-PAE #1 Call Trace: [] xfs_da_do_buf+0x8c4/0x900 [] xfs_da_read_buf+0x30/0x40 [] xfs_da_read_buf+0x30/0x40 [] pollwake+0x0/0x50 [] pollwake+0x0/0x50 [] xfs_da_read_buf+0x30/0x40 [] xfs_dir2_leaf_lookup_int+0x63/0x2f0 [] xfs_dir2_leaf_lookup_int+0x63/0x2f0 [] xfs_dir2_leaf_lookup+0x27/0xc0 [] xfs_dir2_isleaf+0x1f/0x60 [] xfs_dir_lookup+0xd8/0x180 [] xfs_lookup+0x6b/0xf0 [] xfs_vn_lookup+0x55/0xa0 [] do_lookup+0x1ba/0x1e0 [] __link_path_walk+0x6cd/0xd60 [] xfs_dir2_leaf_getdents+0x5ff/0xad0 [] path_walk+0x54/0xc0 [] do_path_lookup+0x83/0x170 [] getname+0x9b/0xe0 [] user_path_at+0x5a/0x90 [] vfs_lstat_fd+0x1f/0x50 [] sys_lstat64+0xf/0x30 [] touch_atime+0x14/0x130 [] vfs_readdir+0x78/0xb0 [] sys_getdents64+0xa1/0xd0 [] sysenter_do_call+0x12/0x25 Thanks, Sab From jack@suse.cz Wed Sep 2 08:58:58 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82DwbDn006810 for ; Wed, 2 Sep 2009 08:58:48 -0500 X-ASG-Debug-ID: 1251899971-2f1b031a0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.suse.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 825E11B88204 for ; Wed, 2 Sep 2009 06:59:31 -0700 (PDT) Received: from mx1.suse.de (cantor.suse.de [195.135.220.2]) by cuda.sgi.com with ESMTP id EQZPGFnubfYiDz9d for ; Wed, 02 Sep 2009 06:59:31 -0700 (PDT) X-ASG-Whitelist: Barracuda Reputation Received: from relay1.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id DB7279337C; Wed, 2 Sep 2009 15:59:28 +0200 (CEST) Received: by duck.suse.cz (Postfix, from userid 10005) id D311A6844D; Wed, 2 Sep 2009 15:59:27 +0200 (CEST) From: Jan Kara To: linux-fsdevel@vger.kernel.org Cc: LKML , hch@lst.de, Jan Kara , ocfs2-devel@oss.oracle.com, Joel Becker , Felix Blyakher , xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 03/16] vfs: Remove syncing from generic_file_direct_write() and generic_file_buffered_write() Subject: [PATCH 03/16] vfs: Remove syncing from generic_file_direct_write() and generic_file_buffered_write() Date: Wed, 2 Sep 2009 15:59:13 +0200 Message-Id: <1251899966-7316-4-git-send-email-jack@suse.cz> X-Mailer: git-send-email 1.6.0.2 In-Reply-To: <1251899966-7316-1-git-send-email-jack@suse.cz> References: <1251899966-7316-1-git-send-email-jack@suse.cz> X-Barracuda-Connect: cantor.suse.de[195.135.220.2] X-Barracuda-Start-Time: 1251899973 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean generic_file_direct_write() and generic_file_buffered_write() called generic_osync_inode() if it was called on O_SYNC file or IS_SYNC inode. But this is superfluous since generic_file_aio_write() does the syncing as well. Also XFS and OCFS2 which call these functions directly handle syncing themselves. So let's have a single place where syncing happens: generic_file_aio_write(). We slightly change the behavior by syncing only the range of file to which the write happened for buffered writes but that should be all that is required. CC: ocfs2-devel@oss.oracle.com CC: Joel Becker CC: Felix Blyakher CC: xfs@oss.sgi.com Signed-off-by: Jan Kara --- mm/filemap.c | 35 ++++++----------------------------- 1 files changed, 6 insertions(+), 29 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 554a396..f863e1d 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2187,20 +2187,7 @@ generic_file_direct_write(struct kiocb *iocb, const struct iovec *iov, } *ppos = end; } - - /* - * Sync the fs metadata but not the minor inode changes and - * of course not the data as we did direct DMA for the IO. - * i_mutex is held, which protects generic_osync_inode() from - * livelocking. AIO O_DIRECT ops attempt to sync metadata here. - */ out: - if ((written >= 0 || written == -EIOCBQUEUED) && - ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { - int err = generic_osync_inode(inode, mapping, OSYNC_METADATA); - if (err < 0) - written = err; - } return written; } EXPORT_SYMBOL(generic_file_direct_write); @@ -2332,8 +2319,6 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov, { struct file *file = iocb->ki_filp; struct address_space *mapping = file->f_mapping; - const struct address_space_operations *a_ops = mapping->a_ops; - struct inode *inode = mapping->host; ssize_t status; struct iov_iter i; @@ -2343,16 +2328,6 @@ generic_file_buffered_write(struct kiocb *iocb, const struct iovec *iov, if (likely(status >= 0)) { written += status; *ppos = pos + status; - - /* - * For now, when the user asks for O_SYNC, we'll actually give - * O_DSYNC - */ - if (unlikely((file->f_flags & O_SYNC) || IS_SYNC(inode))) { - if (!a_ops->writepage || !is_sync_kiocb(iocb)) - status = generic_osync_inode(inode, mapping, - OSYNC_METADATA|OSYNC_DATA); - } } /* @@ -2514,11 +2489,12 @@ ssize_t generic_file_aio_write_nolock(struct kiocb *iocb, ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos); - if (ret > 0 && ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { + if ((ret > 0 || ret == -EIOCBQUEUED) && + ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { ssize_t err; err = sync_page_range_nolock(inode, mapping, pos, ret); - if (err < 0) + if (err < 0 && ret > 0) ret = err; } return ret; @@ -2550,11 +2526,12 @@ ssize_t generic_file_aio_write(struct kiocb *iocb, const struct iovec *iov, ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos); mutex_unlock(&inode->i_mutex); - if (ret > 0 && ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { + if ((ret > 0 || ret == -EIOCBQUEUED) && + ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { ssize_t err; err = sync_page_range(inode, mapping, pos, ret); - if (err < 0) + if (err < 0 && ret > 0) ret = err; } return ret; -- 1.6.0.2 From jack@suse.cz Wed Sep 2 08:59:58 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82DxcAj006909 for ; Wed, 2 Sep 2009 08:59:48 -0500 X-ASG-Debug-ID: 1251900032-3e4301c80000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.suse.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 12BD241B8DD for ; Wed, 2 Sep 2009 07:00:32 -0700 (PDT) Received: from mx1.suse.de (cantor.suse.de [195.135.220.2]) by cuda.sgi.com with ESMTP id VZ1hc3ecGvlZJZ36 for ; Wed, 02 Sep 2009 07:00:32 -0700 (PDT) X-ASG-Whitelist: Barracuda Reputation Received: from relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id E55F993B18; Wed, 2 Sep 2009 15:59:29 +0200 (CEST) Received: by duck.suse.cz (Postfix, from userid 10005) id 5B472168B85; Wed, 2 Sep 2009 15:59:29 +0200 (CEST) From: Jan Kara To: linux-fsdevel@vger.kernel.org Cc: LKML , hch@lst.de, Jan Kara , Felix Blyakher , xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 13/16] xfs: Convert sync_page_range() to simple filemap_write_and_wait_range() Subject: [PATCH 13/16] xfs: Convert sync_page_range() to simple filemap_write_and_wait_range() Date: Wed, 2 Sep 2009 15:59:23 +0200 Message-Id: <1251899966-7316-14-git-send-email-jack@suse.cz> X-Mailer: git-send-email 1.6.0.2 In-Reply-To: <1251899966-7316-1-git-send-email-jack@suse.cz> References: <1251899966-7316-1-git-send-email-jack@suse.cz> X-Barracuda-Connect: cantor.suse.de[195.135.220.2] X-Barracuda-Start-Time: 1251900035 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Christoph Hellwig says that it is enough for XFS to call filemap_write_and_wait_range() instead of sync_page_range() because we do all the metadata syncing when forcing the log. CC: Felix Blyakher CC: xfs@oss.sgi.com CC: Christoph Hellwig Signed-off-by: Jan Kara --- fs/xfs/linux-2.6/xfs_lrw.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/fs/xfs/linux-2.6/xfs_lrw.c b/fs/xfs/linux-2.6/xfs_lrw.c index 7078974..fde63a3 100644 --- a/fs/xfs/linux-2.6/xfs_lrw.c +++ b/fs/xfs/linux-2.6/xfs_lrw.c @@ -817,7 +817,8 @@ write_retry: xfs_iunlock(xip, iolock); if (need_i_mutex) mutex_unlock(&inode->i_mutex); - error2 = sync_page_range(inode, mapping, pos, ret); + error2 = filemap_write_and_wait_range(mapping, pos, + pos + ret - 1); if (!error) error = error2; if (need_i_mutex) -- 1.6.0.2 From jack@suse.cz Wed Sep 2 08:59:58 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82DxcBR006910 for ; Wed, 2 Sep 2009 08:59:48 -0500 X-ASG-Debug-ID: 1251900030-74dc01670000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.suse.de (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 9558C1B88224 for ; Wed, 2 Sep 2009 07:00:30 -0700 (PDT) Received: from mx1.suse.de (cantor.suse.de [195.135.220.2]) by cuda.sgi.com with ESMTP id rfxc1ZR2JZrwNkCh for ; Wed, 02 Sep 2009 07:00:30 -0700 (PDT) X-ASG-Whitelist: Barracuda Reputation Received: from relay2.suse.de (relay-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id C220893A00; Wed, 2 Sep 2009 15:59:29 +0200 (CEST) Received: by duck.suse.cz (Postfix, from userid 10005) id 871E42297EC; Wed, 2 Sep 2009 15:59:28 +0200 (CEST) From: Jan Kara To: linux-fsdevel@vger.kernel.org Cc: LKML , hch@lst.de, Jan Kara , Evgeniy Polyakov , ocfs2-devel@oss.oracle.com, Joel Becker , Felix Blyakher , xfs@oss.sgi.com, Anton Altaparmakov , linux-ntfs-dev@lists.sourceforge.net, OGAWA Hirofumi , linux-ext4@vger.kernel.org, tytso@mit.edu X-ASG-Orig-Subj: [PATCH 07/16] vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode Subject: [PATCH 07/16] vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode Date: Wed, 2 Sep 2009 15:59:17 +0200 Message-Id: <1251899966-7316-8-git-send-email-jack@suse.cz> X-Mailer: git-send-email 1.6.0.2 In-Reply-To: <1251899966-7316-1-git-send-email-jack@suse.cz> References: <1251899966-7316-1-git-send-email-jack@suse.cz> X-Barracuda-Connect: cantor.suse.de[195.135.220.2] X-Barracuda-Start-Time: 1251900035 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Introduce new function for generic inode syncing (vfs_fsync_range) and use it from fsync() path. Introduce also new helper for syncing after a sync write (generic_write_sync) using the generic function. Use these new helpers for syncing from generic VFS functions. This makes O_SYNC writes to block devices acquire i_mutex for syncing. If we really care about this, we can make block_fsync() drop the i_mutex and reacquire it before it returns. CC: Evgeniy Polyakov CC: ocfs2-devel@oss.oracle.com CC: Joel Becker CC: Felix Blyakher CC: xfs@oss.sgi.com CC: Anton Altaparmakov CC: linux-ntfs-dev@lists.sourceforge.net CC: OGAWA Hirofumi CC: linux-ext4@vger.kernel.org CC: tytso@mit.edu Acked-by: Christoph Hellwig Signed-off-by: Jan Kara --- fs/splice.c | 22 +++++--------------- fs/sync.c | 55 +++++++++++++++++++++++++++++++++++++++++++++------ include/linux/fs.h | 3 ++ mm/filemap.c | 18 +++++----------- 4 files changed, 63 insertions(+), 35 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index 73766d2..8190237 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -976,25 +976,15 @@ generic_file_splice_write(struct pipe_inode_info *pipe, struct file *out, if (ret > 0) { unsigned long nr_pages; + int err; - *ppos += ret; nr_pages = (ret + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; - /* - * If file or inode is SYNC and we actually wrote some data, - * sync it. - */ - if (unlikely((out->f_flags & O_SYNC) || IS_SYNC(inode))) { - int err; - - mutex_lock(&inode->i_mutex); - err = generic_osync_inode(inode, mapping, - OSYNC_METADATA|OSYNC_DATA); - mutex_unlock(&inode->i_mutex); - - if (err) - ret = err; - } + err = generic_write_sync(out, *ppos, ret); + if (err) + ret = err; + else + *ppos += ret; balance_dirty_pages_ratelimited_nr(mapping, nr_pages); } diff --git a/fs/sync.c b/fs/sync.c index 3422ba6..6fe72e6 100644 --- a/fs/sync.c +++ b/fs/sync.c @@ -176,19 +176,23 @@ int file_fsync(struct file *filp, struct dentry *dentry, int datasync) } /** - * vfs_fsync - perform a fsync or fdatasync on a file + * vfs_fsync_range - helper to sync a range of data & metadata to disk * @file: file to sync * @dentry: dentry of @file - * @data: only perform a fdatasync operation + * @start: offset in bytes of the beginning of data range to sync + * @end: offset in bytes of the end of data range (inclusive) + * @datasync: perform only datasync * - * Write back data and metadata for @file to disk. If @datasync is - * set only metadata needed to access modified file data is written. + * Write back data in range @start..@end and metadata for @file to disk. If + * @datasync is set only metadata needed to access modified file data is + * written. * * In case this function is called from nfsd @file may be %NULL and * only @dentry is set. This can only happen when the filesystem * implements the export_operations API. */ -int vfs_fsync(struct file *file, struct dentry *dentry, int datasync) +int vfs_fsync_range(struct file *file, struct dentry *dentry, loff_t start, + loff_t end, int datasync) { const struct file_operations *fop; struct address_space *mapping; @@ -212,7 +216,7 @@ int vfs_fsync(struct file *file, struct dentry *dentry, int datasync) goto out; } - ret = filemap_fdatawrite(mapping); + ret = filemap_fdatawrite_range(mapping, start, end); /* * We need to protect against concurrent writers, which could cause @@ -223,12 +227,32 @@ int vfs_fsync(struct file *file, struct dentry *dentry, int datasync) if (!ret) ret = err; mutex_unlock(&mapping->host->i_mutex); - err = filemap_fdatawait(mapping); + + err = filemap_fdatawait_range(mapping, start, end); if (!ret) ret = err; out: return ret; } +EXPORT_SYMBOL(vfs_fsync_range); + +/** + * vfs_fsync - perform a fsync or fdatasync on a file + * @file: file to sync + * @dentry: dentry of @file + * @datasync: only perform a fdatasync operation + * + * Write back data and metadata for @file to disk. If @datasync is + * set only metadata needed to access modified file data is written. + * + * In case this function is called from nfsd @file may be %NULL and + * only @dentry is set. This can only happen when the filesystem + * implements the export_operations API. + */ +int vfs_fsync(struct file *file, struct dentry *dentry, int datasync) +{ + return vfs_fsync_range(file, dentry, 0, LLONG_MAX, datasync); +} EXPORT_SYMBOL(vfs_fsync); static int do_fsync(unsigned int fd, int datasync) @@ -254,6 +278,23 @@ SYSCALL_DEFINE1(fdatasync, unsigned int, fd) return do_fsync(fd, 1); } +/** + * generic_write_sync - perform syncing after a write if file / inode is sync + * @file: file to which the write happened + * @pos: offset where the write started + * @count: length of the write + * + * This is just a simple wrapper about our general syncing function. + */ +int generic_write_sync(struct file *file, loff_t pos, loff_t count) +{ + if (!(file->f_flags & O_SYNC) && !IS_SYNC(file->f_mapping->host)) + return 0; + return vfs_fsync_range(file, file->f_path.dentry, pos, + pos + count - 1, 1); +} +EXPORT_SYMBOL(generic_write_sync); + /* * sys_sync_file_range() permits finely controlled syncing over a segment of * a file in the range offset .. (offset+nbytes-1) inclusive. If nbytes is diff --git a/include/linux/fs.h b/include/linux/fs.h index bc7f0f1..18acaec 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2088,7 +2088,10 @@ extern int __filemap_fdatawrite_range(struct address_space *mapping, extern int filemap_fdatawrite_range(struct address_space *mapping, loff_t start, loff_t end); +extern int vfs_fsync_range(struct file *file, struct dentry *dentry, + loff_t start, loff_t end, int datasync); extern int vfs_fsync(struct file *file, struct dentry *dentry, int datasync); +extern int generic_write_sync(struct file *file, loff_t pos, loff_t count); extern void sync_supers(void); extern void emergency_sync(void); extern void emergency_remount(void); diff --git a/mm/filemap.c b/mm/filemap.c index 3955f7e..70988a1 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -39,11 +39,10 @@ /* * FIXME: remove all knowledge of the buffer layer from the core VM */ -#include /* for generic_osync_inode */ +#include /* for try_to_free_buffers */ #include - /* * Shared mappings implemented 30.11.1994. It's not fully working yet, * though. @@ -2480,19 +2479,16 @@ ssize_t device_aio_write(struct kiocb *iocb, const struct iovec *iov, unsigned long nr_segs, loff_t pos) { struct file *file = iocb->ki_filp; - struct address_space *mapping = file->f_mapping; - struct inode *inode = mapping->host; ssize_t ret; BUG_ON(iocb->ki_pos != pos); ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos); - if ((ret > 0 || ret == -EIOCBQUEUED) && - ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { + if (ret > 0 || ret == -EIOCBQUEUED) { ssize_t err; - err = sync_page_range_nolock(inode, mapping, pos, ret); + err = generic_write_sync(file, pos, ret); if (err < 0 && ret > 0) ret = err; } @@ -2515,8 +2511,7 @@ ssize_t generic_file_aio_write(struct kiocb *iocb, const struct iovec *iov, unsigned long nr_segs, loff_t pos) { struct file *file = iocb->ki_filp; - struct address_space *mapping = file->f_mapping; - struct inode *inode = mapping->host; + struct inode *inode = file->f_mapping->host; ssize_t ret; BUG_ON(iocb->ki_pos != pos); @@ -2525,11 +2520,10 @@ ssize_t generic_file_aio_write(struct kiocb *iocb, const struct iovec *iov, ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos); mutex_unlock(&inode->i_mutex); - if ((ret > 0 || ret == -EIOCBQUEUED) && - ((file->f_flags & O_SYNC) || IS_SYNC(inode))) { + if (ret > 0 || ret == -EIOCBQUEUED) { ssize_t err; - err = sync_page_range(inode, mapping, pos, ret); + err = generic_write_sync(file, pos, ret); if (err < 0 && ret > 0) ret = err; } -- 1.6.0.2 From sandeen@sandeen.net Wed Sep 2 09:30:47 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82EUQB2008658 for ; Wed, 2 Sep 2009 09:30:36 -0500 X-ASG-Debug-ID: 1251901872-74dd02e70000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5573C15A250F for ; Wed, 2 Sep 2009 07:31:15 -0700 (PDT) Received: from mail.sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com with ESMTP id a3qYz1Voi2eCIiTT for ; Wed, 02 Sep 2009 07:31:15 -0700 (PDT) Received: from Liberator.local (unknown [10.0.0.176]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.sandeen.net (Postfix) with ESMTP id 430CAAA60C4; Wed, 2 Sep 2009 09:31:10 -0500 (CDT) Message-ID: <4A9E81AD.70003@sandeen.net> Date: Wed, 02 Sep 2009 09:31:09 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.23 (Macintosh/20090812) MIME-Version: 1.0 To: RUMI Szabolcs CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Structure needs cleaning? (take #2) Subject: Re: Structure needs cleaning? (take #2) References: <20090902152245.b2969883.rumi_ml@rtfm.hu> In-Reply-To: <20090902152245.b2969883.rumi_ml@rtfm.hu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Connect: sandeen.net[209.173.210.139] X-Barracuda-Start-Time: 1251901880 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7910 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean RUMI Szabolcs wrote: > Hi! > > Sorry but my previous post was missing the important first two lines: Yes, thanks. :) > d62c8000: 2c 30 78 41 41 41 41 30 30 30 30 2c 36 2c 30 78 ,0xAAAA0000,6,0x > Filesystem "sda10": XFS internal error xfs_da_do_buf(2) at line 2112 of file fs/xfs/xfs_da_btree.c. Caller 0xc02cc790 This is on-disk corruption, it found bad magic on something it expected to be metadata. You should run xfs_repair. run with -n, or on a restored xfs_metadump image as a dry-run first, if you prefer. -Eric > Pid: 29510, comm: mc Tainted: P 2.6.29-gentoo-r5-PAE #1 > Call Trace: > [] xfs_da_do_buf+0x8c4/0x900 > [] xfs_da_read_buf+0x30/0x40 > [] xfs_da_read_buf+0x30/0x40 > [] pollwake+0x0/0x50 > [] pollwake+0x0/0x50 > [] xfs_da_read_buf+0x30/0x40 > [] xfs_dir2_leaf_lookup_int+0x63/0x2f0 > [] xfs_dir2_leaf_lookup_int+0x63/0x2f0 > [] xfs_dir2_leaf_lookup+0x27/0xc0 > [] xfs_dir2_isleaf+0x1f/0x60 > [] xfs_dir_lookup+0xd8/0x180 > [] xfs_lookup+0x6b/0xf0 > [] xfs_vn_lookup+0x55/0xa0 > [] do_lookup+0x1ba/0x1e0 > [] __link_path_walk+0x6cd/0xd60 > [] xfs_dir2_leaf_getdents+0x5ff/0xad0 > [] path_walk+0x54/0xc0 > [] do_path_lookup+0x83/0x170 > [] getname+0x9b/0xe0 > [] user_path_at+0x5a/0x90 > [] vfs_lstat_fd+0x1f/0x50 > [] sys_lstat64+0xf/0x30 > [] touch_atime+0x14/0x130 > [] vfs_readdir+0x78/0xb0 > [] sys_getdents64+0xa1/0xd0 > [] sysenter_do_call+0x12/0x25 > > Thanks, > Sab > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs > From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:43:58 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HhWAf033626 for ; Wed, 2 Sep 2009 12:43:48 -0500 X-ASG-Debug-ID: 1251913449-4f34024d0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4EDDF15B1F79 for ; Wed, 2 Sep 2009 10:44:09 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id gtPnfZPFYPYSFkjM for ; Wed, 02 Sep 2009 10:44:09 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1MitsN-0002nG-T1 for xfs@oss.sgi.com; Wed, 02 Sep 2009 17:44:07 +0000 Date: Wed, 2 Sep 2009 13:44:07 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: xfsprogs: remove unused scripts Subject: xfsprogs: remove unused scripts Message-ID: <20090902174407.GA9759@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251913469 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean xfs_check64.sh and xfs_ncheck64.sh are outdated copies of xfs_check.sh and xfs_ncheck.sh which call a non-existant xfs_db64 binary. They are never installed or otherwise used, so remove them. They are probably a leftover from IRIX Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/db/Makefile =================================================================== --- xfsprogs-dev.orig/db/Makefile 2009-09-02 14:35:19.413268749 -0300 +++ xfsprogs-dev/db/Makefile 2009-09-02 14:35:29.081614935 -0300 @@ -15,7 +15,6 @@ HFILES = addr.h agf.h agfl.h agi.h attr. text.h type.h write.h attrset.h CFILES = $(HFILES:.h=.c) LSRCFILES = xfs_admin.sh xfs_check.sh xfs_ncheck.sh xfs_metadump.sh -LSRCFILES += xfs_check64.sh xfs_ncheck64.sh LLDLIBS = $(LIBXFS) $(LIBXLOG) $(LIBUUID) $(LIBRT) $(LIBPTHREAD) LTDEPENDENCIES = $(LIBXFS) $(LIBXLOG) Index: xfsprogs-dev/db/xfs_check64.sh =================================================================== --- xfsprogs-dev.orig/db/xfs_check64.sh 2009-09-02 14:35:39.477268924 -0300 +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 @@ -1,36 +0,0 @@ -#!/bin/sh -f -# -# Copyright (c) 2000-2003 Silicon Graphics, Inc. All Rights Reserved. -# - -OPTS=" " -DBOPTS=" " -USAGE="Usage: xfs_check64 [-fsvV] [-l logdev] [-i ino]... [-b bno]... special" - -while getopts "b:fi:l:stvV" c -do - case $c in - s) OPTS=$OPTS"-s ";; - t) OPTS=$OPTS"-t ";; - v) OPTS=$OPTS"-v ";; - V) OPTS=$OPTS"-V ";; - i) OPTS=$OPTS"-i "$OPTARG" ";; - b) OPTS=$OPTS"-b "$OPTARG" ";; - f) DBOPTS=" -f";; - l) DBOPTS=$DBOPTS" -l "$OPTARG" ";; - \?) echo $USAGE 1>&2 - exit 2 - ;; - esac -done -set -- extra $@ -shift $OPTIND -case $# in - 1) xfs_db64$DBOPTS -F -i -p xfs_check64 -c "check$OPTS" $1 - status=$? - ;; - *) echo $USAGE 1>&2 - exit 2 - ;; -esac -exit $status Index: xfsprogs-dev/db/xfs_ncheck64.sh =================================================================== --- xfsprogs-dev.orig/db/xfs_ncheck64.sh 2009-09-02 14:35:33.717268763 -0300 +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 @@ -1,34 +0,0 @@ -#!/bin/sh -f -# -# Copyright (c) 2000-2001 Silicon Graphics, Inc. All Rights Reserved. -# - -OPTS=" " -DBOPTS=" " -USAGE="usage: xfs_ncheck64 [-sfvV] [-l logdev] [-i ino]... special" - -while getopts "b:fi:l:svV" c -do - case $c in - s) OPTS=$OPTS"-s ";; - i) OPTS=$OPTS"-i "$OPTARG" ";; - v) OPTS=$OPTS"-v ";; - V) OPTS=$OPTS"-V ";; - f) DBOPTS=" -f";; - l) DBOPTS=$DBOPTS" -l "$OPTARG" ";; - \?) echo $USAGE 1>&2 - exit 2 - ;; - esac -done -set -- extra $@ -shift $OPTIND -case $# in - 1) xfs_db64$DBOPTS -r -p xfs_ncheck64 -c "blockget -ns" -c "ncheck$OPTS" $1 - status=$? - ;; - *) echo $USAGE 1>&2 - exit 2 - ;; -esac -exit $status From sandeen@sandeen.net Wed Sep 2 12:45:08 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_33 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HilJK033685 for ; Wed, 2 Sep 2009 12:44:58 -0500 X-ASG-Debug-ID: 1251913523-70a203220000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5407141E181 for ; Wed, 2 Sep 2009 10:45:23 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by cuda.sgi.com with ESMTP id 7XDAGu5KGlkSi4ex for ; Wed, 02 Sep 2009 10:45:23 -0700 (PDT) Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id n82HjCcs003479; Wed, 2 Sep 2009 13:45:12 -0400 Received: from neon.msp.redhat.com (neon.msp.redhat.com [10.15.80.10]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id n82HjB3Q003113; Wed, 2 Sep 2009 13:45:12 -0400 Message-ID: <4A9EAF27.6090109@sandeen.net> Date: Wed, 02 Sep 2009 12:45:11 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.21 (X11/20090320) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] xfstests: fix 192 for external logs and enable it by default Subject: Re: [PATCH] xfstests: fix 192 for external logs and enable it by default References: <20090826220836.GA18119@infradead.org> In-Reply-To: <20090826220836.GA18119@infradead.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-Barracuda-Connect: mx1.redhat.com[209.132.183.28] X-Barracuda-Start-Time: 1251913544 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7922 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Christoph Hellwig wrote: > Use _test_mount instead of plain mount to make it work with external logs. > > Enable it by default now that it runs everywhere. > > Signed-off-by: Christoph Hellwig Reviewed-by: Eric Sandeen > Index: xfstests-dev/group > =================================================================== > --- xfstests-dev.orig/group 2009-08-26 21:57:59.000000000 +0000 > +++ xfstests-dev/group 2009-08-26 21:58:06.000000000 +0000 > @@ -301,7 +301,7 @@ prealloc > 189 mount auto quick > 190 rw auto quick > 191 nfs4acl auto > -192 atime > +192 atime auto > 193 metadata auto quick > 194 rw auto > 195 ioctl dump auto quick > Index: xfstests-dev/192 > =================================================================== > --- xfstests-dev.orig/192 2009-08-26 21:57:51.000000000 +0000 > +++ xfstests-dev/192 2009-08-26 22:04:16.000000000 +0000 > @@ -67,7 +67,7 @@ time2=`_access_time $testfile | tee -a $ > > cd / > umount $TEST_DIR > -mount $TEST_DIR > +_test_mount > time3=`_access_time $testfile | tee -a $seq.full` > > delta1=`expr $time2 - $time1` > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs > From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:47:30 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82Hl5fC033823 for ; Wed, 2 Sep 2009 12:47:20 -0500 X-ASG-Debug-ID: 1251913682-4b1d00ca0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id BC3F641E203 for ; Wed, 2 Sep 2009 10:48:02 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id B6Cfn3wiTbClRBks for ; Wed, 02 Sep 2009 10:48:02 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Mitt6-00035M-01; Wed, 02 Sep 2009 17:44:52 +0000 Date: Wed, 2 Sep 2009 13:44:51 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Michael Monnerie X-ASG-Orig-Subj: [PATCH] xfsprogs: fix the -V option for various shell scripts Subject: [PATCH] xfsprogs: fix the -V option for various shell scripts Message-ID: <20090902174451.GB9759@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251913682 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean For most of the shellscripts wrapping xfs_db or xfs_growfs the -V option is not implemented correctly. If we just add -V to the options for the underlying binaries we will never actually call it because the mandatory device / mount point argument is missing. Instead just call the underlying command with -V directly and exit early. This is what xfs_bmap.sh, xfs_mkfile and xfs_metadump.sh are already doing. Signed-off-by: Christoph Hellwig Reported-by: Michael Monnerie Index: xfsprogs-dev/db/xfs_admin.sh =================================================================== --- xfsprogs-dev.orig/db/xfs_admin.sh 2009-09-02 14:36:22.045272960 -0300 +++ xfsprogs-dev/db/xfs_admin.sh 2009-09-02 14:38:28.497305969 -0300 @@ -19,7 +19,10 @@ do L) DB_OPTS=$DB_OPTS" -c 'label "$OPTARG"'";; u) DB_OPTS=$DB_OPTS" -r -c uuid";; U) DB_OPTS=$DB_OPTS" -c 'uuid "$OPTARG"'";; - V) DB_OPTS=$DB_OPTS" -V";; + V) xfs_db -p xfs_admin -V + status=$? + exit $status + ;; \?) echo $USAGE 1>&2 exit 2 ;; Index: xfsprogs-dev/db/xfs_check.sh =================================================================== --- xfsprogs-dev.orig/db/xfs_check.sh 2009-09-02 14:36:22.069271261 -0300 +++ xfsprogs-dev/db/xfs_check.sh 2009-09-02 14:38:28.501271696 -0300 @@ -13,11 +13,14 @@ do s) OPTS=$OPTS"-s ";; t) OPTS=$OPTS"-t ";; v) OPTS=$OPTS"-v ";; - V) OPTS=$OPTS"-V ";; i) OPTS=$OPTS"-i "$OPTARG" ";; b) OPTS=$OPTS"-b "$OPTARG" ";; f) DBOPTS=$DBOPTS" -f";; l) DBOPTS=$DBOPTS" -l "$OPTARG" ";; + V) xfs_db -p xfs_check -V + status=$? + exit $status + ;; \?) echo $USAGE 1>&2 exit 2 ;; Index: xfsprogs-dev/db/xfs_ncheck.sh =================================================================== --- xfsprogs-dev.orig/db/xfs_ncheck.sh 2009-09-02 14:36:22.089271988 -0300 +++ xfsprogs-dev/db/xfs_ncheck.sh 2009-09-02 14:38:28.501271696 -0300 @@ -14,9 +14,12 @@ do s) OPTS=$OPTS"-s ";; i) OPTS=$OPTS"-i "$OPTARG" ";; v) OPTS=$OPTS"-v ";; - V) OPTS=$OPTS"-V ";; f) DBOPTS=$DBOPTS" -f";; l) DBOPTS=$DBOPTS" -l "$OPTARG" ";; + V) xfs_db -p xfs_ncheck -V + status=$? + exit $status + ;; \?) echo $USAGE 1>&2 exit 2 ;; Index: xfsprogs-dev/growfs/xfs_info.sh =================================================================== --- xfsprogs-dev.orig/growfs/xfs_info.sh 2009-09-02 14:36:22.101270370 -0300 +++ xfsprogs-dev/growfs/xfs_info.sh 2009-09-02 14:38:28.505312364 -0300 @@ -10,7 +10,10 @@ while getopts "t:V" c do case $c in t) OPTS="-t $OPTARG" ;; - V) OPTS="-V $OPTARG" ;; + V) xfs_growfs -p xfs_info -V + status=$? + exit $status + ;; *) echo $USAGE 1>&2 exit 2 ;; From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:08 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82Hvhqg034365 for ; Wed, 2 Sep 2009 12:57:58 -0500 X-ASG-Debug-ID: 1251914319-4b1f01250000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 30CC841E03E for ; Wed, 2 Sep 2009 10:58:40 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id vVvhnHtGq9HZq35q for ; Wed, 02 Sep 2009 10:58:40 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6R-0006SB-N3 for xfs@oss.sgi.com; Wed, 02 Sep 2009 17:58:39 +0000 Message-Id: <20090902175531.469184575@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:31 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 00/14] repair memory usage reductions Subject: [PATCH 00/14] repair memory usage reductions X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914320 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean This is a respin of the patches Barry Naujok wrote at SGI for reducing the memory usage in repair. I've split it up, fixed a few small bugs and added two preparatory cleanups - but all the real work is Barry's. There has been lots of heavy testing on large filesystems by Barry on the original patches, and quite a lot of testing on slightly smaller filesystems by me. These were all ad-hoc tests as XFSQA coverage is rather low on repair. My plan is to add various additional testcase for XFSQA both for intentional corruptions as well as reproducing past reported bugs before we'll release these patches in xfsprogs. But I think it would be good if we could get them into the development git tree to get wider coverage already. From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:08 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_21 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HvhpT034366 for ; Wed, 2 Sep 2009 12:57:58 -0500 X-ASG-Debug-ID: 1251914320-70f603da0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 698B341E03E for ; Wed, 2 Sep 2009 10:58:40 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id yHJhGpswi06gMZKs for ; Wed, 02 Sep 2009 10:58:40 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6R-0006Sm-W1 for xfs@oss.sgi.com; Wed, 02 Sep 2009 17:58:40 +0000 Message-Id: <20090902175839.915684396@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:32 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 01/14] repair: merge scanfunc_bno and scanfunc_cnt Subject: [PATCH 01/14] repair: merge scanfunc_bno and scanfunc_cnt References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-unify-scanfunc-bno-cnt X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914320 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Those two functions are almost identical. The big difference is that we only move blocks from XR_E_FREE1 to XR_E_FREE state when processing the cnt btree. Besides that we print bno vs cnt in the messages and obviously validate a slightly different magic number in the header. Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/scan.c =================================================================== --- xfsprogs-dev.orig/repair/scan.c 2009-08-21 18:24:26.000000000 +0000 +++ xfsprogs-dev/repair/scan.c 2009-08-21 18:40:59.000000000 +0000 @@ -439,15 +439,16 @@ _("out-of-order bmap key (file offset) i } void -scanfunc_bno( +scanfunc_allocbt( struct xfs_btree_block *block, int level, xfs_agblock_t bno, xfs_agnumber_t agno, int suspect, - int isroot - ) + int isroot, + __uint32_t magic) { + const char *name; xfs_agblock_t b, e; int i; xfs_alloc_ptr_t *pp; @@ -456,16 +457,18 @@ scanfunc_bno( int numrecs; int state; - if (be32_to_cpu(block->bb_magic) != XFS_ABTB_MAGIC) { - do_warn(_("bad magic # %#x in btbno block %d/%d\n"), - be32_to_cpu(block->bb_magic), agno, bno); + name = (magic == XFS_ABTB_MAGIC) ? "bno" : "cnt"; + + if (be32_to_cpu(block->bb_magic) != magic) { + do_warn(_("bad magic # %#x in bt%s block %d/%d\n"), + be32_to_cpu(block->bb_magic), name, agno, bno); hdr_errors++; if (suspect) return; } if (be16_to_cpu(block->bb_level) != level) { - do_warn(_("expected level %d got %d in btbno block %d/%d\n"), - level, be16_to_cpu(block->bb_level), agno, bno); + do_warn(_("expected level %d got %d in bt%s block %d/%d\n"), + level, be16_to_cpu(block->bb_level), name, agno, bno); hdr_errors++; if (suspect) return; @@ -483,8 +486,8 @@ scanfunc_bno( default: set_agbno_state(mp, agno, bno, XR_E_MULT); do_warn( -_("bno freespace btree block claimed (state %d), agno %d, bno %d, suspect %d\n"), - state, agno, bno, suspect); +_("%s freespace btree block claimed (state %d), agno %d, bno %d, suspect %d\n"), + name, state, agno, bno, suspect); return; } @@ -520,15 +523,27 @@ _("bno freespace btree block claimed (st continue; for (b = be32_to_cpu(rp[i].ar_startblock); b < e; b++) { - if (get_agbno_state(mp, agno, b) - == XR_E_UNKNOWN) + state = get_agbno_state(mp, agno, b); + switch (state) { + case XR_E_UNKNOWN: set_agbno_state(mp, agno, b, XR_E_FREE1); - else { + break; + case XR_E_FREE1: + /* + * no warning messages -- we'll catch + * FREE1 blocks later + */ + if (magic != XFS_ABTB_MAGIC) { + set_agbno_state(mp, agno, b, + XR_E_FREE); + break; + } + default: do_warn( - _("block (%d,%d) multiply claimed by bno space tree, state - %d\n"), - agno, b, - get_agbno_state(mp, agno, b)); + _("block (%d,%d) multiply claimed by %s space tree, state - %d\n"), + agno, b, name, state); + break; } } } @@ -575,12 +590,26 @@ _("bno freespace btree block claimed (st */ if (be32_to_cpu(pp[i]) != 0 && verify_agbno(mp, agno, be32_to_cpu(pp[i]))) - scan_sbtree(be32_to_cpu(pp[i]), level, agno, - suspect, scanfunc_bno, 0); + scan_sbtree(be32_to_cpu(pp[i]), level, agno, suspect, + (magic == XFS_ABTB_MAGIC) ? + scanfunc_bno : scanfunc_cnt, 0); } } void +scanfunc_bno( + struct xfs_btree_block *block, + int level, + xfs_agblock_t bno, + xfs_agnumber_t agno, + int suspect, + int isroot) +{ + return scanfunc_allocbt(block, level, bno, agno, + suspect, isroot, XFS_ABTB_MAGIC); +} + +void scanfunc_cnt( struct xfs_btree_block *block, int level, @@ -590,136 +619,8 @@ scanfunc_cnt( int isroot ) { - xfs_alloc_ptr_t *pp; - xfs_alloc_rec_t *rp; - xfs_agblock_t b, e; - int i; - int hdr_errors; - int numrecs; - int state; - - hdr_errors = 0; - - if (be32_to_cpu(block->bb_magic) != XFS_ABTC_MAGIC) { - do_warn(_("bad magic # %#x in btcnt block %d/%d\n"), - be32_to_cpu(block->bb_magic), agno, bno); - hdr_errors++; - if (suspect) - return; - } - if (be16_to_cpu(block->bb_level) != level) { - do_warn(_("expected level %d got %d in btcnt block %d/%d\n"), - level, be16_to_cpu(block->bb_level), agno, bno); - hdr_errors++; - if (suspect) - return; - } - - /* - * check for btree blocks multiply claimed - */ - state = get_agbno_state(mp, agno, bno); - - switch (state) { - case XR_E_UNKNOWN: - set_agbno_state(mp, agno, bno, XR_E_FS_MAP); - break; - default: - set_agbno_state(mp, agno, bno, XR_E_MULT); - do_warn( -_("bcnt freespace btree block claimed (state %d), agno %d, bno %d, suspect %d\n"), - state, agno, bno, suspect); - return; - } - - numrecs = be16_to_cpu(block->bb_numrecs); - - if (level == 0) { - if (numrecs > mp->m_alloc_mxr[0]) { - numrecs = mp->m_alloc_mxr[0]; - hdr_errors++; - } - if (isroot == 0 && numrecs < mp->m_alloc_mnr[0]) { - numrecs = mp->m_alloc_mnr[0]; - hdr_errors++; - } - - if (hdr_errors) - suspect++; - - rp = XFS_ALLOC_REC_ADDR(mp, block, 1); - for (i = 0; i < numrecs; i++) { - if (be32_to_cpu(rp[i].ar_blockcount) == 0 || - be32_to_cpu(rp[i].ar_startblock) == 0 || - !verify_agbno(mp, agno, be32_to_cpu( - rp[i].ar_startblock)) || - be32_to_cpu(rp[i].ar_blockcount) > - MAXEXTLEN) - continue; - - e = be32_to_cpu(rp[i].ar_startblock) + - be32_to_cpu(rp[i].ar_blockcount); - if (!verify_agbno(mp, agno, e - 1)) - continue; - for (b = be32_to_cpu(rp[i].ar_startblock); b < e; b++) { - state = get_agbno_state(mp, agno, b); - /* - * no warning messages -- we'll catch - * FREE1 blocks later - */ - switch (state) { - case XR_E_FREE1: - set_agbno_state(mp, agno, b, XR_E_FREE); - break; - case XR_E_UNKNOWN: - set_agbno_state(mp, agno, b, - XR_E_FREE1); - break; - default: - do_warn( - _("block (%d,%d) already used, state %d\n"), - agno, b, state); - break; - } - } - } - return; - } - - /* - * interior record - */ - pp = XFS_ALLOC_PTR_ADDR(mp, block, 1, mp->m_alloc_mxr[1]); - - if (numrecs > mp->m_alloc_mxr[1]) { - numrecs = mp->m_alloc_mxr[1]; - hdr_errors++; - } - if (isroot == 0 && numrecs < mp->m_alloc_mnr[1]) { - numrecs = mp->m_alloc_mnr[1]; - hdr_errors++; - } - - /* - * don't pass bogus tree flag down further if this block - * looked ok. bail out if two levels in a row look bad. - */ - - if (suspect && !hdr_errors) - suspect = 0; - - if (hdr_errors) { - if (suspect) - return; - else suspect++; - } - - for (i = 0; i < numrecs; i++) { - if (be32_to_cpu(pp[i]) != 0 && verify_agbno(mp, agno, - be32_to_cpu(pp[i]))) - scan_sbtree(be32_to_cpu(pp[i]), level, agno, - suspect, scanfunc_cnt, 0); - } + return scanfunc_allocbt(block, level, bno, agno, + suspect, isroot, XFS_ABTC_MAGIC); } /* From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:09 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82Hvi2s034370 for ; Wed, 2 Sep 2009 12:57:59 -0500 X-ASG-Debug-ID: 1251914320-2afc03bc0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id D834215B1A29 for ; Wed, 2 Sep 2009 10:58:40 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id j5ECizDNz8VHycsP for ; Wed, 02 Sep 2009 10:58:40 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6S-0006UK-FV; Wed, 02 Sep 2009 17:58:40 +0000 Message-Id: <20090902175840.403232401@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:35 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 04/14] repair: split up scanfunc_ino Subject: [PATCH 04/14] repair: split up scanfunc_ino References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-split-scanfunc_ino X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914320 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Split out a helper to scan a single inode chunk for suspect inodes from scanfunc_ino to make it more readable. Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/scan.c =================================================================== --- xfsprogs-dev.orig/repair/scan.c 2009-08-21 19:00:15.000000000 +0000 +++ xfsprogs-dev/repair/scan.c 2009-08-21 19:03:26.000000000 +0000 @@ -625,6 +625,167 @@ scanfunc_cnt( suspect, isroot, XFS_ABTC_MAGIC); } +static int +scan_single_ino_chunk( + xfs_agnumber_t agno, + xfs_inobt_rec_t *rp, + int suspect) +{ + xfs_ino_t lino; + xfs_agino_t ino; + xfs_agblock_t agbno; + int j; + int nfree; + int off; + int state; + ino_tree_node_t *ino_rec, *first_rec, *last_rec; + + ino = be32_to_cpu(rp->ir_startino); + off = XFS_AGINO_TO_OFFSET(mp, ino); + agbno = XFS_AGINO_TO_AGBNO(mp, ino); + lino = XFS_AGINO_TO_INO(mp, agno, ino); + + /* + * on multi-block block chunks, all chunks start + * at the beginning of the block. with multi-chunk + * blocks, all chunks must start on 64-inode boundaries + * since each block can hold N complete chunks. if + * fs has aligned inodes, all chunks must start + * at a fs_ino_alignment*N'th agbno. skip recs + * with badly aligned starting inodes. + */ + if (ino == 0 || + (inodes_per_block <= XFS_INODES_PER_CHUNK && off != 0) || + (inodes_per_block > XFS_INODES_PER_CHUNK && + off % XFS_INODES_PER_CHUNK != 0) || + (fs_aligned_inodes && agbno % fs_ino_alignment != 0)) { + do_warn( + _("badly aligned inode rec (starting inode = %llu)\n"), + lino); + suspect++; + } + + /* + * verify numeric validity of inode chunk first + * before inserting into a tree. don't have to + * worry about the overflow case because the + * starting ino number of a chunk can only get + * within 255 inodes of max (NULLAGINO). if it + * gets closer, the agino number will be illegal + * as the agbno will be too large. + */ + if (verify_aginum(mp, agno, ino)) { + do_warn( +_("bad starting inode # (%llu (0x%x 0x%x)) in ino rec, skipping rec\n"), + lino, agno, ino); + return ++suspect; + } + + if (verify_aginum(mp, agno, + ino + XFS_INODES_PER_CHUNK - 1)) { + do_warn( +_("bad ending inode # (%llu (0x%x 0x%x)) in ino rec, skipping rec\n"), + lino + XFS_INODES_PER_CHUNK - 1, + agno, ino + XFS_INODES_PER_CHUNK - 1); + return ++suspect; + } + + /* + * set state of each block containing inodes + */ + if (off == 0 && !suspect) { + for (j = 0; + j < XFS_INODES_PER_CHUNK; + j += mp->m_sb.sb_inopblock) { + agbno = XFS_AGINO_TO_AGBNO(mp, ino + j); + state = get_agbno_state(mp, agno, agbno); + if (state == XR_E_UNKNOWN) { + set_agbno_state(mp, agno, agbno, XR_E_INO); + } else if (state == XR_E_INUSE_FS && agno == 0 && + ino + j >= first_prealloc_ino && + ino + j < last_prealloc_ino) { + set_agbno_state(mp, agno, agbno, XR_E_INO); + } else { + do_warn( +_("inode chunk claims used block, inobt block - agno %d, bno %d, inopb %d\n"), + agno, agbno, + mp->m_sb.sb_inopblock); + /* + * XXX - maybe should mark + * block a duplicate + */ + return ++suspect; + } + } + } + + /* + * ensure only one avl entry per chunk + */ + find_inode_rec_range(agno, ino, ino + XFS_INODES_PER_CHUNK, + &first_rec, &last_rec); + if (first_rec != NULL) { + /* + * this chunk overlaps with one (or more) + * already in the tree + */ + do_warn( +_("inode rec for ino %llu (%d/%d) overlaps existing rec (start %d/%d)\n"), + lino, agno, ino, agno, first_rec->ino_startnum); + suspect++; + + /* + * if the 2 chunks start at the same place, + * then we don't have to put this one + * in the uncertain list. go to the next one. + */ + if (first_rec->ino_startnum == ino) + return suspect; + } + + nfree = 0; + + /* + * now mark all the inodes as existing and free or used. + * if the tree is suspect, put them into the uncertain + * inode tree. + */ + if (!suspect) { + if (XFS_INOBT_IS_FREE_DISK(rp, 0)) { + nfree++; + ino_rec = set_inode_free_alloc(agno, ino); + } else { + ino_rec = set_inode_used_alloc(agno, ino); + } + for (j = 1; j < XFS_INODES_PER_CHUNK; j++) { + if (XFS_INOBT_IS_FREE_DISK(rp, j)) { + nfree++; + set_inode_free(ino_rec, j); + } else { + set_inode_used(ino_rec, j); + } + } + } else { + for (j = 0; j < XFS_INODES_PER_CHUNK; j++) { + if (XFS_INOBT_IS_FREE_DISK(rp, j)) { + nfree++; + add_aginode_uncertain(agno, ino + j, 1); + } else { + add_aginode_uncertain(agno, ino + j, 0); + } + } + } + + if (nfree != be32_to_cpu(rp->ir_freecount)) { + do_warn(_("ir_freecount/free mismatch, inode " + "chunk %d/%d, freecount %d nfree %d\n"), + agno, ino, be32_to_cpu(rp->ir_freecount), nfree); + } + + return suspect; +} + + /* * this one walks the inode btrees sucking the info there into * the incore avl tree. We try and rescue corrupted btree records @@ -651,18 +812,11 @@ scanfunc_ino( int isroot ) { - xfs_ino_t lino; int i; - xfs_agino_t ino; - xfs_agblock_t agbno; - int j; - int nfree; - int off; int numrecs; int state; xfs_inobt_ptr_t *pp; xfs_inobt_rec_t *rp; - ino_tree_node_t *ino_rec, *first_rec, *last_rec; int hdr_errors; hdr_errors = 0; @@ -737,165 +891,8 @@ _("inode btree block claimed (state %d), * of INODES_PER_CHUNK (64) inodes. off is the offset into * the block. skip processing of bogus records. */ - for (i = 0; i < numrecs; i++) { - ino = be32_to_cpu(rp[i].ir_startino); - off = XFS_AGINO_TO_OFFSET(mp, ino); - agbno = XFS_AGINO_TO_AGBNO(mp, ino); - lino = XFS_AGINO_TO_INO(mp, agno, ino); - /* - * on multi-block block chunks, all chunks start - * at the beginning of the block. with multi-chunk - * blocks, all chunks must start on 64-inode boundaries - * since each block can hold N complete chunks. if - * fs has aligned inodes, all chunks must start - * at a fs_ino_alignment*N'th agbno. skip recs - * with badly aligned starting inodes. - */ - if (ino == 0 || - (inodes_per_block <= XFS_INODES_PER_CHUNK && - off != 0) || - (inodes_per_block > XFS_INODES_PER_CHUNK && - off % XFS_INODES_PER_CHUNK != 0) || - (fs_aligned_inodes && - agbno % fs_ino_alignment != 0)) { - do_warn( - _("badly aligned inode rec (starting inode = %llu)\n"), - lino); - suspect++; - } - - /* - * verify numeric validity of inode chunk first - * before inserting into a tree. don't have to - * worry about the overflow case because the - * starting ino number of a chunk can only get - * within 255 inodes of max (NULLAGINO). if it - * gets closer, the agino number will be illegal - * as the agbno will be too large. - */ - if (verify_aginum(mp, agno, ino)) { - do_warn( -_("bad starting inode # (%llu (0x%x 0x%x)) in ino rec, skipping rec\n"), - lino, agno, ino); - suspect++; - continue; - } - - if (verify_aginum(mp, agno, - ino + XFS_INODES_PER_CHUNK - 1)) { - do_warn( -_("bad ending inode # (%llu (0x%x 0x%x)) in ino rec, skipping rec\n"), - lino + XFS_INODES_PER_CHUNK - 1, - agno, ino + XFS_INODES_PER_CHUNK - 1); - suspect++; - continue; - } - - /* - * set state of each block containing inodes - */ - if (off == 0 && !suspect) { - for (j = 0; - j < XFS_INODES_PER_CHUNK; - j += mp->m_sb.sb_inopblock) { - agbno = XFS_AGINO_TO_AGBNO(mp, ino + j); - state = get_agbno_state(mp, - agno, agbno); - - if (state == XR_E_UNKNOWN) { - set_agbno_state(mp, agno, - agbno, XR_E_INO); - } else if (state == XR_E_INUSE_FS && - agno == 0 && - ino + j >= first_prealloc_ino && - ino + j < last_prealloc_ino) { - set_agbno_state(mp, agno, - agbno, XR_E_INO); - } else { - do_warn( -_("inode chunk claims used block, inobt block - agno %d, bno %d, inopb %d\n"), - agno, bno, - mp->m_sb.sb_inopblock); - suspect++; - /* - * XXX - maybe should mark - * block a duplicate - */ - continue; - } - } - } - /* - * ensure only one avl entry per chunk - */ - find_inode_rec_range(agno, ino, - ino + XFS_INODES_PER_CHUNK, - &first_rec, - &last_rec); - if (first_rec != NULL) { - /* - * this chunk overlaps with one (or more) - * already in the tree - */ - do_warn( -_("inode rec for ino %llu (%d/%d) overlaps existing rec (start %d/%d)\n"), - lino, agno, ino, - agno, first_rec->ino_startnum); - suspect++; - - /* - * if the 2 chunks start at the same place, - * then we don't have to put this one - * in the uncertain list. go to the next one. - */ - if (first_rec->ino_startnum == ino) - continue; - } - - nfree = 0; - - /* - * now mark all the inodes as existing and free or used. - * if the tree is suspect, put them into the uncertain - * inode tree. - */ - if (!suspect) { - if (XFS_INOBT_IS_FREE_DISK(&rp[i], 0)) { - nfree++; - ino_rec = set_inode_free_alloc(agno, - ino); - } else { - ino_rec = set_inode_used_alloc(agno, - ino); - } - for (j = 1; j < XFS_INODES_PER_CHUNK; j++) { - if (XFS_INOBT_IS_FREE_DISK(&rp[i], j)) { - nfree++; - set_inode_free(ino_rec, j); - } else { - set_inode_used(ino_rec, j); - } - } - } else { - for (j = 0; j < XFS_INODES_PER_CHUNK; j++) { - if (XFS_INOBT_IS_FREE_DISK(&rp[i], j)) { - nfree++; - add_aginode_uncertain(agno, - ino + j, 1); - } else { - add_aginode_uncertain(agno, - ino + j, 0); - } - } - } - - if (nfree != be32_to_cpu(rp[i].ir_freecount)) { - do_warn(_("ir_freecount/free mismatch, inode " - "chunk %d/%d, freecount %d nfree %d\n"), - agno, ino, - be32_to_cpu(rp[i].ir_freecount), nfree); - } - } + for (i = 0; i < numrecs; i++) + suspect = scan_single_ino_chunk(agno, &rp[i], suspect); if (suspect) bad_ino_btree = 1; From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:10 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HviEZ034373 for ; Wed, 2 Sep 2009 12:58:00 -0500 X-ASG-Debug-ID: 1251914320-2afd03d90000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 9CCE615B1A26 for ; Wed, 2 Sep 2009 10:58:40 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id Wrw9h84qUWDPZ2qY for ; Wed, 02 Sep 2009 10:58:40 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6S-0006Tn-9v for xfs@oss.sgi.com; Wed, 02 Sep 2009 17:58:40 +0000 Message-Id: <20090902175840.224768080@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:34 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH 03/14] repair: kill B_IS_META flag Subject: [PATCH 03/14] repair: kill B_IS_META flag References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-kill-B_IS_META X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914320 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean B_IS_META is the inverse flag of B_IS_INODE which is not really obvious from it's use. So just use !B_IS_INODE to make it more clear. Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/prefetch.c =================================================================== --- xfsprogs-dev.orig/repair/prefetch.c 2009-08-20 00:02:25.000000000 +0000 +++ xfsprogs-dev/repair/prefetch.c 2009-08-20 00:05:36.000000000 +0000 @@ -64,7 +64,6 @@ * the buffer is for an inode or other metadata. */ #define B_IS_INODE(f) (((f) & 5) == 0) -#define B_IS_META(f) (((f) & 5) != 0) #define DEF_BATCH_BYTES 0x10000 @@ -131,7 +130,7 @@ if (fsbno > args->last_bno_read) { radix_tree_insert(&args->primary_io_queue, fsbno, bp); - if (B_IS_META(flag)) + if (!B_IS_INODE(flag)) radix_tree_tag_set(&args->primary_io_queue, fsbno, 0); else { args->inode_bufs_queued++; @@ -153,7 +152,7 @@ (long long)XFS_BUF_ADDR(bp), args->agno, fsbno, args->last_bno_read); #endif - ASSERT(B_IS_META(flag)); + ASSERT(!B_IS_INODE(flag)); XFS_BUF_SET_PRIORITY(bp, B_DIR_META_2); radix_tree_insert(&args->secondary_io_queue, fsbno, bp); } From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:09 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_35 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HvixB034371 for ; Wed, 2 Sep 2009 12:57:59 -0500 X-ASG-Debug-ID: 1251914321-4f3902df0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 6191715B1A28 for ; Wed, 2 Sep 2009 10:58:41 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id ByHbuwNH3VaytPYI for ; Wed, 02 Sep 2009 10:58:41 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6T-0006WA-1H; Wed, 02 Sep 2009 17:58:41 +0000 Message-Id: <20090902175840.934778714@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:38 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 07/14] repair: use single prefetch queue Subject: [PATCH 07/14] repair: use single prefetch queue References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-use-single-prefetch-queue X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914321 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean We don't need two prefetch queues as we guarantee execution in order anyway. XXX: description could use some more details. Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/prefetch.c =================================================================== --- xfsprogs-dev.orig/repair/prefetch.c 2009-08-20 00:14:08.000000000 +0000 +++ xfsprogs-dev/repair/prefetch.c 2009-08-20 00:16:01.000000000 +0000 @@ -128,8 +128,9 @@ pthread_mutex_lock(&args->lock); + btree_insert(args->io_queue, fsbno, bp); + if (fsbno > args->last_bno_read) { - btree_insert(args->primary_io_queue, fsbno, bp); if (B_IS_INODE(flag)) { args->inode_bufs_queued++; if (args->inode_bufs_queued == IO_THRESHOLD) @@ -152,7 +153,6 @@ #endif ASSERT(!B_IS_INODE(flag)); XFS_BUF_SET_PRIORITY(bp, B_DIR_META_2); - btree_insert(args->secondary_io_queue, fsbno, bp); } pf_start_processing(args); @@ -405,7 +405,6 @@ pf_which_t which, void *buf) { - struct btree_root *queue; xfs_buf_t *bplist[MAX_BUFS]; unsigned int num; off64_t first_off, last_off, next_off; @@ -416,19 +415,22 @@ unsigned long max_fsbno; char *pbuf; - queue = (which != PF_SECONDARY) ? args->primary_io_queue - : args->secondary_io_queue; - - while (btree_find(queue, 0, &fsbno) != NULL) { - max_fsbno = fsbno + pf_max_fsbs; + for (;;) { num = 0; - - bplist[0] = btree_lookup(queue, fsbno); + if (which == PF_SECONDARY) { + bplist[0] = btree_find(args->io_queue, 0, &fsbno); + max_fsbno = MIN(fsbno + pf_max_fsbs, + args->last_bno_read); + } else { + bplist[0] = btree_find(args->io_queue, + args->last_bno_read, &fsbno); + max_fsbno = fsbno + pf_max_fsbs; + } while (bplist[num] && num < MAX_BUFS && fsbno < max_fsbno) { if (which != PF_META_ONLY || !B_IS_INODE(XFS_BUF_PRIORITY(bplist[num]))) num++; - bplist[num] = btree_lookup_next(queue, &fsbno); + bplist[num] = btree_lookup_next(args->io_queue, &fsbno); } if (!num) return; @@ -440,21 +442,22 @@ */ first_off = LIBXFS_BBTOOFF64(XFS_BUF_ADDR(bplist[0])); last_off = LIBXFS_BBTOOFF64(XFS_BUF_ADDR(bplist[num-1])) + - XFS_BUF_SIZE(bplist[num-1]); + XFS_BUF_SIZE(bplist[num-1]); while (last_off - first_off > pf_max_bytes) { num--; - last_off = LIBXFS_BBTOOFF64(XFS_BUF_ADDR(bplist[num-1])) + - XFS_BUF_SIZE(bplist[num-1]); + last_off = LIBXFS_BBTOOFF64(XFS_BUF_ADDR( + bplist[num-1])) + XFS_BUF_SIZE(bplist[num-1]); } - if (num < ((last_off - first_off) >> (mp->m_sb.sb_blocklog + 3))) { + if (num < ((last_off - first_off) >> + (mp->m_sb.sb_blocklog + 3))) { /* * not enough blocks for one big read, so determine * the number of blocks that are close enough. */ last_off = first_off + XFS_BUF_SIZE(bplist[0]); for (i = 1; i < num; i++) { - next_off = LIBXFS_BBTOOFF64(XFS_BUF_ADDR(bplist[i])) + - XFS_BUF_SIZE(bplist[i]); + next_off = LIBXFS_BBTOOFF64(XFS_BUF_ADDR( + bplist[i])) + XFS_BUF_SIZE(bplist[i]); if (next_off - last_off > pf_batch_bytes) break; last_off = next_off; @@ -463,7 +466,7 @@ } for (i = 0; i < num; i++) { - if (btree_delete(queue, XFS_DADDR_TO_FSB(mp, + if (btree_delete(args->io_queue, XFS_DADDR_TO_FSB(mp, XFS_BUF_ADDR(bplist[i]))) == NULL) do_error(_("prefetch corruption\n")); } @@ -566,7 +569,7 @@ return NULL; pthread_mutex_lock(&args->lock); - while (!args->queuing_done || btree_find(args->primary_io_queue, 0, NULL)) { + while (!args->queuing_done || btree_find(args->io_queue, 0, NULL)) { #ifdef XR_PF_TRACE pftrace("waiting to start prefetch I/O for AG %d", args->agno); @@ -692,8 +695,7 @@ #endif pthread_mutex_lock(&args->lock); - ASSERT(btree_find(args->primary_io_queue, 0, NULL) == NULL); - ASSERT(btree_find(args->secondary_io_queue, 0, NULL) == NULL); + ASSERT(btree_find(args->io_queue, 0, NULL) == NULL); args->prefetch_done = 1; if (args->next_args) @@ -751,8 +753,7 @@ args = calloc(1, sizeof(prefetch_args_t)); - btree_init(&args->primary_io_queue); - btree_init(&args->secondary_io_queue); + btree_init(&args->io_queue); if (pthread_mutex_init(&args->lock, NULL) != 0) do_error(_("failed to initialize prefetch mutex\n")); if (pthread_cond_init(&args->start_reading, NULL) != 0) @@ -831,8 +832,7 @@ pthread_cond_destroy(&args->start_reading); pthread_cond_destroy(&args->start_processing); sem_destroy(&args->ra_count); - btree_destroy(args->primary_io_queue); - btree_destroy(args->secondary_io_queue); + btree_destroy(args->io_queue); free(args); } Index: xfsprogs-dev/repair/prefetch.h =================================================================== --- xfsprogs-dev.orig/repair/prefetch.h 2009-08-20 00:06:44.000000000 +0000 +++ xfsprogs-dev/repair/prefetch.h 2009-08-20 00:16:01.000000000 +0000 @@ -13,8 +13,7 @@ pthread_mutex_t lock; pthread_t queuing_thread; pthread_t io_threads[PF_THREAD_COUNT]; - struct btree_root *primary_io_queue; - struct btree_root *secondary_io_queue; + struct btree_root *io_queue; pthread_cond_t start_reading; pthread_cond_t start_processing; int agno; From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:09 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_66, J_CHICKENPOX_75 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82Hvi4k034372 for ; Wed, 2 Sep 2009 12:57:59 -0500 X-ASG-Debug-ID: 1251914321-70ad03b70000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A23A141E03E for ; Wed, 2 Sep 2009 10:58:41 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id bai6ziy09rFjZjzI for ; Wed, 02 Sep 2009 10:58:41 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6T-0006Wp-79; Wed, 02 Sep 2009 17:58:41 +0000 Message-Id: <20090902175841.137658776@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:39 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 08/14] repair: clean up prefetch tracing Subject: [PATCH 08/14] repair: clean up prefetch tracing References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-cleanup-prefetch-tracing X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914321 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Define a dummy pftrace macro for the non-tracing case to reduce the ifdef hell, clean up a few trace calls and add proper init/exit handlers for the tracing setup and teardown. Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/dino_chunks.c =================================================================== --- xfsprogs-dev.orig/repair/dino_chunks.c 2009-08-19 23:42:32.000000000 +0000 +++ xfsprogs-dev/repair/dino_chunks.c 2009-08-20 00:16:53.000000000 +0000 @@ -629,10 +629,9 @@ cluster_count * sizeof(xfs_buf_t*)); for (bp_index = 0; bp_index < cluster_count; bp_index++) { -#ifdef XR_PF_TRACE pftrace("about to read off %llu in AG %d", (long long)XFS_AGB_TO_DADDR(mp, agno, agbno), agno); -#endif + bplist[bp_index] = libxfs_readbuf(mp->m_dev, XFS_AGB_TO_DADDR(mp, agno, agbno), XFS_FSB_TO_BB(mp, blks_per_cluster), 0); @@ -650,11 +649,9 @@ } agbno += blks_per_cluster; -#ifdef XR_PF_TRACE pftrace("readbuf %p (%llu, %d) in AG %d", bplist[bp_index], (long long)XFS_BUF_ADDR(bplist[bp_index]), XFS_BUF_COUNT(bplist[bp_index]), agno); -#endif } agbno = XFS_AGINO_TO_AGBNO(mp, first_irec->ino_startnum); @@ -906,10 +903,10 @@ * done! - finished up irec and block simultaneously */ for (bp_index = 0; bp_index < cluster_count; bp_index++) { -#ifdef XR_PF_TRACE - pftrace("put/writebuf %p (%llu) in AG %d", bplist[bp_index], - (long long)XFS_BUF_ADDR(bplist[bp_index]), agno); -#endif + pftrace("put/writebuf %p (%llu) in AG %d", + bplist[bp_index], (long long) + XFS_BUF_ADDR(bplist[bp_index]), agno); + if (dirty && !no_modify) libxfs_writebuf(bplist[bp_index], 0); else Index: xfsprogs-dev/repair/dir2.c =================================================================== --- xfsprogs-dev.orig/repair/dir2.c 2009-08-19 23:42:32.000000000 +0000 +++ xfsprogs-dev/repair/dir2.c 2009-08-20 00:16:53.000000000 +0000 @@ -103,21 +103,19 @@ bplist = bparray; } for (i = 0; i < nex; i++) { -#ifdef XR_PF_TRACE pftrace("about to read off %llu (len = %d)", (long long)XFS_FSB_TO_DADDR(mp, bmp[i].startblock), XFS_FSB_TO_BB(mp, bmp[i].blockcount)); -#endif + bplist[i] = libxfs_readbuf(mp->m_dev, XFS_FSB_TO_DADDR(mp, bmp[i].startblock), XFS_FSB_TO_BB(mp, bmp[i].blockcount), 0); if (!bplist[i]) goto failed; -#ifdef XR_PF_TRACE + pftrace("readbuf %p (%llu, %d)", bplist[i], (long long)XFS_BUF_ADDR(bplist[i]), XFS_BUF_COUNT(bplist[i])); -#endif } dabuf = malloc(XFS_DA_BUF_SIZE(nex)); if (dabuf == NULL) { @@ -248,10 +246,8 @@ } da_buf_done(dabuf); for (i = 0; i < nbuf; i++) { -#ifdef XR_PF_TRACE pftrace("putbuf %p (%llu)", bplist[i], (long long)XFS_BUF_ADDR(bplist[i])); -#endif libxfs_putbuf(bplist[i]); } if (bplist != &bp) @@ -538,7 +534,7 @@ /* * bail out if this is the root block (top of tree) */ - if (this_level >= cursor->active) + if (this_level >= cursor->active) return(0); /* * set hashvalue to correctl reflect the now-validated @@ -1425,7 +1421,7 @@ * numbers. Do NOT touch the name until after we've computed * the hashvalue and done a namecheck() on the name. * - * Conditions must either set clearino to zero or set + * Conditions must either set clearino to zero or set * clearreason why it's being cleared. */ if (!ino_discovery && ent_ino == BADFSINO) { @@ -1456,7 +1452,7 @@ if (ino_discovery) { add_inode_uncertain(mp, ent_ino, 0); clearino = 0; - } else + } else clearreason = _("non-existent"); } else { /* Index: xfsprogs-dev/repair/globals.h =================================================================== --- xfsprogs-dev.orig/repair/globals.h 2009-08-19 23:42:32.000000000 +0000 +++ xfsprogs-dev/repair/globals.h 2009-08-20 00:16:53.000000000 +0000 @@ -199,10 +199,6 @@ EXTERN int report_interval; EXTERN __uint64_t *prog_rpt_done; -#ifdef XR_PF_TRACE -EXTERN FILE *pf_trace_file; -#endif - EXTERN int ag_stride; EXTERN int thread_count; Index: xfsprogs-dev/repair/init.c =================================================================== --- xfsprogs-dev.orig/repair/init.c 2009-08-20 00:06:44.000000000 +0000 +++ xfsprogs-dev/repair/init.c 2009-08-20 00:16:53.000000000 +0000 @@ -150,4 +150,5 @@ ts_create(); ts_init(); increase_rlimit(); + pftrace_init(); } Index: xfsprogs-dev/repair/prefetch.c =================================================================== --- xfsprogs-dev.orig/repair/prefetch.c 2009-08-20 00:16:01.000000000 +0000 +++ xfsprogs-dev/repair/prefetch.c 2009-08-20 00:16:53.000000000 +0000 @@ -83,9 +83,8 @@ prefetch_args_t *args) { if (!args->can_start_processing) { -#ifdef XR_PF_TRACE pftrace("signalling processing for AG %d", args->agno); -#endif + args->can_start_processing = 1; pthread_cond_signal(&args->start_processing); } @@ -96,9 +95,8 @@ prefetch_args_t *args) { if (!args->can_start_reading) { -#ifdef XR_PF_TRACE pftrace("signalling reading for AG %d", args->agno); -#endif + args->can_start_reading = 1; pthread_cond_broadcast(&args->start_reading); } @@ -136,25 +134,16 @@ if (args->inode_bufs_queued == IO_THRESHOLD) pf_start_io_workers(args); } -#ifdef XR_PF_TRACE - pftrace("getbuf %c %p (%llu) in AG %d (fsbno = %lu) added to " - "primary queue (inode_bufs_queued = %d, last_bno = %lu)", - B_IS_INODE(flag) ? 'I' : 'M', bp, - (long long)XFS_BUF_ADDR(bp), args->agno, fsbno, - args->inode_bufs_queued, args->last_bno_read); -#endif } else { -#ifdef XR_PF_TRACE - pftrace("getbuf %c %p (%llu) in AG %d (fsbno = %lu) added to " - "secondary queue (last_bno = %lu)", - B_IS_INODE(flag) ? 'I' : 'M', bp, - (long long)XFS_BUF_ADDR(bp), args->agno, fsbno, - args->last_bno_read); -#endif ASSERT(!B_IS_INODE(flag)); XFS_BUF_SET_PRIORITY(bp, B_DIR_META_2); } + pftrace("getbuf %c %p (%llu) in AG %d (fsbno = %lu) added to queue" + "(inode_bufs_queued = %d, last_bno = %lu)", B_IS_INODE(flag) ? + 'I' : 'M', bp, (long long)XFS_BUF_ADDR(bp), args->agno, fsbno, + args->inode_bufs_queued, args->last_bno_read); + pf_start_processing(args); pthread_mutex_unlock(&args->lock); @@ -192,9 +181,9 @@ while (irec.br_blockcount) { unsigned int len; -#ifdef XR_PF_TRACE + pftrace("queuing dir extent in AG %d", args->agno); -#endif + len = (irec.br_blockcount > mp->m_dirblkfsbs) ? mp->m_dirblkfsbs : irec.br_blockcount; pf_queue_io(args, irec.br_startblock, len, B_DIR_META); @@ -520,20 +509,16 @@ } } for (i = 0; i < num; i++) { -#ifdef XR_PF_TRACE pftrace("putbuf %c %p (%llu) in AG %d", B_IS_INODE(XFS_BUF_PRIORITY(bplist[i])) ? 'I' : 'M', bplist[i], (long long)XFS_BUF_ADDR(bplist[i]), args->agno); -#endif libxfs_putbuf(bplist[i]); } pthread_mutex_lock(&args->lock); if (which != PF_SECONDARY) { -#ifdef XR_PF_TRACE pftrace("inode_bufs_queued for AG %d = %d", args->agno, args->inode_bufs_queued); -#endif /* * if primary inode queue running low, process metadata * in boths queues to avoid I/O starvation as the @@ -542,15 +527,14 @@ */ if (which == PF_PRIMARY && !args->queuing_done && args->inode_bufs_queued < IO_THRESHOLD) { -#ifdef XR_PF_TRACE pftrace("reading metadata bufs from primary queue for AG %d", args->agno); -#endif + pf_batch_read(args, PF_META_ONLY, buf); -#ifdef XR_PF_TRACE + pftrace("reading bufs from secondary queue for AG %d", args->agno); -#endif + pf_batch_read(args, PF_SECONDARY, buf); } } @@ -571,20 +555,18 @@ pthread_mutex_lock(&args->lock); while (!args->queuing_done || btree_find(args->io_queue, 0, NULL)) { -#ifdef XR_PF_TRACE pftrace("waiting to start prefetch I/O for AG %d", args->agno); -#endif + while (!args->can_start_reading && !args->queuing_done) pthread_cond_wait(&args->start_reading, &args->lock); -#ifdef XR_PF_TRACE + pftrace("starting prefetch I/O for AG %d", args->agno); -#endif + pf_batch_read(args, PF_PRIMARY, buf); pf_batch_read(args, PF_SECONDARY, buf); -#ifdef XR_PF_TRACE pftrace("ran out of bufs to prefetch for AG %d", args->agno); -#endif + if (!args->queuing_done) args->can_start_reading = 0; } @@ -592,9 +574,8 @@ free(buf); -#ifdef XR_PF_TRACE pftrace("finished prefetch I/O for AG %d", args->agno); -#endif + return NULL; } @@ -636,10 +617,7 @@ break; } } - -#ifdef XR_PF_TRACE pftrace("starting prefetch for AG %d", args->agno); -#endif for (irec = findfirst_inode_rec(args->agno); irec != NULL; irec = next_ino_rec(irec)) { @@ -676,10 +654,9 @@ pthread_mutex_lock(&args->lock); -#ifdef XR_PF_TRACE pftrace("finished queuing inodes for AG %d (inode_bufs_queued = %d)", args->agno, args->inode_bufs_queued); -#endif + args->queuing_done = 1; pf_start_io_workers(args); pf_start_processing(args); @@ -690,9 +667,8 @@ if (args->io_threads[i]) pthread_join(args->io_threads[i], NULL); -#ifdef XR_PF_TRACE pftrace("prefetch for AG %d finished", args->agno); -#endif + pthread_mutex_lock(&args->lock); ASSERT(btree_find(args->io_queue, 0, NULL) == NULL); @@ -712,9 +688,8 @@ { int err; -#ifdef XR_PF_TRACE pftrace("creating queue thread for AG %d", args->agno); -#endif + err = pthread_create(&args->queuing_thread, NULL, pf_queuing_worker, args); if (err != 0) { @@ -801,14 +776,12 @@ pthread_mutex_lock(&args->lock); while (!args->can_start_processing) { -#ifdef XR_PF_TRACE pftrace("waiting to start processing AG %d", args->agno); -#endif + pthread_cond_wait(&args->start_processing, &args->lock); } -#ifdef XR_PF_TRACE pftrace("can start processing AG %d", args->agno); -#endif + pthread_mutex_unlock(&args->lock); } @@ -819,15 +792,13 @@ if (args == NULL) return; -#ifdef XR_PF_TRACE pftrace("waiting AG %d prefetch to finish", args->agno); -#endif + if (args->queuing_thread) pthread_join(args->queuing_thread, NULL); -#ifdef XR_PF_TRACE pftrace("AG %d prefetch done", args->agno); -#endif + pthread_mutex_destroy(&args->lock); pthread_cond_destroy(&args->start_reading); pthread_cond_destroy(&args->start_processing); @@ -839,6 +810,21 @@ #ifdef XR_PF_TRACE +static FILE *pf_trace_file; + +void +pftrace_init(void) +{ + pf_trace_file = fopen("/tmp/xfs_repair_prefetch.trace", "w"); + setvbuf(pf_trace_file, NULL, _IOLBF, 1024); +} + +void +pftrace_done(void) +{ + fclose(pf_trace_file); +} + void _pftrace(const char *func, const char *msg, ...) { @@ -853,7 +839,8 @@ buf[sizeof(buf)-1] = '\0'; va_end(args); - fprintf(pf_trace_file, "%lu.%06lu %s: %s\n", tv.tv_sec, tv.tv_usec, func, buf); + fprintf(pf_trace_file, "%lu.%06lu %s: %s\n", tv.tv_sec, tv.tv_usec, + func, buf); } #endif Index: xfsprogs-dev/repair/prefetch.h =================================================================== --- xfsprogs-dev.orig/repair/prefetch.h 2009-08-20 00:16:01.000000000 +0000 +++ xfsprogs-dev/repair/prefetch.h 2009-08-20 00:16:53.000000000 +0000 @@ -50,8 +50,15 @@ #ifdef XR_PF_TRACE +void pftrace_init(void); +void pftrace_done(void); + #define pftrace(msg...) _pftrace(__FUNCTION__, ## msg) void _pftrace(const char *, const char *, ...); +#else +static inline void pftrace_init(void) { }; +static inline void pftrace_done(void) { }; +static inline void pftrace(const char *msg, ...) { }; #endif #endif /* _XFS_REPAIR_PREFETCH_H */ Index: xfsprogs-dev/repair/xfs_repair.c =================================================================== --- xfsprogs-dev.orig/repair/xfs_repair.c 2009-08-19 23:42:32.000000000 +0000 +++ xfsprogs-dev/repair/xfs_repair.c 2009-08-20 00:16:53.000000000 +0000 @@ -542,11 +542,6 @@ bindtextdomain(PACKAGE, LOCALEDIR); textdomain(PACKAGE); -#ifdef XR_PF_TRACE - pf_trace_file = fopen("/tmp/xfs_repair_prefetch.trace", "w"); - setvbuf(pf_trace_file, NULL, _IOLBF, 1024); -#endif - temp_mp = &xfs_m; setbuf(stdout, NULL); @@ -850,8 +845,7 @@ if (verbose) summary_report(); do_log(_("done\n")); -#ifdef XR_PF_TRACE - fclose(pf_trace_file); -#endif + pftrace_done(); + return (0); } From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:10 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_14, J_CHICKENPOX_32,J_CHICKENPOX_33,J_CHICKENPOX_61,J_CHICKENPOX_62, J_CHICKENPOX_73 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HvjIm034374 for ; Wed, 2 Sep 2009 12:58:00 -0500 X-ASG-Debug-ID: 1251914321-2afe03e00000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id D771915B1A28 for ; Wed, 2 Sep 2009 10:58:41 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id JVchCQzHAjpHurEl for ; Wed, 02 Sep 2009 10:58:41 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6T-0006XL-Ci; Wed, 02 Sep 2009 17:58:41 +0000 Message-Id: <20090902175841.284697389@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:40 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 09/14] repair: track logical to physical block mapping more effeciently Subject: [PATCH 09/14] repair: track logical to physical block mapping more effeciently References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-blkmap-opt X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914321 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Currently we track the logical to physical block mapping by a structure which contains an array of physicial blocks. This is extremly efficient and is replaced with the normal starblock storage we use in the kernel and on disk in this patch. In addition also use thread-local storage for the block map, this is possible because repair only processes one inode at a given time per thread, and the block map does not have to outlive the processing of a single inode. The combination of those factors means we can use pthread thread-local storage to store the block map, and we can re-use the allocation over and over again. This should be ported over to xfs_db eventually, or even better we could try to share the code. [hch: added a small fix in blkmap_set_ext to not call memmove unless needed] Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/bmap.c =================================================================== --- xfsprogs-dev.orig/repair/bmap.c 2009-08-20 02:32:34.000000000 +0000 +++ xfsprogs-dev/repair/bmap.c 2009-08-20 02:32:45.000000000 +0000 @@ -1,5 +1,5 @@ /* - * Copyright (c) 2000-2001,2005 Silicon Graphics, Inc. + * Copyright (c) 2000-2001,2005,2008 Silicon Graphics, Inc. * All Rights Reserved. * * This program is free software; you can redistribute it and/or @@ -21,106 +21,46 @@ #include "bmap.h" /* - * Block mapping code taken from xfs_db. - */ - -/* - * Append an extent to the block entry. - */ -void -blkent_append( - blkent_t **entp, - xfs_dfsbno_t b, - xfs_dfilblks_t c) -{ - blkent_t *ent; - size_t size; - int i; - - ent = *entp; - size = BLKENT_SIZE(c + ent->nblks); - if ((*entp = ent = realloc(ent, size)) == NULL) { - do_warn(_("realloc failed in blkent_append (%u bytes)\n"), - size); - return; - } - for (i = 0; i < c; i++) - ent->blks[ent->nblks + i] = b + i; - ent->nblks += c; -} - -/* - * Make a new block entry. - */ -blkent_t * -blkent_new( - xfs_dfiloff_t o, - xfs_dfsbno_t b, - xfs_dfilblks_t c) -{ - blkent_t *ent; - int i; - - if ((ent = malloc(BLKENT_SIZE(c))) == NULL) { - do_warn(_("malloc failed in blkent_new (%u bytes)\n"), - BLKENT_SIZE(c)); - return ent; - } - ent->nblks = c; - ent->startoff = o; - for (i = 0; i < c; i++) - ent->blks[i] = b + i; - return ent; -} - -/* - * Prepend an extent to the block entry. + * Track the logical to physical block mapping for inodes. + * + * Repair only processes one inode at a given time per thread, and the + * block map does not have to outlive the processing of a single inode. + * + * The combination of those factors means we can use pthreads thread-local + * storage to store the block map, and we can re-use the allocation over + * and over again. */ -void -blkent_prepend( - blkent_t **entp, - xfs_dfsbno_t b, - xfs_dfilblks_t c) -{ - int i; - blkent_t *newent; - blkent_t *oldent; - oldent = *entp; - if ((newent = malloc(BLKENT_SIZE(oldent->nblks + c))) == NULL) { - do_warn(_("malloc failed in blkent_prepend (%u bytes)\n"), - BLKENT_SIZE(oldent->nblks + c)); - *entp = newent; - return; - } - newent->nblks = oldent->nblks + c; - newent->startoff = oldent->startoff - c; - for (i = 0; i < c; i++) - newent->blks[i] = b + c; - for (; i < oldent->nblks + c; i++) - newent->blks[i] = oldent->blks[i - c]; - free(oldent); - *entp = newent; -} +pthread_key_t dblkmap_key; +pthread_key_t ablkmap_key; -/* - * Allocate a block map. - */ blkmap_t * blkmap_alloc( - xfs_extnum_t nex) + xfs_extnum_t nex, + int whichfork) { + pthread_key_t key; blkmap_t *blkmap; + ASSERT(whichfork == XFS_DATA_FORK || whichfork == XFS_ATTR_FORK); + if (nex < 1) nex = 1; - if ((blkmap = malloc(BLKMAP_SIZE(nex))) == NULL) { - do_warn(_("malloc failed in blkmap_alloc (%u bytes)\n"), - BLKMAP_SIZE(nex)); - return blkmap; + + key = whichfork ? ablkmap_key : dblkmap_key; + blkmap = pthread_getspecific(key); + if (!blkmap || blkmap->naexts < nex) { + blkmap = realloc(blkmap, BLKMAP_SIZE(nex)); + if (!blkmap) { + do_warn(_("malloc failed in blkmap_alloc (%u bytes)\n"), + BLKMAP_SIZE(nex)); + return NULL; + } + pthread_setspecific(key, blkmap); + blkmap->naexts = nex; } - blkmap->naents = nex; - blkmap->nents = 0; + + blkmap->nexts = 0; return blkmap; } @@ -131,14 +71,7 @@ void blkmap_free( blkmap_t *blkmap) { - blkent_t **entp; - xfs_extnum_t i; - - if (blkmap == NULL) - return; - for (i = 0, entp = blkmap->ents; i < blkmap->nents; i++, entp++) - free(*entp); - free(blkmap); + /* nothing to do! - keep the memory around for the next inode */ } /* @@ -149,20 +82,18 @@ blkmap_get( blkmap_t *blkmap, xfs_dfiloff_t o) { - blkent_t *ent; - blkent_t **entp; + bmap_ext_t *ext = blkmap->exts; int i; - for (i = 0, entp = blkmap->ents; i < blkmap->nents; i++, entp++) { - ent = *entp; - if (o >= ent->startoff && o < ent->startoff + ent->nblks) - return ent->blks[o - ent->startoff]; + for (i = 0; i < blkmap->nexts; i++, ext++) { + if (o >= ext->startoff && o < ext->startoff + ext->blockcount) + return ext->startblock + (o - ext->startoff); } return NULLDFSBNO; } /* - * Get a chunk of entries from a block map. + * Get a chunk of entries from a block map - only used for reading dirv2 blocks */ int blkmap_getn( @@ -172,93 +103,62 @@ blkmap_getn( bmap_ext_t **bmpp, bmap_ext_t *bmpp_single) { - bmap_ext_t *bmp; - blkent_t *ent; - xfs_dfiloff_t ento; - blkent_t **entp; + bmap_ext_t *bmp = NULL; + bmap_ext_t *ext; int i; int nex; if (nb == 1) { - /* + /* * in the common case, when mp->m_dirblkfsbs == 1, * avoid additional malloc/free overhead */ bmpp_single->startblock = blkmap_get(blkmap, o); - bmpp_single->blockcount = 1; - bmpp_single->startoff = 0; - bmpp_single->flag = 0; - *bmpp = bmpp_single; - return (bmpp_single->startblock != NULLDFSBNO) ? 1 : 0; + goto single_ext; } - for (i = nex = 0, bmp = NULL, entp = blkmap->ents; - i < blkmap->nents; - i++, entp++) { - ent = *entp; - if (ent->startoff >= o + nb) + ext = blkmap->exts; + nex = 0; + for (i = 0; i < blkmap->nexts; i++, ext++) { + + if (ext->startoff >= o + nb) break; - if (ent->startoff + ent->nblks <= o) + if (ext->startoff + ext->blockcount <= o) continue; - for (ento = ent->startoff; - ento < ent->startoff + ent->nblks && ento < o + nb; - ento++) { - if (ento < o) - continue; - if (bmp && - bmp[nex - 1].startoff + bmp[nex - 1].blockcount == - ento && - bmp[nex - 1].startblock + bmp[nex - 1].blockcount == - ent->blks[ento - ent->startoff]) - bmp[nex - 1].blockcount++; - else { - bmp = realloc(bmp, ++nex * sizeof(*bmp)); - if (bmp == NULL) { - do_warn(_("blkmap_getn realloc failed" - " (%u bytes)\n"), - nex * sizeof(*bmp)); - continue; - } - bmp[nex - 1].startoff = ento; - bmp[nex - 1].startblock = - ent->blks[ento - ent->startoff]; - bmp[nex - 1].blockcount = 1; - bmp[nex - 1].flag = 0; - } + + /* + * if all the requested blocks are in one extent (also common), + * use the bmpp_single option as well + */ + if (!bmp && o >= ext->startoff && + o + nb <= ext->startoff + ext->blockcount) { + bmpp_single->startblock = + ext->startblock + (o - ext->startoff); + goto single_ext; } + + /* + * rare case - multiple extents for a single dir block + */ + bmp = malloc(nb * sizeof(bmap_ext_t)); + if (!bmp) + do_error(_("blkmap_getn malloc failed (%u bytes)\n"), + nb * sizeof(bmap_ext_t)); + + bmp[nex].startblock = ext->startblock + (o - ext->startoff); + bmp[nex].blockcount = MIN(nb, ext->blockcount - + (bmp[nex].startblock - ext->startblock)); + o += bmp[nex].blockcount; + nb -= bmp[nex].blockcount; + nex++; } *bmpp = bmp; return nex; -} - -/* - * Make a block map larger. - */ -void -blkmap_grow( - blkmap_t **blkmapp, - blkent_t **entp, - blkent_t *newent) -{ - blkmap_t *blkmap; - size_t size; - int i; - int idx; - blkmap = *blkmapp; - idx = (int)(entp - blkmap->ents); - if (blkmap->naents == blkmap->nents) { - size = BLKMAP_SIZE(blkmap->nents + 1); - if ((*blkmapp = blkmap = realloc(blkmap, size)) == NULL) { - do_warn(_("realloc failed in blkmap_grow (%u bytes)\n"), - size); - return; - } - blkmap->naents++; - } - for (i = blkmap->nents; i > idx; i--) - blkmap->ents[i] = blkmap->ents[i - 1]; - blkmap->ents[idx] = newent; - blkmap->nents++; +single_ext: + bmpp_single->blockcount = nb; + bmpp_single->startoff = 0; /* not even used by caller! */ + *bmpp = bmpp_single; + return (bmpp_single->startblock != NULLDFSBNO) ? 1 : 0; } /* @@ -268,12 +168,12 @@ xfs_dfiloff_t blkmap_last_off( blkmap_t *blkmap) { - blkent_t *ent; + bmap_ext_t *ext; - if (!blkmap->nents) + if (!blkmap->nexts) return NULLDFILOFF; - ent = blkmap->ents[blkmap->nents - 1]; - return ent->startoff + ent->nblks; + ext = blkmap->exts + blkmap->nexts - 1; + return ext->startoff + ext->blockcount; } /* @@ -285,73 +185,45 @@ blkmap_next_off( xfs_dfiloff_t o, int *t) { - blkent_t *ent; - blkent_t **entp; + bmap_ext_t *ext; - if (!blkmap->nents) + if (!blkmap->nexts) return NULLDFILOFF; if (o == NULLDFILOFF) { *t = 0; - ent = blkmap->ents[0]; - return ent->startoff; + return blkmap->exts[0].startoff; } - entp = &blkmap->ents[*t]; - ent = *entp; - if (o < ent->startoff + ent->nblks - 1) + ext = blkmap->exts + *t; + if (o < ext->startoff + ext->blockcount - 1) return o + 1; - entp++; - if (entp >= &blkmap->ents[blkmap->nents]) + if (*t >= blkmap->nexts - 1) return NULLDFILOFF; (*t)++; - ent = *entp; - return ent->startoff; + return ext[1].startoff; } /* - * Set a block value in a block map. + * Make a block map larger. */ -void -blkmap_set_blk( - blkmap_t **blkmapp, - xfs_dfiloff_t o, - xfs_dfsbno_t b) +static blkmap_t * +blkmap_grow( + blkmap_t **blkmapp) { - blkmap_t *blkmap; - blkent_t *ent; - blkent_t **entp; - blkent_t *nextent; - - blkmap = *blkmapp; - for (entp = blkmap->ents; entp < &blkmap->ents[blkmap->nents]; entp++) { - ent = *entp; - if (o < ent->startoff - 1) { - ent = blkent_new(o, b, 1); - blkmap_grow(blkmapp, entp, ent); - return; - } - if (o == ent->startoff - 1) { - blkent_prepend(entp, b, 1); - return; - } - if (o >= ent->startoff && o < ent->startoff + ent->nblks) { - ent->blks[o - ent->startoff] = b; - return; - } - if (o > ent->startoff + ent->nblks) - continue; - blkent_append(entp, b, 1); - if (entp == &blkmap->ents[blkmap->nents - 1]) - return; - ent = *entp; - nextent = entp[1]; - if (ent->startoff + ent->nblks < nextent->startoff) - return; - blkent_append(entp, nextent->blks[0], nextent->nblks); - blkmap_shrink(blkmap, &entp[1]); - return; + pthread_key_t key = dblkmap_key; + blkmap_t *blkmap = *blkmapp; + + if (pthread_getspecific(key) != blkmap) { + key = ablkmap_key; + ASSERT(pthread_getspecific(key) == blkmap); } - ent = blkent_new(o, b, 1); - blkmap_grow(blkmapp, entp, ent); + + blkmap->naexts += 4; + blkmap = realloc(blkmap, BLKMAP_SIZE(blkmap->naexts)); + if (blkmap == NULL) + do_error(_("realloc failed in blkmap_grow\n")); + *blkmapp = blkmap; + pthread_setspecific(key, blkmap); + return blkmap; } /* @@ -364,46 +236,23 @@ blkmap_set_ext( xfs_dfsbno_t b, xfs_dfilblks_t c) { - blkmap_t *blkmap; - blkent_t *ent; - blkent_t **entp; + blkmap_t *blkmap = *blkmapp; xfs_extnum_t i; - blkmap = *blkmapp; - if (!blkmap->nents) { - blkmap->ents[0] = blkent_new(o, b, c); - blkmap->nents = 1; - return; - } - entp = &blkmap->ents[blkmap->nents - 1]; - ent = *entp; - if (ent->startoff + ent->nblks == o) { - blkent_append(entp, b, c); - return; - } - if (ent->startoff + ent->nblks < o) { - ent = blkent_new(o, b, c); - blkmap_grow(blkmapp, &blkmap->ents[blkmap->nents], ent); - return; - } - for (i = 0; i < c; i++) - blkmap_set_blk(blkmapp, o + i, b + i); -} + if (blkmap->nexts == blkmap->naexts) + blkmap = blkmap_grow(blkmapp); -/* - * Make a block map smaller. - */ -void -blkmap_shrink( - blkmap_t *blkmap, - blkent_t **entp) -{ - int i; - int idx; + for (i = 0; i < blkmap->nexts; i++) { + if (blkmap->exts[i].startoff > o) { + memmove(blkmap->exts + i + 1, + blkmap->exts + i, + sizeof(bmap_ext_t) * (blkmap->nexts - i)); + break; + } + } - free(*entp); - idx = (int)(entp - blkmap->ents); - for (i = idx + 1; i < blkmap->nents; i++) - blkmap->ents[i] = blkmap->ents[i - 1]; - blkmap->nents--; + blkmap->exts[i].startoff = o; + blkmap->exts[i].startblock = b; + blkmap->exts[i].blockcount = c; + blkmap->nexts++; } Index: xfsprogs-dev/repair/bmap.h =================================================================== --- xfsprogs-dev.orig/repair/bmap.h 2009-08-20 02:32:34.000000000 +0000 +++ xfsprogs-dev/repair/bmap.h 2009-08-20 02:32:45.000000000 +0000 @@ -16,59 +16,41 @@ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ -/* - * Block mapping code taken from xfs_db. - */ +#ifndef _XFS_REPAIR_BMAP_H +#define _XFS_REPAIR_BMAP_H /* - * Block map entry. + * Extent descriptor. */ -typedef struct blkent { +typedef struct bmap_ext { xfs_dfiloff_t startoff; - xfs_dfilblks_t nblks; - xfs_dfsbno_t blks[1]; -} blkent_t; -#define BLKENT_SIZE(n) \ - (offsetof(blkent_t, blks) + (sizeof(xfs_dfsbno_t) * (n))) + xfs_dfsbno_t startblock; + xfs_dfilblks_t blockcount; +} bmap_ext_t; /* * Block map. */ typedef struct blkmap { - int naents; - int nents; - blkent_t *ents[1]; + int naexts; + int nexts; + bmap_ext_t exts[1]; } blkmap_t; -#define BLKMAP_SIZE(n) \ - (offsetof(blkmap_t, ents) + (sizeof(blkent_t *) * (n))) -/* - * Extent descriptor. - */ -typedef struct bmap_ext { - xfs_dfiloff_t startoff; - xfs_dfsbno_t startblock; - xfs_dfilblks_t blockcount; - int flag; -} bmap_ext_t; +#define BLKMAP_SIZE(n) \ + (offsetof(blkmap_t, exts) + (sizeof(bmap_ext_t) * (n))) -void blkent_append(blkent_t **entp, xfs_dfsbno_t b, - xfs_dfilblks_t c); -blkent_t *blkent_new(xfs_dfiloff_t o, xfs_dfsbno_t b, xfs_dfilblks_t c); -void blkent_prepend(blkent_t **entp, xfs_dfsbno_t b, - xfs_dfilblks_t c); -blkmap_t *blkmap_alloc(xfs_extnum_t); +blkmap_t *blkmap_alloc(xfs_extnum_t nex, int whichfork); void blkmap_free(blkmap_t *blkmap); + +void blkmap_set_ext(blkmap_t **blkmapp, xfs_dfiloff_t o, + xfs_dfsbno_t b, xfs_dfilblks_t c); + xfs_dfsbno_t blkmap_get(blkmap_t *blkmap, xfs_dfiloff_t o); int blkmap_getn(blkmap_t *blkmap, xfs_dfiloff_t o, - xfs_dfilblks_t nb, bmap_ext_t **bmpp, + xfs_dfilblks_t nb, bmap_ext_t **bmpp, bmap_ext_t *bmpp_single); -void blkmap_grow(blkmap_t **blkmapp, blkent_t **entp, - blkent_t *newent); xfs_dfiloff_t blkmap_last_off(blkmap_t *blkmap); xfs_dfiloff_t blkmap_next_off(blkmap_t *blkmap, xfs_dfiloff_t o, int *t); -void blkmap_set_blk(blkmap_t **blkmapp, xfs_dfiloff_t o, - xfs_dfsbno_t b); -void blkmap_set_ext(blkmap_t **blkmapp, xfs_dfiloff_t o, - xfs_dfsbno_t b, xfs_dfilblks_t c); -void blkmap_shrink(blkmap_t *blkmap, blkent_t **entp); + +#endif /* _XFS_REPAIR_BMAP_H */ Index: xfsprogs-dev/repair/dinode.c =================================================================== --- xfsprogs-dev.orig/repair/dinode.c 2009-08-20 02:32:34.000000000 +0000 +++ xfsprogs-dev/repair/dinode.c 2009-08-21 01:23:34.000000000 +0000 @@ -2050,7 +2050,7 @@ process_inode_data_fork( *nextents = 1; if (dinoc->di_format != XFS_DINODE_FMT_LOCAL && type != XR_INO_RTDATA) - *dblkmap = blkmap_alloc(*nextents); + *dblkmap = blkmap_alloc(*nextents, XFS_DATA_FORK); *nextents = 0; switch (dinoc->di_format) { @@ -2172,14 +2172,14 @@ process_inode_attr_fork( err = process_lclinode(mp, agno, ino, dino, XFS_ATTR_FORK); break; case XFS_DINODE_FMT_EXTENTS: - ablkmap = blkmap_alloc(*anextents); + ablkmap = blkmap_alloc(*anextents, XFS_ATTR_FORK); *anextents = 0; err = process_exinode(mp, agno, ino, dino, type, dirty, atotblocks, anextents, &ablkmap, XFS_ATTR_FORK, check_dups); break; case XFS_DINODE_FMT_BTREE: - ablkmap = blkmap_alloc(*anextents); + ablkmap = blkmap_alloc(*anextents, XFS_ATTR_FORK); *anextents = 0; err = process_btinode(mp, agno, ino, dino, type, dirty, atotblocks, anextents, &ablkmap, Index: xfsprogs-dev/repair/init.c =================================================================== --- xfsprogs-dev.orig/repair/init.c 2009-08-20 02:32:34.000000000 +0000 +++ xfsprogs-dev/repair/init.c 2009-08-20 02:32:45.000000000 +0000 @@ -24,19 +24,24 @@ #include "pthread.h" #include "avl.h" #include "dir.h" +#include "bmap.h" #include "incore.h" #include "prefetch.h" #include +/* TODO: dirbuf/freemap key usage is completely b0rked - only used for dirv1 */ static pthread_key_t dirbuf_key; static pthread_key_t dir_freemap_key; static pthread_key_t attr_freemap_key; +extern pthread_key_t dblkmap_key; +extern pthread_key_t ablkmap_key; + static void ts_alloc(pthread_key_t key, unsigned n, size_t size) { void *voidp; - voidp = malloc((n)*(size)); + voidp = calloc(n, size); if (voidp == NULL) { do_error(_("ts_alloc: cannot allocate thread specific storage\n")); /* NO RETURN */ @@ -52,6 +57,9 @@ ts_create(void) pthread_key_create(&dirbuf_key, NULL); pthread_key_create(&dir_freemap_key, NULL); pthread_key_create(&attr_freemap_key, NULL); + + pthread_key_create(&dblkmap_key, NULL); + pthread_key_create(&ablkmap_key, NULL); } void From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:09 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HvhXC034367 for ; Wed, 2 Sep 2009 12:57:59 -0500 X-ASG-Debug-ID: 1251914320-4f3402bb0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A5A8715B1A28 for ; Wed, 2 Sep 2009 10:58:40 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id uC5M8niEd0nyJeAY for ; Wed, 02 Sep 2009 10:58:40 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6S-0006TI-4r; Wed, 02 Sep 2009 17:58:40 +0000 Message-Id: <20090902175840.048298899@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:33 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 02/14] repair: reduce byte swap operations in scanfunc_allocbt Subject: [PATCH 02/14] repair: reduce byte swap operations in scanfunc_allocbt References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-scan.c-reduce-byteswaps X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914320 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Store native endian version of the extent startblock and length in local variables instead of converting them over and over again. Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/scan.c =================================================================== --- xfsprogs-dev.orig/repair/scan.c 2009-08-21 18:48:01.000000000 +0000 +++ xfsprogs-dev/repair/scan.c 2009-08-21 18:54:29.000000000 +0000 @@ -449,7 +449,6 @@ scanfunc_allocbt( __uint32_t magic) { const char *name; - xfs_agblock_t b, e; int i; xfs_alloc_ptr_t *pp; xfs_alloc_rec_t *rp; @@ -509,20 +508,21 @@ _("%s freespace btree block claimed (sta rp = XFS_ALLOC_REC_ADDR(mp, block, 1); for (i = 0; i < numrecs; i++) { - if (be32_to_cpu(rp[i].ar_blockcount) == 0 || - be32_to_cpu(rp[i].ar_startblock) == 0 || - !verify_agbno(mp, agno, - be32_to_cpu(rp[i].ar_startblock)) || - be32_to_cpu(rp[i].ar_blockcount) > - MAXEXTLEN) - continue; + xfs_agblock_t b, end; + xfs_extlen_t len; + + b = be32_to_cpu(rp[i].ar_startblock); + len = be32_to_cpu(rp[i].ar_blockcount); + end = b + len; - e = be32_to_cpu(rp[i].ar_startblock) + - be32_to_cpu(rp[i].ar_blockcount); - if (!verify_agbno(mp, agno, e - 1)) + if (b == 0 || !verify_agbno(mp, agno, b)) + continue; + if (len == 0 || len > MAXEXTLEN) continue; - for (b = be32_to_cpu(rp[i].ar_startblock); - b < e; b++) { + if (!verify_agbno(mp, agno, end - 1)) + continue; + + for ( ; b < end; b++) { state = get_agbno_state(mp, agno, b); switch (state) { case XR_E_UNKNOWN: @@ -579,6 +579,8 @@ _("%s freespace btree block claimed (sta } for (i = 0; i < numrecs; i++) { + xfs_agblock_t bno = be32_to_cpu(pp[i]); + /* * XXX - put sibling detection right here. * we know our sibling chain is good. So as we go, @@ -588,11 +590,11 @@ _("%s freespace btree block claimed (sta * pointer mismatch, try and extract as much data * as possible. */ - if (be32_to_cpu(pp[i]) != 0 && verify_agbno(mp, agno, - be32_to_cpu(pp[i]))) - scan_sbtree(be32_to_cpu(pp[i]), level, agno, suspect, + if (bno != 0 && verify_agbno(mp, agno, bno)) { + scan_sbtree(bno, level, agno, suspect, (magic == XFS_ABTB_MAGIC) ? scanfunc_bno : scanfunc_cnt, 0); + } } } From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:09 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HvhJF034368 for ; Wed, 2 Sep 2009 12:57:58 -0500 X-ASG-Debug-ID: 1251914320-70a403810000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 0484341E03E for ; Wed, 2 Sep 2009 10:58:40 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id eeOl8MIUN6lS91Hr for ; Wed, 02 Sep 2009 10:58:40 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6S-0006Uq-Lf; Wed, 02 Sep 2009 17:58:40 +0000 Message-Id: <20090902175840.573208011@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:36 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 05/14] repair: reduce byte swapping in scan_freelist Subject: [PATCH 05/14] repair: reduce byte swapping in scan_freelist References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-cleanup-scan_freelist X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914321 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Store the ag number in a local native endian variable to avoid byteswapping it over and over again. Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/scan.c =================================================================== --- xfsprogs-dev.orig/repair/scan.c 2009-08-21 19:03:26.000000000 +0000 +++ xfsprogs-dev/repair/scan.c 2009-08-21 19:05:32.000000000 +0000 @@ -943,23 +943,26 @@ scan_freelist( { xfs_agfl_t *agfl; xfs_buf_t *agflbuf; + xfs_agnumber_t agno; xfs_agblock_t bno; int count; int i; + agno = be32_to_cpu(agf->agf_seqno); + if (XFS_SB_BLOCK(mp) != XFS_AGFL_BLOCK(mp) && - XFS_AGF_BLOCK(mp) != XFS_AGFL_BLOCK(mp) && - XFS_AGI_BLOCK(mp) != XFS_AGFL_BLOCK(mp)) - set_agbno_state(mp, be32_to_cpu(agf->agf_seqno), - XFS_AGFL_BLOCK(mp), XR_E_FS_MAP); + XFS_AGF_BLOCK(mp) != XFS_AGFL_BLOCK(mp) && + XFS_AGI_BLOCK(mp) != XFS_AGFL_BLOCK(mp)) + set_agbno_state(mp, agno, XFS_AGFL_BLOCK(mp), XR_E_FS_MAP); + if (be32_to_cpu(agf->agf_flcount) == 0) return; - agflbuf = libxfs_readbuf(mp->m_dev, XFS_AG_DADDR(mp, - be32_to_cpu(agf->agf_seqno), - XFS_AGFL_DADDR(mp)), XFS_FSS_TO_BB(mp, 1), 0); + + agflbuf = libxfs_readbuf(mp->m_dev, + XFS_AG_DADDR(mp, agno, XFS_AGFL_DADDR(mp)), + XFS_FSS_TO_BB(mp, 1), 0); if (!agflbuf) { - do_abort(_("can't read agfl block for ag %d\n"), - be32_to_cpu(agf->agf_seqno)); + do_abort(_("can't read agfl block for ag %d\n"), agno); return; } agfl = XFS_BUF_TO_AGFL(agflbuf); @@ -967,12 +970,11 @@ scan_freelist( count = 0; for (;;) { bno = be32_to_cpu(agfl->agfl_bno[i]); - if (verify_agbno(mp, be32_to_cpu(agf->agf_seqno), bno)) - set_agbno_state(mp, be32_to_cpu(agf->agf_seqno), - bno, XR_E_FREE); + if (verify_agbno(mp, agno, bno)) + set_agbno_state(mp, agno, bno, XR_E_FREE); else do_warn(_("bad agbno %u in agfl, agno %d\n"), - bno, be32_to_cpu(agf->agf_seqno)); + bno, agno); count++; if (i == be32_to_cpu(agf->agf_fllast)) break; @@ -981,8 +983,7 @@ scan_freelist( } if (count != be32_to_cpu(agf->agf_flcount)) { do_warn(_("freeblk count %d != flcount %d in ag %d\n"), count, - be32_to_cpu(agf->agf_flcount), - be32_to_cpu(agf->agf_seqno)); + be32_to_cpu(agf->agf_flcount), agno); } libxfs_putbuf(agflbuf); } From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:09 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_21, J_CHICKENPOX_61,J_CHICKENPOX_63,J_CHICKENPOX_64,J_CHICKENPOX_65, J_CHICKENPOX_66,J_CHICKENPOX_71 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82Hvi8H034369 for ; Wed, 2 Sep 2009 12:57:59 -0500 X-ASG-Debug-ID: 1251914320-4b1e012c0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5EBC841E03E for ; Wed, 2 Sep 2009 10:58:41 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id aPadCMhukC8Dav4Y for ; Wed, 02 Sep 2009 10:58:41 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6S-0006VV-RL; Wed, 02 Sep 2009 17:58:40 +0000 Message-Id: <20090902175840.740632507@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:37 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 06/14] repair: use a btree instead of a radix tree for the prefetch queue Subject: [PATCH 06/14] repair: use a btree instead of a radix tree for the prefetch queue References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-radix-to-btree X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914321 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Currently the prefetch queue in xfs_repair uses a radix tree implementation derived from the Linux kernel one to manage it's prefetch queue. The radix tree implement is not very memory efficient for sparse indices, so replace it with a btree implementation that is much more efficient. This is not that important for the prefetch queue but will be very important for the next memory optimization patches which need a tree to store things like the block map which are very sparse, and we do not want to deal with two tree implementations (or rather three given that we still have avl.c around) Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/Makefile =================================================================== --- xfsprogs-dev.orig/repair/Makefile 2009-08-20 00:01:58.000000000 +0000 +++ xfsprogs-dev/repair/Makefile 2009-08-20 00:06:43.000000000 +0000 @@ -9,15 +9,15 @@ LTCOMMAND = xfs_repair -HFILES = agheader.h attr_repair.h avl.h avl64.h bmap.h dinode.h dir.h \ - dir2.h err_protos.h globals.h incore.h protos.h rt.h \ - progress.h scan.h versions.h prefetch.h radix-tree.h threads.h +HFILES = agheader.h attr_repair.h avl.h avl64.h bmap.h btree.h \ + dinode.h dir.h dir2.h err_protos.h globals.h incore.h protos.h rt.h \ + progress.h scan.h versions.h prefetch.h threads.h -CFILES = agheader.c attr_repair.c avl.c avl64.c bmap.c dino_chunks.c \ - dinode.c dir.c dir2.c globals.c incore.c \ +CFILES = agheader.c attr_repair.c avl.c avl64.c bmap.c btree.c \ + dino_chunks.c dinode.c dir.c dir2.c globals.c incore.c \ incore_bmc.c init.c incore_ext.c incore_ino.c phase1.c \ phase2.c phase3.c phase4.c phase5.c phase6.c phase7.c \ - progress.c prefetch.c radix-tree.c rt.c sb.c scan.c threads.c \ + progress.c prefetch.c rt.c sb.c scan.c threads.c \ versions.c xfs_repair.c LLDLIBS = $(LIBXFS) $(LIBXLOG) $(LIBUUID) $(LIBRT) $(LIBPTHREAD) Index: xfsprogs-dev/repair/btree.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ xfsprogs-dev/repair/btree.c 2009-08-20 00:06:44.000000000 +0000 @@ -0,0 +1,1234 @@ +/* + * Copyright (c) 2007, Silicon Graphics, Inc. Barry Naujok + * All Rights Reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include +#include "btree.h" + + +#define BTREE_KEY_MAX 7 +#define BTREE_KEY_MIN (BTREE_KEY_MAX / 2) + +#define BTREE_PTR_MAX (BTREE_KEY_MAX + 1) + +struct btree_node { + unsigned long num_keys; + unsigned long keys[BTREE_KEY_MAX]; + struct btree_node * ptrs[BTREE_PTR_MAX]; +}; + +struct btree_cursor { + struct btree_node *node; + int index; +}; + +struct btree_root { + struct btree_node *root_node; + struct btree_cursor *cursor; /* track path to end leaf */ + int height; + /* lookup cache */ + int keys_valid; /* set if the cache is valid */ + unsigned long cur_key; + unsigned long next_key; + void *next_value; + unsigned long prev_key; + void *prev_value; +#ifdef BTREE_STATS + struct btree_stats { + unsigned long num_items; + unsigned long max_items; + int alloced; + int cache_hits; + int cache_misses; + int lookup; + int find; + int key_update; + int value_update; + int insert; + int delete; + int inc_height; + int dec_height; + int shift_prev; + int shift_next; + int split; + int merge_prev; + int merge_next; + int balance_prev; + int balance_next; + } stats; +#endif +}; + + +static struct btree_node * +btree_node_alloc(void) +{ + return calloc(1, sizeof(struct btree_node)); +} + +static void +btree_node_free( + struct btree_node *node) +{ + free(node); +} + +static void +btree_free_nodes( + struct btree_node *node, + int level) +{ + int i; + + if (level) + for (i = 0; i <= node->num_keys; i++) + btree_free_nodes(node->ptrs[i], level - 1); + btree_node_free(node); +} + +static void +__btree_init( + struct btree_root *root) +{ + memset(root, 0, sizeof(struct btree_root)); + root->height = 1; + root->cursor = calloc(1, sizeof(struct btree_cursor)); + root->root_node = btree_node_alloc(); + ASSERT(root->root_node); +#ifdef BTREE_STATS + root->stats.max_items = 1; + root->stats.alloced += 1; +#endif +} + +static void +__btree_free( + struct btree_root *root) +{ + btree_free_nodes(root->root_node, root->height - 1); + free(root->cursor); + root->height = 0; + root->cursor = NULL; + root->root_node = NULL; +} + +void +btree_init( + struct btree_root **root) +{ + *root = calloc(1, sizeof(struct btree_root)); + __btree_init(*root); +} + +void +btree_clear( + struct btree_root *root) +{ + __btree_free(root); + __btree_init(root); +} + +void +btree_destroy( + struct btree_root *root) +{ + __btree_free(root); + free(root); +} + +int +btree_is_empty( + struct btree_root *root) +{ + return root->root_node->num_keys == 0; +} + +static inline void +btree_invalidate_cursor( + struct btree_root *root) +{ + root->cursor[0].node = NULL; + root->keys_valid = 0; +} + +static inline unsigned long +btree_key_of_cursor( + struct btree_cursor *cursor, + int height) +{ + while (cursor->node->num_keys == cursor->index && --height > 0) + cursor++; + return cursor->node->keys[cursor->index]; +} + +static void * +btree_get_prev( + struct btree_root *root, + unsigned long *key) +{ + struct btree_cursor *cur = root->cursor; + int level = 0; + struct btree_node *node; + + if (cur->index > 0) { + if (key) + *key = cur->node->keys[cur->index - 1]; + return cur->node->ptrs[cur->index - 1]; + } + + /* else need to go up and back down the tree to find the previous */ + + while (cur->index == 0) { + if (++level == root->height) + return NULL; + cur++; + } + + /* the key is in the current level */ + if (key) + *key = cur->node->keys[cur->index - 1]; + + /* descend back down the right side to get the pointer */ + node = cur->node->ptrs[cur->index - 1]; + while (level--) + node = node->ptrs[node->num_keys]; + return node; +} + +static void * +btree_get_next( + struct btree_root *root, + unsigned long *key) +{ + struct btree_cursor *cur = root->cursor; + int level = 0; + struct btree_node *node; + + while (cur->index == cur->node->num_keys) { + if (++level == root->height) + return NULL; + cur++; + } + if (level == 0) { + if (key) { + cur->index++; + *key = btree_key_of_cursor(cur, root->height); + cur->index--; + } + return cur->node->ptrs[cur->index + 1]; + } + + node = cur->node->ptrs[cur->index + 1]; + while (--level > 0) + node = node->ptrs[0]; + if (key) + *key = node->keys[0]; + return node->ptrs[0]; +} + +/* + * Lookup/Search functions + */ + +static int +btree_do_search( + struct btree_root *root, + unsigned long key) +{ + unsigned long k = 0; + struct btree_cursor *cur = root->cursor + root->height; + struct btree_node *node = root->root_node; + int height = root->height; + int key_found = 0; + int i; + + while (--height >= 0) { + cur--; + for (i = 0; i < node->num_keys; i++) + if (node->keys[i] >= key) { + k = node->keys[i]; + key_found = 1; + break; + } + cur->node = node; + cur->index = i; + node = node->ptrs[i]; + } + root->keys_valid = key_found; + if (!key_found) + return 0; + + root->cur_key = k; + root->next_value = NULL; /* do on-demand next value lookup */ + root->prev_value = btree_get_prev(root, &root->prev_key); + return 1; +} + +static int +btree_search( + struct btree_root *root, + unsigned long key) +{ + if (root->keys_valid && key <= root->cur_key && + (!root->prev_value || key > root->prev_key)) { +#ifdef BTREE_STATS + root->stats.cache_hits++; +#endif + return 1; + } +#ifdef BTREE_STATS + root->stats.cache_misses++; +#endif + return btree_do_search(root, key); +} + +void * +btree_find( + struct btree_root *root, + unsigned long key, + unsigned long *actual_key) +{ +#ifdef BTREE_STATS + root->stats.find += 1; +#endif + if (!btree_search(root, key)) + return NULL; + + if (actual_key) + *actual_key = root->cur_key; + return root->cursor->node->ptrs[root->cursor->index]; +} + +void * +btree_lookup( + struct btree_root *root, + unsigned long key) +{ +#ifdef BTREE_STATS + root->stats.lookup += 1; +#endif + if (!btree_search(root, key) || root->cur_key != key) + return NULL; + return root->cursor->node->ptrs[root->cursor->index]; +} + +void * +btree_peek_prev( + struct btree_root *root, + unsigned long *key) +{ + if (!root->keys_valid) + return NULL; + if (key) + *key = root->prev_key; + return root->prev_value; +} + +void * +btree_peek_next( + struct btree_root *root, + unsigned long *key) +{ + if (!root->keys_valid) + return NULL; + if (!root->next_value) + root->next_value = btree_get_next(root, &root->next_key); + if (key) + *key = root->next_key; + return root->next_value; +} + +static void * +btree_move_cursor_to_next( + struct btree_root *root, + unsigned long *key) +{ + struct btree_cursor *cur = root->cursor; + int level = 0; + + while (cur->index == cur->node->num_keys) { + if (++level == root->height) + return NULL; + cur++; + } + cur->index++; + if (level == 0) { + if (key) + *key = btree_key_of_cursor(cur, root->height); + return cur->node->ptrs[cur->index]; + } + + while (--level >= 0) { + root->cursor[level].node = cur->node->ptrs[cur->index]; + root->cursor[level].index = 0; + cur--; + } + if (key) + *key = cur->node->keys[0]; + return cur->node->ptrs[0]; +} + +void * +btree_lookup_next( + struct btree_root *root, + unsigned long *key) +{ + void *value; + + if (!root->keys_valid) + return NULL; + + root->prev_key = root->cur_key; + root->prev_value = root->cursor->node->ptrs[root->cursor->index]; + + value = btree_move_cursor_to_next(root, &root->cur_key); + if (!value) { + btree_invalidate_cursor(root); + return NULL; + } + root->next_value = NULL; /* on-demand next value fetch */ + if (key) + *key = root->cur_key; + return value; +} + +static void * +btree_move_cursor_to_prev( + struct btree_root *root, + unsigned long *key) +{ + struct btree_cursor *cur = root->cursor; + int level = 0; + + while (cur->index == 0) { + if (++level == root->height) + return NULL; + cur++; + } + cur->index--; + if (key) /* the key is in the current level */ + *key = cur->node->keys[cur->index]; + while (level > 0) { + level--; + root->cursor[level].node = cur->node->ptrs[cur->index]; + root->cursor[level].index = root->cursor[level].node->num_keys; + cur--; + } + return cur->node->ptrs[cur->index]; +} + +void * +btree_lookup_prev( + struct btree_root *root, + unsigned long *key) +{ + void *value; + + if (!root->keys_valid) + return NULL; + + value = btree_move_cursor_to_prev(root, &root->cur_key); + if (!value) + return NULL; + root->prev_value = btree_get_prev(root, &root->prev_key); + root->next_value = NULL; /* on-demand next value fetch */ + if (key) + *key = root->cur_key; + return value; +} + +void * +btree_uncached_lookup( + struct btree_root *root, + unsigned long key) +{ + /* cursor-less (ie. uncached) lookup */ + int height = root->height - 1; + struct btree_node *node = root->root_node; + int i; + int key_found = 0; + + while (height >= 0) { + for (i = 0; i < node->num_keys; i++) + if (node->keys[i] >= key) { + key_found = node->keys[i] == key; + break; + } + node = node->ptrs[i]; + height--; + } + return key_found ? node : NULL; +} + +/* Update functions */ + +static inline void +btree_update_node_key( + struct btree_root *root, + struct btree_cursor *cursor, + int level, + unsigned long new_key) +{ + int i; + +#ifdef BTREE_STATS + root->stats.key_update += 1; +#endif + + cursor += level; + for (i = level; i < root->height; i++) { + if (cursor->index < cursor->node->num_keys) { + cursor->node->keys[cursor->index] = new_key; + break; + } + cursor++; + } +} + +int +btree_update_key( + struct btree_root *root, + unsigned long old_key, + unsigned long new_key) +{ + if (!btree_search(root, old_key) || root->cur_key != old_key) + return ENOENT; + + if (root->next_value && new_key >= root->next_key) + return EINVAL; + + if (root->prev_value && new_key <= root->prev_key) + return EINVAL; + + btree_update_node_key(root, root->cursor, 0, new_key); + + return 0; +} + +int +btree_update_value( + struct btree_root *root, + unsigned long key, + void *new_value) +{ + if (!new_value) + return EINVAL; + + if (!btree_search(root, key) || root->cur_key != key) + return ENOENT; + +#ifdef BTREE_STATS + root->stats.value_update += 1; +#endif + root->cursor->node->ptrs[root->cursor->index] = new_value; + + return 0; +} + +/* + * Cursor modification functions - used for inserting and deleting + */ + +static struct btree_cursor * +btree_copy_cursor_prev( + struct btree_root *root, + struct btree_cursor *dest_cursor, + int level) +{ + struct btree_cursor *src_cur = root->cursor + level; + struct btree_cursor *dst_cur; + int l = level; + int i; + + if (level >= root->height) + return NULL; + + while (src_cur->index == 0) { + if (++l >= root->height) + return NULL; + src_cur++; + } + for (i = l; i < root->height; i++) + dest_cursor[i] = *src_cur++; + + dst_cur = dest_cursor + l; + dst_cur->index--; + while (l-- >= level) { + dest_cursor[l].node = dst_cur->node->ptrs[dst_cur->index]; + dest_cursor[l].index = dest_cursor[l].node->num_keys; + dst_cur--; + } + return dest_cursor; +} + +static struct btree_cursor * +btree_copy_cursor_next( + struct btree_root *root, + struct btree_cursor *dest_cursor, + int level) +{ + struct btree_cursor *src_cur = root->cursor + level; + struct btree_cursor *dst_cur; + int l = level; + int i; + + if (level >= root->height) + return NULL; + + while (src_cur->index == src_cur->node->num_keys) { + if (++l >= root->height) + return NULL; + src_cur++; + } + for (i = l; i < root->height; i++) + dest_cursor[i] = *src_cur++; + + dst_cur = dest_cursor + l; + dst_cur->index++; + while (l-- >= level) { + dest_cursor[l].node = dst_cur->node->ptrs[dst_cur->index]; + dest_cursor[l].index = 0; + dst_cur--; + } + return dest_cursor; +} + +/* + * Shift functions + * + * Tries to move items in the current leaf to its sibling if it has space. + * Used in both insert and delete functions. + * Returns the number of items shifted. + */ + +static int +btree_shift_to_prev( + struct btree_root *root, + int level, + struct btree_cursor *prev_cursor, + int num_children) +{ + struct btree_node *node; + struct btree_node *prev_node; + int num_remain; /* # of keys left in "node" */ + unsigned long key; + int i; + + if (!prev_cursor || !num_children) + return 0; + + prev_node = prev_cursor[level].node; + node = root->cursor[level].node; + + ASSERT(num_children > 0 && num_children <= node->num_keys + 1); + + if ((prev_node->num_keys + num_children) > BTREE_KEY_MAX) + return 0; + +#ifdef BTREE_STATS + root->stats.shift_prev += 1; +#endif + + num_remain = node->num_keys - num_children; + ASSERT(num_remain == -1 || num_remain >= BTREE_KEY_MIN); + + /* shift parent keys around */ + level++; + if (num_remain > 0) + key = node->keys[num_children - 1]; + else + key = btree_key_of_cursor(root->cursor + level, + root->height - level); + while (prev_cursor[level].index == prev_cursor[level].node->num_keys) { + level++; + ASSERT(level < root->height); + } + prev_node->keys[prev_node->num_keys] = + prev_cursor[level].node->keys[prev_cursor[level].index]; + prev_cursor[level].node->keys[prev_cursor[level].index] = key; + + /* copy pointers and keys to the end of the prev node */ + for (i = 0; i < num_children - 1; i++) { + prev_node->keys[prev_node->num_keys + 1 + i] = node->keys[i]; + prev_node->ptrs[prev_node->num_keys + 1 + i] = node->ptrs[i]; + } + prev_node->ptrs[prev_node->num_keys + 1 + i] = node->ptrs[i]; + prev_node->num_keys += num_children; + + /* move remaining pointers/keys to start of node */ + if (num_remain >= 0) { + for (i = 0; i < num_remain; i++) { + node->keys[i] = node->keys[num_children + i]; + node->ptrs[i] = node->ptrs[num_children + i]; + } + node->ptrs[i] = node->ptrs[num_children + i]; + node->num_keys = num_remain; + } else + node->num_keys = 0; + + return num_children; +} + +static int +btree_shift_to_next( + struct btree_root *root, + int level, + struct btree_cursor *next_cursor, + int num_children) +{ + struct btree_node *node; + struct btree_node *next_node; + int num_remain; /* # of children left in node */ + int i; + + if (!next_cursor || !num_children) + return 0; + + node = root->cursor[level].node; + next_node = next_cursor[level].node; + + ASSERT(num_children > 0 && num_children <= node->num_keys + 1); + + if ((next_node->num_keys + num_children) > BTREE_KEY_MAX) + return 0; + + num_remain = node->num_keys + 1 - num_children; + ASSERT(num_remain == 0 || num_remain > BTREE_KEY_MIN); + +#ifdef BTREE_STATS + root->stats.shift_next += 1; +#endif + + /* make space for "num_children" items at beginning of next-leaf */ + i = next_node->num_keys; + next_node->ptrs[num_children + i] = next_node->ptrs[i]; + while (--i >= 0) { + next_node->keys[num_children + i] = next_node->keys[i]; + next_node->ptrs[num_children + i] = next_node->ptrs[i]; + } + + /* update keys in parent and next node from parent */ + do { + level++; + ASSERT(level < root->height); + } while (root->cursor[level].index == root->cursor[level].node->num_keys); + + next_node->keys[num_children - 1] = + root->cursor[level].node->keys[root->cursor[level].index]; + root->cursor[level].node->keys[root->cursor[level].index] = + node->keys[node->num_keys - num_children]; + + /* copy last "num_children" items from node into start of next-node */ + for (i = 0; i < num_children - 1; i++) { + next_node->keys[i] = node->keys[num_remain + i]; + next_node->ptrs[i] = node->ptrs[num_remain + i]; + } + next_node->ptrs[i] = node->ptrs[num_remain + i]; + next_node->num_keys += num_children; + + if (num_remain > 0) + node->num_keys -= num_children; + else + node->num_keys = 0; + + return num_children; +} + +/* + * Insertion functions + */ + +static struct btree_node * +btree_increase_height( + struct btree_root *root) +{ + struct btree_node *new_root; + struct btree_cursor *new_cursor; + + new_cursor = realloc(root->cursor, (root->height + 1) * + sizeof(struct btree_cursor)); + if (!new_cursor) + return NULL; + root->cursor = new_cursor; + + new_root = btree_node_alloc(); + if (!new_root) + return NULL; + +#ifdef BTREE_STATS + root->stats.alloced += 1; + root->stats.inc_height += 1; + root->stats.max_items *= BTREE_PTR_MAX; +#endif + + new_root->ptrs[0] = root->root_node; + root->root_node = new_root; + + root->cursor[root->height].node = new_root; + root->cursor[root->height].index = 0; + + root->height++; + + return new_root; +} + +static int +btree_insert_item( + struct btree_root *root, + int level, + unsigned long key, + void *value); + + +static struct btree_node * +btree_split( + struct btree_root *root, + int level, + unsigned long key, + int *index) +{ + struct btree_node *node = root->cursor[level].node; + struct btree_node *new_node; + int i; + + new_node = btree_node_alloc(); + if (!new_node) + return NULL; + + if (btree_insert_item(root, level + 1, node->keys[BTREE_KEY_MIN], + new_node) != 0) { + btree_node_free(new_node); + return NULL; + } + +#ifdef BTREE_STATS + root->stats.alloced += 1; + root->stats.split += 1; +#endif + + for (i = 0; i < BTREE_KEY_MAX - BTREE_KEY_MIN - 1; i++) { + new_node->keys[i] = node->keys[BTREE_KEY_MIN + 1 + i]; + new_node->ptrs[i] = node->ptrs[BTREE_KEY_MIN + 1 + i]; + } + new_node->ptrs[i] = node->ptrs[BTREE_KEY_MIN + 1 + i]; + new_node->num_keys = BTREE_KEY_MAX - BTREE_KEY_MIN - 1; + + node->num_keys = BTREE_KEY_MIN; + if (key < node->keys[BTREE_KEY_MIN]) + return node; /* index doesn't change */ + + /* insertion point is in new node... */ + *index -= BTREE_KEY_MIN + 1; + return new_node; +} + +static int +btree_insert_shift_to_prev( + struct btree_root *root, + int level, + int *index) +{ + struct btree_cursor tmp_cursor[root->height]; + int n; + + if (*index <= 0) + return -1; + + if (!btree_copy_cursor_prev(root, tmp_cursor, level + 1)) + return -1; + + n = MIN(*index, (BTREE_PTR_MAX - tmp_cursor[level].node->num_keys) / 2); + if (!n || !btree_shift_to_prev(root, level, tmp_cursor, n)) + return -1; + + *index -= n; + return 0; +} + +static int +btree_insert_shift_to_next( + struct btree_root *root, + int level, + int *index) +{ + struct btree_cursor tmp_cursor[root->height]; + int n; + + if (*index >= BTREE_KEY_MAX) + return -1; + + if (!btree_copy_cursor_next(root, tmp_cursor, level + 1)) + return -1; + + n = MIN(BTREE_KEY_MAX - *index, + (BTREE_PTR_MAX - tmp_cursor[level].node->num_keys) / 2); + if (!n || !btree_shift_to_next(root, level, tmp_cursor, n)) + return -1; + return 0; +} + +static int +btree_insert_item( + struct btree_root *root, + int level, + unsigned long key, + void *value) +{ + struct btree_node *node = root->cursor[level].node; + int index = root->cursor[level].index; + int i; + + if (node->num_keys == BTREE_KEY_MAX) { + if (btree_insert_shift_to_prev(root, level, &index) == 0) + goto insert; + if (btree_insert_shift_to_next(root, level, &index) == 0) + goto insert; + if (level == root->height - 1) { + if (!btree_increase_height(root)) + return ENOMEM; + } + node = btree_split(root, level, key, &index); + if (!node) + return ENOMEM; + } +insert: + ASSERT(index <= node->num_keys); + + i = node->num_keys; + node->ptrs[i + 1] = node->ptrs[i]; + while (--i >= index) { + node->keys[i + 1] = node->keys[i]; + node->ptrs[i + 1] = node->ptrs[i]; + } + + node->num_keys++; + node->keys[index] = key; + + if (level == 0) + node->ptrs[index] = value; + else + node->ptrs[index + 1] = value; + + return 0; +} + + + +int +btree_insert( + struct btree_root *root, + unsigned long key, + void *value) +{ + int result; + + if (!value) + return EINVAL; + + if (btree_search(root, key) && root->cur_key == key) + return EEXIST; + +#ifdef BTREE_STATS + root->stats.insert += 1; + root->stats.num_items += 1; +#endif + + result = btree_insert_item(root, 0, key, value); + + btree_invalidate_cursor(root); + + return result; +} + + +/* + * Deletion functions + * + * Rather more complicated as deletions has 4 ways to go once a node + * ends up with less than the minimum number of keys: + * - move remainder to previous node + * - move remainder to next node + * (both will involve a parent deletion which may recurse) + * - balance by moving some items from previous node + * - balance by moving some items from next node + */ + +static void +btree_decrease_height( + struct btree_root *root) +{ + struct btree_node *old_root = root->root_node; + + ASSERT(old_root->num_keys == 0); + +#ifdef BTREE_STATS + root->stats.alloced -= 1; + root->stats.dec_height += 1; + root->stats.max_items /= BTREE_PTR_MAX; +#endif + root->root_node = old_root->ptrs[0]; + btree_node_free(old_root); + root->height--; +} + +static int +btree_merge_with_prev( + struct btree_root *root, + int level, + struct btree_cursor *prev_cursor) +{ + if (!prev_cursor) + return 0; + + if (!btree_shift_to_prev(root, level, prev_cursor, + root->cursor[level].node->num_keys + 1)) + return 0; + +#ifdef BTREE_STATS + root->stats.merge_prev += 1; +#endif + return 1; +} + +static int +btree_merge_with_next( + struct btree_root *root, + int level, + struct btree_cursor *next_cursor) +{ + if (!next_cursor) + return 0; + + if (!btree_shift_to_next(root, level, next_cursor, + root->cursor[level].node->num_keys + 1)) + return 0; + +#ifdef BTREE_STATS + root->stats.merge_next += 1; +#endif + return 1; +} + +static int +btree_balance_with_prev( + struct btree_root *root, + int level, + struct btree_cursor *prev_cursor) +{ + struct btree_cursor *root_cursor = root->cursor; + + if (!prev_cursor) + return 0; + ASSERT(prev_cursor[level].node->num_keys > BTREE_KEY_MIN); + +#ifdef BTREE_STATS + root->stats.balance_prev += 1; +#endif + /* + * Move some nodes from the prev node into the current node. + * As the shift operation is a right shift and is relative to + * the root cursor, make the root cursor the prev cursor and + * pass in the root cursor as the next cursor. + */ + + root->cursor = prev_cursor; + if (!btree_shift_to_next(root, level, root_cursor, + (prev_cursor[level].node->num_keys + 1 - BTREE_KEY_MIN) / 2)) + abort(); + root->cursor = root_cursor; + + return 1; +} + +static int +btree_balance_with_next( + struct btree_root *root, + int level, + struct btree_cursor *next_cursor) +{ + struct btree_cursor *root_cursor = root->cursor; + + if (!next_cursor) + return 0; + assert(next_cursor[level].node->num_keys > BTREE_KEY_MIN); + +#ifdef btree_stats + root->stats.balance_next += 1; +#endif + /* + * move some nodes from the next node into the current node. + * as the shift operation is a left shift and is relative to + * the root cursor, make the root cursor the next cursor and + * pass in the root cursor as the prev cursor. + */ + + root->cursor = next_cursor; + if (!btree_shift_to_prev(root, level, root_cursor, + (next_cursor[level].node->num_keys + 1 - BTREE_KEY_MIN) / 2)) + abort(); + root->cursor = root_cursor; + + return 1; + +} + +static void +btree_delete_key( + struct btree_root *root, + int level); + +/* + * btree_delete_node: + * + * Return 0 if it's done or 1 if the next level needs to be collapsed + */ +static void +btree_delete_node( + struct btree_root *root, + int level) +{ + struct btree_cursor prev_cursor[root->height]; + struct btree_cursor next_cursor[root->height]; + struct btree_cursor *pc; + struct btree_cursor *nc; + + /* + * the node has underflowed, grab or merge keys/items from a + * neighbouring node. + */ + + if (level == root->height - 1) { + if (level > 0 && root->root_node->num_keys == 0) + btree_decrease_height(root); + return; + } + + pc = btree_copy_cursor_prev(root, prev_cursor, level + 1); + if (!btree_merge_with_prev(root, level, pc)) { + nc = btree_copy_cursor_next(root, next_cursor, level + 1); + if (!btree_merge_with_next(root, level, nc)) { + /* merging failed, try redistrubution */ + if (!btree_balance_with_prev(root, level, pc) && + !btree_balance_with_next(root, level, nc)) + abort(); + return; /* when balancing, then the node isn't freed */ + } + } + +#ifdef BTREE_STATS + root->stats.alloced -= 1; +#endif + btree_node_free(root->cursor[level].node); + + btree_delete_key(root, level + 1); +} + +static void +btree_delete_key( + struct btree_root *root, + int level) +{ + struct btree_node *node = root->cursor[level].node; + int index = root->cursor[level].index; + + node->num_keys--; + if (index <= node->num_keys) { + /* + * if not deleting the last item, shift higher items down + * to cover the item being deleted + */ + while (index < node->num_keys) { + node->keys[index] = node->keys[index + 1]; + node->ptrs[index] = node->ptrs[index + 1]; + index++; + } + node->ptrs[index] = node->ptrs[index + 1]; + } else { + /* + * else update the associated parent key as the last key + * in the leaf has changed + */ + btree_update_node_key(root, root->cursor, level + 1, + node->keys[node->num_keys]); + } + /* + * if node underflows, either merge with sibling or rebalance + * with sibling. + */ + if (node->num_keys < BTREE_KEY_MIN) + btree_delete_node(root, level); +} + +void * +btree_delete( + struct btree_root *root, + unsigned long key) +{ + void *value; + + value = btree_lookup(root, key); + if (!value) + return NULL; + +#ifdef BTREE_STATS + root->stats.delete += 1; + root->stats.num_items -= 1; +#endif + + btree_delete_key(root, 0); + + btree_invalidate_cursor(root); + + return value; +} + +#ifdef BTREE_STATS +void +btree_print_stats( + struct btree_root *root, + FILE *f) +{ + unsigned long max_items = root->stats.max_items * + (root->root_node->num_keys + 1); + + fprintf(f, "\tnum_items = %lu, max_items = %lu (%lu%%)\n", + root->stats.num_items, max_items, + root->stats.num_items * 100 / max_items); + fprintf(f, "\talloced = %d nodes, %lu bytes, %lu bytes per item\n", + root->stats.alloced, + root->stats.alloced * sizeof(struct btree_node), + root->stats.alloced * sizeof(struct btree_node) / + root->stats.num_items); + fprintf(f, "\tlookup = %d\n", root->stats.lookup); + fprintf(f, "\tfind = %d\n", root->stats.find); + fprintf(f, "\tcache_hits = %d\n", root->stats.cache_hits); + fprintf(f, "\tcache_misses = %d\n", root->stats.cache_misses); + fprintf(f, "\tkey_update = %d\n", root->stats.key_update); + fprintf(f, "\tvalue_update = %d\n", root->stats.value_update); + fprintf(f, "\tinsert = %d\n", root->stats.insert); + fprintf(f, "\tshift_prev = %d\n", root->stats.shift_prev); + fprintf(f, "\tshift_next = %d\n", root->stats.shift_next); + fprintf(f, "\tsplit = %d\n", root->stats.split); + fprintf(f, "\tinc_height = %d\n", root->stats.inc_height); + fprintf(f, "\tdelete = %d\n", root->stats.delete); + fprintf(f, "\tmerge_prev = %d\n", root->stats.merge_prev); + fprintf(f, "\tmerge_next = %d\n", root->stats.merge_next); + fprintf(f, "\tbalance_prev = %d\n", root->stats.balance_prev); + fprintf(f, "\tbalance_next = %d\n", root->stats.balance_next); + fprintf(f, "\tdec_height = %d\n", root->stats.dec_height); +} +#endif Index: xfsprogs-dev/repair/btree.h =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ xfsprogs-dev/repair/btree.h 2009-08-20 00:06:44.000000000 +0000 @@ -0,0 +1,102 @@ +/* + * Copyright (c) 2007 Silicon Graphics, Inc. + * All Rights Reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef _BTREE_H +#define _BTREE_H + + +struct btree_root; + +void +btree_init( + struct btree_root **root); + +void +btree_destroy( + struct btree_root *root); + +int +btree_is_empty( + struct btree_root *root); + +void * +btree_lookup( + struct btree_root *root, + unsigned long key); + +void * +btree_find( + struct btree_root *root, + unsigned long key, + unsigned long *actual_key); + +void * +btree_peek_prev( + struct btree_root *root, + unsigned long *key); + +void * +btree_peek_next( + struct btree_root *root, + unsigned long *key); + +void * +btree_lookup_next( + struct btree_root *root, + unsigned long *key); + +void * +btree_lookup_prev( + struct btree_root *root, + unsigned long *key); + +int +btree_insert( + struct btree_root *root, + unsigned long key, + void *value); + +void * +btree_delete( + struct btree_root *root, + unsigned long key); + +int +btree_update_key( + struct btree_root *root, + unsigned long old_key, + unsigned long new_key); + +int +btree_update_value( + struct btree_root *root, + unsigned long key, + void *new_value); + +void +btree_clear( + struct btree_root *root); + +#ifdef BTREE_STATS +void +btree_print_stats( + struct btree_root *root, + FILE *f); +#endif + +#endif /* _BTREE_H */ Index: xfsprogs-dev/repair/init.c =================================================================== --- xfsprogs-dev.orig/repair/init.c 2009-08-20 00:01:58.000000000 +0000 +++ xfsprogs-dev/repair/init.c 2009-08-20 00:06:44.000000000 +0000 @@ -26,7 +26,6 @@ #include "dir.h" #include "incore.h" #include "prefetch.h" -#include "radix-tree.h" #include static pthread_key_t dirbuf_key; @@ -151,5 +150,4 @@ ts_create(); ts_init(); increase_rlimit(); - radix_tree_init(); } Index: xfsprogs-dev/repair/prefetch.c =================================================================== --- xfsprogs-dev.orig/repair/prefetch.c 2009-08-20 00:05:36.000000000 +0000 +++ xfsprogs-dev/repair/prefetch.c 2009-08-20 00:14:08.000000000 +0000 @@ -1,6 +1,7 @@ #include #include #include "avl.h" +#include "btree.h" #include "globals.h" #include "agheader.h" #include "incore.h" @@ -14,7 +15,6 @@ #include "threads.h" #include "prefetch.h" #include "progress.h" -#include "radix-tree.h" int do_prefetch = 1; @@ -129,10 +129,8 @@ pthread_mutex_lock(&args->lock); if (fsbno > args->last_bno_read) { - radix_tree_insert(&args->primary_io_queue, fsbno, bp); - if (!B_IS_INODE(flag)) - radix_tree_tag_set(&args->primary_io_queue, fsbno, 0); - else { + btree_insert(args->primary_io_queue, fsbno, bp); + if (B_IS_INODE(flag)) { args->inode_bufs_queued++; if (args->inode_bufs_queued == IO_THRESHOLD) pf_start_io_workers(args); @@ -154,7 +152,7 @@ #endif ASSERT(!B_IS_INODE(flag)); XFS_BUF_SET_PRIORITY(bp, B_DIR_META_2); - radix_tree_insert(&args->secondary_io_queue, fsbno, bp); + btree_insert(args->secondary_io_queue, fsbno, bp); } pf_start_processing(args); @@ -407,7 +405,7 @@ pf_which_t which, void *buf) { - struct radix_tree_root *queue; + struct btree_root *queue; xfs_buf_t *bplist[MAX_BUFS]; unsigned int num; off64_t first_off, last_off, next_off; @@ -415,27 +413,25 @@ int i; int inode_bufs; unsigned long fsbno; + unsigned long max_fsbno; char *pbuf; - queue = (which != PF_SECONDARY) ? &args->primary_io_queue - : &args->secondary_io_queue; + queue = (which != PF_SECONDARY) ? args->primary_io_queue + : args->secondary_io_queue; - while (radix_tree_lookup_first(queue, &fsbno) != NULL) { - - if (which != PF_META_ONLY) { - num = radix_tree_gang_lookup_ex(queue, - (void**)&bplist[0], fsbno, - fsbno + pf_max_fsbs, MAX_BUFS); - ASSERT(num > 0); - ASSERT(XFS_FSB_TO_DADDR(mp, fsbno) == - XFS_BUF_ADDR(bplist[0])); - } else { - num = radix_tree_gang_lookup_tag(queue, - (void**)&bplist[0], fsbno, - MAX_BUFS / 4, 0); - if (num == 0) - return; + while (btree_find(queue, 0, &fsbno) != NULL) { + max_fsbno = fsbno + pf_max_fsbs; + num = 0; + + bplist[0] = btree_lookup(queue, fsbno); + while (bplist[num] && num < MAX_BUFS && fsbno < max_fsbno) { + if (which != PF_META_ONLY || + !B_IS_INODE(XFS_BUF_PRIORITY(bplist[num]))) + num++; + bplist[num] = btree_lookup_next(queue, &fsbno); } + if (!num) + return; /* * do a big read if 25% of the potential buffer is useful, @@ -467,7 +463,7 @@ } for (i = 0; i < num; i++) { - if (radix_tree_delete(queue, XFS_DADDR_TO_FSB(mp, + if (btree_delete(queue, XFS_DADDR_TO_FSB(mp, XFS_BUF_ADDR(bplist[i]))) == NULL) do_error(_("prefetch corruption\n")); } @@ -570,7 +566,7 @@ return NULL; pthread_mutex_lock(&args->lock); - while (!args->queuing_done || args->primary_io_queue.height) { + while (!args->queuing_done || btree_find(args->primary_io_queue, 0, NULL)) { #ifdef XR_PF_TRACE pftrace("waiting to start prefetch I/O for AG %d", args->agno); @@ -696,8 +692,8 @@ #endif pthread_mutex_lock(&args->lock); - ASSERT(args->primary_io_queue.height == 0); - ASSERT(args->secondary_io_queue.height == 0); + ASSERT(btree_find(args->primary_io_queue, 0, NULL) == NULL); + ASSERT(btree_find(args->secondary_io_queue, 0, NULL) == NULL); args->prefetch_done = 1; if (args->next_args) @@ -755,8 +751,8 @@ args = calloc(1, sizeof(prefetch_args_t)); - INIT_RADIX_TREE(&args->primary_io_queue, 0); - INIT_RADIX_TREE(&args->secondary_io_queue, 0); + btree_init(&args->primary_io_queue); + btree_init(&args->secondary_io_queue); if (pthread_mutex_init(&args->lock, NULL) != 0) do_error(_("failed to initialize prefetch mutex\n")); if (pthread_cond_init(&args->start_reading, NULL) != 0) @@ -835,6 +831,8 @@ pthread_cond_destroy(&args->start_reading); pthread_cond_destroy(&args->start_processing); sem_destroy(&args->ra_count); + btree_destroy(args->primary_io_queue); + btree_destroy(args->secondary_io_queue); free(args); } Index: xfsprogs-dev/repair/prefetch.h =================================================================== --- xfsprogs-dev.orig/repair/prefetch.h 2009-08-20 00:01:58.000000000 +0000 +++ xfsprogs-dev/repair/prefetch.h 2009-08-20 00:06:44.000000000 +0000 @@ -3,7 +3,6 @@ #include #include "incore.h" -#include "radix-tree.h" extern int do_prefetch; @@ -14,8 +13,8 @@ pthread_mutex_t lock; pthread_t queuing_thread; pthread_t io_threads[PF_THREAD_COUNT]; - struct radix_tree_root primary_io_queue; - struct radix_tree_root secondary_io_queue; + struct btree_root *primary_io_queue; + struct btree_root *secondary_io_queue; pthread_cond_t start_reading; pthread_cond_t start_processing; int agno; Index: xfsprogs-dev/repair/radix-tree.c =================================================================== --- xfsprogs-dev.orig/repair/radix-tree.c 2009-08-20 00:01:58.000000000 +0000 +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 @@ -1,805 +0,0 @@ -/* - * Copyright (C) 2001 Momchil Velikov - * Portions Copyright (C) 2001 Christoph Hellwig - * Copyright (C) 2005 SGI, Christoph Lameter - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License as - * published by the Free Software Foundation; either version 2, or (at - * your option) any later version. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. - */ - -#include -#include "radix-tree.h" - -#ifndef ARRAY_SIZE -#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) -#endif - -#define RADIX_TREE_MAP_SHIFT 6 -#define RADIX_TREE_MAP_SIZE (1UL << RADIX_TREE_MAP_SHIFT) -#define RADIX_TREE_MAP_MASK (RADIX_TREE_MAP_SIZE-1) - -#ifdef RADIX_TREE_TAGS -#define RADIX_TREE_TAG_LONGS \ - ((RADIX_TREE_MAP_SIZE + BITS_PER_LONG - 1) / BITS_PER_LONG) -#endif - -struct radix_tree_node { - unsigned int count; - void *slots[RADIX_TREE_MAP_SIZE]; -#ifdef RADIX_TREE_TAGS - unsigned long tags[RADIX_TREE_MAX_TAGS][RADIX_TREE_TAG_LONGS]; -#endif -}; - -struct radix_tree_path { - struct radix_tree_node *node; - int offset; -}; - -#define RADIX_TREE_INDEX_BITS (8 /* CHAR_BIT */ * sizeof(unsigned long)) -#define RADIX_TREE_MAX_PATH (RADIX_TREE_INDEX_BITS/RADIX_TREE_MAP_SHIFT + 2) - -static unsigned long height_to_maxindex[RADIX_TREE_MAX_PATH]; - -/* - * Radix tree node cache. - */ - -#define radix_tree_node_alloc(r) ((struct radix_tree_node *) \ - calloc(1, sizeof(struct radix_tree_node))) -#define radix_tree_node_free(n) free(n) - -#ifdef RADIX_TREE_TAGS - -static inline void tag_set(struct radix_tree_node *node, unsigned int tag, - int offset) -{ - *((__uint32_t *)node->tags[tag] + (offset >> 5)) |= (1 << (offset & 31)); -} - -static inline void tag_clear(struct radix_tree_node *node, unsigned int tag, - int offset) -{ - __uint32_t *p = (__uint32_t*)node->tags[tag] + (offset >> 5); - __uint32_t m = 1 << (offset & 31); - *p &= ~m; -} - -static inline int tag_get(struct radix_tree_node *node, unsigned int tag, - int offset) -{ - return 1 & (((const __uint32_t *)node->tags[tag])[offset >> 5] >> (offset & 31)); -} - -/* - * Returns 1 if any slot in the node has this tag set. - * Otherwise returns 0. - */ -static inline int any_tag_set(struct radix_tree_node *node, unsigned int tag) -{ - int idx; - for (idx = 0; idx < RADIX_TREE_TAG_LONGS; idx++) { - if (node->tags[tag][idx]) - return 1; - } - return 0; -} - -#endif - -/* - * Return the maximum key which can be store into a - * radix tree with height HEIGHT. - */ -static inline unsigned long radix_tree_maxindex(unsigned int height) -{ - return height_to_maxindex[height]; -} - -/* - * Extend a radix tree so it can store key @index. - */ -static int radix_tree_extend(struct radix_tree_root *root, unsigned long index) -{ - struct radix_tree_node *node; - unsigned int height; -#ifdef RADIX_TREE_TAGS - char tags[RADIX_TREE_MAX_TAGS]; - int tag; -#endif - - /* Figure out what the height should be. */ - height = root->height + 1; - while (index > radix_tree_maxindex(height)) - height++; - - if (root->rnode == NULL) { - root->height = height; - goto out; - } - -#ifdef RADIX_TREE_TAGS - /* - * Prepare the tag status of the top-level node for propagation - * into the newly-pushed top-level node(s) - */ - for (tag = 0; tag < RADIX_TREE_MAX_TAGS; tag++) { - tags[tag] = 0; - if (any_tag_set(root->rnode, tag)) - tags[tag] = 1; - } -#endif - do { - if (!(node = radix_tree_node_alloc(root))) - return -ENOMEM; - - /* Increase the height. */ - node->slots[0] = root->rnode; - -#ifdef RADIX_TREE_TAGS - /* Propagate the aggregated tag info into the new root */ - for (tag = 0; tag < RADIX_TREE_MAX_TAGS; tag++) { - if (tags[tag]) - tag_set(node, tag, 0); - } -#endif - node->count = 1; - root->rnode = node; - root->height++; - } while (height > root->height); -out: - return 0; -} - -/** - * radix_tree_insert - insert into a radix tree - * @root: radix tree root - * @index: index key - * @item: item to insert - * - * Insert an item into the radix tree at position @index. - */ -int radix_tree_insert(struct radix_tree_root *root, - unsigned long index, void *item) -{ - struct radix_tree_node *node = NULL, *slot; - unsigned int height, shift; - int offset; - int error; - - /* Make sure the tree is high enough. */ - if ((!index && !root->rnode) || - index > radix_tree_maxindex(root->height)) { - error = radix_tree_extend(root, index); - if (error) - return error; - } - - slot = root->rnode; - height = root->height; - shift = (height-1) * RADIX_TREE_MAP_SHIFT; - - offset = 0; /* uninitialised var warning */ - do { - if (slot == NULL) { - /* Have to add a child node. */ - if (!(slot = radix_tree_node_alloc(root))) - return -ENOMEM; - if (node) { - node->slots[offset] = slot; - node->count++; - } else - root->rnode = slot; - } - - /* Go a level down */ - offset = (index >> shift) & RADIX_TREE_MAP_MASK; - node = slot; - slot = node->slots[offset]; - shift -= RADIX_TREE_MAP_SHIFT; - height--; - } while (height > 0); - - if (slot != NULL) - return -EEXIST; - - ASSERT(node); - node->count++; - node->slots[offset] = item; -#ifdef RADIX_TREE_TAGS - ASSERT(!tag_get(node, 0, offset)); - ASSERT(!tag_get(node, 1, offset)); -#endif - return 0; -} - -static inline void **__lookup_slot(struct radix_tree_root *root, - unsigned long index) -{ - unsigned int height, shift; - struct radix_tree_node **slot; - - height = root->height; - if (index > radix_tree_maxindex(height)) - return NULL; - - shift = (height-1) * RADIX_TREE_MAP_SHIFT; - slot = &root->rnode; - - while (height > 0) { - if (*slot == NULL) - return NULL; - - slot = (struct radix_tree_node **) - ((*slot)->slots + - ((index >> shift) & RADIX_TREE_MAP_MASK)); - shift -= RADIX_TREE_MAP_SHIFT; - height--; - } - - return (void **)slot; -} - -/** - * radix_tree_lookup_slot - lookup a slot in a radix tree - * @root: radix tree root - * @index: index key - * - * Lookup the slot corresponding to the position @index in the radix tree - * @root. This is useful for update-if-exists operations. - */ -void **radix_tree_lookup_slot(struct radix_tree_root *root, unsigned long index) -{ - return __lookup_slot(root, index); -} - -/** - * radix_tree_lookup - perform lookup operation on a radix tree - * @root: radix tree root - * @index: index key - * - * Lookup the item at the position @index in the radix tree @root. - */ -void *radix_tree_lookup(struct radix_tree_root *root, unsigned long index) -{ - void **slot; - - slot = __lookup_slot(root, index); - return slot != NULL ? *slot : NULL; -} - -/** - * raid_tree_first_key - find the first index key in the radix tree - * @root: radix tree root - * @index: where the first index will be placed - * - * Returns the first entry and index key in the radix tree @root. - */ -void *radix_tree_lookup_first(struct radix_tree_root *root, unsigned long *index) -{ - unsigned int height, shift; - struct radix_tree_node *slot; - unsigned long i; - - height = root->height; - *index = 0; - if (height == 0) - return NULL; - - shift = (height-1) * RADIX_TREE_MAP_SHIFT; - slot = root->rnode; - - for (; height > 1; height--) { - for (i = 0; i < RADIX_TREE_MAP_SIZE; i++) { - if (slot->slots[i] != NULL) - break; - } - ASSERT(i < RADIX_TREE_MAP_SIZE); - - *index |= (i << shift); - shift -= RADIX_TREE_MAP_SHIFT; - slot = slot->slots[i]; - } - for (i = 0; i < RADIX_TREE_MAP_SIZE; i++) { - if (slot->slots[i] != NULL) { - *index |= i; - return slot->slots[i]; - } - } - return NULL; -} - -#ifdef RADIX_TREE_TAGS - -/** - * radix_tree_tag_set - set a tag on a radix tree node - * @root: radix tree root - * @index: index key - * @tag: tag index - * - * Set the search tag (which must be < RADIX_TREE_MAX_TAGS) - * corresponding to @index in the radix tree. From - * the root all the way down to the leaf node. - * - * Returns the address of the tagged item. Setting a tag on a not-present - * item is a bug. - */ -void *radix_tree_tag_set(struct radix_tree_root *root, - unsigned long index, unsigned int tag) -{ - unsigned int height, shift; - struct radix_tree_node *slot; - - height = root->height; - if (index > radix_tree_maxindex(height)) - return NULL; - - shift = (height - 1) * RADIX_TREE_MAP_SHIFT; - slot = root->rnode; - - while (height > 0) { - int offset; - - offset = (index >> shift) & RADIX_TREE_MAP_MASK; - if (!tag_get(slot, tag, offset)) - tag_set(slot, tag, offset); - slot = slot->slots[offset]; - ASSERT(slot != NULL); - shift -= RADIX_TREE_MAP_SHIFT; - height--; - } - - return slot; -} - -/** - * radix_tree_tag_clear - clear a tag on a radix tree node - * @root: radix tree root - * @index: index key - * @tag: tag index - * - * Clear the search tag (which must be < RADIX_TREE_MAX_TAGS) - * corresponding to @index in the radix tree. If - * this causes the leaf node to have no tags set then clear the tag in the - * next-to-leaf node, etc. - * - * Returns the address of the tagged item on success, else NULL. ie: - * has the same return value and semantics as radix_tree_lookup(). - */ -void *radix_tree_tag_clear(struct radix_tree_root *root, - unsigned long index, unsigned int tag) -{ - struct radix_tree_path path[RADIX_TREE_MAX_PATH], *pathp = path; - struct radix_tree_node *slot; - unsigned int height, shift; - void *ret = NULL; - - height = root->height; - if (index > radix_tree_maxindex(height)) - goto out; - - shift = (height - 1) * RADIX_TREE_MAP_SHIFT; - pathp->node = NULL; - slot = root->rnode; - - while (height > 0) { - int offset; - - if (slot == NULL) - goto out; - - offset = (index >> shift) & RADIX_TREE_MAP_MASK; - pathp[1].offset = offset; - pathp[1].node = slot; - slot = slot->slots[offset]; - pathp++; - shift -= RADIX_TREE_MAP_SHIFT; - height--; - } - - ret = slot; - if (ret == NULL) - goto out; - - do { - if (!tag_get(pathp->node, tag, pathp->offset)) - goto out; - tag_clear(pathp->node, tag, pathp->offset); - if (any_tag_set(pathp->node, tag)) - goto out; - pathp--; - } while (pathp->node); -out: - return ret; -} - -#endif - -static unsigned int -__lookup(struct radix_tree_root *root, void **results, unsigned long index, - unsigned int max_items, unsigned long *next_index) -{ - unsigned int nr_found = 0; - unsigned int shift, height; - struct radix_tree_node *slot; - unsigned long i; - - height = root->height; - if (height == 0) - goto out; - - shift = (height-1) * RADIX_TREE_MAP_SHIFT; - slot = root->rnode; - - for ( ; height > 1; height--) { - - for (i = (index >> shift) & RADIX_TREE_MAP_MASK ; - i < RADIX_TREE_MAP_SIZE; i++) { - if (slot->slots[i] != NULL) - break; - index &= ~((1UL << shift) - 1); - index += 1UL << shift; - if (index == 0) - goto out; /* 32-bit wraparound */ - } - if (i == RADIX_TREE_MAP_SIZE) - goto out; - - shift -= RADIX_TREE_MAP_SHIFT; - slot = slot->slots[i]; - } - - /* Bottom level: grab some items */ - for (i = index & RADIX_TREE_MAP_MASK; i < RADIX_TREE_MAP_SIZE; i++) { - index++; - if (slot->slots[i]) { - results[nr_found++] = slot->slots[i]; - if (nr_found == max_items) - goto out; - } - } -out: - *next_index = index; - return nr_found; -} - -/** - * radix_tree_gang_lookup - perform multiple lookup on a radix tree - * @root: radix tree root - * @results: where the results of the lookup are placed - * @first_index: start the lookup from this key - * @max_items: place up to this many items at *results - * - * Performs an index-ascending scan of the tree for present items. Places - * them at *@results and returns the number of items which were placed at - * *@results. - * - * The implementation is naive. - */ -unsigned int -radix_tree_gang_lookup(struct radix_tree_root *root, void **results, - unsigned long first_index, unsigned int max_items) -{ - const unsigned long max_index = radix_tree_maxindex(root->height); - unsigned long cur_index = first_index; - unsigned int ret = 0; - - while (ret < max_items) { - unsigned int nr_found; - unsigned long next_index; /* Index of next search */ - - if (cur_index > max_index) - break; - nr_found = __lookup(root, results + ret, cur_index, - max_items - ret, &next_index); - ret += nr_found; - if (next_index == 0) - break; - cur_index = next_index; - } - return ret; -} - -/** - * radix_tree_gang_lookup_ex - perform multiple lookup on a radix tree - * @root: radix tree root - * @results: where the results of the lookup are placed - * @first_index: start the lookup from this key - * @last_index: don't lookup past this key - * @max_items: place up to this many items at *results - * - * Performs an index-ascending scan of the tree for present items starting - * @first_index until @last_index up to as many as @max_items. Places - * them at *@results and returns the number of items which were placed - * at *@results. - * - * The implementation is naive. - */ -unsigned int -radix_tree_gang_lookup_ex(struct radix_tree_root *root, void **results, - unsigned long first_index, unsigned long last_index, - unsigned int max_items) -{ - const unsigned long max_index = radix_tree_maxindex(root->height); - unsigned long cur_index = first_index; - unsigned int ret = 0; - - while (ret < max_items && cur_index < last_index) { - unsigned int nr_found; - unsigned long next_index; /* Index of next search */ - - if (cur_index > max_index) - break; - nr_found = __lookup(root, results + ret, cur_index, - max_items - ret, &next_index); - ret += nr_found; - if (next_index == 0) - break; - cur_index = next_index; - } - return ret; -} - -#ifdef RADIX_TREE_TAGS - -static unsigned int -__lookup_tag(struct radix_tree_root *root, void **results, unsigned long index, - unsigned int max_items, unsigned long *next_index, unsigned int tag) -{ - unsigned int nr_found = 0; - unsigned int shift; - unsigned int height = root->height; - struct radix_tree_node *slot; - - shift = (height - 1) * RADIX_TREE_MAP_SHIFT; - slot = root->rnode; - - while (height > 0) { - unsigned long i = (index >> shift) & RADIX_TREE_MAP_MASK; - - for ( ; i < RADIX_TREE_MAP_SIZE; i++) { - if (tag_get(slot, tag, i)) { - ASSERT(slot->slots[i] != NULL); - break; - } - index &= ~((1UL << shift) - 1); - index += 1UL << shift; - if (index == 0) - goto out; /* 32-bit wraparound */ - } - if (i == RADIX_TREE_MAP_SIZE) - goto out; - height--; - if (height == 0) { /* Bottom level: grab some items */ - unsigned long j = index & RADIX_TREE_MAP_MASK; - - for ( ; j < RADIX_TREE_MAP_SIZE; j++) { - index++; - if (tag_get(slot, tag, j)) { - ASSERT(slot->slots[j] != NULL); - results[nr_found++] = slot->slots[j]; - if (nr_found == max_items) - goto out; - } - } - } - shift -= RADIX_TREE_MAP_SHIFT; - slot = slot->slots[i]; - } -out: - *next_index = index; - return nr_found; -} - -/** - * radix_tree_gang_lookup_tag - perform multiple lookup on a radix tree - * based on a tag - * @root: radix tree root - * @results: where the results of the lookup are placed - * @first_index: start the lookup from this key - * @max_items: place up to this many items at *results - * @tag: the tag index (< RADIX_TREE_MAX_TAGS) - * - * Performs an index-ascending scan of the tree for present items which - * have the tag indexed by @tag set. Places the items at *@results and - * returns the number of items which were placed at *@results. - */ -unsigned int -radix_tree_gang_lookup_tag(struct radix_tree_root *root, void **results, - unsigned long first_index, unsigned int max_items, - unsigned int tag) -{ - const unsigned long max_index = radix_tree_maxindex(root->height); - unsigned long cur_index = first_index; - unsigned int ret = 0; - - while (ret < max_items) { - unsigned int nr_found; - unsigned long next_index; /* Index of next search */ - - if (cur_index > max_index) - break; - nr_found = __lookup_tag(root, results + ret, cur_index, - max_items - ret, &next_index, tag); - ret += nr_found; - if (next_index == 0) - break; - cur_index = next_index; - } - return ret; -} - -#endif - -/** - * radix_tree_shrink - shrink height of a radix tree to minimal - * @root radix tree root - */ -static inline void radix_tree_shrink(struct radix_tree_root *root) -{ - /* try to shrink tree height */ - while (root->height > 1 && - root->rnode->count == 1 && - root->rnode->slots[0]) { - struct radix_tree_node *to_free = root->rnode; - - root->rnode = to_free->slots[0]; - root->height--; - /* must only free zeroed nodes into the slab */ -#ifdef RADIX_TREE_TAGS - tag_clear(to_free, 0, 0); - tag_clear(to_free, 1, 0); -#endif - to_free->slots[0] = NULL; - to_free->count = 0; - radix_tree_node_free(to_free); - } -} - -/** - * radix_tree_delete - delete an item from a radix tree - * @root: radix tree root - * @index: index key - * - * Remove the item at @index from the radix tree rooted at @root. - * - * Returns the address of the deleted item, or NULL if it was not present. - */ -void *radix_tree_delete(struct radix_tree_root *root, unsigned long index) -{ - struct radix_tree_path path[RADIX_TREE_MAX_PATH], *pathp = path; - struct radix_tree_path *orig_pathp; - struct radix_tree_node *slot; - unsigned int height, shift; - void *ret = NULL; -#ifdef RADIX_TREE_TAGS - char tags[RADIX_TREE_MAX_TAGS]; - int nr_cleared_tags; - int tag; -#endif - int offset; - - height = root->height; - if (index > radix_tree_maxindex(height)) - goto out; - - shift = (height - 1) * RADIX_TREE_MAP_SHIFT; - pathp->node = NULL; - slot = root->rnode; - - for ( ; height > 0; height--) { - if (slot == NULL) - goto out; - - pathp++; - offset = (index >> shift) & RADIX_TREE_MAP_MASK; - pathp->offset = offset; - pathp->node = slot; - slot = slot->slots[offset]; - shift -= RADIX_TREE_MAP_SHIFT; - } - - ret = slot; - if (ret == NULL) - goto out; - - orig_pathp = pathp; - -#ifdef RADIX_TREE_TAGS - /* - * Clear all tags associated with the just-deleted item - */ - nr_cleared_tags = 0; - for (tag = 0; tag < RADIX_TREE_MAX_TAGS; tag++) { - tags[tag] = 1; - if (tag_get(pathp->node, tag, pathp->offset)) { - tag_clear(pathp->node, tag, pathp->offset); - if (!any_tag_set(pathp->node, tag)) { - tags[tag] = 0; - nr_cleared_tags++; - } - } - } - - for (pathp--; nr_cleared_tags && pathp->node; pathp--) { - for (tag = 0; tag < RADIX_TREE_MAX_TAGS; tag++) { - if (tags[tag]) - continue; - - tag_clear(pathp->node, tag, pathp->offset); - if (any_tag_set(pathp->node, tag)) { - tags[tag] = 1; - nr_cleared_tags--; - } - } - } -#endif - /* Now free the nodes we do not need anymore */ - for (pathp = orig_pathp; pathp->node; pathp--) { - pathp->node->slots[pathp->offset] = NULL; - pathp->node->count--; - - if (pathp->node->count) { - if (pathp->node == root->rnode) - radix_tree_shrink(root); - goto out; - } - - /* Node with zero slots in use so free it */ - radix_tree_node_free(pathp->node); - } - root->rnode = NULL; - root->height = 0; -out: - return ret; -} - -#ifdef RADIX_TREE_TAGS -/** - * radix_tree_tagged - test whether any items in the tree are tagged - * @root: radix tree root - * @tag: tag to test - */ -int radix_tree_tagged(struct radix_tree_root *root, unsigned int tag) -{ - struct radix_tree_node *rnode; - rnode = root->rnode; - if (!rnode) - return 0; - return any_tag_set(rnode, tag); -} -#endif - -static unsigned long __maxindex(unsigned int height) -{ - unsigned int tmp = height * RADIX_TREE_MAP_SHIFT; - unsigned long index = (~0UL >> (RADIX_TREE_INDEX_BITS - tmp - 1)) >> 1; - - if (tmp >= RADIX_TREE_INDEX_BITS) - index = ~0UL; - return index; -} - -static void radix_tree_init_maxindex(void) -{ - unsigned int i; - - for (i = 0; i < ARRAY_SIZE(height_to_maxindex); i++) - height_to_maxindex[i] = __maxindex(i); -} - -void radix_tree_init(void) -{ - radix_tree_init_maxindex(); -} Index: xfsprogs-dev/repair/radix-tree.h =================================================================== --- xfsprogs-dev.orig/repair/radix-tree.h 2009-08-20 00:01:58.000000000 +0000 +++ /dev/null 1970-01-01 00:00:00.000000000 +0000 @@ -1,76 +0,0 @@ -/* - * Copyright (C) 2001 Momchil Velikov - * Portions Copyright (C) 2001 Christoph Hellwig - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License as - * published by the Free Software Foundation; either version 2, or (at - * your option) any later version. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. - */ -#ifndef __XFS_SUPPORT_RADIX_TREE_H__ -#define __XFS_SUPPORT_RADIX_TREE_H__ - -#define RADIX_TREE_TAGS - -struct radix_tree_root { - unsigned int height; - struct radix_tree_node *rnode; -}; - -#define RADIX_TREE_INIT(mask) { \ - .height = 0, \ - .rnode = NULL, \ -} - -#define RADIX_TREE(name, mask) \ - struct radix_tree_root name = RADIX_TREE_INIT(mask) - -#define INIT_RADIX_TREE(root, mask) \ -do { \ - (root)->height = 0; \ - (root)->rnode = NULL; \ -} while (0) - -#ifdef RADIX_TREE_TAGS -#define RADIX_TREE_MAX_TAGS 2 -#endif - -int radix_tree_insert(struct radix_tree_root *, unsigned long, void *); -void *radix_tree_lookup(struct radix_tree_root *, unsigned long); -void **radix_tree_lookup_slot(struct radix_tree_root *, unsigned long); -void *radix_tree_lookup_first(struct radix_tree_root *, unsigned long *); -void *radix_tree_delete(struct radix_tree_root *, unsigned long); -unsigned int -radix_tree_gang_lookup(struct radix_tree_root *root, void **results, - unsigned long first_index, unsigned int max_items); -unsigned int -radix_tree_gang_lookup_ex(struct radix_tree_root *root, void **results, - unsigned long first_index, unsigned long last_index, - unsigned int max_items); - -void radix_tree_init(void); - -#ifdef RADIX_TREE_TAGS -void *radix_tree_tag_set(struct radix_tree_root *root, - unsigned long index, unsigned int tag); -void *radix_tree_tag_clear(struct radix_tree_root *root, - unsigned long index, unsigned int tag); -int radix_tree_tag_get(struct radix_tree_root *root, - unsigned long index, unsigned int tag); -unsigned int -radix_tree_gang_lookup_tag(struct radix_tree_root *root, void **results, - unsigned long first_index, unsigned int max_items, - unsigned int tag); -int radix_tree_tagged(struct radix_tree_root *root, unsigned int tag); -#endif - -#endif /* __XFS_SUPPORT_RADIX_TREE_H__ */ From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:43 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_64, J_CHICKENPOX_65,J_CHICKENPOX_66 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HwIFf034412 for ; Wed, 2 Sep 2009 12:58:33 -0500 X-ASG-Debug-ID: 1251914322-70a203a50000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 9640441E04A for ; Wed, 2 Sep 2009 10:58:42 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id 3XWEVUcA6owepWaI for ; Wed, 02 Sep 2009 10:58:42 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6U-0006a2-5D; Wed, 02 Sep 2009 17:58:42 +0000 Message-Id: <20090902175842.081792481@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:44 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 13/14] repair: optimize duplicate extent tracking Subject: [PATCH 13/14] repair: optimize duplicate extent tracking References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-dup_extents-btree X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914322 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Switch the duplicate extent tracking from an avl tree to our new btree implementation. Modify search_dup_extent to find overlapping extents with differening start blocks instead of having the caller walk every possible start block. Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/dinode.c =================================================================== --- xfsprogs-dev.orig/repair/dinode.c 2009-08-21 15:11:16.000000000 +0000 +++ xfsprogs-dev/repair/dinode.c 2009-08-21 15:11:29.000000000 +0000 @@ -735,18 +735,14 @@ process_bmbt_reclist_int( * checking each entry without setting the * block bitmap */ - for (b = irec.br_startblock; - agbno < ebno; - b++, agbno++) { - if (search_dup_extent(mp, agno, agbno)) { - do_warn(_("%s fork in ino %llu claims " - "dup extent, off - %llu, " - "start - %llu, cnt %llu\n"), - forkname, ino, irec.br_startoff, - irec.br_startblock, - irec.br_blockcount); - goto done; - } + if (search_dup_extent(agno, agbno, ebno)) { + do_warn(_("%s fork in ino %llu claims " + "dup extent, off - %llu, " + "start - %llu, cnt %llu\n"), + forkname, ino, irec.br_startoff, + irec.br_startblock, + irec.br_blockcount); + goto done; } *tot += irec.br_blockcount; continue; Index: xfsprogs-dev/repair/incore.h =================================================================== --- xfsprogs-dev.orig/repair/incore.h 2009-08-21 15:11:16.000000000 +0000 +++ xfsprogs-dev/repair/incore.h 2009-08-21 15:11:29.000000000 +0000 @@ -20,6 +20,8 @@ #define XFS_REPAIR_INCORE_H #include "avl.h" + + /* * contains definition information. implementation (code) * is spread out in separate files. @@ -179,23 +181,11 @@ get_bcnt_extent(xfs_agnumber_t agno, xfs /* * duplicate extent tree functions */ -void add_dup_extent(xfs_agnumber_t agno, - xfs_agblock_t startblock, - xfs_extlen_t blockcount); - -extern avltree_desc_t **extent_tree_ptrs; -/* ARGSUSED */ -static inline int -search_dup_extent(xfs_mount_t *mp, xfs_agnumber_t agno, xfs_agblock_t agbno) -{ - ASSERT(agno < glob_agcount); - - if (avl_findrange(extent_tree_ptrs[agno], agbno) != NULL) - return(1); - - return(0); -} +int add_dup_extent(xfs_agnumber_t agno, xfs_agblock_t startblock, + xfs_extlen_t blockcount); +int search_dup_extent(xfs_agnumber_t agno, + xfs_agblock_t start_agbno, xfs_agblock_t end_agbno); void add_rt_dup_extent(xfs_drtbno_t startblock, xfs_extlen_t blockcount); Index: xfsprogs-dev/repair/incore_ext.c =================================================================== --- xfsprogs-dev.orig/repair/incore_ext.c 2009-08-21 15:11:16.000000000 +0000 +++ xfsprogs-dev/repair/incore_ext.c 2009-08-21 15:24:07.000000000 +0000 @@ -18,6 +18,7 @@ #include #include "avl.h" +#include "btree.h" #include "globals.h" #include "incore.h" #include "agheader.h" @@ -72,8 +73,8 @@ static rt_ext_flist_t rt_ext_flist; static avl64tree_desc_t *rt_ext_tree_ptr; /* dup extent tree for rt */ -avltree_desc_t **extent_tree_ptrs; /* array of extent tree ptrs */ - /* one per ag for dups */ +static struct btree_root **dup_extent_trees; /* per ag dup extent trees */ + static avltree_desc_t **extent_bno_ptrs; /* * array of extent tree ptrs * one per ag for free extents @@ -100,6 +101,48 @@ static pthread_mutex_t rt_ext_tree_lock; static pthread_mutex_t rt_ext_flist_lock; /* + * duplicate extent tree functions + */ + +void +release_dup_extent_tree( + xfs_agnumber_t agno) +{ + btree_clear(dup_extent_trees[agno]); +} + +int +add_dup_extent( + xfs_agnumber_t agno, + xfs_agblock_t startblock, + xfs_extlen_t blockcount) +{ +#ifdef XR_DUP_TRACE + fprintf(stderr, "Adding dup extent - %d/%d %d\n", agno, startblock, + blockcount); +#endif + return btree_insert(dup_extent_trees[agno], startblock, + (void *)(uintptr_t)(startblock + blockcount)); +} + +int +search_dup_extent( + xfs_agnumber_t agno, + xfs_agblock_t start_agbno, + xfs_agblock_t end_agbno) +{ + unsigned long bno; + + if (!btree_find(dup_extent_trees[agno], start_agbno, &bno)) + return 0; /* this really shouldn't happen */ + if (bno < end_agbno) + return 1; + return (uintptr_t)btree_peek_prev(dup_extent_trees[agno], NULL) > + start_agbno; +} + + +/* * extent tree stuff is avl trees of duplicate extents, * sorted in order by block number. there is one tree per ag. */ @@ -211,14 +254,6 @@ release_extent_tree(avltree_desc_t *tree * top-level (visible) routines */ void -release_dup_extent_tree(xfs_agnumber_t agno) -{ - release_extent_tree(extent_tree_ptrs[agno]); - - return; -} - -void release_agbno_extent_tree(xfs_agnumber_t agno) { release_extent_tree(extent_bno_ptrs[agno]); @@ -522,93 +557,6 @@ get_bcnt_extent(xfs_agnumber_t agno, xfs return(ext); } -/* - * the next 2 routines manage the trees of duplicate extents -- 1 tree - * per AG - */ -void -add_dup_extent(xfs_agnumber_t agno, xfs_agblock_t startblock, - xfs_extlen_t blockcount) -{ - extent_tree_node_t *first, *last, *ext, *next_ext; - xfs_agblock_t new_startblock; - xfs_extlen_t new_blockcount; - - ASSERT(agno < glob_agcount); - -#ifdef XR_DUP_TRACE - fprintf(stderr, "Adding dup extent - %d/%d %d\n", agno, startblock, blockcount); -#endif - avl_findranges(extent_tree_ptrs[agno], startblock - 1, - startblock + blockcount + 1, - (avlnode_t **) &first, (avlnode_t **) &last); - /* - * find adjacent and overlapping extent blocks - */ - if (first == NULL && last == NULL) { - /* nothing, just make and insert new extent */ - - ext = mk_extent_tree_nodes(startblock, blockcount, XR_E_MULT); - - if (avl_insert(extent_tree_ptrs[agno], - (avlnode_t *) ext) == NULL) { - do_error(_("duplicate extent range\n")); - } - - return; - } - - ASSERT(first != NULL && last != NULL); - - /* - * find the new composite range, delete old extent nodes - * as we go - */ - new_startblock = startblock; - new_blockcount = blockcount; - - for (ext = first; - ext != (extent_tree_node_t *) last->avl_node.avl_nextino; - ext = next_ext) { - /* - * preserve the next inorder node - */ - next_ext = (extent_tree_node_t *) ext->avl_node.avl_nextino; - /* - * just bail if the new extent is contained within an old one - */ - if (ext->ex_startblock <= startblock && - ext->ex_blockcount >= blockcount) - return; - /* - * now check for overlaps and adjacent extents - */ - if (ext->ex_startblock + ext->ex_blockcount >= startblock - || ext->ex_startblock <= startblock + blockcount) { - - if (ext->ex_startblock < new_startblock) - new_startblock = ext->ex_startblock; - - if (ext->ex_startblock + ext->ex_blockcount > - new_startblock + new_blockcount) - new_blockcount = ext->ex_startblock + - ext->ex_blockcount - - new_startblock; - - avl_delete(extent_tree_ptrs[agno], (avlnode_t *) ext); - continue; - } - } - - ext = mk_extent_tree_nodes(new_startblock, new_blockcount, XR_E_MULT); - - if (avl_insert(extent_tree_ptrs[agno], (avlnode_t *) ext) == NULL) { - do_error(_("duplicate extent range\n")); - } - - return; -} - static __psunsigned_t avl_ext_start(avlnode_t *node) { @@ -901,10 +849,9 @@ incore_ext_init(xfs_mount_t *mp) pthread_mutex_init(&rt_ext_tree_lock, NULL); pthread_mutex_init(&rt_ext_flist_lock, NULL); - if ((extent_tree_ptrs = malloc(agcount * - sizeof(avltree_desc_t *))) == NULL) - do_error( - _("couldn't malloc dup extent tree descriptor table\n")); + dup_extent_trees = calloc(agcount, sizeof(struct btree_root *)); + if (!dup_extent_trees) + do_error(_("couldn't malloc dup extent tree descriptor table\n")); if ((extent_bno_ptrs = malloc(agcount * sizeof(avltree_desc_t *))) == NULL) @@ -917,10 +864,6 @@ incore_ext_init(xfs_mount_t *mp) _("couldn't malloc free by-bcnt extent tree descriptor table\n")); for (i = 0; i < agcount; i++) { - if ((extent_tree_ptrs[i] = - malloc(sizeof(avltree_desc_t))) == NULL) - do_error( - _("couldn't malloc dup extent tree descriptor\n")); if ((extent_bno_ptrs[i] = malloc(sizeof(avltree_desc_t))) == NULL) do_error( @@ -932,7 +875,7 @@ incore_ext_init(xfs_mount_t *mp) } for (i = 0; i < agcount; i++) { - avl_init_tree(extent_tree_ptrs[i], &avl_extent_tree_ops); + btree_init(&dup_extent_trees[i]); avl_init_tree(extent_bno_ptrs[i], &avl_extent_tree_ops); avl_init_tree(extent_bcnt_ptrs[i], &avl_extent_bcnt_tree_ops); } @@ -959,18 +902,18 @@ incore_ext_teardown(xfs_mount_t *mp) free_allocations(ba_list); for (i = 0; i < mp->m_sb.sb_agcount; i++) { - free(extent_tree_ptrs[i]); + btree_destroy(dup_extent_trees[i]); free(extent_bno_ptrs[i]); free(extent_bcnt_ptrs[i]); } + free(dup_extent_trees); free(extent_bcnt_ptrs); free(extent_bno_ptrs); - free(extent_tree_ptrs); - extent_bcnt_ptrs = extent_bno_ptrs = extent_tree_ptrs = NULL; - - return; + dup_extent_trees = NULL; + extent_bcnt_ptrs = NULL; + extent_bno_ptrs = NULL; } int Index: xfsprogs-dev/repair/scan.c =================================================================== --- xfsprogs-dev.orig/repair/scan.c 2009-08-21 15:11:16.000000000 +0000 +++ xfsprogs-dev/repair/scan.c 2009-08-21 15:23:51.000000000 +0000 @@ -286,8 +286,9 @@ _("bad back (left) sibling pointer (saw * filesystem */ if (type != XR_INO_RTDATA || whichfork != XFS_DATA_FORK) { - if (search_dup_extent(mp, XFS_FSB_TO_AGNO(mp, bno), - XFS_FSB_TO_AGBNO(mp, bno))) + if (search_dup_extent(XFS_FSB_TO_AGNO(mp, bno), + XFS_FSB_TO_AGBNO(mp, bno), + XFS_FSB_TO_AGBNO(mp, bno) + 1)) return(1); } else { if (search_rt_dup_extent(mp, bno)) From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:43 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_36, J_CHICKENPOX_66 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HwIhD034411 for ; Wed, 2 Sep 2009 12:58:33 -0500 X-ASG-Debug-ID: 1251914322-70a403820000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7320341E03E for ; Wed, 2 Sep 2009 10:58:42 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id HlfjgbUQXbxSolHH for ; Wed, 02 Sep 2009 10:58:42 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6T-0006ZM-Vq; Wed, 02 Sep 2009 17:58:42 +0000 Message-Id: <20090902175841.875447973@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:43 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 12/14] repair: switch block usage bitmap to a btree Subject: [PATCH 12/14] repair: switch block usage bitmap to a btree References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-bmap_extents-btree X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914322 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Using a btree representing the extents is much more space efficient than using a bitmap tracking every single block. In addition it also allows for more optimal algorithms checking range overlaps instead of walking every block in various places. Also move the RT tracking bitmap into incore.c instead of leaving it a as macros - this keeps the implementation contained. Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/dino_chunks.c =================================================================== --- xfsprogs-dev.orig/repair/dino_chunks.c 2009-09-02 14:51:09.449268859 -0300 +++ xfsprogs-dev/repair/dino_chunks.c 2009-09-02 14:51:18.593298964 -0300 @@ -118,6 +118,7 @@ verify_inode_chunk(xfs_mount_t *mp, int i; int j; int state; + xfs_extlen_t blen; agno = XFS_INO_TO_AGNO(mp, ino); agino = XFS_INO_TO_AGINO(mp, ino); @@ -433,9 +434,10 @@ verify_inode_chunk(xfs_mount_t *mp, * entry or an iunlinked pointer */ pthread_mutex_lock(&ag_locks[agno]); - for (j = 0, cur_agbno = chunk_start_agbno; - cur_agbno < chunk_stop_agbno; cur_agbno++) { - state = get_bmap(agno, cur_agbno); + for (cur_agbno = chunk_start_agbno; + cur_agbno < chunk_stop_agbno; + cur_agbno += blen) { + state = get_bmap_ext(agno, cur_agbno, chunk_stop_agbno, &blen); switch (state) { case XR_E_MULT: case XR_E_INUSE: @@ -444,9 +446,9 @@ verify_inode_chunk(xfs_mount_t *mp, do_warn( _("inode block %d/%d multiply claimed, (state %d)\n"), agno, cur_agbno, state); - set_bmap(agno, cur_agbno, XR_E_MULT); - j = 1; - break; + set_bmap_ext(agno, cur_agbno, blen, XR_E_MULT); + pthread_mutex_unlock(&ag_locks[agno]); + return 0; case XR_E_INO: do_error( _("uncertain inode block overlap, agbno = %d, ino = %llu\n"), @@ -455,11 +457,6 @@ verify_inode_chunk(xfs_mount_t *mp, default: break; } - - if (j) { - pthread_mutex_unlock(&ag_locks[agno]); - return(0); - } } pthread_mutex_unlock(&ag_locks[agno]); @@ -487,8 +484,9 @@ verify_inode_chunk(xfs_mount_t *mp, pthread_mutex_lock(&ag_locks[agno]); for (cur_agbno = chunk_start_agbno; - cur_agbno < chunk_stop_agbno; cur_agbno++) { - state = get_bmap(agno, cur_agbno); + cur_agbno < chunk_stop_agbno; + cur_agbno += blen) { + state = get_bmap_ext(agno, cur_agbno, chunk_stop_agbno, &blen); switch (state) { case XR_E_INO: do_error( @@ -498,7 +496,7 @@ verify_inode_chunk(xfs_mount_t *mp, case XR_E_UNKNOWN: case XR_E_FREE1: case XR_E_FREE: - set_bmap(agno, cur_agbno, XR_E_INO); + set_bmap_ext(agno, cur_agbno, blen, XR_E_INO); break; case XR_E_MULT: case XR_E_INUSE: @@ -512,7 +510,7 @@ verify_inode_chunk(xfs_mount_t *mp, do_warn( _("inode block %d/%d bad state, (state %d)\n"), agno, cur_agbno, state); - set_bmap(agno, cur_agbno, XR_E_INO); + set_bmap_ext(agno, cur_agbno, blen, XR_E_INO); break; } } Index: xfsprogs-dev/repair/dinode.c =================================================================== --- xfsprogs-dev.orig/repair/dinode.c 2009-09-02 14:51:09.457268829 -0300 +++ xfsprogs-dev/repair/dinode.c 2009-09-02 14:51:18.593298964 -0300 @@ -524,6 +524,7 @@ process_rt_rec( /* * set the appropriate number of extents + * this iterates block by block, this can be optimised using extents */ for (b = irec->br_startblock; b < irec->br_startblock + irec->br_blockcount; b += mp->m_sb.sb_rextsize) { @@ -614,9 +615,10 @@ process_bmbt_reclist_int( char *forkname; int i; int state; - xfs_dfsbno_t e; xfs_agnumber_t agno; xfs_agblock_t agbno; + xfs_agblock_t ebno; + xfs_extlen_t blen; xfs_agnumber_t locked_agno = -1; int error = 1; @@ -718,7 +720,7 @@ process_bmbt_reclist_int( */ agno = XFS_FSB_TO_AGNO(mp, irec.br_startblock); agbno = XFS_FSB_TO_AGBNO(mp, irec.br_startblock); - e = irec.br_startblock + irec.br_blockcount; + ebno = agbno + irec.br_blockcount; if (agno != locked_agno) { if (locked_agno != -1) pthread_mutex_unlock(&ag_locks[locked_agno]); @@ -733,7 +735,9 @@ process_bmbt_reclist_int( * checking each entry without setting the * block bitmap */ - for (b = irec.br_startblock; b < e; b++, agbno++) { + for (b = irec.br_startblock; + agbno < ebno; + b++, agbno++) { if (search_dup_extent(mp, agno, agbno)) { do_warn(_("%s fork in ino %llu claims " "dup extent, off - %llu, " @@ -748,22 +752,10 @@ process_bmbt_reclist_int( continue; } - for (b = irec.br_startblock; b < e; b++, agbno++) { - /* - * Process in chunks of 16 (XR_BB_UNIT/XR_BB) - * for common XR_E_UNKNOWN to XR_E_INUSE transition - */ - if (((agbno & XR_BB_MASK) == 0) && ((irec.br_startblock + irec.br_blockcount - b) >= (XR_BB_UNIT/XR_BB))) { - if (ba_bmap[agno][agbno>>XR_BB] == XR_E_UNKNOWN_LL) { - ba_bmap[agno][agbno>>XR_BB] = XR_E_INUSE_LL; - agbno += (XR_BB_UNIT/XR_BB) - 1; - b += (XR_BB_UNIT/XR_BB) - 1; - continue; - } - - } - - state = get_bmap(agno, agbno); + for (b = irec.br_startblock; + agbno < ebno; + b += blen, agbno += blen) { + state = get_bmap_ext(agno, agbno, ebno, &blen); switch (state) { case XR_E_FREE: case XR_E_FREE1: @@ -772,7 +764,7 @@ process_bmbt_reclist_int( forkname, ino, (__uint64_t) b); /* fall through ... */ case XR_E_UNKNOWN: - set_bmap(agno, agbno, XR_E_INUSE); + set_bmap_ext(agno, agbno, blen, XR_E_INUSE); break; case XR_E_BAD_STATE: @@ -788,7 +780,7 @@ process_bmbt_reclist_int( case XR_E_INUSE: case XR_E_MULT: - set_bmap(agno, agbno, XR_E_MULT); + set_bmap_ext(agno, agbno, blen, XR_E_MULT); do_warn(_("%s fork in %s inode %llu claims " "used block %llu\n"), forkname, ftype, ino, (__uint64_t) b); Index: xfsprogs-dev/repair/globals.h =================================================================== --- xfsprogs-dev.orig/repair/globals.h 2009-09-02 14:51:09.461268919 -0300 +++ xfsprogs-dev/repair/globals.h 2009-09-02 14:51:18.597292070 -0300 @@ -156,11 +156,6 @@ EXTERN int chunks_pblock; /* # of 64-in EXTERN int max_symlink_blocks; EXTERN __int64_t fs_max_file_offset; -/* block allocation bitmaps */ - -EXTERN __uint64_t **ba_bmap; /* see incore.h */ -EXTERN __uint64_t *rt_ba_bmap; /* see incore.h */ - /* realtime info */ EXTERN xfs_rtword_t *btmcompute; Index: xfsprogs-dev/repair/phase2.c =================================================================== --- xfsprogs-dev.orig/repair/phase2.c 2009-09-02 14:51:09.465298621 -0300 +++ xfsprogs-dev/repair/phase2.c 2009-09-02 14:51:18.605297206 -0300 @@ -109,7 +109,6 @@ void phase2(xfs_mount_t *mp) { xfs_agnumber_t i; - xfs_agblock_t b; int j; ino_tree_node_t *ino_rec; @@ -169,11 +168,8 @@ phase2(xfs_mount_t *mp) /* * also mark blocks */ - for (b = 0; b < mp->m_ialloc_blks; b++) { - set_bmap(0, - b + XFS_INO_TO_AGBNO(mp, mp->m_sb.sb_rootino), - XR_E_INO); - } + set_bmap_ext(0, XFS_INO_TO_AGBNO(mp, mp->m_sb.sb_rootino), + mp->m_ialloc_blks, XR_E_INO); } else { do_log(_(" - found root inode chunk\n")); Index: xfsprogs-dev/repair/phase4.c =================================================================== --- xfsprogs-dev.orig/repair/phase4.c 2009-09-02 14:51:09.533268366 -0300 +++ xfsprogs-dev/repair/phase4.c 2009-09-02 14:51:18.609296598 -0300 @@ -192,8 +192,7 @@ phase4(xfs_mount_t *mp) xfs_agnumber_t i; xfs_agblock_t j; xfs_agblock_t ag_end; - xfs_agblock_t extent_start; - xfs_extlen_t extent_len; + xfs_extlen_t blen; int ag_hdr_len = 4 * mp->m_sb.sb_sectsize; int ag_hdr_block; int bstate; @@ -226,29 +225,13 @@ phase4(xfs_mount_t *mp) ag_end = (i < mp->m_sb.sb_agcount - 1) ? mp->m_sb.sb_agblocks : mp->m_sb.sb_dblocks - (xfs_drfsbno_t) mp->m_sb.sb_agblocks * i; - extent_start = extent_len = 0; + /* * set up duplicate extent list for this ag */ - for (j = ag_hdr_block; j < ag_end; j++) { - - /* Process in chunks of 16 (XR_BB_UNIT/XR_BB) */ - if ((extent_start == 0) && ((j & XR_BB_MASK) == 0)) { - switch(ba_bmap[i][j>>XR_BB]) { - case XR_E_UNKNOWN_LL: - case XR_E_FREE1_LL: - case XR_E_FREE_LL: - case XR_E_INUSE_LL: - case XR_E_INUSE_FS_LL: - case XR_E_INO_LL: - case XR_E_FS_MAP_LL: - j += (XR_BB_UNIT/XR_BB) - 1; - continue; - } - } - - bstate = get_bmap(i, j); - switch (bstate) { + for (j = ag_hdr_block; j < ag_end; j += blen) { + bstate = get_bmap_ext(i, j, ag_end, &blen); + switch (bstate) { case XR_E_BAD_STATE: default: do_warn( @@ -262,37 +245,13 @@ phase4(xfs_mount_t *mp) case XR_E_INUSE_FS: case XR_E_INO: case XR_E_FS_MAP: - if (extent_start == 0) - continue; - else { - /* - * add extent and reset extent state - */ - add_dup_extent(i, extent_start, - extent_len); - extent_start = 0; - extent_len = 0; - } break; case XR_E_MULT: - if (extent_start == 0) { - extent_start = j; - extent_len = 1; - } else if (extent_len == MAXEXTLEN) { - add_dup_extent(i, extent_start, - extent_len); - extent_start = j; - extent_len = 1; - } else - extent_len++; + add_dup_extent(i, j, blen); break; } } - /* - * catch tail-case, extent hitting the end of the ag - */ - if (extent_start != 0) - add_dup_extent(i, extent_start, extent_len); + PROG_RPT_INC(prog_rpt_done[i], 1); } print_final_rpt(); Index: xfsprogs-dev/repair/phase5.c =================================================================== --- xfsprogs-dev.orig/repair/phase5.c 2009-09-02 14:51:09.561269620 -0300 +++ xfsprogs-dev/repair/phase5.c 2009-09-02 14:51:18.613269588 -0300 @@ -88,10 +88,8 @@ mk_incore_fstree(xfs_mount_t *mp, xfs_ag xfs_agblock_t agbno; xfs_agblock_t ag_end; uint free_blocks; -#ifdef XR_BLD_FREE_TRACE - int old_state; - int state = XR_E_BAD_STATE; -#endif + xfs_extlen_t blen; + int bstate; /* * scan the bitmap for the ag looking for continuous @@ -120,30 +118,10 @@ mk_incore_fstree(xfs_mount_t *mp, xfs_ag * ok, now find the number of extents, keep track of the * largest extent. */ - for (agbno = 0; agbno < ag_end; agbno++) { -#if 0 - old_state = state; - state = get_bmap(agno, agbno); - if (state != old_state) { - fprintf(stderr, "agbno %u - new state is %d\n", - agbno, state); - } -#endif - /* Process in chunks of 16 (XR_BB_UNIT/XR_BB) */ - if ((in_extent == 0) && ((agbno & XR_BB_MASK) == 0)) { - /* testing >= XR_E_INUSE */ - switch (ba_bmap[agno][agbno>>XR_BB]) { - case XR_E_INUSE_LL: - case XR_E_INUSE_FS_LL: - case XR_E_INO_LL: - case XR_E_FS_MAP_LL: - agbno += (XR_BB_UNIT/XR_BB) - 1; - continue; - } - - } - if (get_bmap(agno, agbno) < XR_E_INUSE) { - free_blocks++; + for (agbno = 0; agbno < ag_end; agbno += blen) { + bstate = get_bmap_ext(agno, agbno, ag_end, &blen); + if (bstate < XR_E_INUSE) { + free_blocks += blen; if (in_extent == 0) { /* * found the start of a free extent @@ -151,9 +129,9 @@ mk_incore_fstree(xfs_mount_t *mp, xfs_ag in_extent = 1; num_extents++; extent_start = agbno; - extent_len = 1; + extent_len = blen; } else { - extent_len++; + extent_len += blen; } } else { if (in_extent) { Index: xfsprogs-dev/repair/incore.c =================================================================== --- xfsprogs-dev.orig/repair/incore.c 2009-09-02 14:51:09.565269570 -0300 +++ xfsprogs-dev/repair/incore.c 2009-09-02 14:51:29.072772399 -0300 @@ -18,6 +18,7 @@ #include #include "avl.h" +#include "btree.h" #include "globals.h" #include "incore.h" #include "agheader.h" @@ -52,14 +53,192 @@ free_allocations(ba_rec_t *list) return; } +/* + * The following manages the in-core bitmap of the entire filesystem + * using extents in a btree. + * + * The btree items will point to one of the state values below, + * rather than storing the value itself in the pointer. + */ +static int states[16] = + {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}; + +static struct btree_root **ag_bmap; + +static void +update_bmap( + struct btree_root *bmap, + unsigned long offset, + xfs_extlen_t blen, + void *new_state) +{ + unsigned long end = offset + blen; + int *cur_state; + unsigned long cur_key; + int *next_state; + unsigned long next_key; + int *prev_state; + + cur_state = btree_find(bmap, offset, &cur_key); + if (!cur_state) + return; + + if (offset == cur_key) { + /* if the start is the same as the "item" extent */ + if (cur_state == new_state) + return; + + /* + * Note: this may be NULL if we are updating the map for + * the superblock. + */ + prev_state = btree_peek_prev(bmap, NULL); + + next_state = btree_peek_next(bmap, &next_key); + if (next_key > end) { + /* different end */ + if (new_state == prev_state) { + /* #1: prev has same state, move offset up */ + btree_update_key(bmap, offset, end); + return; + } + + /* #4: insert new extent after, update current value */ + btree_update_value(bmap, offset, new_state); + btree_insert(bmap, end, cur_state); + return; + } + + /* same end (and same start) */ + if (new_state == next_state) { + /* next has same state */ + if (new_state == prev_state) { + /* #3: merge prev & next */ + btree_delete(bmap, offset); + btree_delete(bmap, end); + return; + } + + /* #8: merge next */ + btree_update_value(bmap, offset, new_state); + btree_delete(bmap, end); + return; + } + + /* same start, same end, next has different state */ + if (new_state == prev_state) { + /* #5: prev has same state */ + btree_delete(bmap, offset); + return; + } + + /* #6: update value only */ + btree_update_value(bmap, offset, new_state); + return; + } + + /* different start, offset is in the middle of "cur" */ + prev_state = btree_peek_prev(bmap, NULL); + ASSERT(prev_state != NULL); + if (prev_state == new_state) + return; + + if (end == cur_key) { + /* end is at the same point as the current extent */ + if (new_state == cur_state) { + /* #7: move next extent down */ + btree_update_key(bmap, end, offset); + return; + } + + /* #9: different start, same end, add new extent */ + btree_insert(bmap, offset, new_state); + return; + } + + /* #2: insert an extent into the middle of another extent */ + btree_insert(bmap, offset, new_state); + btree_insert(bmap, end, prev_state); +} + +void +set_bmap_ext( + xfs_agnumber_t agno, + xfs_agblock_t agbno, + xfs_extlen_t blen, + int state) +{ + update_bmap(ag_bmap[agno], agbno, blen, &states[state]); +} + +int +get_bmap_ext( + xfs_agnumber_t agno, + xfs_agblock_t agbno, + xfs_agblock_t maxbno, + xfs_extlen_t *blen) +{ + int *statep; + unsigned long key; + + statep = btree_find(ag_bmap[agno], agbno, &key); + if (!statep) + return -1; + + if (key == agbno) { + if (blen) { + if (!btree_peek_next(ag_bmap[agno], &key)) + return -1; + *blen = MIN(maxbno, key) - agbno; + } + return *statep; + } + + statep = btree_peek_prev(ag_bmap[agno], NULL); + if (!statep) + return -1; + if (blen) + *blen = MIN(maxbno, key) - agbno; + + return *statep; +} +static uint64_t *rt_bmap; static size_t rt_bmap_size; +/* block records fit into __uint64_t's units */ +#define XR_BB_UNIT 64 /* number of bits/unit */ +#define XR_BB 4 /* bits per block record */ +#define XR_BB_NUM (XR_BB_UNIT/XR_BB) /* number of records per unit */ +#define XR_BB_MASK 0xF /* block record mask */ + +/* + * these work in real-time extents (e.g. fsbno == rt extent number) + */ +int +get_rtbmap( + xfs_drtbno_t bno) +{ + return (*(rt_bmap + bno / XR_BB_NUM) >> + ((bno % XR_BB_NUM) * XR_BB)) & XR_BB_MASK; +} + +void +set_rtbmap( + xfs_drtbno_t bno, + int state) +{ + *(rt_bmap + bno / XR_BB_NUM) = + ((*(rt_bmap + bno / XR_BB_NUM) & + (~((__uint64_t) XR_BB_MASK << ((bno % XR_BB_NUM) * XR_BB)))) | + (((__uint64_t) state) << ((bno % XR_BB_NUM) * XR_BB))); +} + static void reset_rt_bmap(void) { - if (rt_ba_bmap) - memset(rt_ba_bmap, 0x22, rt_bmap_size); /* XR_E_FREE */ + if (rt_bmap) + memset(rt_bmap, 0x22, rt_bmap_size); /* XR_E_FREE */ } static void @@ -72,8 +251,8 @@ init_rt_bmap( rt_bmap_size = roundup(mp->m_sb.sb_rextents / (NBBY / XR_BB), sizeof(__uint64_t)); - rt_ba_bmap = memalign(sizeof(__uint64_t), rt_bmap_size); - if (!rt_ba_bmap) { + rt_bmap = memalign(sizeof(__uint64_t), rt_bmap_size); + if (!rt_bmap) { do_error( _("couldn't allocate realtime block map, size = %llu\n"), mp->m_sb.sb_rextents); @@ -84,8 +263,8 @@ init_rt_bmap( static void free_rt_bmap(xfs_mount_t *mp) { - free(rt_ba_bmap); - rt_ba_bmap = NULL; + free(rt_bmap); + rt_bmap = NULL; } @@ -93,28 +272,41 @@ void reset_bmaps(xfs_mount_t *mp) { xfs_agnumber_t agno; + xfs_agblock_t ag_size; int ag_hdr_block; - int i; ag_hdr_block = howmany(4 * mp->m_sb.sb_sectsize, mp->m_sb.sb_blocksize); + ag_size = mp->m_sb.sb_agblocks; - for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) { - memset(ba_bmap[agno], 0, - roundup((mp->m_sb.sb_agblocks + (NBBY / XR_BB) - 1) / - (NBBY / XR_BB), sizeof(__uint64_t))); - for (i = 0; i < ag_hdr_block; i++) - set_bmap(agno, i, XR_E_INUSE_FS); + for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) { + if (agno == mp->m_sb.sb_agcount - 1) + ag_size = (xfs_extlen_t)(mp->m_sb.sb_dblocks - + (xfs_drfsbno_t)mp->m_sb.sb_agblocks * agno); +#ifdef BTREE_STATS + if (btree_find(ag_bmap[agno], 0, NULL)) { + printf("ag_bmap[%d] btree stats:\n", i); + btree_print_stats(ag_bmap[agno], stdout); + } +#endif + /* + * We always insert an item for the first block having a + * given state. So the code below means: + * + * block 0..ag_hdr_block-1: XR_E_INUSE_FS + * ag_hdr_block..ag_size: XR_E_UNKNOWN + * ag_size... XR_E_BAD_STATE + */ + btree_clear(ag_bmap[agno]); + btree_insert(ag_bmap[agno], 0, &states[XR_E_INUSE_FS]); + btree_insert(ag_bmap[agno], + ag_hdr_block, &states[XR_E_UNKNOWN]); + btree_insert(ag_bmap[agno], ag_size, &states[XR_E_BAD_STATE]); } if (mp->m_sb.sb_logstart != 0) { - xfs_dfsbno_t logend; - - logend = mp->m_sb.sb_logstart + mp->m_sb.sb_logblocks; - - for (i = mp->m_sb.sb_logstart; i < logend ; i++) { - set_bmap(XFS_FSB_TO_AGNO(mp, i), - XFS_FSB_TO_AGBNO(mp, i), XR_E_INUSE_FS); - } + set_bmap_ext(XFS_FSB_TO_AGNO(mp, mp->m_sb.sb_logstart), + XFS_FSB_TO_AGBNO(mp, mp->m_sb.sb_logstart), + mp->m_sb.sb_logblocks, XR_E_INUSE_FS); } reset_rt_bmap(); @@ -123,30 +315,18 @@ reset_bmaps(xfs_mount_t *mp) void init_bmaps(xfs_mount_t *mp) { - xfs_agblock_t numblocks = mp->m_sb.sb_agblocks; - int agcount = mp->m_sb.sb_agcount; - int i; - size_t size = 0; - - ba_bmap = calloc(agcount, sizeof(__uint64_t *)); - if (!ba_bmap) - do_error(_("couldn't allocate block map pointers\n")); + xfs_agnumber_t i; - ag_locks = calloc(agcount, sizeof(pthread_mutex_t)); + ag_bmap = calloc(mp->m_sb.sb_agcount, sizeof(struct btree_root *)); + if (!ag_bmap) + do_error(_("couldn't allocate block map btree roots\n")); + + ag_locks = calloc(mp->m_sb.sb_agcount, sizeof(pthread_mutex_t)); if (!ag_locks) do_error(_("couldn't allocate block map locks\n")); - for (i = 0; i < agcount; i++) { - size = roundup((numblocks+(NBBY/XR_BB)-1) / (NBBY/XR_BB), - sizeof(__uint64_t)); - - ba_bmap[i] = memalign(sizeof(__uint64_t), size); - if (!ba_bmap[i]) { - do_error(_("couldn't allocate block map, size = %d\n"), - numblocks); - return; - } - memset(ba_bmap[i], 0, size); + for (i = 0; i < mp->m_sb.sb_agcount; i++) { + btree_init(&ag_bmap[i]); pthread_mutex_init(&ag_locks[i], NULL); } @@ -160,9 +340,9 @@ free_bmaps(xfs_mount_t *mp) xfs_agnumber_t i; for (i = 0; i < mp->m_sb.sb_agcount; i++) - free(ba_bmap[i]); - free(ba_bmap); - ba_bmap = NULL; + btree_destroy(ag_bmap[i]); + free(ag_bmap); + ag_bmap = NULL; free_rt_bmap(mp); } Index: xfsprogs-dev/repair/incore.h =================================================================== --- xfsprogs-dev.orig/repair/incore.h 2009-09-02 14:51:09.573269190 -0300 +++ xfsprogs-dev/repair/incore.h 2009-09-02 14:51:18.621298890 -0300 @@ -37,59 +37,32 @@ void record_allocation(ba_rec_t *addr, void free_allocations(ba_rec_t *list); /* - * block bit map defs -- track state of each filesystem block. - * ba_bmap is an array of bitstrings declared in the globals.h file. - * the bitstrings are broken up into 64-bit chunks. one bitstring per AG. - */ -#define BA_BMAP_SIZE(x) (howmany(x, 4)) - -void init_bmaps(xfs_mount_t *mp); -void reset_bmaps(xfs_mount_t *mp); -void free_bmaps(xfs_mount_t *mp); - - -/* blocks are numbered from zero */ - -/* block records fit into __uint64_t's units */ - -#define XR_BB_UNIT 64 /* number of bits/unit */ -#define XR_BB 4 /* bits per block record */ -#define XR_BB_NUM (XR_BB_UNIT/XR_BB) /* number of records per unit */ -#define XR_BB_MASK 0xF /* block record mask */ - -/* - * bitstring ops -- set/get block states, either in filesystem - * bno's or in agbno's. turns out that fsbno addressing is - * more convenient when dealing with bmap extracted addresses - * and agbno addressing is more convenient when dealing with - * meta-data extracted addresses. So the fsbno versions use - * mtype (which can be one of the block map types above) to - * set the correct block map while the agbno versions assume - * you want to use the regular block map. - */ - -#define get_bmap(agno, ag_blockno) \ - ((int) (*(ba_bmap[(agno)] + (ag_blockno)/XR_BB_NUM) \ - >> (((ag_blockno)%XR_BB_NUM)*XR_BB)) \ - & XR_BB_MASK) -#define set_bmap(agno, ag_blockno, state) \ - *(ba_bmap[(agno)] + (ag_blockno)/XR_BB_NUM) = \ - ((*(ba_bmap[(agno)] + (ag_blockno)/XR_BB_NUM) & \ - (~((__uint64_t) XR_BB_MASK << (((ag_blockno)%XR_BB_NUM)*XR_BB)))) | \ - (((__uint64_t) (state)) << (((ag_blockno)%XR_BB_NUM)*XR_BB))) - -/* - * these work in real-time extents (e.g. fsbno == rt extent number) - */ -#define get_rtbmap(fsbno) \ - ((*(rt_ba_bmap + (fsbno)/XR_BB_NUM) >> \ - (((fsbno)%XR_BB_NUM)*XR_BB)) & XR_BB_MASK) -#define set_rtbmap(fsbno, state) \ - *(rt_ba_bmap + (fsbno)/XR_BB_NUM) = \ - ((*(rt_ba_bmap + (fsbno)/XR_BB_NUM) & \ - (~((__uint64_t) XR_BB_MASK << (((fsbno)%XR_BB_NUM)*XR_BB)))) | \ - (((__uint64_t) (state)) << (((fsbno)%XR_BB_NUM)*XR_BB))) + * block map -- track state of each filesystem block. + */ + +void init_bmaps(xfs_mount_t *mp); +void reset_bmaps(xfs_mount_t *mp); +void free_bmaps(xfs_mount_t *mp); + +void set_bmap_ext(xfs_agnumber_t agno, xfs_agblock_t agbno, + xfs_extlen_t blen, int state); +int get_bmap_ext(xfs_agnumber_t agno, xfs_agblock_t agbno, + xfs_agblock_t maxbno, xfs_extlen_t *blen); +void set_rtbmap(xfs_drtbno_t bno, int state); +int get_rtbmap(xfs_drtbno_t bno); + +static inline void +set_bmap(xfs_agnumber_t agno, xfs_agblock_t agbno, int state) +{ + set_bmap_ext(agno, agbno, 1, state); +} + +static inline int +get_bmap(xfs_agnumber_t agno, xfs_agblock_t agbno) +{ + return get_bmap_ext(agno, agbno, agbno + 1, NULL); +} /* * extent tree definitions Index: xfsprogs-dev/repair/scan.c =================================================================== --- xfsprogs-dev.orig/repair/scan.c 2009-09-02 14:51:09.577269000 -0300 +++ xfsprogs-dev/repair/scan.c 2009-09-02 14:51:18.629269735 -0300 @@ -509,7 +509,7 @@ _("%s freespace btree block claimed (sta rp = XFS_ALLOC_REC_ADDR(mp, block, 1); for (i = 0; i < numrecs; i++) { xfs_agblock_t b, end; - xfs_extlen_t len; + xfs_extlen_t len, blen; b = be32_to_cpu(rp[i].ar_startblock); len = be32_to_cpu(rp[i].ar_blockcount); @@ -522,8 +522,8 @@ _("%s freespace btree block claimed (sta if (!verify_agbno(mp, agno, end - 1)) continue; - for ( ; b < end; b++) { - state = get_bmap(agno, b); + for ( ; b < end; b += blen) { + state = get_bmap_ext(agno, b, end, &blen); switch (state) { case XR_E_UNKNOWN: set_bmap(agno, b, XR_E_FREE1); @@ -534,13 +534,15 @@ _("%s freespace btree block claimed (sta * FREE1 blocks later */ if (magic != XFS_ABTB_MAGIC) { - set_bmap(agno, b, XR_E_FREE); + set_bmap_ext(agno, b, blen, + XR_E_FREE); break; } default: do_warn( - _("block (%d,%d) multiply claimed by %s space tree, state - %d\n"), - agno, b, name, state); + _("block (%d,%d-%d) multiply claimed by %s space tree, state - %d\n"), + agno, b, b + blen - 1, + name, state); break; } } From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:43 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HwIEw034420 for ; Wed, 2 Sep 2009 12:58:33 -0500 X-ASG-Debug-ID: 1251914322-728a03d00000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B950A41E03E for ; Wed, 2 Sep 2009 10:58:42 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id 2DwF15ZUUZofuQ8P for ; Wed, 02 Sep 2009 10:58:42 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6U-0006bD-BA; Wed, 02 Sep 2009 17:58:42 +0000 Message-Id: <20090902175842.262611292@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:45 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 14/14] repair: add missing locking in scanfunc_bmap Subject: [PATCH 14/14] repair: add missing locking in scanfunc_bmap References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-scanfunc_bmap-locking X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914322 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Make sure to protect access to the block usage tracking btree with the ag_lock. Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/scan.c =================================================================== --- xfsprogs-dev.orig/repair/scan.c 2009-08-20 03:16:13.000000000 +0000 +++ xfsprogs-dev/repair/scan.c 2009-08-20 03:18:17.000000000 +0000 @@ -235,6 +235,7 @@ agno = XFS_FSB_TO_AGNO(mp, bno); agbno = XFS_FSB_TO_AGBNO(mp, bno); + pthread_mutex_lock(&ag_locks[agno]); state = get_bmap(agno, agbno); switch (state) { case XR_E_UNKNOWN: @@ -280,6 +281,7 @@ state, ino, (__uint64_t) bno); break; } + pthread_mutex_unlock(&ag_locks[agno]); } else { /* * attribute fork for realtime files is in the regular From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:44 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_43, J_CHICKENPOX_64,J_CHICKENPOX_73 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HwJlr034430 for ; Wed, 2 Sep 2009 12:58:34 -0500 X-ASG-Debug-ID: 1251914321-4f3802b80000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 29E6915B1A32 for ; Wed, 2 Sep 2009 10:58:42 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id wuSfvCMm22hRDUfe for ; Wed, 02 Sep 2009 10:58:42 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6T-0006Ym-Po; Wed, 02 Sep 2009 17:58:41 +0000 Message-Id: <20090902175841.711310240@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:42 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 11/14] repair: cleanup alloc/free/reset of the block usage tracking Subject: [PATCH 11/14] repair: cleanup alloc/free/reset of the block usage tracking References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-cleanup-bmap-helpers-2 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914322 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Currently the code to allocate, free and reset the block usage bitmaps is a complete mess. This patch reorganizes it into logical helpers. Details: - the current incore_init code is called just before phase2 is called, which then marks the log and the AG headers used. - we get rid of incore_init init, and replace it with direct calls to the unchanched incore_ino_init/incore_ext_init functions and our new init_bmaps which does all the allocations for the block usage tracking, aswell as a call to reset_bmaps to initialize it to the default values. - reset_bmaps is also called from early phase4 code to reset all state instead of opencoding it. - there is a new free_bmaps helper which we call to free our block usage bitmaps when we don't need them anymore after phase5. The current code frees some of it a bit early in phase5, but needs to take of it in phase6 in case we didn't call phase5 due to nomodify mode, and leaks it if we don't call phase 6, which might happen in case of a bad inode allocation btree. Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/phase4.c =================================================================== --- xfsprogs-dev.orig/repair/phase4.c 2009-08-21 01:59:26.000000000 +0000 +++ xfsprogs-dev/repair/phase4.c 2009-08-21 02:41:44.000000000 +0000 @@ -355,19 +355,7 @@ phase4(xfs_mount_t *mp) /* * initialize bitmaps for all AGs */ - for (i = 0; i < mp->m_sb.sb_agcount; i++) { - /* - * now reset the bitmap for all ags - */ - memset(ba_bmap[i], 0, - roundup((mp->m_sb.sb_agblocks+(NBBY/XR_BB)-1)/(NBBY/XR_BB), - sizeof(__uint64_t))); - for (j = 0; j < ag_hdr_block; j++) - set_bmap(i, j, XR_E_INUSE_FS); - } - set_bmap_rt(mp->m_sb.sb_rextents); - set_bmap_log(mp); - set_bmap_fs(mp); + reset_bmaps(mp); do_log(_(" - check for inodes claiming duplicate blocks...\n")); set_progress_msg(PROG_FMT_DUP_BLOCKS, (__uint64_t) mp->m_sb.sb_icount); Index: xfsprogs-dev/repair/incore.c =================================================================== --- xfsprogs-dev.orig/repair/incore.c 2009-08-21 01:59:26.000000000 +0000 +++ xfsprogs-dev/repair/incore.c 2009-08-21 03:02:28.000000000 +0000 @@ -52,205 +52,117 @@ free_allocations(ba_rec_t *list) return; } -/* ba bmap setupstuff. setting/getting state is in incore.h */ -void -setup_bmap(xfs_agnumber_t agno, xfs_agblock_t numblocks, xfs_drtbno_t rtblocks) -{ - int i; - size_t size = 0; +static size_t rt_bmap_size; - ba_bmap = (__uint64_t**)malloc(agno*sizeof(__uint64_t *)); - if (!ba_bmap) - do_error(_("couldn't allocate block map pointers\n")); - ag_locks = malloc(agno * sizeof(pthread_mutex_t)); - if (!ag_locks) - do_error(_("couldn't allocate block map locks\n")); - - for (i = 0; i < agno; i++) { - size = roundup((numblocks+(NBBY/XR_BB)-1) / (NBBY/XR_BB), - sizeof(__uint64_t)); - - ba_bmap[i] = (__uint64_t*)memalign(sizeof(__uint64_t), size); - if (!ba_bmap[i]) { - do_error(_("couldn't allocate block map, size = %d\n"), - numblocks); - return; - } - memset(ba_bmap[i], 0, size); - pthread_mutex_init(&ag_locks[i], NULL); - } +static void +reset_rt_bmap(void) +{ + if (rt_ba_bmap) + memset(rt_ba_bmap, 0x22, rt_bmap_size); /* XR_E_FREE */ +} - if (rtblocks == 0) { - rt_ba_bmap = NULL; +static void +init_rt_bmap( + xfs_mount_t *mp) +{ + if (mp->m_sb.sb_rextents == 0) return; - } - size = roundup(rtblocks / (NBBY/XR_BB), sizeof(__uint64_t)); + rt_bmap_size = roundup(mp->m_sb.sb_rextents / (NBBY / XR_BB), + sizeof(__uint64_t)); - rt_ba_bmap=(__uint64_t*)memalign(sizeof(__uint64_t), size); + rt_ba_bmap = memalign(sizeof(__uint64_t), rt_bmap_size); if (!rt_ba_bmap) { - do_error( + do_error( _("couldn't allocate realtime block map, size = %llu\n"), - rtblocks); - return; + mp->m_sb.sb_rextents); + return; } - - /* - * start all real-time as free blocks - */ - set_bmap_rt(rtblocks); - - return; } -/* ARGSUSED */ -void -teardown_rt_bmap(xfs_mount_t *mp) +static void +free_rt_bmap(xfs_mount_t *mp) { - if (rt_ba_bmap != NULL) { - free(rt_ba_bmap); - rt_ba_bmap = NULL; - } - - return; + free(rt_ba_bmap); + rt_ba_bmap = NULL; } -/* ARGSUSED */ -void -teardown_ag_bmap(xfs_mount_t *mp, xfs_agnumber_t agno) -{ - ASSERT(ba_bmap[agno] != NULL); - - free(ba_bmap[agno]); - ba_bmap[agno] = NULL; - - return; -} -/* ARGSUSED */ void -teardown_bmap_finish(xfs_mount_t *mp) +reset_bmaps(xfs_mount_t *mp) { - free(ba_bmap); - ba_bmap = NULL; - - return; -} + xfs_agnumber_t agno; + int ag_hdr_block; + int i; -void -teardown_bmap(xfs_mount_t *mp) -{ - xfs_agnumber_t i; + ag_hdr_block = howmany(4 * mp->m_sb.sb_sectsize, mp->m_sb.sb_blocksize); - for (i = 0; i < mp->m_sb.sb_agcount; i++) { - teardown_ag_bmap(mp, i); + for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) { + memset(ba_bmap[agno], 0, + roundup((mp->m_sb.sb_agblocks + (NBBY / XR_BB) - 1) / + (NBBY / XR_BB), sizeof(__uint64_t))); + for (i = 0; i < ag_hdr_block; i++) + set_bmap(agno, i, XR_E_INUSE_FS); } - teardown_rt_bmap(mp); - teardown_bmap_finish(mp); + if (mp->m_sb.sb_logstart != 0) { + xfs_dfsbno_t logend; - return; -} + logend = mp->m_sb.sb_logstart + mp->m_sb.sb_logblocks; -/* - * block map initialization routines -- realtime, log, fs - */ -void -set_bmap_rt(xfs_drtbno_t num) -{ - xfs_drtbno_t j; - xfs_drtbno_t size; - - /* - * for now, initialize all realtime blocks to be free - * (state == XR_E_FREE) - */ - size = howmany(num / (NBBY/XR_BB), sizeof(__uint64_t)); - - for (j = 0; j < size; j++) - rt_ba_bmap[j] = 0x2222222222222222LL; - - return; -} - -void -set_bmap_log(xfs_mount_t *mp) -{ - xfs_dfsbno_t logend, i; - - if (mp->m_sb.sb_logstart == 0) - return; - - logend = mp->m_sb.sb_logstart + mp->m_sb.sb_logblocks; - - for (i = mp->m_sb.sb_logstart; i < logend ; i++) { - set_bmap(XFS_FSB_TO_AGNO(mp, i), - XFS_FSB_TO_AGBNO(mp, i), XR_E_INUSE_FS); + for (i = mp->m_sb.sb_logstart; i < logend ; i++) { + set_bmap(XFS_FSB_TO_AGNO(mp, i), + XFS_FSB_TO_AGBNO(mp, i), XR_E_INUSE_FS); + } } - return; + reset_rt_bmap(); } void -set_bmap_fs(xfs_mount_t *mp) +init_bmaps(xfs_mount_t *mp) { - xfs_agnumber_t i; - xfs_agblock_t j; - xfs_agblock_t end; - - /* - * AG header is 4 sectors - */ - end = howmany(4 * mp->m_sb.sb_sectsize, mp->m_sb.sb_blocksize); + xfs_agblock_t numblocks = mp->m_sb.sb_agblocks; + int agcount = mp->m_sb.sb_agcount; + int i; + size_t size = 0; - for (i = 0; i < mp->m_sb.sb_agcount; i++) - for (j = 0; j < end; j++) - set_bmap(i, j, XR_E_INUSE_FS); + ba_bmap = calloc(agcount, sizeof(__uint64_t *)); + if (!ba_bmap) + do_error(_("couldn't allocate block map pointers\n")); - return; -} + ag_locks = calloc(agcount, sizeof(pthread_mutex_t)); + if (!ag_locks) + do_error(_("couldn't allocate block map locks\n")); -#if 0 -void -set_bmap_fs_bt(xfs_mount_t *mp) -{ - xfs_agnumber_t i; - xfs_agblock_t j; - xfs_agblock_t begin; - xfs_agblock_t end; - - begin = bnobt_root; - end = inobt_root + 1; - - for (i = 0; i < mp->m_sb.sb_agcount; i++) { - /* - * account for btree roots - */ - for (j = begin; j < end; j++) - set_bmap(i, j, XR_E_INUSE_FS); + for (i = 0; i < agcount; i++) { + size = roundup((numblocks+(NBBY/XR_BB)-1) / (NBBY/XR_BB), + sizeof(__uint64_t)); + + ba_bmap[i] = memalign(sizeof(__uint64_t), size); + if (!ba_bmap[i]) { + do_error(_("couldn't allocate block map, size = %d\n"), + numblocks); + return; + } + memset(ba_bmap[i], 0, size); + pthread_mutex_init(&ag_locks[i], NULL); } - return; + init_rt_bmap(mp); + reset_bmaps(mp); } -#endif void -incore_init(xfs_mount_t *mp) +free_bmaps(xfs_mount_t *mp) { - int agcount = mp->m_sb.sb_agcount; - extern void incore_ino_init(xfs_mount_t *); - extern void incore_ext_init(xfs_mount_t *); - - /* init block alloc bmap */ - - setup_bmap(agcount, mp->m_sb.sb_agblocks, mp->m_sb.sb_rextents); - incore_ino_init(mp); - incore_ext_init(mp); - - /* initialize random globals now that we know the fs geometry */ + xfs_agnumber_t i; - inodes_per_block = mp->m_sb.sb_inopblock; + for (i = 0; i < mp->m_sb.sb_agcount; i++) + free(ba_bmap[i]); + free(ba_bmap); + ba_bmap = NULL; - return; + free_rt_bmap(mp); } Index: xfsprogs-dev/repair/incore.h =================================================================== --- xfsprogs-dev.orig/repair/incore.h 2009-08-21 01:59:26.000000000 +0000 +++ xfsprogs-dev/repair/incore.h 2009-08-21 03:00:13.000000000 +0000 @@ -43,14 +43,10 @@ void free_allocations(ba_rec_t *list); */ #define BA_BMAP_SIZE(x) (howmany(x, 4)) -void set_bmap_rt(xfs_drfsbno_t numblocks); -void set_bmap_log(xfs_mount_t *mp); -void set_bmap_fs(xfs_mount_t *mp); -void teardown_bmap(xfs_mount_t *mp); - -void teardown_rt_bmap(xfs_mount_t *mp); -void teardown_ag_bmap(xfs_mount_t *mp, xfs_agnumber_t agno); -void teardown_bmap_finish(xfs_mount_t *mp); +void init_bmaps(xfs_mount_t *mp); +void reset_bmaps(xfs_mount_t *mp); +void free_bmaps(xfs_mount_t *mp); + /* blocks are numbered from zero */ @@ -254,6 +250,7 @@ void release_agbcnt_extent_tree(xfs_agn */ void free_rt_dup_extent_tree(xfs_mount_t *mp); +void incore_ext_init(xfs_mount_t *); /* * per-AG extent trees shutdown routine -- all (bno, bcnt and dup) * at once. this one actually frees the memory instead of just recyling @@ -261,6 +258,8 @@ void free_rt_dup_extent_tree(xfs_mount_ */ void incore_ext_teardown(xfs_mount_t *mp); +void incore_ino_init(xfs_mount_t *); + /* * inode definitions */ Index: xfsprogs-dev/repair/phase2.c =================================================================== --- xfsprogs-dev.orig/repair/phase2.c 2009-08-21 02:04:25.000000000 +0000 +++ xfsprogs-dev/repair/phase2.c 2009-08-21 02:41:43.000000000 +0000 @@ -134,12 +134,6 @@ phase2(xfs_mount_t *mp) do_log(_(" - scan filesystem freespace and inode maps...\n")); - /* - * account for space used by ag headers and log if internal - */ - set_bmap_log(mp); - set_bmap_fs(mp); - bad_ino_btree = 0; set_progress_msg(PROG_FMT_SCAN_AG, (__uint64_t) glob_agcount); Index: xfsprogs-dev/repair/xfs_repair.c =================================================================== --- xfsprogs-dev.orig/repair/xfs_repair.c 2009-08-21 02:47:02.000000000 +0000 +++ xfsprogs-dev/repair/xfs_repair.c 2009-08-21 03:03:51.000000000 +0000 @@ -39,7 +39,6 @@ extern void phase4(xfs_mount_t *); extern void phase5(xfs_mount_t *); extern void phase6(xfs_mount_t *); extern void phase7(xfs_mount_t *); -extern void incore_init(xfs_mount_t *); #define XR_MAX_SECT_SIZE (64 * 1024) @@ -694,9 +693,14 @@ main(int argc, char **argv) calc_mkfs(mp); /* - * check sb filesystem stats and initialize in-core data structures + * initialize block alloc map */ - incore_init(mp); + init_bmaps(mp); + incore_ino_init(mp); + incore_ext_init(mp); + + /* initialize random globals now that we know the fs geometry */ + inodes_per_block = mp->m_sb.sb_inopblock; if (parse_sb_version(&mp->m_sb)) { do_warn( @@ -724,6 +728,11 @@ main(int argc, char **argv) } timestamp(PHASE_END, 5, NULL); + /* + * Done with the block usage maps, toss them... + */ + free_bmaps(mp); + if (!bad_ino_btree) { phase6(mp); timestamp(PHASE_END, 6, NULL); Index: xfsprogs-dev/repair/phase6.c =================================================================== --- xfsprogs-dev.orig/repair/phase6.c 2009-08-21 02:44:58.000000000 +0000 +++ xfsprogs-dev/repair/phase6.c 2009-08-21 02:54:54.000000000 +0000 @@ -3661,11 +3661,6 @@ phase6(xfs_mount_t *mp) do_log(_("Phase 6 - check inode connectivity...\n")); - if (!no_modify) - teardown_bmap_finish(mp); - else - teardown_bmap(mp); - incore_ext_teardown(mp); add_ino_ex_data(mp); Index: xfsprogs-dev/repair/phase5.c =================================================================== --- xfsprogs-dev.orig/repair/phase5.c 2009-08-21 02:42:26.000000000 +0000 +++ xfsprogs-dev/repair/phase5.c 2009-08-21 03:00:07.000000000 +0000 @@ -1465,11 +1465,6 @@ phase5_func( } /* - * done with the AG bitmap, toss it... - */ - teardown_ag_bmap(mp, agno); - - /* * ok, now set up the btree cursors for the * on-disk btrees (includs pre-allocating all * required blocks for the trees themselves) @@ -1655,7 +1650,6 @@ phase5(xfs_mount_t *mp) _(" - generate realtime summary info and bitmap...\n")); rtinit(mp); generate_rtinfo(mp, btmcompute, sumcompute); - teardown_rt_bmap(mp); } do_log(_(" - reset superblock...\n")); From BATV+2d85c3858335d18f94a6+2201+infradead.org+hch@bombadil.srs.infradead.org Wed Sep 2 12:58:44 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-4.1 required=5.0 tests=AWL,BAYES_00,LOCAL_GNU_PATCH autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82HwJRd034429 for ; Wed, 2 Sep 2009 12:58:34 -0500 X-ASG-Debug-ID: 1251914321-4f3302be0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 068A615B1A2F for ; Wed, 2 Sep 2009 10:58:41 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id L2AEtt0bguNXEgzI for ; Wed, 02 Sep 2009 10:58:41 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Miu6T-0006Y6-Ja; Wed, 02 Sep 2009 17:58:41 +0000 Message-Id: <20090902175841.479553130@bombadil.infradead.org> User-Agent: quilt/0.47-1 Date: Wed, 02 Sep 2009 13:55:41 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com Cc: Barry Naujok X-ASG-Orig-Subj: [PATCH 10/14] repair: cleanup helpers for tracking block usage Subject: [PATCH 10/14] repair: cleanup helpers for tracking block usage References: <20090902175531.469184575@bombadil.infradead.org> Content-Disposition: inline; filename=repair-cleanup-bmap-helpers X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251914322 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Rename get_agbno_state/set_agbno_state to get_bmap/set_bmap because those names are more self-descriptive. Remove the superblous mount argument to the as the current filesystem is a global in repair. Remove the fsbno taking variant as they just complicated the code. Bring all uses of them into the canonical form. Signed-off-by: Barry Naujok Signed-off-by: Christoph Hellwig Index: xfsprogs-dev/repair/dinode.c =================================================================== --- xfsprogs-dev.orig/repair/dinode.c 2009-08-21 19:05:41.000000000 +0000 +++ xfsprogs-dev/repair/dinode.c 2009-08-21 19:05:51.000000000 +0000 @@ -545,40 +545,33 @@ process_rt_rec( continue; } - state = get_rtbno_state(mp, ext); - + state = get_rtbmap(ext); switch (state) { - case XR_E_FREE: - case XR_E_UNKNOWN: - set_rtbno_state(mp, ext, XR_E_INUSE); + case XR_E_FREE: + case XR_E_UNKNOWN: + set_rtbmap(ext, XR_E_INUSE); + break; + case XR_E_BAD_STATE: + do_error(_("bad state in rt block map %llu\n"), ext); + case XR_E_FS_MAP: + case XR_E_INO: + case XR_E_INUSE_FS: + do_error(_("data fork in rt inode %llu found " + "metadata block %llu in rt bmap\n"), + ino, ext); + case XR_E_INUSE: + if (pwe) break; - - case XR_E_BAD_STATE: - do_error(_("bad state in rt block map %llu\n"), - ext); - - case XR_E_FS_MAP: - case XR_E_INO: - case XR_E_INUSE_FS: - do_error(_("data fork in rt inode %llu found " - "metadata block %llu in rt bmap\n"), + case XR_E_MULT: + set_rtbmap(ext, XR_E_MULT); + do_warn(_("data fork in rt inode %llu claims " + "used rt block %llu\n"), ino, ext); - - case XR_E_INUSE: - if (pwe) - break; - - case XR_E_MULT: - set_rtbno_state(mp, ext, XR_E_MULT); - do_warn(_("data fork in rt inode %llu claims " - "used rt block %llu\n"), - ino, ext); - return 1; - - case XR_E_FREE1: - default: - do_error(_("illegal state %d in rt block map " - "%llu\n"), state, b); + return 1; + case XR_E_FREE1: + default: + do_error(_("illegal state %d in rt block map " + "%llu\n"), state, b); } } @@ -770,8 +763,7 @@ process_bmbt_reclist_int( } - state = get_agbno_state(mp, agno, agbno); - + state = get_bmap(agno, agbno); switch (state) { case XR_E_FREE: case XR_E_FREE1: @@ -780,7 +772,7 @@ process_bmbt_reclist_int( forkname, ino, (__uint64_t) b); /* fall through ... */ case XR_E_UNKNOWN: - set_agbno_state(mp, agno, agbno, XR_E_INUSE); + set_bmap(agno, agbno, XR_E_INUSE); break; case XR_E_BAD_STATE: @@ -796,7 +788,7 @@ process_bmbt_reclist_int( case XR_E_INUSE: case XR_E_MULT: - set_agbno_state(mp, agno, agbno, XR_E_MULT); + set_bmap(agno, agbno, XR_E_MULT); do_warn(_("%s fork in %s inode %llu claims " "used block %llu\n"), forkname, ftype, ino, (__uint64_t) b); Index: xfsprogs-dev/repair/dino_chunks.c =================================================================== --- xfsprogs-dev.orig/repair/dino_chunks.c 2009-08-21 19:05:40.000000000 +0000 +++ xfsprogs-dev/repair/dino_chunks.c 2009-08-21 19:05:51.000000000 +0000 @@ -151,7 +151,8 @@ verify_inode_chunk(xfs_mount_t *mp, pthread_mutex_lock(&ag_locks[agno]); - switch (state = get_agbno_state(mp, agno, agbno)) { + state = get_bmap(agno, agbno); + switch (state) { case XR_E_INO: do_warn( _("uncertain inode block %d/%d already known\n"), @@ -160,7 +161,7 @@ verify_inode_chunk(xfs_mount_t *mp, case XR_E_UNKNOWN: case XR_E_FREE1: case XR_E_FREE: - set_agbno_state(mp, agno, agbno, XR_E_INO); + set_bmap(agno, agbno, XR_E_INO); break; case XR_E_MULT: case XR_E_INUSE: @@ -172,14 +173,14 @@ verify_inode_chunk(xfs_mount_t *mp, do_warn( _("inode block %d/%d multiply claimed, (state %d)\n"), agno, agbno, state); - set_agbno_state(mp, agno, agbno, XR_E_MULT); + set_bmap(agno, agbno, XR_E_MULT); pthread_mutex_unlock(&ag_locks[agno]); return(0); default: do_warn( _("inode block %d/%d bad state, (state %d)\n"), agno, agbno, state); - set_agbno_state(mp, agno, agbno, XR_E_INO); + set_bmap(agno, agbno, XR_E_INO); break; } @@ -434,7 +435,8 @@ verify_inode_chunk(xfs_mount_t *mp, pthread_mutex_lock(&ag_locks[agno]); for (j = 0, cur_agbno = chunk_start_agbno; cur_agbno < chunk_stop_agbno; cur_agbno++) { - switch (state = get_agbno_state(mp, agno, cur_agbno)) { + state = get_bmap(agno, cur_agbno); + switch (state) { case XR_E_MULT: case XR_E_INUSE: case XR_E_INUSE_FS: @@ -442,7 +444,7 @@ verify_inode_chunk(xfs_mount_t *mp, do_warn( _("inode block %d/%d multiply claimed, (state %d)\n"), agno, cur_agbno, state); - set_agbno_state(mp, agno, cur_agbno, XR_E_MULT); + set_bmap(agno, cur_agbno, XR_E_MULT); j = 1; break; case XR_E_INO: @@ -486,7 +488,8 @@ verify_inode_chunk(xfs_mount_t *mp, for (cur_agbno = chunk_start_agbno; cur_agbno < chunk_stop_agbno; cur_agbno++) { - switch (state = get_agbno_state(mp, agno, cur_agbno)) { + state = get_bmap(agno, cur_agbno); + switch (state) { case XR_E_INO: do_error( _("uncertain inode block %llu already known\n"), @@ -495,7 +498,7 @@ verify_inode_chunk(xfs_mount_t *mp, case XR_E_UNKNOWN: case XR_E_FREE1: case XR_E_FREE: - set_agbno_state(mp, agno, cur_agbno, XR_E_INO); + set_bmap(agno, cur_agbno, XR_E_INO); break; case XR_E_MULT: case XR_E_INUSE: @@ -509,7 +512,7 @@ verify_inode_chunk(xfs_mount_t *mp, do_warn( _("inode block %d/%d bad state, (state %d)\n"), agno, cur_agbno, state); - set_agbno_state(mp, agno, cur_agbno, XR_E_INO); + set_bmap(agno, cur_agbno, XR_E_INO); break; } } @@ -742,22 +745,23 @@ process_inode_chunk( * mark block as an inode block in the incore bitmap */ pthread_mutex_lock(&ag_locks[agno]); - switch (state = get_agbno_state(mp, agno, agbno)) { - case XR_E_INO: /* already marked */ - break; - case XR_E_UNKNOWN: - case XR_E_FREE: - case XR_E_FREE1: - set_agbno_state(mp, agno, agbno, XR_E_INO); - break; - case XR_E_BAD_STATE: - do_error(_("bad state in block map %d\n"), state); - break; - default: - set_agbno_state(mp, agno, agbno, XR_E_MULT); - do_warn(_("inode block %llu multiply claimed, state was %d\n"), - XFS_AGB_TO_FSB(mp, agno, agbno), state); - break; + state = get_bmap(agno, agbno); + switch (state) { + case XR_E_INO: /* already marked */ + break; + case XR_E_UNKNOWN: + case XR_E_FREE: + case XR_E_FREE1: + set_bmap(agno, agbno, XR_E_INO); + break; + case XR_E_BAD_STATE: + do_error(_("bad state in block map %d\n"), state); + break; + default: + set_bmap(agno, agbno, XR_E_MULT); + do_warn(_("inode block %llu multiply claimed, state was %d\n"), + XFS_AGB_TO_FSB(mp, agno, agbno), state); + break; } pthread_mutex_unlock(&ag_locks[agno]); @@ -923,20 +927,21 @@ process_inode_chunk( agbno++; pthread_mutex_lock(&ag_locks[agno]); - switch (state = get_agbno_state(mp, agno, agbno)) { + state = get_bmap(agno, agbno); + switch (state) { case XR_E_INO: /* already marked */ break; case XR_E_UNKNOWN: case XR_E_FREE: case XR_E_FREE1: - set_agbno_state(mp, agno, agbno, XR_E_INO); + set_bmap(agno, agbno, XR_E_INO); break; case XR_E_BAD_STATE: do_error(_("bad state in block map %d\n"), state); break; default: - set_agbno_state(mp, agno, agbno, XR_E_MULT); + set_bmap(agno, agbno, XR_E_MULT); do_warn(_("inode block %llu multiply claimed, " "state was %d\n"), XFS_AGB_TO_FSB(mp, agno, agbno), state); Index: xfsprogs-dev/repair/phase4.c =================================================================== --- xfsprogs-dev.orig/repair/phase4.c 2009-08-21 18:59:24.000000000 +0000 +++ xfsprogs-dev/repair/phase4.c 2009-08-21 19:05:51.000000000 +0000 @@ -247,8 +247,7 @@ phase4(xfs_mount_t *mp) } } - bstate = get_agbno_state(mp, i, j); - + bstate = get_bmap(i, j); switch (bstate) { case XR_E_BAD_STATE: default: @@ -305,9 +304,7 @@ phase4(xfs_mount_t *mp) rt_len = 0; for (bno = 0; bno < mp->m_sb.sb_rextents; bno++) { - - bstate = get_rtbno_state(mp, bno); - + bstate = get_rtbmap(bno); switch (bstate) { case XR_E_BAD_STATE: default: @@ -366,7 +363,7 @@ phase4(xfs_mount_t *mp) roundup((mp->m_sb.sb_agblocks+(NBBY/XR_BB)-1)/(NBBY/XR_BB), sizeof(__uint64_t))); for (j = 0; j < ag_hdr_block; j++) - set_agbno_state(mp, i, j, XR_E_INUSE_FS); + set_bmap(i, j, XR_E_INUSE_FS); } set_bmap_rt(mp->m_sb.sb_rextents); set_bmap_log(mp); Index: xfsprogs-dev/repair/phase5.c =================================================================== --- xfsprogs-dev.orig/repair/phase5.c 2009-08-21 18:59:24.000000000 +0000 +++ xfsprogs-dev/repair/phase5.c 2009-08-21 19:05:51.000000000 +0000 @@ -123,7 +123,7 @@ mk_incore_fstree(xfs_mount_t *mp, xfs_ag for (agbno = 0; agbno < ag_end; agbno++) { #if 0 old_state = state; - state = get_agbno_state(mp, agno, agbno); + state = get_bmap(agno, agbno); if (state != old_state) { fprintf(stderr, "agbno %u - new state is %d\n", agbno, state); @@ -142,7 +142,7 @@ mk_incore_fstree(xfs_mount_t *mp, xfs_ag } } - if (get_agbno_state(mp, agno, agbno) < XR_E_INUSE) { + if (get_bmap(agno, agbno) < XR_E_INUSE) { free_blocks++; if (in_extent == 0) { /* Index: xfsprogs-dev/repair/scan.c =================================================================== --- xfsprogs-dev.orig/repair/scan.c 2009-08-21 19:05:32.000000000 +0000 +++ xfsprogs-dev/repair/scan.c 2009-08-21 19:06:51.000000000 +0000 @@ -148,6 +148,9 @@ scanfunc_bmap( xfs_dfiloff_t last_key; char *forkname; int numrecs; + xfs_agnumber_t agno; + xfs_agblock_t agbno; + int state; if (whichfork == XFS_DATA_FORK) forkname = _("data"); @@ -229,11 +232,15 @@ _("bad back (left) sibling pointer (saw bm_cursor->level[level].right_fsbno = be64_to_cpu(block->bb_u.l.bb_rightsib); - switch (get_fsbno_state(mp, bno)) { + agno = XFS_FSB_TO_AGNO(mp, bno); + agbno = XFS_FSB_TO_AGBNO(mp, bno); + + state = get_bmap(agno, agbno); + switch (state) { case XR_E_UNKNOWN: case XR_E_FREE1: case XR_E_FREE: - set_fsbno_state(mp, bno, XR_E_INUSE); + set_bmap(agno, agbno, XR_E_INUSE); break; case XR_E_FS_MAP: case XR_E_INUSE: @@ -245,19 +252,17 @@ _("bad back (left) sibling pointer (saw * we made it here, the block probably * contains btree data. */ - set_fsbno_state(mp, bno, XR_E_MULT); + set_bmap(agno, agbno, XR_E_MULT); do_warn( _("inode 0x%llx bmap block 0x%llx claimed, state is %d\n"), - ino, (__uint64_t) bno, - get_fsbno_state(mp, bno)); + ino, (__uint64_t) bno, state); break; case XR_E_MULT: case XR_E_INUSE_FS: - set_fsbno_state(mp, bno, XR_E_MULT); + set_bmap(agno, agbno, XR_E_MULT); do_warn( _("inode 0x%llx bmap block 0x%llx claimed, state is %d\n"), - ino, (__uint64_t) bno, - get_fsbno_state(mp, bno)); + ino, (__uint64_t) bno, state); /* * if we made it to here, this is probably a bmap block * that is being used by *another* file as a bmap block @@ -272,8 +277,7 @@ _("bad back (left) sibling pointer (saw default: do_warn( _("bad state %d, inode 0x%llx bmap block 0x%llx\n"), - get_fsbno_state(mp, bno), - ino, (__uint64_t) bno); + state, ino, (__uint64_t) bno); break; } } else { @@ -476,19 +480,15 @@ scanfunc_allocbt( /* * check for btree blocks multiply claimed */ - state = get_agbno_state(mp, agno, bno); - - switch (state) { - case XR_E_UNKNOWN: - set_agbno_state(mp, agno, bno, XR_E_FS_MAP); - break; - default: - set_agbno_state(mp, agno, bno, XR_E_MULT); + state = get_bmap(agno, bno); + switch (state != XR_E_UNKNOWN) { + set_bmap(agno, bno, XR_E_MULT); do_warn( _("%s freespace btree block claimed (state %d), agno %d, bno %d, suspect %d\n"), name, state, agno, bno, suspect); return; } + set_bmap(agno, bno, XR_E_FS_MAP); numrecs = be16_to_cpu(block->bb_numrecs); @@ -523,11 +523,10 @@ _("%s freespace btree block claimed (sta continue; for ( ; b < end; b++) { - state = get_agbno_state(mp, agno, b); + state = get_bmap(agno, b); switch (state) { case XR_E_UNKNOWN: - set_agbno_state(mp, agno, b, - XR_E_FREE1); + set_bmap(agno, b, XR_E_FREE1); break; case XR_E_FREE1: /* @@ -535,8 +534,7 @@ _("%s freespace btree block claimed (sta * FREE1 blocks later */ if (magic != XFS_ABTB_MAGIC) { - set_agbno_state(mp, agno, b, - XR_E_FREE); + set_bmap(agno, b, XR_E_FREE); break; } default: @@ -698,13 +696,14 @@ _("bad ending inode # (%llu (0x%x 0x%x)) j < XFS_INODES_PER_CHUNK; j += mp->m_sb.sb_inopblock) { agbno = XFS_AGINO_TO_AGBNO(mp, ino + j); - state = get_agbno_state(mp, agno, agbno); + + state = get_bmap(agno, agbno); if (state == XR_E_UNKNOWN) { - set_agbno_state(mp, agno, agbno, XR_E_INO); + set_bmap(agno, agbno, XR_E_INO); } else if (state == XR_E_INUSE_FS && agno == 0 && ino + j >= first_prealloc_ino && ino + j < last_prealloc_ino) { - set_agbno_state(mp, agno, agbno, XR_E_INO); + set_bmap(agno, agbno, XR_E_INO); } else { do_warn( _("inode chunk claims used block, inobt block - agno %d, bno %d, inopb %d\n"), @@ -842,16 +841,15 @@ scanfunc_ino( * check for btree blocks multiply claimed, any unknown/free state * is ok in the bitmap block. */ - state = get_agbno_state(mp, agno, bno); - + state = get_bmap(agno, bno); switch (state) { case XR_E_UNKNOWN: case XR_E_FREE1: case XR_E_FREE: - set_agbno_state(mp, agno, bno, XR_E_FS_MAP); + set_bmap(agno, bno, XR_E_FS_MAP); break; default: - set_agbno_state(mp, agno, bno, XR_E_MULT); + set_bmap(agno, bno, XR_E_MULT); do_warn( _("inode btree block claimed (state %d), agno %d, bno %d, suspect %d\n"), state, agno, bno, suspect); @@ -953,7 +951,7 @@ scan_freelist( if (XFS_SB_BLOCK(mp) != XFS_AGFL_BLOCK(mp) && XFS_AGF_BLOCK(mp) != XFS_AGFL_BLOCK(mp) && XFS_AGI_BLOCK(mp) != XFS_AGFL_BLOCK(mp)) - set_agbno_state(mp, agno, XFS_AGFL_BLOCK(mp), XR_E_FS_MAP); + set_bmap(agno, XFS_AGFL_BLOCK(mp), XR_E_FS_MAP); if (be32_to_cpu(agf->agf_flcount) == 0) return; @@ -971,7 +969,7 @@ scan_freelist( for (;;) { bno = be32_to_cpu(agfl->agfl_bno[i]); if (verify_agbno(mp, agno, bno)) - set_agbno_state(mp, agno, bno, XR_E_FREE); + set_bmap(agno, bno, XR_E_FREE); else do_warn(_("bad agbno %u in agfl, agno %d\n"), bno, agno); Index: xfsprogs-dev/repair/Makefile =================================================================== --- xfsprogs-dev.orig/repair/Makefile 2009-08-21 19:05:38.000000000 +0000 +++ xfsprogs-dev/repair/Makefile 2009-08-21 19:05:51.000000000 +0000 @@ -32,9 +32,7 @@ include $(BUILDRULES) # # Tracing flags: -# -DXR_BMAP_DBG incore block bitmap debugging # -DXR_INODE_TRACE inode processing -# -DXR_BMAP_TRACE bmap btree processing # -DXR_DIR_TRACE directory processing # -DXR_DUP_TRACE duplicate extent processing # -DXR_BCNT_TRACE incore bcnt freespace btree building Index: xfsprogs-dev/repair/incore.c =================================================================== --- xfsprogs-dev.orig/repair/incore.c 2009-08-21 18:59:24.000000000 +0000 +++ xfsprogs-dev/repair/incore.c 2009-08-21 19:05:51.000000000 +0000 @@ -185,7 +185,8 @@ set_bmap_log(xfs_mount_t *mp) logend = mp->m_sb.sb_logstart + mp->m_sb.sb_logblocks; for (i = mp->m_sb.sb_logstart; i < logend ; i++) { - set_fsbno_state(mp, i, XR_E_INUSE_FS); + set_bmap(XFS_FSB_TO_AGNO(mp, i), + XFS_FSB_TO_AGBNO(mp, i), XR_E_INUSE_FS); } return; @@ -205,7 +206,7 @@ set_bmap_fs(xfs_mount_t *mp) for (i = 0; i < mp->m_sb.sb_agcount; i++) for (j = 0; j < end; j++) - set_agbno_state(mp, i, j, XR_E_INUSE_FS); + set_bmap(i, j, XR_E_INUSE_FS); return; } @@ -227,7 +228,7 @@ set_bmap_fs_bt(xfs_mount_t *mp) * account for btree roots */ for (j = begin; j < end; j++) - set_agbno_state(mp, i, j, XR_E_INUSE_FS); + set_bmap(i, j, XR_E_INUSE_FS); } return; @@ -253,44 +254,3 @@ incore_init(xfs_mount_t *mp) return; } - -#if defined(XR_BMAP_TRACE) || defined(XR_BMAP_DBG) -int -get_agbno_state(xfs_mount_t *mp, xfs_agnumber_t agno, - xfs_agblock_t ag_blockno) -{ - __uint64_t *addr; - - addr = ba_bmap[(agno)] + (ag_blockno)/XR_BB_NUM; - - return((*addr >> (((ag_blockno)%XR_BB_NUM)*XR_BB)) & XR_BB_MASK); -} - -void set_agbno_state(xfs_mount_t *mp, xfs_agnumber_t agno, - xfs_agblock_t ag_blockno, int state) -{ - __uint64_t *addr; - - addr = ba_bmap[(agno)] + (ag_blockno)/XR_BB_NUM; - - *addr = (((*addr) & - (~((__uint64_t) XR_BB_MASK << (((ag_blockno)%XR_BB_NUM)*XR_BB)))) | - (((__uint64_t) (state)) << (((ag_blockno)%XR_BB_NUM)*XR_BB))); -} - -int -get_fsbno_state(xfs_mount_t *mp, xfs_dfsbno_t blockno) -{ - return(get_agbno_state(mp, XFS_FSB_TO_AGNO(mp, blockno), - XFS_FSB_TO_AGBNO(mp, blockno))); -} - -void -set_fsbno_state(xfs_mount_t *mp, xfs_dfsbno_t blockno, int state) -{ - set_agbno_state(mp, XFS_FSB_TO_AGNO(mp, blockno), - XFS_FSB_TO_AGBNO(mp, blockno), state); - - return; -} -#endif Index: xfsprogs-dev/repair/incore.h =================================================================== --- xfsprogs-dev.orig/repair/incore.h 2009-08-21 18:59:24.000000000 +0000 +++ xfsprogs-dev/repair/incore.h 2009-08-21 19:05:51.000000000 +0000 @@ -72,51 +72,23 @@ void teardown_bmap_finish(xfs_mount_t * you want to use the regular block map. */ -#if defined(XR_BMAP_TRACE) || defined(XR_BMAP_DBG) -/* - * implemented as functions for debugging purposes - */ -int get_agbno_state(xfs_mount_t *mp, xfs_agnumber_t agno, - xfs_agblock_t ag_blockno); -void set_agbno_state(xfs_mount_t *mp, xfs_agnumber_t agno, - xfs_agblock_t ag_blockno, int state); - -int get_fsbno_state(xfs_mount_t *mp, xfs_dfsbno_t blockno); -void set_fsbno_state(xfs_mount_t *mp, xfs_dfsbno_t blockno, int state); -#else -/* - * implemented as macros for performance purposes - */ - -#define get_agbno_state(mp, agno, ag_blockno) \ +#define get_bmap(agno, ag_blockno) \ ((int) (*(ba_bmap[(agno)] + (ag_blockno)/XR_BB_NUM) \ >> (((ag_blockno)%XR_BB_NUM)*XR_BB)) \ & XR_BB_MASK) -#define set_agbno_state(mp, agno, ag_blockno, state) \ +#define set_bmap(agno, ag_blockno, state) \ *(ba_bmap[(agno)] + (ag_blockno)/XR_BB_NUM) = \ ((*(ba_bmap[(agno)] + (ag_blockno)/XR_BB_NUM) & \ (~((__uint64_t) XR_BB_MASK << (((ag_blockno)%XR_BB_NUM)*XR_BB)))) | \ (((__uint64_t) (state)) << (((ag_blockno)%XR_BB_NUM)*XR_BB))) -#define get_fsbno_state(mp, blockno) \ - get_agbno_state(mp, XFS_FSB_TO_AGNO(mp, (blockno)), \ - XFS_FSB_TO_AGBNO(mp, (blockno))) -#define set_fsbno_state(mp, blockno, state) \ - set_agbno_state(mp, XFS_FSB_TO_AGNO(mp, (blockno)), \ - XFS_FSB_TO_AGBNO(mp, (blockno)), (state)) - - -#define get_agbno_rec(mp, agno, ag_blockno) \ - (*(ba_bmap[(agno)] + (ag_blockno)/XR_BB_NUM)) -#endif /* XR_BMAP_TRACE */ - /* * these work in real-time extents (e.g. fsbno == rt extent number) */ -#define get_rtbno_state(mp, fsbno) \ +#define get_rtbmap(fsbno) \ ((*(rt_ba_bmap + (fsbno)/XR_BB_NUM) >> \ (((fsbno)%XR_BB_NUM)*XR_BB)) & XR_BB_MASK) -#define set_rtbno_state(mp, fsbno, state) \ +#define set_rtbmap(fsbno, state) \ *(rt_ba_bmap + (fsbno)/XR_BB_NUM) = \ ((*(rt_ba_bmap + (fsbno)/XR_BB_NUM) & \ (~((__uint64_t) XR_BB_MASK << (((fsbno)%XR_BB_NUM)*XR_BB)))) | \ Index: xfsprogs-dev/repair/phase2.c =================================================================== --- xfsprogs-dev.orig/repair/phase2.c 2009-08-21 18:59:24.000000000 +0000 +++ xfsprogs-dev/repair/phase2.c 2009-08-21 19:05:51.000000000 +0000 @@ -176,7 +176,7 @@ phase2(xfs_mount_t *mp) * also mark blocks */ for (b = 0; b < mp->m_ialloc_blks; b++) { - set_agbno_state(mp, 0, + set_bmap(0, b + XFS_INO_TO_AGBNO(mp, mp->m_sb.sb_rootino), XR_E_INO); } Index: xfsprogs-dev/repair/phase3.c =================================================================== --- xfsprogs-dev.orig/repair/phase3.c 2009-08-21 18:59:24.000000000 +0000 +++ xfsprogs-dev/repair/phase3.c 2009-08-21 19:05:51.000000000 +0000 @@ -61,14 +61,8 @@ walk_unlinked_list(xfs_mount_t *mp, xfs_ agbno = XFS_AGINO_TO_AGBNO(mp, current_ino); pthread_mutex_lock(&ag_locks[agno]); - switch (state = get_agbno_state(mp, - agno, agbno)) { - case XR_E_UNKNOWN: - case XR_E_FREE: - case XR_E_FREE1: - set_agbno_state(mp, agno, agbno, - XR_E_INO); - break; + state = get_bmap(agno, agbno); + switch (state) { case XR_E_BAD_STATE: do_error(_( "bad state in block map %d\n"), @@ -85,8 +79,7 @@ walk_unlinked_list(xfs_mount_t *mp, xfs_ * anyway, hopefully without * losing too much other data */ - set_agbno_state(mp, agno, agbno, - XR_E_INO); + set_bmap(agno, agbno, XR_E_INO); break; } pthread_mutex_unlock(&ag_locks[agno]); Index: xfsprogs-dev/repair/rt.c =================================================================== --- xfsprogs-dev.orig/repair/rt.c 2009-08-21 18:59:24.000000000 +0000 +++ xfsprogs-dev/repair/rt.c 2009-08-21 19:05:51.000000000 +0000 @@ -91,7 +91,7 @@ generate_rtinfo(xfs_mount_t *mp, bits = 0; for (i = 0; i < sizeof(xfs_rtword_t) * NBBY && extno < mp->m_sb.sb_rextents; i++, extno++) { - if (get_rtbno_state(mp, extno) == XR_E_FREE) { + if (get_rtbmap(extno) == XR_E_FREE) { sb_frextents++; bits |= freebit; @@ -218,7 +218,7 @@ process_rtbitmap(xfs_mount_t *mp, bit < bitsperblock && extno < mp->m_sb.sb_rextents; bit++, extno++) { if (xfs_isset(words, bit)) { - set_rtbno_state(mp, extno, XR_E_FREE); + set_rtbmap(extno, XR_E_FREE); sb_frextents++; if (prevbit == 0) { start_bmbno = bmbno; From rumi_ml@rtfm.hu Wed Sep 2 14:34:51 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_45 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82JYV4C040501 for ; Wed, 2 Sep 2009 14:34:41 -0500 X-ASG-Debug-ID: 1251920117-6a8903780000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from nexus.dynaweb.hu (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A9A2841E369 for ; Wed, 2 Sep 2009 12:35:17 -0700 (PDT) Received: from nexus.dynaweb.hu (nexus.dynaweb.hu [195.70.37.87]) by cuda.sgi.com with ESMTP id lE7xJlBgALz4shYf for ; Wed, 02 Sep 2009 12:35:17 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by nexus.dynaweb.hu (Postfix) with ESMTP id E9DC06CEEB; Wed, 2 Sep 2009 21:34:43 +0200 (CEST) X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Scanned: by amavisd-new using ClamAV at dynaweb.hu Received: from nexus.dynaweb.hu ([127.0.0.1]) by localhost (nexus.dynaweb.hu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id p-NFC2ZeRGKc; Wed, 2 Sep 2009 21:34:42 +0200 (CEST) Received: from raketa.ipn.dynaweb.hu (catv-80-99-36-176.catv.broadband.hu [80.99.36.176]) by nexus.dynaweb.hu (Postfix) with ESMTPSA id 3BCD16CEE5; Wed, 2 Sep 2009 21:34:42 +0200 (CEST) Date: Wed, 2 Sep 2009 21:34:41 +0200 From: RUMI Szabolcs To: Eric Sandeen Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Structure needs cleaning? (take #2) Subject: Re: Structure needs cleaning? (take #2) Message-Id: <20090902213441.470b439c.rumi_ml@rtfm.hu> In-Reply-To: <4A9E81AD.70003@sandeen.net> References: <20090902152245.b2969883.rumi_ml@rtfm.hu> <4A9E81AD.70003@sandeen.net> X-Mailer: Sylpheed 2.6.0 (GTK+ 2.16.5; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Barracuda-Connect: nexus.dynaweb.hu[195.70.37.87] X-Barracuda-Start-Time: 1251920121 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7928 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Status: Clean Hi! Well, what could be the reason? I mean, there was no hardware failure, no crash, no reboot, no errors in the disk's SMART error log, no nothing. What I did was that I've extracted and deleted the rather huge OpenOffice source tree several times (sometimes with overwriting) and finally it ended up with these undeletable files and xfs errors. Is it considered normal for xfs to get messed up like that under such load? Thanks, Sab The xfs_repair output and xfs_info output is included below: # xfs_repair -v /dev/sda10 Phase 1 - find and verify superblock... - block cache size set to 255664 entries Phase 2 - using internal log - zero log... zero_log: head block 37688 tail block 37688 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - agno = 32 - agno = 33 - agno = 34 - agno = 35 - agno = 36 - agno = 37 - agno = 38 - agno = 39 - agno = 40 - agno = 41 - agno = 42 - agno = 43 - agno = 44 - agno = 45 - agno = 46 - agno = 47 - agno = 48 - agno = 49 - agno = 50 - agno = 51 - agno = 52 - agno = 53 - agno = 54 - agno = 55 - agno = 56 - agno = 57 - agno = 58 - agno = 59 - agno = 60 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - agno = 32 - agno = 33 - agno = 34 - agno = 35 - agno = 36 - agno = 37 - agno = 38 - agno = 39 - agno = 40 - agno = 41 - agno = 42 - agno = 43 - agno = 44 - agno = 45 - agno = 46 - agno = 47 - agno = 48 - agno = 49 - agno = 50 - agno = 51 - agno = 52 - agno = 53 - agno = 54 - agno = 55 - agno = 56 - agno = 57 - agno = 58 - agno = 59 - agno = 60 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - agno = 32 - agno = 33 - agno = 34 - agno = 35 - agno = 36 - agno = 37 - agno = 38 - agno = 39 - agno = 40 - agno = 41 - agno = 42 - agno = 43 - agno = 44 - agno = 45 - agno = 46 - agno = 47 - agno = 48 - agno = 49 - agno = 50 - agno = 51 - agno = 52 - agno = 53 - agno = 54 - agno = 55 - agno = 56 - agno = 57 - agno = 58 - agno = 59 - agno = 60 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 leaf block 8388608 for directory inode 4051737 bad header rebuilding directory inode 4051737 leaf block 8388608 for directory inode 4053318 bad header rebuilding directory inode 4053318 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - agno = 32 - agno = 33 - agno = 34 - agno = 35 - agno = 36 - agno = 37 - agno = 38 - agno = 39 - agno = 40 - agno = 41 - agno = 42 - agno = 43 - agno = 44 - agno = 45 - agno = 46 - agno = 47 - agno = 48 - agno = 49 - agno = 50 - agno = 51 - agno = 52 - agno = 53 - agno = 54 - agno = 55 - agno = 56 - agno = 57 - agno = 58 - agno = 59 - agno = 60 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... XFS_REPAIR Summary Wed Sep 2 21:23:53 2009 Phase Start End Duration Phase 1: 09/02 21:23:33 09/02 21:23:33 Phase 2: 09/02 21:23:33 09/02 21:23:35 2 seconds Phase 3: 09/02 21:23:35 09/02 21:23:49 14 seconds Phase 4: 09/02 21:23:49 09/02 21:23:50 1 second Phase 5: 09/02 21:23:50 09/02 21:23:50 Phase 6: 09/02 21:23:50 09/02 21:23:50 Phase 7: 09/02 21:23:50 09/02 21:23:50 Total run time: 17 seconds done # xfs_info /dev/sda10 meta-data=/dev/sda10 isize=256 agcount=61, agsize=32768 blks = sectsz=512 attr=0 data = bsize=4096 blocks=1998848, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=16384, version=2 = sectsz=512 sunit=1 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 On Wed, 02 Sep 2009 09:31:09 -0500 Eric Sandeen wrote: > RUMI Szabolcs wrote: > > Hi! > > > > Sorry but my previous post was missing the important first two lines: > > Yes, thanks. :) > > > d62c8000: 2c 30 78 41 41 41 41 30 30 30 30 2c 36 2c 30 78 ,0xAAAA0000,6,0x > > Filesystem "sda10": XFS internal error xfs_da_do_buf(2) at line 2112 of file fs/xfs/xfs_da_btree.c. Caller 0xc02cc790 > > This is on-disk corruption, it found bad magic on something it expected > to be metadata. You should run xfs_repair. run with -n, or on a > restored xfs_metadump image as a dry-run first, if you prefer. > > -Eric > > > Pid: 29510, comm: mc Tainted: P 2.6.29-gentoo-r5-PAE #1 > > Call Trace: > > [] xfs_da_do_buf+0x8c4/0x900 > > [] xfs_da_read_buf+0x30/0x40 > > [] xfs_da_read_buf+0x30/0x40 > > [] pollwake+0x0/0x50 > > [] pollwake+0x0/0x50 > > [] xfs_da_read_buf+0x30/0x40 > > [] xfs_dir2_leaf_lookup_int+0x63/0x2f0 > > [] xfs_dir2_leaf_lookup_int+0x63/0x2f0 > > [] xfs_dir2_leaf_lookup+0x27/0xc0 > > [] xfs_dir2_isleaf+0x1f/0x60 > > [] xfs_dir_lookup+0xd8/0x180 > > [] xfs_lookup+0x6b/0xf0 > > [] xfs_vn_lookup+0x55/0xa0 > > [] do_lookup+0x1ba/0x1e0 > > [] __link_path_walk+0x6cd/0xd60 > > [] xfs_dir2_leaf_getdents+0x5ff/0xad0 > > [] path_walk+0x54/0xc0 > > [] do_path_lookup+0x83/0x170 > > [] getname+0x9b/0xe0 > > [] user_path_at+0x5a/0x90 > > [] vfs_lstat_fd+0x1f/0x50 > > [] sys_lstat64+0xf/0x30 > > [] touch_atime+0x14/0x130 > > [] vfs_readdir+0x78/0xb0 > > [] sys_getdents64+0xa1/0xd0 > > [] sysenter_do_call+0x12/0x25 > > > > Thanks, > > Sab > > > > _______________________________________________ > > xfs mailing list > > xfs@oss.sgi.com > > http://oss.sgi.com/mailman/listinfo/xfs > > > From sandeen@sandeen.net Wed Sep 2 14:52:53 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82JqX3m041639 for ; Wed, 2 Sep 2009 14:52:43 -0500 X-ASG-Debug-ID: 1251921188-4afb000b0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 64BD841E799 for ; Wed, 2 Sep 2009 12:53:08 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by cuda.sgi.com with ESMTP id 9vl1NYZVjxhromt4 for ; Wed, 02 Sep 2009 12:53:08 -0700 (PDT) Received: from int-mx08.intmail.prod.int.phx2.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id n82Jr5Qo031078; Wed, 2 Sep 2009 15:53:05 -0400 Received: from neon.msp.redhat.com (neon.msp.redhat.com [10.15.80.10]) by int-mx08.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id n82Jr4Nq018954; Wed, 2 Sep 2009 15:53:05 -0400 Message-ID: <4A9ECD20.9090201@sandeen.net> Date: Wed, 02 Sep 2009 14:53:04 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.21 (X11/20090320) MIME-Version: 1.0 To: RUMI Szabolcs CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Structure needs cleaning? (take #2) Subject: Re: Structure needs cleaning? (take #2) References: <20090902152245.b2969883.rumi_ml@rtfm.hu> <4A9E81AD.70003@sandeen.net> <20090902213441.470b439c.rumi_ml@rtfm.hu> In-Reply-To: <20090902213441.470b439c.rumi_ml@rtfm.hu> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 10.5.11.21 X-Barracuda-Connect: mx1.redhat.com[209.132.183.28] X-Barracuda-Start-Time: 1251921210 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7930 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean RUMI Szabolcs wrote: > Hi! > > Well, what could be the reason? I mean, there was no hardware failure, > no crash, no reboot, no errors in the disk's SMART error log, no nothing. > What I did was that I've extracted and deleted the rather huge OpenOffice > source tree several times (sometimes with overwriting) and finally it > ended up with these undeletable files and xfs errors. Is it considered > normal for xfs to get messed up like that under such load? No, not normal. It found the text "0xAAAA0000,6,0x" on disk in an area where it expected to find valid filesystem metadata. Corruption could come from anywhere - an xfs bug, some other bug, bad memory, bad cables, neon death rays from space, writing directly to the disk, who knows. Awfully hard to track down a one-off occurrence like this, I'm afraid. > Thanks, > Sab > > > > The xfs_repair output and xfs_info output is included below: > > # xfs_repair -v /dev/sda10 ... > Phase 6 - check inode connectivity... > - resetting contents of realtime bitmap and summary inodes > - traversing filesystem ... > - agno = 0 > - agno = 1 > - agno = 2 > - agno = 3 > - agno = 4 > - agno = 5 > - agno = 6 > - agno = 7 > leaf block 8388608 for directory inode 4051737 bad header > rebuilding directory inode 4051737 > leaf block 8388608 for directory inode 4053318 bad header > rebuilding directory inode 4053318 ... above is the problem, properly found & repaired. -Eric From jpiszcz@lucidpixels.com Wed Sep 2 16:01:57 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_54 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82L1bnX045657 for ; Wed, 2 Sep 2009 16:01:47 -0500 X-ASG-Debug-ID: 1251925349-01d303ce0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from lucidpixels.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4FFAC15B7BC1 for ; Wed, 2 Sep 2009 14:02:29 -0700 (PDT) Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by cuda.sgi.com with ESMTP id SD8vEZJ326nRawpc for ; Wed, 02 Sep 2009 14:02:29 -0700 (PDT) Received: by lucidpixels.com (Postfix, from userid 1001) id AA9E84667; Wed, 2 Sep 2009 17:02:29 -0400 (EDT) Date: Wed, 2 Sep 2009 17:02:29 -0400 (EDT) From: Justin Piszcz To: Eric Sandeen cc: RUMI Szabolcs , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Structure needs cleaning? (take #2) Subject: Re: Structure needs cleaning? (take #2) In-Reply-To: <4A9ECD20.9090201@sandeen.net> Message-ID: References: <20090902152245.b2969883.rumi_ml@rtfm.hu> <4A9E81AD.70003@sandeen.net> <20090902213441.470b439c.rumi_ml@rtfm.hu> <4A9ECD20.9090201@sandeen.net> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Barracuda-Connect: lucidpixels.com[75.144.35.66] X-Barracuda-Start-Time: 1251925350 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7934 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Hi RUMI, 1. Have you run memtest86 for a few passes? 2. Have you run a short+long test against the drive? smartctl -t short /dev/sda # wait 10min smartctl -t long # wait until its done show smartctl -a /dev/sda output <- report this back 3. Is this the first time this has happened? 4. Is your system on a UPS? 5. How do you mount your XFS partition? a. Do you use any special parameters? Justin. On Wed, 2 Sep 2009, Eric Sandeen wrote: > RUMI Szabolcs wrote: >> Hi! >> >> Well, what could be the reason? I mean, there was no hardware failure, >> no crash, no reboot, no errors in the disk's SMART error log, no nothing. >> What I did was that I've extracted and deleted the rather huge OpenOffice >> source tree several times (sometimes with overwriting) and finally it >> ended up with these undeletable files and xfs errors. Is it considered >> normal for xfs to get messed up like that under such load? > > No, not normal. > > It found the text "0xAAAA0000,6,0x" on disk in an area where it expected > to find valid filesystem metadata. > > Corruption could come from anywhere - an xfs bug, some other bug, bad > memory, bad cables, neon death rays from space, writing directly to the > disk, who knows. Awfully hard to track down a one-off occurrence like > this, I'm afraid. > >> Thanks, >> Sab >> >> >> >> The xfs_repair output and xfs_info output is included below: >> >> # xfs_repair -v /dev/sda10 > > ... > >> Phase 6 - check inode connectivity... >> - resetting contents of realtime bitmap and summary inodes >> - traversing filesystem ... >> - agno = 0 >> - agno = 1 >> - agno = 2 >> - agno = 3 >> - agno = 4 >> - agno = 5 >> - agno = 6 >> - agno = 7 >> leaf block 8388608 for directory inode 4051737 bad header >> rebuilding directory inode 4051737 >> leaf block 8388608 for directory inode 4053318 bad header >> rebuilding directory inode 4053318 > > ... > > above is the problem, properly found & repaired. > > -Eric > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs > From felixb@sgi.com Wed Sep 2 16:43:43 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82LhNgD048118 for ; Wed, 2 Sep 2009 16:43:33 -0500 Received: from estes.americas.sgi.com (estes.americas.sgi.com [128.162.236.10]) by relay2.corp.sgi.com (Postfix) with ESMTP id EA54C3040BB for ; Wed, 2 Sep 2009 14:44:20 -0700 (PDT) Received: from eagdhcp-232-185.americas.sgi.com (eagdhcp-232-185.americas.sgi.com [128.162.232.185]) by estes.americas.sgi.com (Postfix) with ESMTP id 7269A70016DD; Wed, 2 Sep 2009 16:14:16 -0500 (CDT) Message-Id: <6959C33E-61FC-4C10-A055-D61580F16E82@sgi.com> From: Felix Blyakher To: xfs mailing list Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v926) Subject: Change in maintainership Date: Wed, 2 Sep 2009 16:14:14 -0500 Cc: felix@xfs.org X-Mailer: Apple Mail (2.926) X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Hi all, I have accepted a position at another storage company and so will no longer be serving as the maintainer for XFS from SGI. Alex Elder has agreed to take on maintainer responsibilities in my place. I will be working with him initially to make sure the transition goes smoothly, but I don't expect there to be any major issues. Alex has been working with XFS (more behind the scenes) for quite a while now and has a long history working with storage software as well as Linux. He is very capable engineer, and I'm sure he'll be able to handle the maintainer duties. And SGI is supporting this decision by giving Alex all resources necessary for this work. On my part, I've enjoyed working on xfs for so long, that I can't just go completely away from it. I will continue to work on xfs in my spare time and stay in touch with community. I could be reached now at felix@xfs.org. Thanks, Felix From jessicabrooks@topspotdirectory.com Wed Sep 2 17:05:40 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.9 required=5.0 tests=BAYES_50,J_CHICKENPOX_33, MSGID_FROM_MTA_HEADER autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82M5KN0049187 for ; Wed, 2 Sep 2009 17:05:30 -0500 X-ASG-Debug-ID: 1251929162-59fd02fe0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from smtp02.lnh.mail.rcn.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3299510A2E17 for ; Wed, 2 Sep 2009 15:06:02 -0700 (PDT) Received: from smtp02.lnh.mail.rcn.net (smtp02.lnh.mail.rcn.net [207.172.157.102]) by cuda.sgi.com with ESMTP id T9WbDZHBDWyQzKYS for ; Wed, 02 Sep 2009 15:06:02 -0700 (PDT) Received: from mr02.lnh.mail.rcn.net ([207.172.157.22]) by smtp02.lnh.mail.rcn.net with ESMTP; 02 Sep 2009 18:06:00 -0400 Received: from smtp01.lnh.mail.rcn.net (smtp01.lnh.mail.rcn.net [207.172.4.11]) by mr02.lnh.mail.rcn.net (MOS 3.10.7-GA) with ESMTP id QDQ25216; Wed, 2 Sep 2009 18:05:58 -0400 (EDT) Message-Id: <200909022205.QDQ25216@mr02.lnh.mail.rcn.net> Received: from 24-136-17-129.alc-bsr1.chi-alc.il.cable.rcn.com (HELO SG-06) ([24.136.17.129]) by smtp01.lnh.mail.rcn.net with ESMTP; 02 Sep 2009 18:05:59 -0400 Reply-To: "Jessica Brooks" From: "Jessica Brooks" To: X-ASG-Orig-Subj: Link exchange with oss.sgi.com Subject: Link exchange with oss.sgi.com Date: Wed, 2 Sep 2009 17:03:57 -0500 Importance: Normal X-Priority: 3 (Normal) MIME-Version: 1.0 X-AntiAbuse: Message Originator UID: RI {1a52a-92599} X-AntiAbuse: Since we wish to continue to provide abuse tracking! X-AntiAbuse: Please do not use this header to filter out email X-AntiAbuse: Report abuse incidents to mach 5 enterprises X-AntiAbuse: This header is intended to track abuse. Include with any abuse report X-Mailer: Mach 5 Mailer version 4 RI{1a52a-92599} Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 8bit X-Junkmail-Status: score=10/50, host=mr02.lnh.mail.rcn.net X-Junkmail-SD-Raw: score=unknown, refid=str=0001.0A020209.4A9EEC47.018C,ss=1,fgs=0, ip=207.172.4.11, so=2009-07-20 21:54:04, dmn=5.7.1/2009-06-05, mode=single engine X-Junkmail-IWF: false X-Barracuda-Connect: smtp02.lnh.mail.rcn.net[207.172.157.102] X-Barracuda-Start-Time: 1251929167 X-Barracuda-Bayes: INNOCENT GLOBAL 0.5000 1.0000 0.0000 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 1.80 X-Barracuda-Spam-Status: No, SCORE=1.80 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MSGID_FROM_MTA_HEADER, MSGID_FROM_MTA_HEADER_2, NO_OBLIGATION X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7938 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.30 NO_OBLIGATION BODY: There is no obligation 0.00 MSGID_FROM_MTA_HEADER Message-Id was added by a relay 1.50 MSGID_FROM_MTA_HEADER_2 Message-Id was added by a relay X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Hello, My name is Jessica Brooks. I was surfing the web and came across your site oss.sgi. com. I would like to add your site to a category specific page on our site topspotdirectory.com (Details about the directory are below). As you know, exchanging links will benefit both of us by raising our search engine rankings and generating more traffic to both of our sites. Best of all it's FREE; there is no cost or hidden fee. We simply ask for a link back from the home or an internal page of your site. Simple enough, don't you think? I hope you're interested because we have other quality linking opportunities to offer. If you are interested in exchanging links, please reply to this email with your link details and the URL of your links page below: Anchor Text: URL: Description: (Use up to 400 words) Links Page (where you will put my link): Once I hear back from you with the information above, I'll send you a reply regarding our link details. I look forward to your response. Best wishes, Jessica Brooks on behalf of Top Spot Directory jessicabrooks@topspotdirectory.com Ref: 9-2 ---------------------------------- Top Spot Directory topspotdirectory.com The Top Spot Directory was developed in 2006 as an online search aid with constantly updated categories and fresh topics. Our staff works diligently to create a database of links to the Web's best resources for some of the most sought after and researched topics. Unlike many directories, all of our links are hand selected by members of our staff. Our editors personally assess each submission and search for quality sites to include. PLEASE NOTICE: This is not part of a mass email. I, Jessica Brooks, personally sent this email to you. Our company considers this to be a polite way to contact you and I do apologize sincerely if you have been inconvenienced in any way. Please know that our sole purpose is to introduce ourselves to you with no obligation on your part. Abiding by rules of etiquette, we are obliged to offer you an 'OPT-OUT' from future mailings from us; should you wish to exercise this right, please reply to this email with "OPT-OUT" in the subject field. Please be certain to include your email address and the URL of your site so we can act promptly to appropriately remove your information from our list of preferred link exchanges. From info@Petroleum.com Wed Sep 2 18:44:01 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n82Nhfch053960 for ; Wed, 2 Sep 2009 18:43:51 -0500 X-ASG-Debug-ID: 1251935044-03d900af0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.thfd.gov.tw (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B005141F964 for ; Wed, 2 Sep 2009 16:44:05 -0700 (PDT) Received: from mail.thfd.gov.tw (210-241-111-171.HINET-IP.hinet.net [210.241.111.171]) by cuda.sgi.com with ESMTP id 0ZkpbChoDVfD0WuB for ; Wed, 02 Sep 2009 16:44:05 -0700 (PDT) Received: from mail.thfd.gov.tw (localhost [127.0.0.1]) by mail.thfd.gov.tw (8.14.3/8.14.3) with ESMTP id n82NXwCh067222; Thu, 3 Sep 2009 07:33:59 +0800 (CST) (envelope-from info@Petroleum.com) From: "MALAYSIA PETROLUEM COMPANY" Reply-To: mrllparker01@gmail.com X-ASG-Orig-Subj: N/A Subject: N/A Date: Thu, 3 Sep 2009 07:33:58 +0800 Message-Id: <20090902233547.M44808@Petroleum.com> X-Mailer: OpenWebMail 2.53 X-OriginatingIP: 41.220.75.16 (thfd3000) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 To: undisclosed-recipients:; X-Barracuda-Connect: 210-241-111-171.HINET-IP.hinet.net[210.241.111.171] X-Barracuda-Start-Time: 1251935075 X-Barracuda-Bayes: INNOCENT GLOBAL 0.5019 1.0000 0.7500 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 0.85 X-Barracuda-Spam-Status: No, SCORE=0.85 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MAILTO_TO_SPAM_ADDR, RDNS_DYNAMIC X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7944 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 MAILTO_TO_SPAM_ADDR URI: Includes a link to a likely spammer email 0.10 RDNS_DYNAMIC Delivered to trusted network by host with dynamic-looking rDNS X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean MALAYSIA PETROLUEM COMPANY (MY) Address:Malaysia Petroleum House 8188 Jalan Tun Razak, 50400 Kuala Lumpur Malaysia. Hello, Malaysia Petroleum Company wishes to announce to the general public about it 1st Quota Recruitment excise which is currently going on now. For our three (3)newly commissioned offices around Malaysia MY. Send your CV to our Human Resources Department Via email. Officers:Mr.Allen Parker Head Human Resources Officer Email:mrllparker01@gmail.com Regards, HR Department Malays From dyorke16@intoscana.it Wed Sep 2 22:11:13 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_50,UNPARSEABLE_RELAY autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n833ArdB064457 for ; Wed, 2 Sep 2009 22:11:03 -0500 X-ASG-Debug-ID: 1251947483-6ad902520000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from smtp1.aruba.it (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with SMTP id 052FA420215 for ; Wed, 2 Sep 2009 20:11:23 -0700 (PDT) Received: from smtp1.aruba.it (smtp3.aruba.it [62.149.128.202]) by cuda.sgi.com with SMTP id 08apMZGigFV6zDpP for ; Wed, 02 Sep 2009 20:11:23 -0700 (PDT) Received: (qmail 18099 invoked by uid 89); 3 Sep 2009 03:10:17 -0000 Received: from unknown (HELO WebmailCustom.aruba.it) (dyorke16@intoscana.it@10.10.10.25) by smtp1.aruba.it with SMTP; 3 Sep 2009 03:10:17 -0000 To: (Recipient List Suppressed) Received: from 83.229.80.27 by HTTP Sender: dyorke16@intoscana.it From: "Dave Yorke" Reply-To: d.yorke48@yahoo.com.hk X-ASG-Orig-Subj: Request Subject: Request X-Mailer: Quality Web Email v3.1s X-Originating-IP: 83.229.80.27 Date: Thu, 03 Sep 2009 05:10:09 +0200 Message-id: <4a9f3391.20f.36f6.318960781@WebmailCustom.aruba.it> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Barracuda-Connect: smtp3.aruba.it[62.149.128.202] X-Barracuda-Start-Time: 1251947510 X-Barracuda-Bayes: INNOCENT GLOBAL 0.4960 1.0000 0.0000 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 1.50 X-Barracuda-Spam-Status: No, SCORE=1.50 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=ADVANCE_FEE_1, BSF_SC0_SA083, UNPARSEABLE_RELAY X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.7958 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 UNPARSEABLE_RELAY Informational: message has unparseable relay lines 0.00 ADVANCE_FEE_1 Appears to be advance fee fraud (Nigerian 419) 1.50 BSF_SC0_SA083 Custom Rule SA083 X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Greetings from Dave Yorke, I am Dr. Dave Yorke, Group Accountant (R.B.T.T) However, I have already sent you this same letter by post one month ago, but I am not sure if it did get to you since I have not heard from you, hence my resending it again. I discovered a dormant account in my office, as Group Accountant with Republic Bank of Trinidad and Tobago. It will be in my interest to transfer this fund worth $28.5M Dollars (Twenty Eight Million Five hundred thousand Dollars) in an account offshore. Can you be my partner? Regards and respect, Dave Yorke From noel123@netvigator.com Thu Sep 3 09:30:58 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n83EUb6C106116 for ; Thu, 3 Sep 2009 09:30:48 -0500 X-ASG-Debug-ID: 1251988263-73ba01a00000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ymail03dat.netvigator.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B001E422808 for ; Thu, 3 Sep 2009 07:31:04 -0700 (PDT) Received: from ymail03dat.netvigator.com (ymail03dat.netvigator.com [218.102.23.51]) by cuda.sgi.com with ESMTP id yec0suTkTxul1WVQ for ; Thu, 03 Sep 2009 07:31:04 -0700 (PDT) Received: from woboas04.netvigator.com ([219.76.94.20]) by ymail03dat.netvigator.com (InterMail vM.6.01.03.02 201-2131-111-104-20040324) with ESMTP id <20090903143102.BXXQ1029.ymail03dat.netvigator.com@woboas04.netvigator.com>; Thu, 3 Sep 2009 22:31:02 +0800 Received: from localhost (woboas04.netvigator.com [127.0.0.1]) by woboas04.netvigator.com (Postfix) with ESMTP id 5358A8C4E4; Thu, 3 Sep 2009 11:34:00 +0000 (GMT) Received: from obav02.netvigator.com (obav02.netvigator.com [127.0.0.1]) by obav02.netvigator.com (Postfix) with SMTP id 598201B1063; Thu, 3 Sep 2009 14:31:02 +0000 (GMT) X-Mailer: Openwave WebEngine, version 2.8.19 (webedge20-101-1110-20050615) X-Originating-IP: [66.237.61.213] From: "United Nations Human Settlements Board" , " "@obav02.netvigator.com To: info@netvigator.com X-ASG-Orig-Subj: REF:UN013-0156/UPS-UN-HABITAT Subject: REF:UN013-0156/UPS-UN-HABITAT Date: Thu, 3 Sep 2009 22:31:01 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset=Big5 Content-Transfer-Encoding: 7bit Message-Id: <20090903143102.598201B1063@obav02.netvigator.com> X-Barracuda-Connect: ymail03dat.netvigator.com[218.102.23.51] X-Barracuda-Start-Time: 1251988291 X-Barracuda-Bayes: INNOCENT GLOBAL 0.5080 1.0000 0.7500 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: 0.75 X-Barracuda-Spam-Status: No, SCORE=0.75 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8000 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Good Day Beneficiary, After several attempts to reach you, I deemed it necessary and urgent to contact you with your email address and to notify you finally about your outstanding mid year settlement/compensation which is being given out by the United Nations Human Settlements Programme. This compensation is being made to all of you who have lost your money through any online transactions this year or as a result of any internet fraudulent activities that you might have previousley being involved. The United Nations Human Settlements Programme, UN-HABITAT, is the United Nations agency for human settlements. It is mandated by the UN General Assembly to promote humanly, socially and environmentally with the goal of providing adequate shelter for all. As a result of the much fraudulent activities spreading over the internet, the Organizing Committee of the UN-HABITAT have decided to get details of most of the victims who were previously scammed by some internet fraudsters. The main purpose of this Programme is to compensate every one of you with a check sum of $500,000.00 each, to help settle all your debts and start a new business. The Financial Commitee of the UN-HABITAT Programme have deposited your Settlement Check sum of $500,000.00 USD to the United Parcel Service of Nigeria (UPS), for them to facilitate the delivery to you prior to your contact with them. Your Settlement Check Parcel was deposited and registered with Reference Number UN013-0156/UPS-UN-HABITAT. You are to contact the United Parcel Service of Nigeria (UPS), with your the below information: Full Name: Resident Address: Direct Telephone Number: Country: Reference Number: This will enable them further the delivery/shipment of your settlement check to you. Note that we have not paid the shipment fee for the delivery/shipment of your check to you. you are hereby advise to contact the United Parcel Service of Nigeria to book your settlement check shipment arrangements. ============================== United Parcel Service Nigeria LTD Plot 781 Emeka Anyaoku Street Area Eleven Garki FCT-Abuja Nigeria. Tel: +234-807-217-8475 Email: ups-customerservice@live.co.uk ============================== Accept Our Greetings. Evans Laurie UN-HABITAT Information Officer From Daniele.Passerone@empa.ch Thu Sep 3 10:31:22 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n83FUu4l110900 for ; Thu, 3 Sep 2009 10:31:12 -0500 X-ASG-Debug-ID: 1251991899-221800260000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.empa.ch (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id AC2D44227B5 for ; Thu, 3 Sep 2009 08:31:40 -0700 (PDT) Received: from mx1.empa.ch (mx1.empa.ch [152.88.7.31]) by cuda.sgi.com with ESMTP id lszcSeTBK7xhJNh6 for ; Thu, 03 Sep 2009 08:31:40 -0700 (PDT) Received: from Du-Exc-Hub1.empa.emp-eaw.ch (localhost [127.0.0.1]) by mx1.empa.ch (Spam & Virus Firewall) with ESMTP id DE782D41D7 for ; Thu, 3 Sep 2009 17:31:36 +0200 (CEST) Received: from Du-Exc-Hub1.empa.emp-eaw.ch ([152.88.6.64]) by mx1.empa.ch with ESMTP id dopY9lSjzOCFrisv for ; Thu, 03 Sep 2009 17:31:36 +0200 (CEST) Received: from DU-Exc-Mail.empa.emp-eaw.ch ([fe80::bc9b:a2a9:e3fb:5e94]) by Du-Exc-Hub1.empa.emp-eaw.ch ([2002:9858:640::9858:640]) with mapi; Thu, 3 Sep 2009 17:31:36 +0200 From: "Passerone, Daniele" To: "xfs@oss.sgi.com" Date: Thu, 3 Sep 2009 17:31:36 +0200 X-ASG-Orig-Subj: RE: xfs data loss Subject: RE: xfs data loss Thread-Topic: RE: xfs data loss Thread-Index: Acosq599XJjOofGgRCSxA+/2swSBlQ== Message-ID: Accept-Language: it-IT, de-CH Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: it-IT, de-CH Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Barracuda-Connect: mx1.empa.ch[152.88.7.31] X-Barracuda-Start-Time: 1251991904 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0011 1.0000 -2.0139 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.01 X-Barracuda-Spam-Status: No, SCORE=-2.01 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8003 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Dear Peter,=20 Thank you very much for the time spent in writing this long and=20 interesting answer. Now I agree with you, that harsh and useful is better than emollient and lying :-) > When you write to a mailing list asking for free help and support, > it is rather rude to not have done some preliminary work, such as > figuring out the characterisics of RAID5 in case of failure. It > is also somewhat rude (but amazingly common) to make confused and > partial reports, such as not checking and reporting what has > actually failed. That is true. Unfortunately I am not the person who assembled the RAID5 and configured the machine, and I had to act mostly alone to figure out what to do. That is why I eventually preferred to make a partial report. > But a soft but more open assessment of how outrageous some queries > are is help too as it makes it easier to assess the gravity of the > situation. The smooth, emollient sell-side people will let you dig > your own grave. Just consider your statement below about "assume > clean" that to me sounds very dangerous (big euphemism), and that > did not elicit any warning from the sell-side: At the beginning of this week I was confronted with the following=20 situation: 1) /dev/md4 a 19+1 RAID 5, with the corresponding xfs /raidmd4 filesystem that had lost half of the directories=20 on the 24th of August; for NO PARTICULAR APPARENT REASON (and this still ma= kes me crazy). No logs, nothing.=20 2) /dev/md5, a 19+1 RAID 5, that could not mount anymore...lost superblock. 3) /dev/md6 , a 4+1 RAID5, that was not mounting anymore because 2 devices = were lost. My collegue zapped the filesystem (which was almost empty), and rebuilt the= RAID5.=20 Unfortunately I cannot say exactly what he did. For 2) it was clear what happened: At the distance of a few days, two devices of /dev/md5 died.=20 The information about the death of one device is issued in /var/log/warn. We did not check it during the last days, so when the second device died, i= t was too late. BUT: I followed the advice to make a read test on all devices (using dd) an= d all were ok. So it seemed to be a raid controller problem, of the same kind described h= ere http://maillists.uci.edu/mailman/public/uci-linux/2007-December/002225.html where a solution is proposed including the reassembling of the raid using m= dadm with the option=20 "assume-clean". This is where this "assume-clean" comes from: from a read t= est, followed by=20 the study of the above mailing list post. The resync of the /dev/md5 was performed, the raid was again with 20 workin= g devices,=20 but at the end of the day the filesystem still was not able to mount. So, I was eventually forced to do xfs_repair -L /dev/md5, which was a night= mare: incredible number of forking, inodes cleared... but eventually... successfu= l. I was in the meanwhile 10 years older and with all my hair suddenly greyed,= but... RESULT: /dev/md5 is again up and running, with all data. BUT at the same time, /dev/md4 was not able to mount anymore: superblock e= rror. So, at that point we bought another big drive (7 TB), we performed backup o= f /dev/md5 , and then we run the same procedure on /dev/md4.=20 RESULT: /dev/md4 is again up and running, but the data disappeared on Augus= t 24 were still missing. Since the structure was including all devices, at this point I run xfs_repa= ir -L /dev/md4. But nothing happens. No error, and half of the data still missing. So at this point I don't understand.=20 THERE IS ONE IMPORTANT THING THAT I DID NOT MENTION, BECAUSE IT WAS NOT EVI= DENT BY LOOKING AT /etc/raidtab,=20 /proc/mdstat, etc., and it was done by my collaborator All structure of the raids, partitioning etc. was done using Yast2 with LVM= . The use of LVM is a mistery to me, even more than the basic of the RAID ( := -) ) The /etc/lvm/backup and archive directories are empty. In yast2 now the LVM panel is empty, and I have forbidden my collaborator t= o try to go through LVM now... Coming to other specific questions: >Sure you can reassemble the RAID, but what do you mean by "still >ok"? Have you read-tested those 2 drives? Have you tested the >*other* 18 drives? How do you know none of the other 18 drives got >damaged? Have you verified that only the host adapter electronics >failed or whatever it was that made those 2 drives drop out? Tested all drives, but not the host adapter electronics. >Why do you *need* to assume clean? If the 2 "lost" drives are >really ok, you just resync the array.=20 Well, following the post above, after checking that the lost drives are ok,= =20 first I stop the raid, then I create the raid with 20 drives assuming them = clean,=20 then I stop it again, then assemble it with resyncing. >If you *need* to assume >clean, it is likely that you have lost something like 5% of data >in (every stripe and thus) most files and directories (and >internal metadata) and will be replacing it with random >bytes. That will very likely cause XFS problems (the least of the >problems of course). On the /raidmd5 fortunately this was not the case. >Especially in a place where part of the everyday >activity is earthquake simulation... LOL you are right. > But apart from that, it is not as easy to backup 20 TB, >Or to 'fsck' several TB as you also discovered. Anyhow my opinion >is that the best way to backup large storage servers is another >large storage server (or more than one). When I buy a hard drive I >buy 3 backup drives for each "live" drive I use -- at *home*. At least now, we did at least that right. >Not at all absurd -- if those users *really* accept that. But you >are trying to recover the arrays instead of scratching them and >restarting. That suggests to me that the users did not actually >accept that. If the real agreement with the users is "you have to >keep backups, but if something happens you will behave as if you >cannot or don't want to restore them" it is quite different. Well. You would be surprised to know how stupid can scientist be when=20 they ignore the worst case scenario.=20 Including myself. I knew exactly the situation, but if I had not succeeded in recovering=20 /raid/md5, it would have been a hard moment for me and my research group. And we ALL knew that there were no backups. >That's not so clear. One problem with trying to provide some >opinions on your issue and whether the filesystems are recoverable >is that you haven't made clear what failed and how you tested each >component of each array to make sure that what is still working is >known (and talk of "assume clean" is very suspicious). Just to clarify: assume-clean was an option to the mdadm --create command when I discovered that my 20 devices were there and running: I run a dd com= mand reading the first megabytes of each device. Was this wrong? >That you have tried to run repair tools on a filesystem with an >incomplete storage layer may have made things rather worse, so >knowing *exactly* what has failed may help you a lot. I will contact the Sun service and ask them to check the whole storage-cont= roller part. In the meanwhile I am almost convinced that that 4-5 TB lost on /dev/md4 ar= e lost for good. I sent the metadata one week ago to the mailing list. Do you think this cou= ld help in examining the famous 20 drives? I hope I could catch up. I am trying to learn quickly. Thanks, Daniele From BATV+e647964f7e3a8370884a+2202+infradead.org+hch@bombadil.srs.infradead.org Thu Sep 3 10:45:20 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n83FitGk111768 for ; Thu, 3 Sep 2009 10:45:10 -0500 X-ASG-Debug-ID: 1251992752-428601590000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 001B915C5907; Thu, 3 Sep 2009 08:45:52 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id oninTJ3GlojV60Oy; Thu, 03 Sep 2009 08:45:52 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1MjEVU-0004zh-3t; Thu, 03 Sep 2009 15:45:52 +0000 Date: Thu, 3 Sep 2009 11:45:52 -0400 From: Christoph Hellwig To: Alex Elder Cc: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 2/4] xfs: make sure xfs_sync_fsdata covers the log Subject: Re: [PATCH 2/4] xfs: make sure xfs_sync_fsdata covers the log Message-ID: <20090903154551.GA16715@infradead.org> References: <20090827231558.057467775@bombadil.infradead.org> <1AB9A794DBDDF54A8A81BE2296F7BDFE83ABF3@cf--amer001e--3.americas.sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1AB9A794DBDDF54A8A81BE2296F7BDFE83ABF3@cf--amer001e--3.americas.sgi.com> User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251992753 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean FYI: I found some nasty deadlock in this on a large machine, please hold back until I've sorted it out. From BATV+e647964f7e3a8370884a+2202+infradead.org+hch@bombadil.srs.infradead.org Thu Sep 3 10:45:39 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n83FjDCf111807 for ; Thu, 3 Sep 2009 10:45:29 -0500 X-ASG-Debug-ID: 1251992771-73c7034d0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8561D4240DF for ; Thu, 3 Sep 2009 08:46:11 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id G4xMP3NhRTSWBdiJ for ; Thu, 03 Sep 2009 08:46:11 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1MjEVn-000532-5N for xfs@oss.sgi.com; Thu, 03 Sep 2009 15:46:11 +0000 Date: Thu, 3 Sep 2009 11:46:11 -0400 From: Christoph Hellwig To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] xfs: use correct log reservation when handling ENOSPC in xfs_create Subject: Re: [PATCH] xfs: use correct log reservation when handling ENOSPC in xfs_create Message-ID: <20090903154611.GB16715@infradead.org> References: <20090826113736.GA15562@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090826113736.GA15562@infradead.org> User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251992771 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean ping? On Wed, Aug 26, 2009 at 07:37:36AM -0400, Christoph Hellwig wrote: > We added the ENOSPC handling patch in xfs_create just after it got mered > with xfs_mkdir. Change the log reservation to the variable for either > the create or mkdir value so it does the right thing if get here for creating > a directory. > > Signed-off-by: Christoph Hellwig > > Index: xfs/fs/xfs/xfs_vnodeops.c > =================================================================== > --- xfs.orig/fs/xfs/xfs_vnodeops.c 2009-08-24 11:35:52.794261954 -0300 > +++ xfs/fs/xfs/xfs_vnodeops.c 2009-08-24 11:36:25.466636450 -0300 > @@ -1479,8 +1479,8 @@ xfs_create( > if (error == ENOSPC) { > /* flush outstanding delalloc blocks and retry */ > xfs_flush_inodes(dp); > - error = xfs_trans_reserve(tp, resblks, XFS_CREATE_LOG_RES(mp), 0, > - XFS_TRANS_PERM_LOG_RES, XFS_CREATE_LOG_COUNT); > + error = xfs_trans_reserve(tp, resblks, log_res, 0, > + XFS_TRANS_PERM_LOG_RES, log_count); > } > if (error == ENOSPC) { > /* No space at all so try a "no-allocation" reservation */ > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs ---end quoted text--- From BATV+e647964f7e3a8370884a+2202+infradead.org+hch@bombadil.srs.infradead.org Thu Sep 3 10:51:16 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n83FoptX112119 for ; Thu, 3 Sep 2009 10:51:06 -0500 X-ASG-Debug-ID: 1251993109-1877002d0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 47DAF15C0EE2 for ; Thu, 3 Sep 2009 08:51:49 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id 2eRd6SaI86FzSQP2 for ; Thu, 03 Sep 2009 08:51:49 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1MjEYB-0005Lr-CZ; Thu, 03 Sep 2009 15:48:39 +0000 Date: Thu, 3 Sep 2009 11:48:39 -0400 From: Christoph Hellwig To: Jan Engelhardt Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Mounted xfs slows down block device Subject: Re: Mounted xfs slows down block device Message-ID: <20090903154839.GC16715@infradead.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251993109 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Strange, as XFS doesn't actually use the block device mapping at all, so all the caching doesn't interact with each other. Maybe some throtteling code in the VM doesn't like these parallel accesses. But I need to add that reading the block device on a mounted filesystem is not a good idea anyay - you will not get any sort of concistency guarantee. From aelder@sgi.com Thu Sep 3 11:38:21 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n83Gc1pH114883 for ; Thu, 3 Sep 2009 11:38:11 -0500 Received: from cf--amer001e--3.americas.sgi.com (cf--amer001e--3.americas.sgi.com [137.38.100.5]) by relay2.corp.sgi.com (Postfix) with ESMTP id C3C37304093 for ; Thu, 3 Sep 2009 09:38:58 -0700 (PDT) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [PATCH] xfs: use correct log reservation when handling ENOSPC inxfs_create Date: Thu, 3 Sep 2009 11:35:41 -0500 Message-ID: <1AB9A794DBDDF54A8A81BE2296F7BDFE83AC27@cf--amer001e--3.americas.sgi.com> In-Reply-To: <20090826113736.GA15562@infradead.org> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PATCH] xfs: use correct log reservation when handling ENOSPC inxfs_create Thread-Index: AcomRnnZKIyKPu1vSRyNGJjitvdkDQGbf54A From: "Alex Elder" To: "Christoph Hellwig" Cc: X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Christoph Hellwig wrote: > We added the ENOSPC handling patch in xfs_create just after it got = mered > with xfs_mkdir. Change the log reservation to the variable for either > the create or mkdir value so it does the right thing if get here for = creating > a directory. >=20 > Signed-off-by: Christoph Hellwig >=20 > Index: xfs/fs/xfs/xfs_vnodeops.c > = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- xfs.orig/fs/xfs/xfs_vnodeops.c 2009-08-24 11:35:52.794261954 -0300 > +++ xfs/fs/xfs/xfs_vnodeops.c 2009-08-24 11:36:25.466636450 -0300 > @@ -1479,8 +1479,8 @@ xfs_create( > if (error =3D=3D ENOSPC) { > /* flush outstanding delalloc blocks and retry */ > xfs_flush_inodes(dp); > - error =3D xfs_trans_reserve(tp, resblks, XFS_CREATE_LOG_RES(mp), 0, > - XFS_TRANS_PERM_LOG_RES, XFS_CREATE_LOG_COUNT); > + error =3D xfs_trans_reserve(tp, resblks, log_res, 0, > + XFS_TRANS_PERM_LOG_RES, log_count); > } > if (error =3D=3D ENOSPC) { > /* No space at all so try a "no-allocation" reservation */ >=20 Looks good. Reviewed-by: Alex Elder From BATV+e647964f7e3a8370884a+2202+infradead.org+hch@bombadil.srs.infradead.org Thu Sep 3 11:41:44 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_35 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n83GfJAA115065 for ; Thu, 3 Sep 2009 11:41:34 -0500 X-ASG-Debug-ID: 1251996136-198f01830000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id AE6F615C65BE for ; Thu, 3 Sep 2009 09:42:16 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id dEKeAjXvAKBQ1GPU for ; Thu, 03 Sep 2009 09:42:16 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1MjFNx-0008Mg-WA; Thu, 03 Sep 2009 16:42:10 +0000 Date: Thu, 3 Sep 2009 12:42:09 -0400 From: Christoph Hellwig To: Dave Chinner Cc: Christoph Hellwig , Theodore Tso , Chris Mason , linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: ext4 writepages is making tiny bios? Subject: Re: ext4 writepages is making tiny bios? Message-ID: <20090903164209.GA28384@infradead.org> References: <20090901184450.GB7885@think> <20090901205744.GE6996@mit.edu> <20090901212740.GA9930@infradead.org> <20090903055201.GA7146@discord.disaster> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090903055201.GA7146@discord.disaster> User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1251996136 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Thu, Sep 03, 2009 at 03:52:01PM +1000, Dave Chinner wrote: > > XFS did the mistake of trusting the VM, while everyone more or less > > overrode it. Removing all those checks and writing out much larger > > data fixes it with a relatively small patch: > > > > http://verein.lst.de/~hch/xfs/xfs-writeback-scaling > > Careful: > > - tloff = min(tlast, startpage->index + 64); > + tloff = min(tlast, startpage->index + 8192); > > That will cause 64k page machines to try to write back 512MB at a > time. This will re-introduce similar to the behaviour in sles9 where > writeback would only terminate at the end of an extent (because the > mapping end wasn't capped like above). Pretty good point, any applies to all the different things we discussed recently. Ted, should be maybe introduce a max_writeback_mb instead of the max_writeback_pages in the VM, too? From gwehrman@sgi.com Thu Sep 3 15:56:31 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from relay.sgi.com (relay1.corp.sgi.com [137.38.102.111]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n83KuB4a131539 for ; Thu, 3 Sep 2009 15:56:21 -0500 Received: from goalpost.americas.sgi.com (goalpost.americas.sgi.com [128.162.232.54]) by relay1.corp.sgi.com (Postfix) with ESMTP id E36E58F8049 for ; Thu, 3 Sep 2009 13:57:08 -0700 (PDT) Received: by goalpost.americas.sgi.com (Postfix, from userid 14442) id 9127F2526DDC; Thu, 3 Sep 2009 15:49:40 -0500 (CDT) Date: Thu, 3 Sep 2009 15:49:40 -0500 From: Geoffrey Wehrman To: Christoph Hellwig Cc: xfs@oss.sgi.com Subject: Re: [PATCH 00/14] repair memory usage reductions Message-ID: <20090903204940.GB24510@sgi.com> References: <20090902175531.469184575@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090902175531.469184575@bombadil.infradead.org> User-Agent: Mutt/1.5.14 (2007-02-12) X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Wed, Sep 02, 2009 at 01:55:31PM -0400, Christoph Hellwig wrote: | This is a respin of the patches Barry Naujok wrote at SGI for reducing | the memory usage in repair. I've split it up, fixed a few small bugs | and added two preparatory cleanups - but all the real work is Barry's. | There has been lots of heavy testing on large filesystems by Barry | on the original patches, and quite a lot of testing on slightly smaller | filesystems by me. These were all ad-hoc tests as XFSQA coverage is | rather low on repair. My plan is to add various additional testcase | for XFSQA both for intentional corruptions as well as reproducing past | reported bugs before we'll release these patches in xfsprogs. But I think | it would be good if we could get them into the development git tree to | get wider coverage already. How do these changes affect xfs_repair I/O performance? Barry changes were previously withheld within SGI due to a regression in performance. -- Geoffrey Wehrman 651-683-5496 gwehrman@sgi.com From tytso@mit.edu Thu Sep 3 19:15:51 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_21, J_CHICKENPOX_35 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n840FVR9146860 for ; Thu, 3 Sep 2009 19:15:41 -0500 X-ASG-Debug-ID: 1252023360-325100330000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from thunker.thunk.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id A0EF015CF0C5 for ; Thu, 3 Sep 2009 17:16:00 -0700 (PDT) Received: from thunker.thunk.org (THUNK.ORG [69.25.196.29]) by cuda.sgi.com with ESMTP id rr96niqx4lMLm397 for ; Thu, 03 Sep 2009 17:16:00 -0700 (PDT) Received: from root (helo=closure.thunk.org) by thunker.thunk.org with local-esmtp (Exim 4.50 #1 (Debian)) id 1MjMSx-0000iR-0B; Thu, 03 Sep 2009 20:15:47 -0400 Received: from tytso by closure.thunk.org with local (Exim 4.69) (envelope-from ) id 1MjMSw-0001AH-0I; Thu, 03 Sep 2009 20:15:46 -0400 Date: Thu, 3 Sep 2009 20:15:45 -0400 From: Theodore Tso To: Christoph Hellwig Cc: Dave Chinner , Chris Mason , linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com, Jens Axboe X-ASG-Orig-Subj: Re: ext4 writepages is making tiny bios? Subject: Re: ext4 writepages is making tiny bios? Message-ID: <20090904001545.GA30759@mit.edu> References: <20090901184450.GB7885@think> <20090901205744.GE6996@mit.edu> <20090901212740.GA9930@infradead.org> <20090903055201.GA7146@discord.disaster> <20090903164209.GA28384@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090903164209.GA28384@infradead.org> User-Agent: Mutt/1.5.18 (2008-05-17) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@mit.edu X-SA-Exim-Scanned: No (on thunker.thunk.org); SAEximRunCond expanded to false X-Barracuda-Connect: THUNK.ORG[69.25.196.29] X-Barracuda-Start-Time: 1252023382 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8033 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Thu, Sep 03, 2009 at 12:42:09PM -0400, Christoph Hellwig wrote: > > Careful: > > > > - tloff = min(tlast, startpage->index + 64); > > + tloff = min(tlast, startpage->index + 8192); > > > > That will cause 64k page machines to try to write back 512MB at a > > time. This will re-introduce similar to the behaviour in sles9 where > > writeback would only terminate at the end of an extent (because the > > mapping end wasn't capped like above). > > Pretty good point, any applies to all the different things we discussed > recently. Ted, should be maybe introduce a max_writeback_mb instead of > the max_writeback_pages in the VM, too? Good point. Jens, maybe we should replace my patch with this one, which makes the tunable in terms of megabytes instead of pages? - Ted commit ed48d661394a6b22e9d376a7ad5327c2b9080a9c Author: Theodore Ts'o Date: Tue Sep 1 13:19:06 2009 +0200 vm: Add an tuning knob for vm.max_writeback_mb Originally, MAX_WRITEBACK_PAGES was hard-coded to 1024 because of a concern of not holding I_SYNC for too long. (At least, that was the comment previously.) This doesn't make sense now because the only time we wait for I_SYNC is if we are calling sync or fsync, and in that case we need to write out all of the data anyway. Previously there may have been other code paths that waited on I_SYNC, but not any more. According to Christoph, the current writeback size is way too small, and XFS had a hack that bumped out nr_to_write to four times the value sent by the VM to be able to saturate medium-sized RAID arrays. This value was also problematic for ext4 as well, as it caused large files to be come interleaved on disk by in 8 megabyte chunks (we bumped up the nr_to_write by a factor of two). So, in this patch, we make the MAX_WRITEBACK_PAGES a tunable, max_writeback_mb, and set it to a default value of 128 megabytes. http://bugzilla.kernel.org/show_bug.cgi?id=13930 Signed-off-by: "Theodore Ts'o" diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 38cb758..a9b230f 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -585,14 +585,7 @@ void generic_sync_bdi_inodes(struct writeback_control *wbc) generic_sync_wb_inodes(&bdi->wb, wbc); } -/* - * The maximum number of pages to writeout in a single bdi flush/kupdate - * operation. We do this so we don't hold I_SYNC against an inode for - * enormous amounts of time, which would block a userspace task which has - * been forced to throttle against that inode. Also, the code reevaluates - * the dirty each time it has written this many pages. - */ -#define MAX_WRITEBACK_PAGES 1024 +#define MAX_WRITEBACK_PAGES (max_writeback_mb << (20 - PAGE_SHIFT)) static inline bool over_bground_thresh(void) { diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 34c59f9..57cd3b5 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -103,6 +103,7 @@ extern int vm_dirty_ratio; extern unsigned long vm_dirty_bytes; extern unsigned int dirty_writeback_interval; extern unsigned int dirty_expire_interval; +extern unsigned int max_writeback_mb; extern int vm_highmem_is_dirtyable; extern int block_dump; extern int laptop_mode; diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 58be760..315fc30 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -1104,6 +1104,14 @@ static struct ctl_table vm_table[] = { .proc_handler = &proc_dointvec, }, { + .ctl_name = CTL_UNNUMBERED, + .procname = "max_writeback_mb", + .data = &max_writeback_mb, + .maxlen = sizeof(max_writeback_mb), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, + { .ctl_name = VM_NR_PDFLUSH_THREADS, .procname = "nr_pdflush_threads", .data = &nr_pdflush_threads, diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 0fce7df..77decaa 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -55,6 +55,12 @@ static inline long sync_writeback_pages(void) /* The following parameters are exported via /proc/sys/vm */ /* + * The maximum amount of memory (in megabytes) to write out in a + * single bdflush/kupdate operation. + */ +unsigned int max_writeback_mb = 128; + +/* * Start background writeback (via pdflush) at this percentage */ int dirty_background_ratio = 10; From greg@kroah.com Thu Sep 3 19:58:07 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,J_CHICKENPOX_34 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n840vlM1150210 for ; Thu, 3 Sep 2009 19:57:57 -0500 X-ASG-Debug-ID: 1252025917-7b5d01940000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from coco.kroah.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5741F426D1F for ; Thu, 3 Sep 2009 17:58:37 -0700 (PDT) Received: from coco.kroah.org (kroah.org [198.145.64.141]) by cuda.sgi.com with ESMTP id uGaiTCt0zvcflikY for ; Thu, 03 Sep 2009 17:58:37 -0700 (PDT) Received: from localhost (c-98-246-45-209.hsd1.or.comcast.net [98.246.45.209]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by coco.kroah.org (Postfix) with ESMTPSA id 689704825E; Thu, 3 Sep 2009 15:27:33 -0700 (PDT) Date: Thu, 3 Sep 2009 15:19:58 -0700 From: Greg KH To: Christoph Hellwig Cc: stable@kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [stable] [PATCH 0/4] 2.6.30-stable backport of the XFS+NFSD inode cache race fix Subject: Re: [stable] [PATCH 0/4] 2.6.30-stable backport of the XFS+NFSD inode cache race fix Message-ID: <20090903221958.GC23517@kroah.com> References: <20090819184258.542698202@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090819184258.542698202@bombadil.infradead.org> User-Agent: Mutt/1.5.20 (2009-06-14) X-Barracuda-Connect: kroah.org[198.145.64.141] X-Barracuda-Start-Time: 1252025921 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8037 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Wed, Aug 19, 2009 at 02:42:58PM -0400, Christoph Hellwig wrote: > > This is a backport of the fix for the XFS inode cache races lots of people > have reported. The context for the two VFS patches changed quite a bit from > 2.6.30 to 2.6.31-rc so I consider these backports. The backport has been > tested by me with xfstest for XFS and extN, and by lots of users that have > been waiting for the fix for their nfs servers still running 2.6.30. All queued up, thanks. greg k-h From bnaujok@optusnet.com.au Thu Sep 3 21:24:02 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00, MSGID_FROM_MTA_HEADER autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n842Nfws156062 for ; Thu, 3 Sep 2009 21:23:52 -0500 X-ASG-Debug-ID: 1252031072-77cf03720000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail04.syd.optusnet.com.au (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 18A02426F8C; Thu, 3 Sep 2009 19:24:32 -0700 (PDT) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by cuda.sgi.com with ESMTP id cZBVbpTnMLneUynq; Thu, 03 Sep 2009 19:24:32 -0700 (PDT) Received: from localhost.localdomain (webmail08.syd.optusnet.com.au [211.29.133.113]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n842OHYo023675; Fri, 4 Sep 2009 12:24:17 +1000 Message-Id: <200909040224.n842OHYo023675@mail04.syd.optusnet.com.au> Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: binary Mime-Version: 1.0 X-Mailer: MIME-tools 5.420 (Entity 5.420) Received: from 203-206-165-193.perm.iinet.net.au ([203.206.165.193]) by webmail08.syd.optusnet.com.au with http (user=bnaujok@optusnet.com.au); Fri, 04 Sep 2009 12:24:17 +1000 From: Barry Naujok To: Geoffrey Wehrman Cc: Christoph Hellwig , xfs@oss.sgi.com Date: Fri, 04 Sep 2009 12:24:17 +1000 X-ASG-Orig-Subj: Re: Re: [PATCH 00/14] repair memory usage reductions Subject: Re: Re: [PATCH 00/14] repair memory usage reductions X-Barracuda-Connect: mail04.syd.optusnet.com.au[211.29.132.185] X-Barracuda-Start-Time: 1252031079 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -0.53 X-Barracuda-Spam-Status: No, SCORE=-0.53 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=MSGID_FROM_MTA_HEADER, MSGID_FROM_MTA_HEADER_2 X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8043 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 MSGID_FROM_MTA_HEADER Message-Id was added by a relay 1.50 MSGID_FROM_MTA_HEADER_2 Message-Id was added by a relay X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Geoffrey Wehrman wrote: > > On Wed, Sep 02, 2009 at 01:55:31PM -0400, Christoph Hellwig wrote: > | This is a respin of the patches Barry Naujok wrote at SGI for reducing > | the memory usage in repair. I've split it up, fixed a few small bugs > | and added two preparatory cleanups - but all the real work is Barry's. > | There has been lots of heavy testing on large filesystems by Barry > | on the original patches, and quite a lot of testing on slightly > smaller > | filesystems by me. These were all ad-hoc tests as XFSQA coverage is > | rather low on repair. My plan is to add various additional testcase > | for XFSQA both for intentional corruptions as well as reproducing past > | reported bugs before we'll release these patches in xfsprogs. But I > think > | it would be good if we could get them into the development git tree to > | get wider coverage already. > > How do these changes affect xfs_repair I/O performance? Barry changes > were previously withheld within SGI due to a regression in performance. They were withheld? First I've heard about that. I spent a lot of time on those changes to minimize the performance impact, and with increasing xfs_repair's cache size, can actually be faster now depending on the system's RAM and filesystem size. And it's certainly faster than xfs_repair before my performance optimisation changes. Barry. From SRS0+7RX2+1+fromorbit.com=david@internode.on.net Thu Sep 3 21:57:49 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n842vTbQ158463 for ; Thu, 3 Sep 2009 21:57:39 -0500 X-ASG-Debug-ID: 1252033084-3ace004b0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 775C915CFEFC for ; Thu, 3 Sep 2009 19:58:05 -0700 (PDT) Received: from mail.internode.on.net (bld-mail16.adl2.internode.on.net [150.101.137.101]) by cuda.sgi.com with ESMTP id hPG62UB754eENMMd for ; Thu, 03 Sep 2009 19:58:05 -0700 (PDT) Received: from discord (unverified [121.44.12.22]) by mail.internode.on.net (SurgeMail 3.8f2) with ESMTP id 4673549-1927428 for multiple; Fri, 04 Sep 2009 12:27:55 +0930 (CST) Received: from dave by discord with local (Exim 4.69) (envelope-from ) id 1MjOzp-0007vO-PZ; Fri, 04 Sep 2009 12:57:53 +1000 Date: Fri, 4 Sep 2009 12:57:53 +1000 From: Dave Chinner To: Geoffrey Wehrman Cc: Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 00/14] repair memory usage reductions Subject: Re: [PATCH 00/14] repair memory usage reductions Message-ID: <20090904025753.GB7146@discord.disaster> References: <20090902175531.469184575@bombadil.infradead.org> <20090903204940.GB24510@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090903204940.GB24510@sgi.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-Barracuda-Connect: bld-mail16.adl2.internode.on.net[150.101.137.101] X-Barracuda-Start-Time: 1252033106 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_SC5_SA210e X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8045 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 BSF_SC5_SA210e Custom Rule SA210e X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Thu, Sep 03, 2009 at 03:49:40PM -0500, Geoffrey Wehrman wrote: > On Wed, Sep 02, 2009 at 01:55:31PM -0400, Christoph Hellwig wrote: > | This is a respin of the patches Barry Naujok wrote at SGI for reducing > | the memory usage in repair. I've split it up, fixed a few small bugs > | and added two preparatory cleanups - but all the real work is Barry's. > | There has been lots of heavy testing on large filesystems by Barry > | on the original patches, and quite a lot of testing on slightly smaller > | filesystems by me. These were all ad-hoc tests as XFSQA coverage is > | rather low on repair. My plan is to add various additional testcase > | for XFSQA both for intentional corruptions as well as reproducing past > | reported bugs before we'll release these patches in xfsprogs. But I think > | it would be good if we could get them into the development git tree to > | get wider coverage already. > > How do these changes affect xfs_repair I/O performance? Barry changes > were previously withheld within SGI due to a regression in performance. Christoph asked me to repeat what I said on #xfs w.r.t the regression. The repair slowdowns were a result of increased CPU usage of the btree structures used to track free space compared to manipulating massive bitmaps. Hence if you have a disk subsystem fast enough that prefetching could keep the CPUs 100% busy processing all the incoming metadata the memory-optimised repair was about 30% slower than the existing repair code. However, given that getting to being CPU bound with the current repair code requires having a *lot* of memory, so the more common case is that you have to add gigabytes of swap space so that repair can run. In these situations, the current repair will run much, much slower than the memory optimised repair because the new version does not have to swap. Indeed, I recall one of the driving factors for this work was the SGI customer that needed to connect their 300TB (or was it 600TB?) XFS filesystem to an Altix with 2TB of RAM to be able to repair it because the server head connected to the filesystem did not have 2TB of storage available to assign as swap space. That is, XFS scalability is limited by the amount of memory needed by repair.... Another mitigating factor is that the worst regressions were on ia64, for which bitmap manipulation is far more friendly than branchy, cache-miss causing btree traversals. Hence the regression will be less (maybe even not present) on current x86-64 CPUs which handle branches and cache misses far, far better than Altix/ia64.... With that in mind, I think the memory usage optimisation is far more important to the majority of XFS users than the CPU usage regression it causes as the majority of users don't have RAM-rich environments to run repair in. Cheers, Dave. -- Dave Chinner david@fromorbit.com From axboe@kernel.dk Fri Sep 4 02:19:45 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_35 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n847JO1r183045 for ; Fri, 4 Sep 2009 02:19:35 -0500 X-ASG-Debug-ID: 1252048816-1ba501590000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from kernel.dk (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7C72A427849 for ; Fri, 4 Sep 2009 00:20:16 -0700 (PDT) Received: from kernel.dk (brick.kernel.dk [93.163.65.50]) by cuda.sgi.com with ESMTP id Nx0WC8pNEZFjnNh0 for ; Fri, 04 Sep 2009 00:20:16 -0700 (PDT) Received: by kernel.dk (Postfix, from userid 1000) id B625C37A0C7; Fri, 4 Sep 2009 09:20:13 +0200 (CEST) Date: Fri, 4 Sep 2009 09:20:13 +0200 From: Jens Axboe To: Theodore Tso Cc: Christoph Hellwig , Dave Chinner , Chris Mason , linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: ext4 writepages is making tiny bios? Subject: Re: ext4 writepages is making tiny bios? Message-ID: <20090904072013.GR18599@kernel.dk> References: <20090901184450.GB7885@think> <20090901205744.GE6996@mit.edu> <20090901212740.GA9930@infradead.org> <20090903055201.GA7146@discord.disaster> <20090903164209.GA28384@infradead.org> <20090904001545.GA30759@mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090904001545.GA30759@mit.edu> X-Barracuda-Connect: brick.kernel.dk[93.163.65.50] X-Barracuda-Start-Time: 1252048819 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8063 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Thu, Sep 03 2009, Theodore Tso wrote: > On Thu, Sep 03, 2009 at 12:42:09PM -0400, Christoph Hellwig wrote: > > > Careful: > > > > > > - tloff = min(tlast, startpage->index + 64); > > > + tloff = min(tlast, startpage->index + 8192); > > > > > > That will cause 64k page machines to try to write back 512MB at a > > > time. This will re-introduce similar to the behaviour in sles9 where > > > writeback would only terminate at the end of an extent (because the > > > mapping end wasn't capped like above). > > > > Pretty good point, any applies to all the different things we discussed > > recently. Ted, should be maybe introduce a max_writeback_mb instead of > > the max_writeback_pages in the VM, too? > > Good point. > > Jens, maybe we should replace my patch with this one, which makes the > tunable in terms of megabytes instead of pages? That is probably a better metric than 'pages', lets update it. -- Jens Axboe From andi@firstfloor.org Fri Sep 4 06:08:25 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n84B85uF202182 for ; Fri, 4 Sep 2009 06:08:15 -0500 X-ASG-Debug-ID: 1252062511-5c8e013e0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from one.firstfloor.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 93B69428774 for ; Fri, 4 Sep 2009 04:08:31 -0700 (PDT) Received: from one.firstfloor.org (one.firstfloor.org [213.235.205.2]) by cuda.sgi.com with ESMTP id 6nD9mk7I7mNiIYbc for ; Fri, 04 Sep 2009 04:08:31 -0700 (PDT) Received: from basil.firstfloor.org (p5B3CB6DD.dip0.t-ipconnect.de [91.60.182.221]) by one.firstfloor.org (Postfix) with ESMTP id AD7821F0800E; Fri, 4 Sep 2009 13:08:25 +0200 (CEST) Received: by basil.firstfloor.org (Postfix, from userid 1000) id 37A1CB16FF; Fri, 4 Sep 2009 13:08:25 +0200 (CEST) To: pg_xf2@xf2.to.sabi.co.UK (Peter Grandi) Cc: Linux XFS X-ASG-Orig-Subj: Re: xfs data loss Subject: Re: xfs data loss From: Andi Kleen References: <4A975A35.3060809@sandeen.net> <4A981133.6060009@sandeen.net> <19101.5976.387292.614270@tree.ty.sabi.co.uk> Date: Fri, 04 Sep 2009 13:08:25 +0200 In-Reply-To: <19101.5976.387292.614270@tree.ty.sabi.co.uk> (Peter Grandi's message of "Tue, 1 Sep 2009 12:45:12 +0000") Message-ID: <87tyzjufva.fsf@basil.nowhere.org> User-Agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/22.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Barracuda-Connect: one.firstfloor.org[213.235.205.2] X-Barracuda-Start-Time: 1252062539 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8075 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean pg_xf2@xf2.to.sabi.co.UK (Peter Grandi) writes: > > Depends -- some people here are XFS salesmen, in that their career > and employability depend at least in part on widespread adoption > of XFS, and on support from other kernel subsystem guys, who may > be one day on an interview panel (the guild of Linux kernel > hackers is a pretty small and closed world in practice). These are > sell-side engineers, and they will be smooth and emollient even in > the face of outrageously ridiculous stuff. The main thing that seems `outrageously ridiculous' is your cynical and totally unfair and in my experience incorrect description of the people who are doing great work on XFS and unlike you actually helping users on this mailing list and improving Linux. -Andi From Daniele.Passerone@empa.ch Fri Sep 4 06:45:28 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n84Bj3eJ204182 for ; Fri, 4 Sep 2009 06:45:18 -0500 X-ASG-Debug-ID: 1252064755-0e7d00fd0000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.empa.ch (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1AA2A1D68214 for ; Fri, 4 Sep 2009 04:45:55 -0700 (PDT) Received: from mx1.empa.ch (mx1.empa.ch [152.88.7.31]) by cuda.sgi.com with ESMTP id fUXphFPBvKP8bkGP for ; Fri, 04 Sep 2009 04:45:55 -0700 (PDT) Received: from eaw-exc-hub1.eawag.wroot.emp-eaw.ch (localhost [127.0.0.1]) by mx1.empa.ch (Spam & Virus Firewall) with ESMTP id E010BD5FE1 for ; Fri, 4 Sep 2009 13:45:54 +0200 (CEST) Received: from eaw-exc-hub1.eawag.wroot.emp-eaw.ch (eaw-exc-hub1.emp-eaw.ch [152.88.5.116]) by mx1.empa.ch with ESMTP id HsDkvChiYVU24MP7 for ; Fri, 04 Sep 2009 13:45:54 +0200 (CEST) Received: from DU-Exc-Mail.empa.emp-eaw.ch ([fe80::bc9b:a2a9:e3fb:5e94]) by eaw-exc-hub1.eawag.wroot.emp-eaw.ch ([2002:9858:574::9858:574]) with mapi; Fri, 4 Sep 2009 13:45:54 +0200 From: "Passerone, Daniele" To: "xfs@oss.sgi.com" Date: Fri, 4 Sep 2009 13:45:53 +0200 X-ASG-Orig-Subj: Re: xfs data loss Subject: Re: xfs data loss Thread-Topic: Re: xfs data loss Thread-Index: AcotVUHk47j4TfaOStmYMe0NogtfMg== Message-ID: Accept-Language: it-IT, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: it-IT, en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Barracuda-Connect: mx1.empa.ch[152.88.7.31] X-Barracuda-Start-Time: 1252064758 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0618 1.0000 -1.6259 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.63 X-Barracuda-Spam-Status: No, SCORE=-1.63 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8077 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Commenting further on my preceding message, I just would like to stress the= fact that everybody here has tried to help - xfs and not-xfs people. So I = have seen no emollient answers here, at least not to my query. Mr. Peter Grandi was harsh - very harsh at the beginning, but I think he a= lso spent time to think about my problem. For that I am grateful. I am less grateful for being defined "outraugeously ridicolous". But I can = skip that in times of trouble... Daniele From gwehrman@sgi.com Fri Sep 4 08:46:31 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n84DkA98211688 for ; Fri, 4 Sep 2009 08:46:21 -0500 Received: from goalpost.americas.sgi.com (goalpost.americas.sgi.com [128.162.232.54]) by relay2.corp.sgi.com (Postfix) with ESMTP id EAA65304064 for ; Fri, 4 Sep 2009 06:47:08 -0700 (PDT) Received: by goalpost.americas.sgi.com (Postfix, from userid 14442) id 6169E2526DDE; Fri, 4 Sep 2009 08:37:37 -0500 (CDT) Date: Fri, 4 Sep 2009 08:37:37 -0500 From: Geoffrey Wehrman To: Dave Chinner Cc: Christoph Hellwig , xfs@oss.sgi.com Subject: Re: [PATCH 00/14] repair memory usage reductions Message-ID: <20090904133737.GD12052@sgi.com> References: <20090902175531.469184575@bombadil.infradead.org> <20090903204940.GB24510@sgi.com> <20090904025753.GB7146@discord.disaster> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090904025753.GB7146@discord.disaster> User-Agent: Mutt/1.5.14 (2007-02-12) X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Fri, Sep 04, 2009 at 12:57:53PM +1000, Dave Chinner wrote: | Christoph asked me to repeat what I said on #xfs w.r.t the regression. Thank you for the detailed description. All I had was a statement from January 2008, "Barry has completed the memory optimization, but initial testing shows that performance has regressed." That was the last update recorded on Barry's work. | With that in mind, I think the memory usage optimisation is far more | important to the majority of XFS users than the CPU usage regression | it causes as the majority of users don't have RAM-rich environments | to run repair in. I agree. -- Geoffrey Wehrman 651-683-5496 gwehrman@sgi.com From BATV+e49c75e08a496b1b2e4f+2203+infradead.org+hch@bombadil.srs.infradead.org Fri Sep 4 09:53:52 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n84ErPCh216643 for ; Fri, 4 Sep 2009 09:53:42 -0500 X-ASG-Debug-ID: 1252076063-738e02950000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7B0D1429773; Fri, 4 Sep 2009 07:54:23 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id ObPMo6K6OskN6uwm; Fri, 04 Sep 2009 07:54:23 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Mja85-0002mD-Dv; Fri, 04 Sep 2009 14:51:09 +0000 Date: Fri, 4 Sep 2009 10:51:09 -0400 From: Christoph Hellwig To: Geoffrey Wehrman Cc: Dave Chinner , Christoph Hellwig , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 00/14] repair memory usage reductions Subject: Re: [PATCH 00/14] repair memory usage reductions Message-ID: <20090904145109.GA8351@infradead.org> References: <20090902175531.469184575@bombadil.infradead.org> <20090903204940.GB24510@sgi.com> <20090904025753.GB7146@discord.disaster> <20090904133737.GD12052@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090904133737.GD12052@sgi.com> User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1252076063 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Fri, Sep 04, 2009 at 08:37:37AM -0500, Geoffrey Wehrman wrote: > On Fri, Sep 04, 2009 at 12:57:53PM +1000, Dave Chinner wrote: > | Christoph asked me to repeat what I said on #xfs w.r.t the regression. > > Thank you for the detailed description. All I had was a statement from > January 2008, "Barry has completed the memory optimization, but initial > testing shows that performance has regressed." That was the last update > recorded on Barry's work. > > | With that in mind, I think the memory usage optimisation is far more > | important to the majority of XFS users than the CPU usage regression > | it causes as the majority of users don't have RAM-rich environments > | to run repair in. > > I agree. In my testing I haven't seen big differences in performance, it sometimes got a bit faster and sometimes a bit slower. I will send out a more detailed performace report in a few days. From pelerdin@gardner-webb.edu Fri Sep 4 10:04:44 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: *** X-Spam-Status: No, score=3.0 required=5.0 tests=BAYES_50,HTML_MESSAGE autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n84F4Mbw217253 for ; Fri, 4 Sep 2009 10:04:34 -0500 X-ASG-Debug-ID: 1252076707-408401a20000-w1Z2WR X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bay0-omc3-s22.bay0.hotmail.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 49E9E1D68EEA for ; Fri, 4 Sep 2009 08:05:07 -0700 (PDT) Received: from bay0-omc3-s22.bay0.hotmail.com (bay0-omc3-s22.bay0.hotmail.com [65.54.246.222]) by cuda.sgi.com with ESMTP id 5P3Beb2sdKwULGmS for ; Fri, 04 Sep 2009 08:05:07 -0700 (PDT) Received: from BL2PRD0102HT006.prod.exchangelabs.com ([65.55.174.125]) by bay0-omc3-s22.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 4 Sep 2009 08:04:45 -0700 Received: from BL2PRD0102MB008.prod.exchangelabs.com ([169.254.14.67]) by BL2PRD0102HT006.prod.exchangelabs.com ([169.254.95.89]) with mapi; Fri, 4 Sep 2009 15:04:36 +0000 From: Patrick James Elerding To: "info@webservice.com" X-ASG-Orig-Subj: Web Service, Subject: Web Service, Thread-Topic: Web Service, Thread-Index: AcotcQOD7GT2Tl1HTgOBeMe4hwM0iA== Date: Fri, 4 Sep 2009 15:04:34 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: multipart/alternative; boundary="_000_DEF75BDFECE5B249B3678C41B7B4EDD22EFC8C36BL2PRD0102MB008_" MIME-Version: 1.0 X-OriginalArrivalTime: 04 Sep 2009 15:04:45.0753 (UTC) FILETIME=[0A1A4A90:01CA2D71] X-Barracuda-Connect: bay0-omc3-s22.bay0.hotmail.com[65.54.246.222] X-Barracuda-Start-Time: 1252076714 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=HTML_MESSAGE X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8091 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean --_000_DEF75BDFECE5B249B3678C41B7B4EDD22EFC8C36BL2PRD0102MB008_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Web Service, You have exceeded the limit of your mailbox set by your Web service, and you will be having problems in sending and recieving mails. To prevent this, please click on the link below to reset your account. http://fd8.formdesk.com/webservice/form1 Failure to do this, will result in limited access to your mailbox. Warning!!! Do not send your username and password via email. Regards, Web Service. --_000_DEF75BDFECE5B249B3678C41B7B4EDD22EFC8C36BL2PRD0102MB008_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Web Service,
You have exceeded the limit of your mailbox set by your Web
service, and you will be having problems in sending and recieving mails. To prevent this, please click on the link below to reset your account.
Failure to do this, will result in limited access to your mailbox.
Warning!!! Do not send your username and password via email.
Regards,
Web Service.
--_000_DEF75BDFECE5B249B3678C41B7B4EDD22EFC8C36BL2PRD0102MB008_-- From michael.monnerie@is.it-management.at Fri Sep 4 12:26:12 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_33 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n84HPo1f225931 for ; Fri, 4 Sep 2009 12:26:01 -0500 X-ASG-Debug-ID: 1252085181-351d00460000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mailsrv5.zmi.at (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id F20F042A3CF for ; Fri, 4 Sep 2009 10:26:21 -0700 (PDT) Received: from mailsrv5.zmi.at (mailsrv5.zmi.at [212.69.164.54]) by cuda.sgi.com with ESMTP id AqPWFdWAtapbA9kd for ; Fri, 04 Sep 2009 10:26:21 -0700 (PDT) Received: from mailsrv.i.zmi.at (h081217106033.dyn.cm.kabsi.at [81.217.106.33]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailsrv2.i.zmi.at", Issuer "power4u.zmi.at" (not verified)) by mailsrv5.zmi.at (Postfix) with ESMTP id 90A736CF for ; Fri, 4 Sep 2009 19:25:45 +0200 (CEST) Received: from saturn.localnet (saturn.i.zmi.at [10.72.27.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mailsrv.i.zmi.at (Postfix) with ESMTPSA id E0FB840015E for ; Fri, 4 Sep 2009 19:25:45 +0200 (CEST) From: Michael Monnerie Organization: it-management http://it-management.at To: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH 00/14] repair memory usage reductions Subject: Re: [PATCH 00/14] repair memory usage reductions Date: Fri, 4 Sep 2009 19:24:35 +0200 User-Agent: KMail/1.10.3 (Linux/2.6.30.5-ZMI; KDE/4.1.3; x86_64; ; ) References: <20090902175531.469184575@bombadil.infradead.org> <20090904133737.GD12052@sgi.com> <20090904145109.GA8351@infradead.org> In-Reply-To: <20090904145109.GA8351@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200909041924.35987@zmi.at> X-Barracuda-Connect: mailsrv5.zmi.at[212.69.164.54] X-Barracuda-Start-Time: 1252085201 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8101 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Freitag 04 September 2009 Christoph Hellwig wrote: > In my testing I haven't seen big differences in performance, it > sometimes got a bit faster and sometimes a bit slower. =A0I will send > out a more detailed performace report in a few days. =46rom what I've read, it should be faster on a machine with 2GB RAM and=20 10TB storage, while it's maybe slower on a 64GB RAM machine with a 1TB=20 xfs storage. Given that disks grow faster than RAM sizes, and that with=20 virtualization a single machine typically has not too much RAM these=20 days, I guess with the patches speed will improve overall. mfg zmi =2D-=20 // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 From sandeen@sandeen.net Fri Sep 4 17:35:15 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_64, J_CHICKENPOX_74,J_CHICKENPOX_84 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n84MYtZ7246830 for ; Fri, 4 Sep 2009 17:35:05 -0500 X-ASG-Debug-ID: 1252103740-3bd003560000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id E82FB42B84F for ; Fri, 4 Sep 2009 15:35:40 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by cuda.sgi.com with ESMTP id D3x8zZYtZCmmAfIx for ; Fri, 04 Sep 2009 15:35:40 -0700 (PDT) Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id n84MZdGX025638 for ; Fri, 4 Sep 2009 18:35:39 -0400 Received: from neon.msp.redhat.com (neon.msp.redhat.com [10.15.80.10]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id n84MZcZA009199 for ; Fri, 4 Sep 2009 18:35:38 -0400 Message-ID: <4AA19639.6090208@sandeen.net> Date: Fri, 04 Sep 2009 17:35:37 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.21 (X11/20090320) MIME-Version: 1.0 To: xfs mailing list X-ASG-Orig-Subj: [PATCH] xfsprogs: mark some functions as noreturn Subject: [PATCH] xfsprogs: mark some functions as noreturn Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-Barracuda-Connect: mx1.redhat.com[209.132.183.28] X-Barracuda-Start-Time: 1252103745 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8121 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean Static checkers are a lot less noisy if they know certain functions are noreturn. Making this change removed about 50 errors from "clang" output. (http://clang-analyzer.llvm.org) output. Signed-off-by: Eric Sandeen --- diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c index 69d91c5..ab8a7d9 100644 --- a/mkfs/xfs_mkfs.c +++ b/mkfs/xfs_mkfs.c @@ -27,7 +27,7 @@ */ static void conflict(char opt, char *tab[], int oldidx, int newidx); static void illegal(char *value, char *opt); -static void reqval(char opt, char *tab[], int idx); +static __attribute__((noreturn)) void reqval(char opt, char *tab[], int idx); static void respec(char opt, char *tab[], int idx); static void unknown(char opt, char *s); static int ispow2(unsigned int i); @@ -2464,7 +2464,7 @@ ispow2( return (i & (i - 1)) == 0; } -static void +static void __attribute__((noreturn)) reqval( char opt, char *tab[], diff --git a/repair/err_protos.h b/repair/err_protos.h index 556e9b9..6944950 100644 --- a/repair/err_protos.h +++ b/repair/err_protos.h @@ -16,7 +16,11 @@ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ -void do_abort(char const *, ...); /* abort, internal error */ -void do_error(char const *, ...); /* abort, system error */ -void do_warn(char const *, ...); /* issue warning */ -void do_log(char const *, ...); /* issue log message */ +/* abort, internal error */ +void __attribute__((noreturn)) do_abort(char const *, ...); +/* abort, system error */ +void __attribute__((noreturn)) do_error(char const *, ...); +/* issue warning */ +void do_warn(char const *, ...); +/* issue log message */ +void do_log(char const *, ...); diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c index e9e5965..5dfc3c3 100644 --- a/repair/xfs_repair.c +++ b/repair/xfs_repair.c @@ -351,7 +351,7 @@ do_msg(int do_abort, char const *msg, va_list args) } } -void +void __attribute__((noreturn)) do_error(char const *msg, ...) { va_list args; @@ -366,7 +366,7 @@ do_error(char const *msg, ...) * like do_error, only the error is internal, no system * error so no oserror processing */ -void +void __attribute__((noreturn)) do_abort(char const *msg, ...) { va_list args; From BATV+e49c75e08a496b1b2e4f+2203+infradead.org+hch@bombadil.srs.infradead.org Fri Sep 4 18:08:35 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n84N8AQ4248898 for ; Fri, 4 Sep 2009 18:08:25 -0500 X-ASG-Debug-ID: 1252105745-66ff00360000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id AA72A15DFCE2 for ; Fri, 4 Sep 2009 16:09:05 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id DA75Dzvgw6cHJE6X for ; Fri, 04 Sep 2009 16:09:05 -0700 (PDT) X-ASG-Whitelist: Client Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1Mjhtw-0006vI-Pv; Fri, 04 Sep 2009 23:09:04 +0000 Date: Fri, 4 Sep 2009 19:09:04 -0400 From: Christoph Hellwig To: Eric Sandeen Cc: xfs mailing list X-ASG-Orig-Subj: Re: [PATCH] xfsprogs: mark some functions as noreturn Subject: Re: [PATCH] xfsprogs: mark some functions as noreturn Message-ID: <20090904230904.GA25934@infradead.org> References: <4AA19639.6090208@sandeen.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4AA19639.6090208@sandeen.net> User-Agent: Mutt/1.5.19 (2009-01-05) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Connect: bombadil.infradead.org[18.85.46.34] X-Barracuda-Start-Time: 1252105745 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean On Fri, Sep 04, 2009 at 05:35:37PM -0500, Eric Sandeen wrote: > Static checkers are a lot less noisy if they know certain > functions are noreturn. > > Making this change removed about 50 errors from "clang" output. > (http://clang-analyzer.llvm.org) output. Not pretty but useful, Reviewed-by: Christoph Hellwig > -void do_abort(char const *, ...); /* abort, internal error */ > -void do_error(char const *, ...); /* abort, system error */ > -void do_warn(char const *, ...); /* issue warning */ > -void do_log(char const *, ...); /* issue log message */ > +/* abort, internal error */ > +void __attribute__((noreturn)) do_abort(char const *, ...); > +/* abort, system error */ > +void __attribute__((noreturn)) do_error(char const *, ...); > +/* issue warning */ > +void do_warn(char const *, ...); > +/* issue log message */ > +void do_log(char const *, ...); It would be good to add the proper printflike attributes to these to also get vararg typechecking. From greg@kroah.com Fri Sep 4 19:21:07 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,J_CHICKENPOX_46 autolearn=no version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n850KlBI253849 for ; Fri, 4 Sep 2009 19:20:57 -0500 X-ASG-Debug-ID: 1252110079-67f801990000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from coco.kroah.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 2D5BE42BEC7 for ; Fri, 4 Sep 2009 17:21:19 -0700 (PDT) Received: from coco.kroah.org (kroah.org [198.145.64.141]) by cuda.sgi.com with ESMTP id FqRHC31OShnAbn0g for ; Fri, 04 Sep 2009 17:21:19 -0700 (PDT) Received: from localhost (c-98-246-45-209.hsd1.or.comcast.net [98.246.45.209]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by coco.kroah.org (Postfix) with ESMTPSA id A3D6149102; Fri, 4 Sep 2009 17:21:18 -0700 (PDT) X-Mailbox-Line: From gregkh@mini.kroah.org Fri Sep 4 17:14:55 2009 Message-Id: <20090905001455.221274331@mini.kroah.org> User-Agent: quilt/0.48-1 Date: Fri, 04 Sep 2009 17:14:26 -0700 From: Greg KH To: linux-kernel@vger.kernel.org, stable@kernel.org Cc: stable-review@kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk, xfs@oss.sgi.com, Christoph Hellwig X-ASG-Orig-Subj: [patch 51/71] vfs: fix inode_init_always calling convention Subject: [patch 51/71] vfs: fix inode_init_always calling convention References: <20090905001335.106974681@mini.kroah.org> Content-Disposition: inline; filename=vfs-fix-inode_init_always-calling-convention.patch Lines: 144 In-Reply-To: <20090905001824.GA18171@kroah.com> X-Barracuda-Connect: kroah.org[198.145.64.141] X-Barracuda-Start-Time: 1252110102 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8129 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean 2.6.30-stable review patch. If anyone has any objections, please let us know. ------------------ From: Christoph Hellwig backport of upstream commit 54e346215e4fe2ca8c94c54e546cc61902060510 Currently inode_init_always calls into ->destroy_inode if the additional initialization fails. That's not only counter-intuitive because inode_init_always did not allocate the inode structure, but in case of XFS it's actively harmful as ->destroy_inode might delete the inode from a radix-tree that has never been added. This in turn might end up deleting the inode for the same inum that has been instanciated by another process and cause lots of cause subtile problems. Also in the case of re-initializing a reclaimable inode in XFS it would free an inode we still want to keep alive. Signed-off-by: Christoph Hellwig Reviewed-by: Eric Sandeen Signed-off-by: Greg Kroah-Hartman --- fs/inode.c | 30 +++++++++++++++++------------- fs/xfs/xfs_iget.c | 17 +++++------------ include/linux/fs.h | 2 +- 3 files changed, 23 insertions(+), 26 deletions(-) --- a/fs/inode.c +++ b/fs/inode.c @@ -118,12 +118,11 @@ static void wake_up_inode(struct inode * * These are initializations that need to be done on every inode * allocation as the fields are not initialised by slab allocation. */ -struct inode *inode_init_always(struct super_block *sb, struct inode *inode) +int inode_init_always(struct super_block *sb, struct inode *inode) { static const struct address_space_operations empty_aops; static struct inode_operations empty_iops; static const struct file_operations empty_fops; - struct address_space *const mapping = &inode->i_data; inode->i_sb = sb; @@ -150,7 +149,7 @@ struct inode *inode_init_always(struct s inode->dirtied_when = 0; if (security_inode_alloc(inode)) - goto out_free_inode; + goto out; /* allocate and initialize an i_integrity */ if (ima_inode_alloc(inode)) @@ -189,16 +188,12 @@ struct inode *inode_init_always(struct s inode->i_private = NULL; inode->i_mapping = mapping; - return inode; + return 0; out_free_security: security_inode_free(inode); -out_free_inode: - if (inode->i_sb->s_op->destroy_inode) - inode->i_sb->s_op->destroy_inode(inode); - else - kmem_cache_free(inode_cachep, (inode)); - return NULL; +out: + return -ENOMEM; } EXPORT_SYMBOL(inode_init_always); @@ -211,9 +206,18 @@ static struct inode *alloc_inode(struct else inode = kmem_cache_alloc(inode_cachep, GFP_KERNEL); - if (inode) - return inode_init_always(sb, inode); - return NULL; + if (!inode) + return NULL; + + if (unlikely(inode_init_always(sb, inode))) { + if (inode->i_sb->s_op->destroy_inode) + inode->i_sb->s_op->destroy_inode(inode); + else + kmem_cache_free(inode_cachep, inode); + return NULL; + } + + return inode; } void destroy_inode(struct inode *inode) --- a/fs/xfs/xfs_iget.c +++ b/fs/xfs/xfs_iget.c @@ -63,6 +63,10 @@ xfs_inode_alloc( ip = kmem_zone_alloc(xfs_inode_zone, KM_SLEEP); if (!ip) return NULL; + if (inode_init_always(mp->m_super, VFS_I(ip))) { + kmem_zone_free(xfs_inode_zone, ip); + return NULL; + } ASSERT(atomic_read(&ip->i_iocount) == 0); ASSERT(atomic_read(&ip->i_pincount) == 0); @@ -104,17 +108,6 @@ xfs_inode_alloc( #ifdef XFS_DIR2_TRACE ip->i_dir_trace = ktrace_alloc(XFS_DIR2_KTRACE_SIZE, KM_NOFS); #endif - /* - * Now initialise the VFS inode. We do this after the xfs_inode - * initialisation as internal failures will result in ->destroy_inode - * being called and that will pass down through the reclaim path and - * free the XFS inode. This path requires the XFS inode to already be - * initialised. Hence if this call fails, the xfs_inode has already - * been freed and we should not reference it at all in the error - * handling. - */ - if (!inode_init_always(mp->m_super, VFS_I(ip))) - return NULL; /* prevent anyone from using this yet */ VFS_I(ip)->i_state = I_NEW|I_LOCK; @@ -166,7 +159,7 @@ xfs_iget_cache_hit( * errors cleanly, then tag it so it can be set up correctly * later. */ - if (!inode_init_always(mp->m_super, VFS_I(ip))) { + if (inode_init_always(mp->m_super, VFS_I(ip))) { error = ENOMEM; goto out_error; } --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2135,7 +2135,7 @@ extern loff_t default_llseek(struct file extern loff_t vfs_llseek(struct file *file, loff_t offset, int origin); -extern struct inode * inode_init_always(struct super_block *, struct inode *); +extern int inode_init_always(struct super_block *, struct inode *); extern void inode_init_once(struct inode *); extern void inode_add_to_lists(struct super_block *, struct inode *); extern void iput(struct inode *); From greg@kroah.com Fri Sep 4 19:21:10 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n850KoMZ253853 for ; Fri, 4 Sep 2009 19:21:00 -0500 X-ASG-Debug-ID: 1252110080-3e6303190000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from coco.kroah.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5E33642BEC9 for ; Fri, 4 Sep 2009 17:21:20 -0700 (PDT) Received: from coco.kroah.org (kroah.org [198.145.64.141]) by cuda.sgi.com with ESMTP id cpdJCYQ8cHIuQ8Gx for ; Fri, 04 Sep 2009 17:21:20 -0700 (PDT) Received: from localhost (c-98-246-45-209.hsd1.or.comcast.net [98.246.45.209]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by coco.kroah.org (Postfix) with ESMTPSA id 9903D49107; Fri, 4 Sep 2009 17:21:19 -0700 (PDT) X-Mailbox-Line: From gregkh@mini.kroah.org Fri Sep 4 17:14:55 2009 Message-Id: <20090905001455.382972737@mini.kroah.org> User-Agent: quilt/0.48-1 Date: Fri, 04 Sep 2009 17:14:27 -0700 From: Greg KH To: linux-kernel@vger.kernel.org, stable@kernel.org Cc: stable-review@kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk, xfs@oss.sgi.com, Christoph Hellwig X-ASG-Orig-Subj: [patch 52/71] vfs: add __destroy_inode Subject: [patch 52/71] vfs: add __destroy_inode References: <20090905001335.106974681@mini.kroah.org> Content-Disposition: inline; filename=vfs-add-__destroy_inode.patch Lines: 60 In-Reply-To: <20090905001824.GA18171@kroah.com> X-Barracuda-Connect: kroah.org[198.145.64.141] X-Barracuda-Start-Time: 1252110105 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -1.52 X-Barracuda-Spam-Status: No, SCORE=-1.52 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests=BSF_RULE7568M X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8129 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_RULE7568M Custom Rule 7568M X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean 2.6.30-stable review patch. If anyone has any objections, please let us know. ------------------ From: Christoph Hellwig backport of upstream commit 2e00c97e2c1d2ffc9e26252ca26b237678b0b772 When we want to tear down an inode that lost the add to the cache race in XFS we must not call into ->destroy_inode because that would delete the inode that won the race from the inode cache radix tree. This patch provides the __destroy_inode helper needed to fix this, the actual fix will be in th next patch. As XFS was the only reason destroy_inode was exported we shift the export to the new __destroy_inode. Signed-off-by: Christoph Hellwig Reviewed-by: Eric Sandeen Signed-off-by: Greg Kroah-Hartman --- fs/inode.c | 10 +++++++--- include/linux/fs.h | 1 + 2 files changed, 8 insertions(+), 3 deletions(-) --- a/fs/inode.c +++ b/fs/inode.c @@ -220,18 +220,22 @@ static struct inode *alloc_inode(struct return inode; } -void destroy_inode(struct inode *inode) +void __destroy_inode(struct inode *inode) { BUG_ON(inode_has_buffers(inode)); ima_inode_free(inode); security_inode_free(inode); +} +EXPORT_SYMBOL(__destroy_inode); + +void destroy_inode(struct inode *inode) +{ + __destroy_inode(inode); if (inode->i_sb->s_op->destroy_inode) inode->i_sb->s_op->destroy_inode(inode); else kmem_cache_free(inode_cachep, (inode)); } -EXPORT_SYMBOL(destroy_inode); - /* * These are initializations that only need to be done --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2162,6 +2162,7 @@ extern void __iget(struct inode * inode) extern void iget_failed(struct inode *); extern void clear_inode(struct inode *); extern void destroy_inode(struct inode *); +extern void __destroy_inode(struct inode *); extern struct inode *new_inode(struct super_block *); extern int should_remove_suid(struct dentry *); extern int file_remove_suid(struct file *); From greg@kroah.com Fri Sep 4 19:21:07 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n850Klox253845 for ; Fri, 4 Sep 2009 19:20:57 -0500 X-ASG-Debug-ID: 1252110081-3e6602e00000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from coco.kroah.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3BED442BEC8 for ; Fri, 4 Sep 2009 17:21:22 -0700 (PDT) Received: from coco.kroah.org (kroah.org [198.145.64.141]) by cuda.sgi.com with ESMTP id lJtLpJ6HvXPsaDV5 for ; Fri, 04 Sep 2009 17:21:22 -0700 (PDT) Received: from localhost (c-98-246-45-209.hsd1.or.comcast.net [98.246.45.209]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by coco.kroah.org (Postfix) with ESMTPSA id 9A99549111; Fri, 4 Sep 2009 17:21:21 -0700 (PDT) X-Mailbox-Line: From gregkh@mini.kroah.org Fri Sep 4 17:14:55 2009 Message-Id: <20090905001455.686767014@mini.kroah.org> User-Agent: quilt/0.48-1 Date: Fri, 04 Sep 2009 17:14:29 -0700 From: Greg KH To: linux-kernel@vger.kernel.org, stable@kernel.org Cc: stable-review@kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk, Felix Blyakher , xfs@oss.sgi.com, Christoph Hellwig X-ASG-Orig-Subj: [patch 54/71] xfs: fix spin_is_locked assert on uni-processor builds Subject: [patch 54/71] xfs: fix spin_is_locked assert on uni-processor builds References: <20090905001335.106974681@mini.kroah.org> Content-Disposition: inline; filename=xfs-fix-spin_is_locked-assert-on-uni-processor-builds.patch Lines: 32 In-Reply-To: <20090905001824.GA18171@kroah.com> X-Barracuda-Connect: kroah.org[198.145.64.141] X-Barracuda-Start-Time: 1252110105 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Virus-Scanned: by cuda.sgi.com at sgi.com X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=2.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=2.1 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.8129 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Virus-Scanned: ClamAV version 0.94.2, clamav-milter version 0.94.2 on oss.sgi.com X-Virus-Status: Clean 2.6.30-stable review patch. If anyone has any objections, please let us know. ------------------ From: Christoph Hellwig upstream commit a8914f3a6d72c97328597a556a99daaf5cc288ae Without SMP or preemption spin_is_locked always returns false, so we can't do an assert with it. Instead use assert_spin_locked, which does the right thing on all builds. Signed-off-by: Christoph Hellwig Reviewed-by: Eric Sandeen Reported-by: Johannes Engel Tested-by: Johannes Engel Signed-off-by: Felix Blyakher Signed-off-by: Greg Kroah-Hartman --- fs/xfs/xfs_log.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -3180,7 +3180,7 @@ try_again: STATIC void xlog_state_want_sync(xlog_t *log, xlog_in_core_t *iclog) { - ASSERT(spin_is_locked(&log->l_icloglock)); + assert_spin_locked(&log->l_icloglock); if (iclog->ic_state == XLOG_STATE_ACTIVE) { xlog_state_switch_iclogs(log, iclog, 0); From greg@kroah.com Fri Sep 4 19:21:14 2009 X-Spam-Checker-Version: SpamAssassin 3.3.0-rupdated (updated) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-rupdated Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n850KsGT253858 for ; Fri, 4 Sep 2009 19:21:04 -0500 X-ASG-Debug-ID: 1252110080-6ea901300000-NocioJ X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from coco.kroah.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8065D42BEC9 for ; Fri, 4 Sep 2009 17:21:21 -0700 (PDT) Received: from coco.kroah.org (kroah.org [198.145.64.141]) by cuda.sgi.com with ESMTP id DEQR2KPjYbExrBt8 for ; Fri, 04 Sep 2009 17:21:21 -0700 (PDT) Received: from localhost (c-98-246-45-209.hsd1.or.comcast.net [98.246.45.209]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by coco.kroah.org (Postfix) with ESMTPSA id 86FE54910B; Fri, 4 Sep 2009 17:21:20 -0700 (PDT) X-Mailbox-Line: From gregkh@mini.kroah.org Fri Sep 4 17:14:55 2009 Message-Id: <20090905001455.522626105@mini.kroah.org> User-Agent: quilt/0.48-1 Date: Fri, 04 Sep 2009 17:14:28 -0700 From: Greg KH To: linux-kernel@vger.kernel.org, stable@kernel.org Cc: stable-review@kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk, xfs@oss.sgi.com, Christoph Hellwig X-ASG-Orig-Subj: [patch 53/71] xfs: fix freeing of inodes not yet added to the inode cache Subject