[Top] [All Lists]

Re: TAKE 950027 - xfs_icsb_lock_all_counters fails with CONFIG_PREEMPT a

To: Linus Torvalds <torvalds@xxxxxxxx>
Subject: Re: TAKE 950027 - xfs_icsb_lock_all_counters fails with CONFIG_PREEMPT and >=256p
From: David Chinner <dgc@xxxxxxx>
Date: Fri, 3 Mar 2006 07:32:29 +1100
Cc: Andi Kleen <ak@xxxxxxx>, David Chinner <dgc@xxxxxxx>, linux-xfs@xxxxxxxxxxx, mingo@xxxxxxx, tony.luck@xxxxxxxxx
In-reply-to: <Pine.LNX.4.64.0603021003420.22647@xxxxxxxxxxx>
References: <20060301125320.20FDA49F1681@xxxxxxxxxxxxxxxxxxxxxxx> <200603021309.46495.ak@xxxxxxx> <Pine.LNX.4.64.0603020846080.22647@xxxxxxxxxxx> <200603021852.35443.ak@xxxxxxx> <Pine.LNX.4.64.0603021003420.22647@xxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/
On Thu, Mar 02, 2006 at 10:06:50AM -0800, Linus Torvalds wrote:
> On Thu, 2 Mar 2006, Andi Kleen wrote:
> >
> > That would be spinlock nesting of 2 per CPU on a 1024 CPU machine. 
> > Not exactly plenty. Better 12-14 at least.
> That's PLENTY.
> First off, name somebody who sells bigger systems than that.
> Second, taking two spinlocks per CPU is crazy in the first place. Who does 
> that? Talk about scaling badly.

Nobody does that - it was a purely hypothetical question. The algorithm
needs to lock out all the per-cpu counters while state transitions occur
(i.e. does global manipulations of the counters).

> Third, code that cares can avoid it entirely by just bumping the preempt 
> counter _once_, and then using the raw spinlock code. That's how you 
> should do it for big-rw-locks anyway, just because it's less insane, if 
> that is what XFS is doing with its "two spinlocks per CPU" thing.

Exactly the change I made, except it uses an atomic bit field and ndelay
rather than raw spinlocks. Spinlocks were a bad choice in the first
place because I was not aware of the counter overflow problem.


Dave Chinner
R&D Software Enginner
SGI Australian Software Group

<Prev in Thread] Current Thread [Next in Thread>