On Tue, 15 Jun 2004 15:29:09 +1000
Nathan Scott <nathans@xxxxxxx> wrote:
> > My theory is that one of the ENOSPC error paths forgets to brelse a buffer
> > which
> > causes the semaphore to get lost and causes the deadlocks later. This is
> > likely
> > a AGF buffer, since xfs_alloc_read_agf() and children are blocking too.
> Yep, makes sense.
Unfortunately not. I noticed that even the normal path (without any errors)
does not brelse the buffers. This means they must be freed somewhere else
and the few brelses that are there are likely bogus. I just didn't find
the code who does this. I suppose they are somehow tracked in the
transaction, which makes sense.
Also adding brelses did not stop the deadlocks i was seeing.
I added some instrumentation and all lost buffers seem to originate
from xfs_alloc_fix_freelists (with different calls). I also tried with
only a single CPU and it happens so, so it isn't an SMP race.
I will look into this more later.