On Fri, 2011-01-21 at 04:22 -0500, Christoph Hellwig wrote:
> During an allocation or freeing of an extent, we occasionally have an
> interesting situation happen during a btree update. We may merge a block
> as part of a record delete, then allocate it again immediately afterwards
> when inserting a modified extent. xfs_alloc_fixup_trees() does this sort
> of manipulation to the btrees, and so can merge then immediately split
> the tree resulting the in the same busy block being used from the
Methinks someone must have observed this stuff by tracing...
> Previously, this would trigger a synchronous log force of the entire log
> (as the current transaction had a lsn of 0 until committed), but continue
> to allow the extent to be used because if it is not on disk now then it
> must have been freed and marked busy in this transaction.
This could explain a mysterious slowdown I was looking at
a while ago.
> In the case of allocbt blocks, we log the fact that they get moved to the
> freelist and we also log them when the move off the free list back into
> the free space trees. When we move the blocks off the freelist back to
> the trees, we mark the extent busy at that point so that it doesn't get
> reused until the transaction is committed.
> This means that as long as the block is on the AGFL and is only used
> for allocbt blocks, we don't have to force the log to ensure prior
> manipulating transactions are on disk as the process of log recovery
> will replay the transactions in the correct order.
> However, we still need to protect against the fact that as we approach
> ENOSPC in an AG we can allocate data and metadata blocks direct from the
> AGFL. In this case, we need to treat btree blocks just like any other
> busy extent. Hence we still need to track btree blocks in the busy extent
> tree, but we need to distinguish them from normal extents so we can apply
> the necessary exceptions for btree block allocation.
> Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx>
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
Sorry, this ended up being a pretty pointless note...
I see what you're doing here but have not completed
my detailed review. I also have read Dave's comments
and I think he's right that what the block is being
allocated *for* (metadata or data) determines whether
the log force is required.
Anyway, I need to go now unfortunately. I'll try to
finish reviewing this and the last one by Monday.