On Mon, Oct 27, 2008 at 01:30:58PM +1100, Timothy Shimmin wrote:
> Dave Chinner wrote:
> > Ok, I think I've found the regression - it's introduced by the AIL
> > cursor modifications. The patch below has been running for 15
> > minutes now on my UML box that would have hung in a couple of
> > minutes otherwise.
> Yeah, the fix looks good. The previous code is pretty
> obviously broken - a search which always returns NULL.
> Which begs the question on the best way of testing this ail code.
> I dunno - it would be nice for independent testing of data structures
> but perhaps that is too ambitious.
> OOC, so the call path for this code....
> xfsaild -> xfsaild_push(ailp, &last_pushed_lsn)
> -> lip = xfs_trans_ail_cursor_first(ailp, cur, *last_lsn)
> Initially, last_lsn = 0 in xfsaild
> but it will be updated via last_pushed_lsn.
> So it looks like things will work initially when lsn==0, because
> xfs_trans_ail_cursor_first special cases that and uses the min.
> But as soon as the lsn is set to non-zero,
> xfs_trans_ail_cursor_first will return NULL,
> and xfsaild_push will return early.
Right - that was the bug. With the fix we will only return NULL if
we walk off the end of the AIL list before we get to the LSN being
requested to start at. Otherwise we jump over the "lip = NULL" and
start at the first log item with a LSN greater than or equal to the