xfs
[Top] [All Lists]

Re: [PATCH] Always reset btree cursor after an insert

To: Dave Chinner <dchinner@xxxxxxxxx>
Subject: Re: [PATCH] Always reset btree cursor after an insert
From: Lachlan McIlroy <lachlan@xxxxxxx>
Date: Tue, 17 Jun 2008 11:24:17 +1000
Cc: xfs-dev <xfs-dev@xxxxxxx>, xfs-oss <xfs@xxxxxxxxxxx>
In-reply-to: <200806161014.04407.dchinner@xxxxxxxxx>
References: <4855CE2D.70505@xxxxxxx> <20080616050150.GK3700@disturbed> <48560B0B.3060204@xxxxxxx> <200806161014.04407.dchinner@xxxxxxxxx>
Reply-to: lachlan@xxxxxxx
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Thunderbird 2.0.0.14 (X11/20080421)
Dave Chinner wrote:
On Sunday 15 June 2008 11:41 pm, Lachlan McIlroy wrote:
Dave Chinner wrote:
On Mon, Jun 16, 2008 at 12:21:33PM +1000, Lachlan McIlroy wrote:
After a btree insert operation a cursor can be invalid due to block
splits and a maybe a new root block.  We reset the cursor in
xfs_bmbt_insert() in the cases where we think we need to but it
isn't enough as we still see assertions.  Just do what we do elsewhere
and reset the cursor unconditionally.
Ok, so you should also kill the new code in the btree insert that
revalidates the btree cursor. IIRC, this was the only place it was
needed for....
That was the plan.

Cool. Please include it in the next version of the patch.

+                       if ((error = xfs_bmbt_lookup_eq(cur, new->br_startoff,
+                                       new->br_startblock, new->br_blockcount,
+                                       &i)))
                                goto done;

                        error = xfs_bmbt_lookup_eq(cur, new->br_startoff,
                                        br_startblock, new->br_blockcount, &i);
                        if (error)
                                goto done;

-                       ASSERT(i == 1);
+                       ASSERT(i == 0);
ASSERT? How about a WANT_CORRUPTED_GOTO()?
I was just being consistent with the rest of the code.  If you think this
ASSERT should be changed then what about all of them?

Well, the ASSERT means silent failure on a production system.
Given that failure here indicates a corrupt btree, then we really
should be treating it as such. i.e. shut down the filesystem. This
might have saved us a whole heap of trouble tracking down these
problems as a shutdown here would have pointed us right at the
source of the problem...


Dave, what I was asking is what about the rest of the ASSERTs in
this file - should we change them too?  There's a lot of them.
After this change they are all equally likely to trigger so if
it makes sense to change one then the same argument applies to
all of them.


<Prev in Thread] Current Thread [Next in Thread>