xfs
[Top] [All Lists]

Re: [PATCH] fix corruption case for block size < page size

To: lachlan@xxxxxxx
Subject: Re: [PATCH] fix corruption case for block size < page size
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Tue, 16 Dec 2008 00:10:53 -0600
Cc: xfs-oss <xfs@xxxxxxxxxxx>
In-reply-to: <49474530.2080809@xxxxxxx>
References: <49435F35.40109@xxxxxxxxxxx> <4943FCD7.2010509@xxxxxxxxxxx> <494735D9.8020809@xxxxxxx> <49473F5C.3070308@xxxxxxxxxxx> <49474530.2080809@xxxxxxx>
User-agent: Thunderbird 2.0.0.18 (Macintosh/20081105)
Lachlan McIlroy wrote:
> Eric Sandeen wrote:
>> Actually; after the truncate down step (3) we should have:
>>
>>      |<--------trunc-----------------------
>> 3: |11??|                                       trunc down to 1/2 block
>>      ^
>>      |
>>     EOF
>>
>> Hm, but does the end of this block get zeroed now or only when we
>> subsequently extend the size?  The latter I think...?
> Only when extending the file size.

Right.

>> So I think in the next step:
>>
>>  trunc-->|
>> 4: |1100|                                        trunc up to block+1byte
>>       ^^
>>   now || this part of the block gets zeroed, right, by xfs_zero_eof?
> Yes (by xfs_zero_last_block()).

Right.  :)  But I *think* that after this step we are actually zeroing
into block 1 (2nd block) and causing it to get zeroed/mapped.  Off by
one maybe?

>>> Because of the truncate to 256 bytes
>>> only the first block is allocated and everything beyond 512 bytes is
>>> a hole.  
>> Yep, up until the last write anyway.
>>
>>> More specifically there is a hole under the remainder of the
>>> page so xfs_zero_eof() will skip that region and not zero anything.
>> Well, the last write (step 5) is still completely within the page...
>>
>> Right, that's what it *should* be doing; but in page_state_convert (and
>> I'll admit to not having this 100% nailed down) we write block 1 and map
>> blocks 2 & 3 back into the file, and get:
>>
>> # |1100|0000|1111|1111|2222|----|----|----|
>>              ^^^^ ^^^^
>> where these  |||| |||| blocks are stale data, and block 1 is written
>> (but at least zeroed).  How block 1 got zeroed I guess I'm not quite
> I think block 1 got zeroed during the last write because the file size
> was extended from 513 to 2048.  Byte 513 is just inside block 1.  But
> that block should have been a hole and xfs_zero_last_block() should
> have skipped it.

I think the 2nd extending write does skip it but from a bit more looking
the first extending truncate might step into it by one... still looking
into that.

>> certain yet.  But it does not appear that blocks 2 and 3 get *written*
>> any time other than step 1; blktrace seems to confirm this.  block 1
>> does get written, and 0s are written.  (But I don't think this block
>> ever should get written either; EOF landed there but only via truncate,
>> not a write).
> Agree.
> 
>> Crap, now you've got me slightly confused again, and I'll need to look a
>> bit more to be sure I'm 100% clear on what's getting zeroed and when vs.
>> what's getting mapped and why.  :)
> That makes two.

:)

> Something else to consider is that there may be allocated blocks
> entirely beyond eof due to speculative allocation.  This means that just
> because a block within a page is beyond eof does not mean it covers a
> hole.  This is why xfs_zero_eof() looks for blocks to zero between the
> old eof and the new eof.

true... yeah, my test may yet be a bit naiive.

-Eric

<Prev in Thread] Current Thread [Next in Thread>