xfs
[Top] [All Lists]

Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc

To: David Chinner <dgc@xxxxxxx>
Subject: Re: [PATCH 1/5] fallocate() implementation in i86, x86_64 and powerpc
From: Andreas Dilger <adilger@xxxxxxxxxxxxx>
Date: Thu, 14 Jun 2007 03:14:58 -0600
Cc: "Amit K. Arora" <aarora@xxxxxxxxxxxxxxxxxx>, Suparna Bhattacharya <suparna@xxxxxxxxxx>, torvalds@xxxxxxxx, akpm@xxxxxxxxxxxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, cmm@xxxxxxxxxx
In-reply-to: <20070613235217.GS86004887@xxxxxxx>
Mail-followup-to: David Chinner <dgc@xxxxxxx>, "Amit K. Arora" <aarora@xxxxxxxxxxxxxxxxxx>, Suparna Bhattacharya <suparna@xxxxxxxxxx>, torvalds@xxxxxxxx, akpm@xxxxxxxxxxxxxxxxxxxx, linux-fsdevel@xxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, linux-ext4@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, cmm@xxxxxxxxxx
References: <20070426175056.GA25321@xxxxxxxxxxxxxxxxxxxx> <20070426180332.GA7209@xxxxxxxxxxxxxxxxxxxx> <20070509160102.GA30745@xxxxxxxxxxxxxxxxxxxx> <20070510005926.GT85884050@xxxxxxx> <20070510115620.GB21400@xxxxxxxxxxxxxxxxxxxx> <20070510223950.GD86004887@xxxxxxx> <20070511110301.GB28425@xxxxxxxxxx> <20070512080157.GF85884050@xxxxxxx> <20070612061652.GA6320@xxxxxxxxxxxxxxxxxxxx> <20070613235217.GS86004887@xxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.4.1i
On Jun 14, 2007  09:52 +1000, David Chinner wrote:
> B FA_PREALLOCATE
> provides the same functionality as
> B FA_ALLOCATE
> except it does not ever change the file size. This allows allocation
> of zero blocks beyond the end of file and is useful for optimising
> append workloads.
> TP
> B FA_DEALLOCATE
> removes the underlying disk space with the given range. The disk space
> shall be removed regardless of it's contents so both allocated space
> from
> B FA_ALLOCATE
> and
> B FA_PREALLOCATE
> as well as from
> B write(3)
> will be removed.
> B FA_DEALLOCATE
> shall never remove disk blocks outside the range specified.

So this is essentially the same as "punch".  There doesn't seem to be
a mechanism to only unallocate unused FA_{PRE,}ALLOCATE space at the
end.

> B FA_DEALLOCATE
> shall never change the file size. If changing the file size
> is required when deallocating blocks from an offset to end
> of file (or beyond end of file) is required,
> B ftuncate64(3)
> should be used.

This also seems to be a bit of a wart, since it isn't a natural converse
of either of the above functions.  How about having two modes,
similar to FA_ALLOCATE and FA_PREALLOCATE?  Say, FA_PUNCH (which
would be as you describe here - deletes all data in the specified
range changing the file size if it overlaps EOF, and FA_DEALLOCATE,
which only deallocates unused FA_{PRE,}ALLOCATE space?

We might also consider making @mode be a mask instead of an enumeration:

FA_FL_DEALLOC   0x01 (default allocate)
FA_FL_KEEP_SIZE 0x02 (default extend/shrink size)
FA_FL_DEL_DATA  0x04 (default keep written data on DEALLOC)

We might then build FA_ALLOCATE and FA_DEALLOCATE out of these flags
without making the interface sub-optimal.

I suppose it might be a bit late in the game to add a "goal"
parameter and e.g. FA_FL_REQUIRE_GOAL, FA_FL_NEAR_GOAL, etc to make
the API more suitable for XFS?  The goal could be a single __u64, or
a struct with e.g. __u64 byte offset (possibly also __u32 lun like
in FIEMAP).  I guess the one potential limitation here is the
number of function parameters on some architectures.

> B ENOSPC
> There is not enough space left on the device containing the file
> referred to by
> IR fd.

Should probably say whether space is removed on failure or not.  In
some (primitive) implementations it might no longer be possible to
distinguish between unwritten extents and zero-filled blocks, though
at this point DEALLOC of zero-filled blocks might not be harmful either.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


<Prev in Thread] Current Thread [Next in Thread>