[ ... ]
> I've read a recommendation to start the partition on the 1MB
> mark. Does this make sense?
As a general principle it is good, that has almost no cost.
Indeed recent versions of some partitionig tools do that by
I often recommend aligning partitions to 1GiB, also because I
like to have 1GiB or so of empty space at the very beginning and
end of a drive.
> I'd like to read about the NFS blog entry but the link you
> included results in a 404. I forgot to mention in my last
Oops I forgot a bit of the URL:
Note that currently I suggest different values from:
* 4% of memory "dirty" today is often a gigantic amount.
I had provided an elegant patch to specify the same in
absolute terms in
but now the official way is the "_bytes" alternative.
* 2% as the level at which writing becomes uncached is too
low, and the system become unresposive when that level is
crossed. Sure it is risky, but, regretfully, I think that
maintaining responsiveness is usually better than limiting
outstanding background writes.
> Based on what I understood from your thoughts above, if an
> applications issues a flush/fsync and it does not complete due
> to some catastrophic crash, xfs on its own can not roll back
> to the prev version of the file in case of unfinished write
> operation. disabling the device caches wouldn't help either
If your goal is to make sure incomplete updates don't get
persisted, disabling device caches might help with that, in a
very perverse way (if the whole partial update is still in the
device cache, it just vanishes). Forget that of course :-).
The main message is that filesystems in UNIX-like system should
not provide atomic transactions, just the means to do them at
the applications level, because they are both difficult and very
The secondary message is that some applications and the firmware
of some host adpters and drives don't do the right thing, and
if your really want to make sure about atomic transactions it is
an expensive and difficult system integration challenge.
> [ ... ] only filesystems that do COW can do this at the
> expense of performance? (btrfs and zfs, please hurry and grow
Filesystems that do COW sort-of do *global* "rolling" updates,
that is filtree level snapshots, but that's a side effect of a
choice made for other reasons (consistency more than currency).
> [ ... ] If you were in my place with the resource constraints,
> you'd go with: xfs with barriers on top of mdraid10 with
> device cache ON and setting vm/dirty_bytes, [ ... ]
Yes, that seems a reasonable overall tradeoff, because XFS is
implemented to provide well defined (and documented) semantics,
to check whether the underlying storage layer actually does
barriers, and to perform decently even if "delayed" writing is
not that delayed.