>> As to 'ext4' and doing (euphemism) insipid tests involving
>> peculiar setups, there is an interesting story in this post:
> I really don't see the connection to this thread. You're
> advocating mostly that tar use fsync on every file, which to
> me seems absurd.
Rather different: I am pointing out that there is a fundamental
problem, that the spectrum of safety/speed tradeoffs covers 2
orders of magnitude as to speed, and that for equivalent points
XFS and 'ext4' don't perform that differently (factor of 2 in
this particular "test", which is sort of "noise").
Note: it is Schilling who advocates for 'tar' to 'fsync' every
file, and he gives some pretty good reasons why that should
be the default, and why that should not be that expensive,
(which I is a bit optimistic0. My advocacy in that thread
was that having different safety/speed tradeoffs is a good
thing, if they are honestly represented as tradeoffs.
So it is likely if there is a significant difference you are
getting a different tradeoff even if you may not *want* a
Note: JFS and XFS are more or less as good as it gets as to
"general purpose" filesystems, and when people complain
about "speed" of them odds are that they are using either
improperly, or in corner cases, or there is a problem in the
application or storage layer. To get something better than
JFS or XFS one must look at filesystems based on radically
different tradeoffs, like NILFS2 (log), OCFS2 (shareable) or
BTRFS (COW). In your case perhaps NILFS2 would give best
And that's what seems to be happening: 'ext4' seems to commit
metadata and data in spacewise order, XFS in timewise order,
because the seek order on writeout probably reflects the order
in which files were extracted from the 'tar' file.
> If the system goes down halfway through tar extraction, I
> would delete the tree and untar again. What do I care if some
> files are corrupt, when the entire tree is incomplete anyway?
Maybe you don't care; but filesystems are not psychic (they use
hardwired and adaptive policy, not predictive) and given that
most people seem to care the default for XFS is to try harder to
keep metadata durable.
Also various versions of 'tar' have options that allow
continuing rather than restarting an extraction because some
people prefer that.
> [ ... ] It's just that untarring large source trees is a very
> typical workload for me.
Well, it makes a lot of difference whether you are creating an
extreme corner case just to see what happens, or whether you
have a real problem, even a corner case problem, about which you
have to make some compromise.
The problem you have described seems rather strange:
* You write a lot of little files to memory, as you have way
more memory than data.
* The whole is written out to a relatively RAID6 in one go, on
a storage layer that can do 500-700MB/s but does 1/5th of that.
* You don't do anything else with the files.
> And I just don't want to accept that XFS cannot do better than
> being several orders of magnitude slower than ext4 (speaking
> of binary orders of magnitude).
> As I see it, both file systems give the same guarantees:
> 1) That upon completion of sync, all data is readily available
> on permanent storage.
> 2) That the file system metadata doesn't suffer corruption,
> should the system lose power during the operation.
Yes, but they also give you some *implicit* guarantees that are
different. For example that:
* XFS spreads out files for you so you can better take
advantage of parallelism in your storage layer, and further
allocations are more resistant to fragmentation.
* 'ext4' probably commits in a different and less safe order
from XFS. If the storage layer rearranged IO order this
might matter a lot less.
You may not care about either, but then you are doing something
For example, if you were to use your freshly written sources to
do a build, then conceivably spreading the files over 4 AGs
means that the builds can be much quicker on a system with
available hardware parallelism.
Also, *you* don't care about the order in which losses would
happen, and how much, if the system crashes, but most users tend
to want to avoid repeating work, because either they are not
copying data, or the copy is huge and they don't want to restart
it from the beginning.