[Top] [All Lists]

Re: XFS corruption during power-blackout

To: Chris Wedgwood <cw@xxxxxxxx>
Subject: Re: XFS corruption during power-blackout
From: Stewart Smith <stewart@xxxxxxxxxxxxxxxx>
Date: Fri, 01 Jul 2005 11:09:01 +1000
Cc: Bryan Henderson <hbryan@xxxxxxxxxx>, Al Boldi <a1426z@xxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, linux-xfs@xxxxxxxxxxx, Steve Lord <lord@xxxxxxx>, "'Nathan Scott'" <nathans@xxxxxxx>, reiserfs-list@xxxxxxxxxxx
In-reply-to: <054069.b93858d6b97c07747dc32be2dd8981b254d981528781006053dce7be58de88865a43b162.IBX@xxxxxxxxxxxxxxxxxxxxx>
References: <254889.27725ab660aa106eb6acc07307d71ef1fbd5b6fd366aebef9e2f611750fbcb467e46e8a4.IBX@xxxxxxxxxxxxxxxxxxxxx> <OFC7BE2BBA.42F70A93-ON88257030.00595A48-88257030.005A99A3@xxxxxxxxxx> <054069.b93858d6b97c07747dc32be2dd8981b254d981528781006053dce7be58de88865a43b162.IBX@xxxxxxxxxxxxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
On Thu, 2005-06-30 at 11:46 -0700, Chris Wedgwood wrote:
> Yes, but POSIX is broken in places.  The linux implmentation (now and
> for sometime but not always) won't return until all dirty data is
> flushed.

POSIX, in regard to fsync() provides "flexibility for the
implementation" - maybe your environment is special and you don't buffer
anything, so fsync() is null. Or perhaps you cannot control some of the
disk caches, so fsync() is null.

In newer systems, you can check for the flag POSIX_SYNCHRONIZED_IO (or
similar) that, if set, gaurentees that fsync() is synchronously flushing
buffers to disk. However, this only came into the spec in 99 or 2000 i
think, so there are still a lot of systems in which you have to know the

> > and some 'sync' programs do multiple sync()s.
> Such programs are arguably broken (grub maybe?).  If one doesn't work,
> then why should doing it <n>-times?

It's a legacy from the days when it was an async operation. The idea
went: that the time it took to type sync and press enter three times
(note, no using up-arrow, enter - typing) would be long enough for the
buffers that started to get flushed on the first sync to have hit disk.

> > And it's also filesystem-type-dependent.
> If a filesystem doesn't flush reliably with sync, I would call that a
> bug.
> > fsync(), on the other hand, is a true synchronizing operation.
> Again that requires the fs to behave correctly so if it fails it
> should be reported as a bug.

It's all fun and games - reliably getting data to disk is not fun. If
Linux can reliably follow the idea that fsync() is synchronous and
really does flush everything to disk, then it will be a lot better off
then a lot of other platforms.

Also, it'd be useful to have a list of where bugs affecting this have
been found and in what kernels - It is not out of the question
explicitly coding in exceptions (read: big warnings to users) for these

I guess a list of known-bad drives and controllers could be useful too.
Doubly useful if the kernel could report this, but a userspace list
would also be good. 

Stewart Smith (stewart@xxxxxxxxxxxxxxxx)

Attachment: signature.asc
Description: This is a digitally signed message part

<Prev in Thread] Current Thread [Next in Thread>