[Top] [All Lists]

Re: synchronization of XFS

To: Steve Lord <lord@xxxxxxx>
Subject: Re: synchronization of XFS
From: Chris Wedgwood <cw@xxxxxxxx>
Date: Thu, 25 Mar 2004 06:07:23 -0800
Cc: Stefan Smietanowski <stesmi@xxxxxxxxxx>, "IKARASHI, Seiichi" <ikarashi@xxxxxxxxxxxxxxxx>, linux-xfs@xxxxxxxxxxx
In-reply-to: <4062E19A.90207@xxxxxxx>
References: <4060F7FC.8090602@xxxxxxxxxxxxxxxx> <20040325063902.GA9697@xxxxxxxxxxxxxxxxxxxxxxx> <4062C97A.6030702@xxxxxxxxxxxxxxxx> <20040325124152.GA12078@xxxxxxxxxxxxxxxxxxxxxxx> <4062D7E5.6070501@xxxxxxxxxx> <20040325132200.GA12333@xxxxxxxxxxxxxxxxxxxxxxx> <4062E19A.90207@xxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
On Thu, Mar 25, 2004 at 07:41:46AM -0600, Steve Lord wrote:

> sync means the filesystem will look the same after a crash, it does
> not mean all the data has to hit the disk. In the case of XFS it
> does not really mean this. Sync will make sure that the journal is
> upto date on disk, so the metadata can be recovered after a
> crash.

fsync()/sync() should mean all dirty DATA blocks and METADATA blocks
are flushed to the backing store.  Anything less than this is a bug.

SuS requires that sync() schedule writes to the file system but not
necessarily complete before the system call returns --- however, Linux
has for some time actually had the system call block during write-out
(very early version on Linux didn't do this).


> Making sync push all the metadata itself would slow it down.

sync() should also flush the metadata.  If people wish to only flush
data blocks there is fdatasync() for smart applications.

Applications such as MTAs and some databases *depend* on these

Now, since you wrote the new sync' code I'm not going to argue at to
what XFS does, but as far as I can tell sync() does indeed flush all
datablocks to disk.

Because metadata is isn't backed buthe buffer cache I guess that
doesn't get flushed?  But it idally it should :-)

> Now if grub is opening the block device and reading out of that, it
> is looking at the same pages for metadata that xfs is looking at in
> memory. There is a bug where you can get corruption if you access
> the block device in parallel with the filesystem. Possibly this is
> behind the problem.

This will cause an oops on 2.6.x won't it --- so I suspect if this is
behind the problem the report will be have been different.


<Prev in Thread] Current Thread [Next in Thread>