[Top] [All Lists]


To: Alex Wun <alexwun@xxxxxxxxxxxxx>, lord@xxxxxxx
Subject: Re: XFS Writes: IOLOCK_EXCL
From: Nathan Scott <nathans@xxxxxxx>
Date: Wed, 25 Feb 2004 17:29:07 +1100
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <01fa01c3fafa$7c3482d0$9002a8c0@xxxxxxxxxxxxx>
References: <Pine.LNX.4.44.0402240914430.12803-100000@xxxxxxxxxxxxxxxxxxxxxx> <01fa01c3fafa$7c3482d0$9002a8c0@xxxxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mutt/1.5.3i
On Tue, Feb 24, 2004 at 12:20:22PM -0500, Alex Wun wrote:
> Hi.

hi there.

> What I've been trying to do is squeeze some more write performance out of my
> system.

Good stuff.

> I've ended up fiddling with some XFS code and have found that write
> performance can improve significantly in certain applications (i.e.
> exporting the filesystem over a network).  I've left the system running (and

Can you describe your workload there a bit more?  Do you have
multiple writers to the same file (from different NFS clients?)?
Readers too?  thanks.

> writing) over the weekend and it seems to be stable.  Although things look
> good for now, I'd like to be certain that I haven't broken anything.
> Essentially, this is what I've done:
> In the function xfs_fsync():
> Instead of:
>             VOP_FLUSH_PAGES(...);
> I essentially have:
>                   VOP_COMMIT_PAGES(...);
>                   xfs_iunlock(ip, XFS_IOLOCK_EXCL);
>                   VOP_CLEANUP_PAGES(...);
>                   xfs_ilock(ip, XFS_IOLOCK_EXCL);
> Where VOP_COMMIT_PAGES() + VOP_CLEANUP_PAGES() does the same thing as
> VOP_FLUSH_PAGES() but in two parts, first the writing of the pages then the
> cleanup.  So I've basically released the XFS_IOLOCK_EXCL during page
> cleanup.

By "cleanup" I guess you mean the filemap_fdatawait() call in
fs_flush_pages?  So, you're releasing the IOLOCK inbetween the
initiation of IO, and the wait-for-completion of that IO.  Hmm,
I'll need to think about that - Steve, anything immediately
obvious to you as to whether thats going to go bad?

filemap_fdatawait is only processing already-locked pages, so
other writers would be synchronised by the page lock.  However
another writer can sneak into the window created there, and do
nasty things like updating the inode size or do direct IO when
you weren't looking, etc.  I think, not convinced yet though.

What pointed out this spot in the code as a problem initially?
Is this the call down from NFS where it wants to write a range
within a file, but since theres no such interface it has to sync
all the dirty data?  Someone else was asking me if that could
be improved a little while ago, but there are no VFS interfaces
to provide that functionality (yet?) - XFS would be able to use
that if it existed.

> This may not work for everyone but it seems to give me better writing (maybe
> strict operating conditions will apply?). But I'd just like to ask, from a
> conceptual stand-point, whether this is a legitimate thing to do.

I'm a little concerned that this may be opening a window for
data corruption, but I need to think about it a whole lot.

> Thanks for your time,

Thanks for looking into this.



<Prev in Thread] Current Thread [Next in Thread>