xfs
[Top] [All Lists]

Re: [reiserfs-list] Re: benchmarks

To: Nikita Danilov <NikitaDanilov@xxxxxxxxx>
Subject: Re: [reiserfs-list] Re: benchmarks
From: Xuan Baldauf <xuan--reiserfs@xxxxxxxxxxx>
Date: Mon, 16 Jul 2001 21:34:18 +0200
Cc: Hans Reiser <reiser@xxxxxxxxxxx>, Russell Coker <russell@xxxxxxxxxxxx>, Chris Wedgwood <cw@xxxxxxxx>, rsharpe@xxxxxxxxxx, Xuan Baldauf <xuan--reiserfs@xxxxxxxxxxx>, Seth Mos <knuffie@xxxxxxxxx>, Federico Sevilla III <jijo@xxxxxxxxxxxxxxxxxxxx>, linux-xfs@xxxxxxxxxxx, reiserfs-list@xxxxxxxxxxx
References: <Pine.BSI.4.10.10107141752080.18419-100000@xs3.xs4all.nl> <3B5169E5.827BFED@namesys.com> <20010716210029.I11938@weta.f00f.org> <20010716101313.2DC3E965@lyta.coker.com.au> <3B52C49F.9FE1F503@namesys.com> <15186.51514.66966.458597@beta.namesys.com>
Sender: owner-linux-xfs@xxxxxxxxxxx

Nikita Danilov wrote:

> Hans Reiser writes:
>  > Russell Coker wrote:
>  > >
>  > > On Mon, 16 Jul 2001 11:00, Chris Wedgwood wrote:
>  > > > On Sun, Jul 15, 2001 at 02:01:09PM +0400, Hans Reiser wrote:
>  > > >
>  > > >     Making the server stateless is wrong
>  > > >
>  > > > why?
>  > >
>  > > Because it leads to all the problems we have seen!  Why not have the 
> client
>  > > have an open file handle (the way Samba works and the way the Unix file
>  > > system API works)?  Then when the server goes down the client sends a 
> request
>  > > to open the file again...
>
> If you have 10000 clients each opening 100 files you got 1e6 opened
> files on the server---it wouldn't work. NFS was designed to be stateless
> to be scalable.

Every existing file has at least one name (or is member of the hidden 
to-be-deleted-directory, and so has
a name, too), and an object_id. Suppose the object_id is 32 bytes long. A 
virtual filedescriptor may be 4
bytes long, some housekeeping metadata 28 bytes, so we will have 64MB occupied 
in you scenario. Where's
the problem? 80% of those 64MB can be swapped out.

>
>
>  > >
>  > > >     making the readdir a multioperation act is wrong
>  > > >
>  > > > why? i have 3M directories... ar you saying clients should read the
>  > > > whole things at once?
>  > >
>  > > No.  findfirst()/findnext() is the correct way of doing this.  Forcing 
> the
>  > > client to read through 3M directory entries to see if "foo.*" matches
>  > > anything is just wrong.  The client should be able to ask for a list of 
> file
>  > > names matching certain criteria (time stamp, name, ownership, etc).  The
>  > > findfirst() and findnext() APIs on DOS, OS/2, and Windows do this quite 
> well.
>  >
>  > there is a fundamental conflict between having cookies, shrinkable 
> directories, and the ability to
>  > find foo.* without reading the whole directory, all at the same time.
>  >
>  > NFS V4 is designed by braindead twerps incapable of layering software when 
> designing it.
>
> Just cannot stand it. You mean that NFS v4 features database in a kernel
> too? (It's our insider joke.)

NFS should not be kernel-bound. The nfs server application mimics the 
applications on the NFS clients. If
this is not possible, something is wrong.

>
>
>  >
>  > >
>  > > If you have 3M directory entries then SMB should kick butt over NFS.
>  > >
>  > > Also while we're at it, one of the worst things about NFS is the issue of
>  > > delete.  Because it's stateless NFS servers implement unlink as "mv" and
>  > > things get nasty from there...
>  > >
>  > > --
>  > > http://www.coker.com.au/bonnie++/     Bonnie++ hard drive benchmark
>  > > http://www.coker.com.au/postal/       Postal SMTP/POP benchmark
>  > > http://www.coker.com.au/projects.html Projects I am working on
>  > > http://www.coker.com.au/~russell/     My home page
>
> Nikita.

Being "stateless" is only a weak way to implement disconnected operation. If 
there was state (if the
server could know that a client has a filedescriptor to a file), the client 
could be informed that it's
virtual file descriptors to the files to be deleted are invalid now. This only 
fails if the network is
down, so this is a disconnected operation problem.

By the way: if NFS was scaleable, why doesn't it allow every handle from the 
server (like inode numbers,
directory position cookies) to be of variable, dynamic, server-determined size? 
This would be scaleable.

P.S.: Hans, how do you prevent object_id|inode reusing? Using 
mount_id+generation counter per mount?




<Prev in Thread] Current Thread [Next in Thread>