A file change notification API is an inherently reasonable thing to have, for
use by more than just
NFS (emacs for instance).
Xuan Baldauf wrote:
> Ragnar KjЬrstad wrote:
> > > > For each open file you have:
> > > >
> > > > struct file (96b)
> > > > struct inode (460b)
> > > > struct dentry (112b)
> > > >
> > > > at least. This totals to 668M of kernel memory, that is, unpageable.
> > >
> > > As I said below, the NFS server should be a user-space daemon..
> > But the datastructures mentioned above would still be kernel-memory, so
> > that will not solve the problem.
> Actually, you are right. :-(
> But once there is a filesystem change notification API, a userspace NFS
> server application could create a "catch-all" notification rule which says
> that it should be informed about every change within the whole VFS,
> including deletion of objects (like a file or a filename). All the tracking
> and NFS client notification then becomes a swappable user space problem.
> If Nikita does not like 1M open files by filedescriptor on a server due to
> memory scalability reasons, maybe he does like 1M accessible files by
> object_id|inode. Then the server issues virtual NFS filehandles to clients
> and stores internally which inode number maps to which virtual NFS
> filehandles. For every file deleted, the NFS server checks which clients to
> notify using the network. If this is not possible, then it is a
> Or let it say the other way around: an inode number is a "ticket for
> accessing a specified file on the server, issued by the server". This inode
> number is state. (The same applies to directory cookies). Just because
> currently the ticket cannot be invalidated by the server, it does not loose
> its property to be state, it only is state which is never released. Because
> the more files are opened by NFS on the server, the more state is given to
> the client. Because the state is not released (the server has to accept
> requests to files whose inode numbers have been sent some reboots before),
> NFS imposes the nfs server to have a memory leak by design - or to be broken
> in another way by design (like risking to refer a different file by the same
> object_id; requiring the server to have unique object_ids over indefinite
> space and time while object_ids are limited; requiring the server to use
> only a limited amount directory cookies over indefinite space and time, thus
> requiring reiserfs to be unneccessarily inefficient for directories (yet it
> is more efficient than many other filesystems)).