> I've noticed that the page cleaner daemon, which is supposed to map
> delayed allocations to disk when necessary (out of memory), walks
> sequentially through whole memory searching for "delayed allocated" pages.
> I think there is a problem here -- the page cleaner should take page aging
> information into account.
> While the VM maybe trying to sync relatively old pages which are in the
> inactive_dirty list, the page cleaner is mapping relatively new "delayed
> allocated" pages to disk.
> The right thing, IMHO, is to make the page cleaner look at the inactive
> dirty list when we are under a low memory condition.
> This way it will map "older" dirty pages, making it possible for
> page_launder() to sync them to disk and potentially free them later.
Yes you are correct in that the page cleaner is insensitive to the age
of the data it is working on, it is in effect random.
There are a couple of points to be made on the daemon.
First, we would really like not to have the daemon in its current state
at all and use an address space flush operation to trigger the activity
and drive this out of the vm system directly - from discussions in Miami
at the storage workshop, the correct way to do this may still involve a
special daemon for xfs, but it would be handed work by the vm. This would
also be used for subsequent writes of data (non delayed allocation) which
currently use buffer heads, I/O clustering could then be applied to those
writes as well. Getting to this stage would involve abstracting knowledge
of buffer heads out of the vm and hiding them behind another flush method
(I think). The flush call would have to be free to flush more or different
data than it was requested to.
Second, your initial comment misses one of the points of the page cleaner,
that it is the only thread of activity which is going to move delalloc pages
out to disk, if it was based purely on aging out pages due to pressure then
you could end up with data written to files not getting flushed to disk
for a very long time. Delayed allocate data needs to be treated more like
other writes to disk.
Pagebuf has a long way to go before it is a completed project, especially
if kiobufs get the revamp which appears to be on its way!