On 03/29/12 07:07, Dave Chinner wrote:
From: Dave Chinner<dchinner@xxxxxxxxxx>
When memory allocation fails to add the page array or tht epages to
a buffer during xfs_buf_get(), the buffer is left in the cache in a
partially initialised state. There is enough state left for the next
lookup on that buffer to find the buffer, and for the buffer to then
be used without finishing the initialisation. As a result, when an
attempt to do IO on the buffer occurs, it fails with EIO because
there are no pages attached to the buffer.
We cannot remove the buffer from the cache immediately and free it,
because there may already be a racing lookup that is blocked on the
buffer lock. hence the moment we unlock the buffer to then free it,
the other user is woken and we have a use-after-free situation.
Hence we have to mark the buffer as "broken" and check that after we
have gained the buffer lock on a cache hit lookup. This enables
racing lookups to avoid the broken buffer and drop their references,
allowing the buffer to be freed.
This however, doesn't solve the problem completely - there may be a
delay in the buffer getting freed (e.g. pre-emption), so when we try
the lookup a second time with a new buffer to insert into the tree,
if we find the broken buffer again, drop the buffer lock, sleep for
a short while, and try the lookup again. When the broken bufer is
finally removed from the cache we will make forwards progress.
Signed-off-by: Dave Chinner<dchinner@xxxxxxxxxx>
fs/xfs/xfs_buf.c | 33 ++++++++++++++++++++++++++++++++-
fs/xfs/xfs_buf.h | 2 ++
fs/xfs/xfs_trace.h | 2 ++
3 files changed, 36 insertions(+), 1 deletions(-)
Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>