Allowing writeback from reclaim context causes massive problems with stack
overflows as we can call into the writeback code which tends to be a heavy
stack user both in the generic code and XFS from random contexts that
perform memory allocations.
Follow the example of btrfs (and in slightly different form ext4) and refuse
to write out data from reclaim context. This issue should really be handled
by the VM so that we can tune better for this case, but until we get it
sorted out there we have to hack around this in each filesystem with a
complex writeback path.
Signed-off-by: Christoph Hellwig <hch@xxxxxx>
--- xfs.orig/fs/xfs/linux-2.6/xfs_aops.c 2010-05-29 18:09:01.713253891
+++ xfs/fs/xfs/linux-2.6/xfs_aops.c 2010-05-29 18:11:46.847261713 +0200
@@ -1333,6 +1333,21 @@ xfs_vm_writepage(
trace_xfs_writepage(inode, page, 0);
+ * Refuse to write the page out if we are called from reclaim context.
+ * This is primarily to avoid stack overflows when called from deep
+ * used stacks in random callers for direct reclaim, but disabling
+ * reclaim for kswap is a nice side-effect as kswapd causes rather
+ * suboptimal I/O patters, too.
+ * This should really be done by the core VM, but until that happens
+ * filesystems like XFS, btrfs and ext4 have to take care of this
+ * by themselves.
+ if (current->flags & PF_MEMALLOC)
+ goto out_fail;
* We need a transaction if:
* 1. There are delalloc buffers on the page
* 2. The page is uptodate and we have unmapped buffers