xfs
[Top] [All Lists]

Re: 3.2.9 and locking problem

To: Arkadiusz Miśkiewicz <arekm@xxxxxxxx>
Subject: Re: 3.2.9 and locking problem
From: Dave Chinner <david@xxxxxxxxxxxxx>
Date: Mon, 12 Mar 2012 11:53:25 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <201203092028.47177.arekm@xxxxxxxx>
References: <201203092028.47177.arekm@xxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
On Fri, Mar 09, 2012 at 08:28:47PM +0100, Arkadiusz Miśkiewicz wrote:
> 
> Are there any bugs in area visible in tracebacks below? I have a system where 
> one operation
> (upgrade of single rpm package) causes rpm process to hang in D-state, 
> sysrq-w below:
> 
> [  400.755253] SysRq : Show Blocked State
> [  400.758507]   task                        PC stack   pid father
> [  400.758507] rpm             D 0000000100005781     0  8732   8698 
> 0x00000000
> [  400.758507]  ffff88021657dc48 0000000000000086 ffff880200000000 
> ffff88025126f480
> [  400.758507]  ffff880252276630 ffff88021657dfd8 ffff88021657dfd8 
> ffff88021657dfd8
> [  400.758507]  ffff880252074af0 ffff880252276630 ffff88024cb0d005 
> ffff88021657dcb0
> [  400.758507] Call Trace:
> [  400.758507]  [<ffffffff8114b22a>] ? kmem_cache_free+0x2a/0x110
> [  400.758507]  [<ffffffff8114d2ed>] ? kmem_cache_alloc+0x11d/0x140
> [  400.758507]  [<ffffffffa00df3c7>] ? kmem_zone_alloc+0x67/0xe0 [xfs]
> [  400.758507]  [<ffffffff8148b78a>] schedule+0x3a/0x50
> [  400.758507]  [<ffffffff8148d25d>] rwsem_down_failed_common+0xbd/0x150
> [  400.758507]  [<ffffffff8148d303>] rwsem_down_write_failed+0x13/0x20
> [  400.758507]  [<ffffffff812652a3>] call_rwsem_down_write_failed+0x13/0x20
> [  400.758507]  [<ffffffff8148c8ed>] ? down_write+0x2d/0x40
> [  400.758507]  [<ffffffffa00cf97c>] xfs_ilock+0xcc/0x120 [xfs]
> [  400.758507]  [<ffffffffa00d4ace>] xfs_setattr_nonsize+0x1ce/0x5b0 [xfs]
> [  400.758507]  [<ffffffff81265502>] ? __strncpy_from_user+0x22/0x60
> [  400.758507]  [<ffffffffa00d52ab>] xfs_vn_setattr+0x1b/0x40 [xfs]
> [  400.758507]  [<ffffffff8117c1a2>] notify_change+0x1a2/0x340
> [  400.758507]  [<ffffffff8115ed80>] chown_common+0xd0/0xf0
> [  400.758507]  [<ffffffff8115fe4c>] sys_chown+0xac/0x1a0
> [  400.758507]  [<ffffffff81495112>] system_call_fastpath+0x16/0x1b

I can't see why we'd get a task stuck here - it's waiting on the
XFS_ILOCK_EXCL. The only reason for this is if we leaked an unlock
somewhere. It appears you can reproduce this fairly quickly, so
running an event trace via trace-cmd for all the xfs_ilock trace
points and posting the report output might tell us what inode is
blocked and where we leaked (if that is the cause).

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>