-
Mikhail Pershin authored
The mdt_dom_discard_data() issues new lock to cause data discard for all conflicting client locks. This was done in context of unlink RPC processing and may cause it to be stuck waiting for client to cancel their locks leading to cascading timeouts for any other locks waiting on the same resource and parent directory. Patch skips discard lock waiting in the current context by using own CP callback for that which doesn't wait for blocking locks. They will be finished later by LDLM and cleaned up in that completion callback. So current thread just makes sure discard locks are taken and BL ASTs are sent but doesnt't wait for lock granting and that fixes the original problem. At the same time that opens window for race with data being flushed on client, so it is possible that new IO from client will happen on just unlinked object causing error message and it is not possible to distinguish that case from other possibly critical situations. To solve that the unlinked object is pinned in memory while until discard lock is granted. Therefore, such objects can be easily distinguished as stale one and any IO against it can be just silently ignored. Older clients are not fully compatible with async DoM discard so patch adds also new connection flag ASYNC_DISCARD to distinguish old clients and use old blocking discard for then. Test-Parameters: testlist=racer,racer,racer Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com> Change-Id: I419677af43c33e365a246fe12205b506209deace Reviewed-on: https://review.whamcloud.com/34071 Tested-by: Jenkins Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
9c028e74