Skip to content
  • Mikhail Pershin's avatar
    LU-11359 mdt: fix mdt_dom_discard_data() timeouts · 9c028e74
    Mikhail Pershin authored
    The mdt_dom_discard_data() issues new lock to cause data
    discard for all conflicting client locks. This was done in
    context of unlink RPC processing and may cause it to be stuck
    waiting for client to cancel their locks leading to cascading
    timeouts for any other locks waiting on the same resource and
    parent directory.
    Patch skips discard lock waiting in the current context by
    using own CP callback for that which doesn't wait for blocking
    locks. They will be finished later by LDLM and cleaned up in
    that completion callback. So current thread just makes sure
    discard locks are taken and BL ASTs are sent but doesnt't wait
    for lock granting and that fixes the original problem.
    At the same time that opens window for race with data being
    flushed on client, so it is possible that new IO from client
    will happen on just unlinked object causing error message and
    it is not possible to distinguish that case from other
    possibly critical situations. To solve that the unlinked object
    is pinned in memory while until discard lock is granted.
    Therefore, such objects can be easily distinguished as stale one
    and any IO against it can be just silently ignored.
    Older clients are not fully compatible with async DoM discard so
    patch adds also new connection flag ASYNC_DISCARD to distinguish
    old clients and use old blocking discard for then.
    Test-Parameters: testlist=racer,racer,racer
    Signed-off-by: default avatarMikhail Pershin <>
    Change-Id: I419677af43c33e365a246fe12205b506209deace
    Tested-by: Jenkins
    Reviewed-by: default avatarAndreas Dilger <>
    Reviewed-by: default avatarPatrick Farrell <>
    Tested-by: default avatarMaloo <>
    Reviewed-by: default avatarOleg Drokin <>