Skip to content
Snippets Groups Projects
  • Yury Umanets's avatar
    ccaeafd4
    b=15230 · ccaeafd4
    Yury Umanets authored
    r=nikita,shadow
    
    - fixed handling for OBD_FAIL_$PREF_$OPC_NET fail_ids in mdt. Former code did not
    check it correctly (due to typo with && instead of &) in mdt_req_handle() and
    they all did not work. In same time, some handlers like mdt_close() and
    mdt_enqueue() tried to check them again (result of some wrong fix) but again, did
    it not correctly. They returned 0 error without doing anything. This should
    have to emulate network failure. But as they did not allocate reply buffer and
    returned 0 error, they caused rs != NULL assert in ptlrpc. Fxing this also fixed
    replay-single.sh test_53* and replay_dual.sh test_12 and possibly others;
    
    - removed checking for NET fail_id in mdt_close() and mdt_enqueue() - sources
    of recent assert;
    
    - added sanity check in mdt_req_handle() for any other invalid situation about
    returning 0 error and not allocating reply buffers;
    
    - removed mdt_reply(), move its one line call into mdt_req_handle(). This was
    needed to simplify handling NET fail_ids in which case we should just return 0
    and make sure that no reply is sent;
    
    - comments and cleanups;
    
    - in reply-dual.sh - remove test 8 from ALWAYS_EXCEPT. It passes in HEAD.
    Originally for placed into ALWAYS_EXCEPT for old mds code and later moved to
    HEAD test scripts but as mds in HEAD is completely new this bug is making any
    sense there;
    
    - in reply-single.sh - remove tests 0b 39 56 from ALWAYS_EXCEPT. They are
    passing in HEAD. Also they are obsolete and related to closed bugs.
    ccaeafd4
    History
    b=15230
    Yury Umanets authored
    r=nikita,shadow
    
    - fixed handling for OBD_FAIL_$PREF_$OPC_NET fail_ids in mdt. Former code did not
    check it correctly (due to typo with && instead of &) in mdt_req_handle() and
    they all did not work. In same time, some handlers like mdt_close() and
    mdt_enqueue() tried to check them again (result of some wrong fix) but again, did
    it not correctly. They returned 0 error without doing anything. This should
    have to emulate network failure. But as they did not allocate reply buffer and
    returned 0 error, they caused rs != NULL assert in ptlrpc. Fxing this also fixed
    replay-single.sh test_53* and replay_dual.sh test_12 and possibly others;
    
    - removed checking for NET fail_id in mdt_close() and mdt_enqueue() - sources
    of recent assert;
    
    - added sanity check in mdt_req_handle() for any other invalid situation about
    returning 0 error and not allocating reply buffers;
    
    - removed mdt_reply(), move its one line call into mdt_req_handle(). This was
    needed to simplify handling NET fail_ids in which case we should just return 0
    and make sure that no reply is sent;
    
    - comments and cleanups;
    
    - in reply-dual.sh - remove test 8 from ALWAYS_EXCEPT. It passes in HEAD.
    Originally for placed into ALWAYS_EXCEPT for old mds code and later moved to
    HEAD test scripts but as mds in HEAD is completely new this bug is making any
    sense there;
    
    - in reply-single.sh - remove tests 0b 39 56 from ALWAYS_EXCEPT. They are
    passing in HEAD. Also they are obsolete and related to closed bugs.