lustre/tests/replay-single.sh · ccaeafd40baad0b8e8d0ab544a18f2f5b0b5f9a6 · debian-packages / lustre-release

16 years ago

b=15230 · ccaeafd4

Yury Umanets authored 16 years ago

r=nikita,shadow

- fixed handling for OBD_FAIL_$PREF_$OPC_NET fail_ids in mdt. Former code did not
check it correctly (due to typo with && instead of &) in mdt_req_handle() and
they all did not work. In same time, some handlers like mdt_close() and
mdt_enqueue() tried to check them again (result of some wrong fix) but again, did
it not correctly. They returned 0 error without doing anything. This should
have to emulate network failure. But as they did not allocate reply buffer and
returned 0 error, they caused rs != NULL assert in ptlrpc. Fxing this also fixed
replay-single.sh test_53* and replay_dual.sh test_12 and possibly others;

- removed checking for NET fail_id in mdt_close() and mdt_enqueue() - sources
of recent assert;

- added sanity check in mdt_req_handle() for any other invalid situation about
returning 0 error and not allocating reply buffers;

- removed mdt_reply(), move its one line call into mdt_req_handle(). This was
needed to simplify handling NET fail_ids in which case we should just return 0
and make sure that no reply is sent;

- comments and cleanups;

- in reply-dual.sh - remove test 8 from ALWAYS_EXCEPT. It passes in HEAD.
Originally for placed into ALWAYS_EXCEPT for old mds code and later moved to
HEAD test scripts but as mds in HEAD is completely new this bug is making any
sense there;

- in reply-single.sh - remove tests 0b 39 56 from ALWAYS_EXCEPT. They are
passing in HEAD. Also they are obsolete and related to closed bugs.

ccaeafd4

History

b=15230

Yury Umanets authored 16 years ago

r=nikita,shadow

- fixed handling for OBD_FAIL_$PREF_$OPC_NET fail_ids in mdt. Former code did not
check it correctly (due to typo with && instead of &) in mdt_req_handle() and
they all did not work. In same time, some handlers like mdt_close() and
mdt_enqueue() tried to check them again (result of some wrong fix) but again, did
it not correctly. They returned 0 error without doing anything. This should
have to emulate network failure. But as they did not allocate reply buffer and
returned 0 error, they caused rs != NULL assert in ptlrpc. Fxing this also fixed
replay-single.sh test_53* and replay_dual.sh test_12 and possibly others;

- removed checking for NET fail_id in mdt_close() and mdt_enqueue() - sources
of recent assert;

- added sanity check in mdt_req_handle() for any other invalid situation about
returning 0 error and not allocating reply buffers;

- removed mdt_reply(), move its one line call into mdt_req_handle(). This was
needed to simplify handling NET fail_ids in which case we should just return 0
and make sure that no reply is sent;

- comments and cleanups;

- in reply-dual.sh - remove test 8 from ALWAYS_EXCEPT. It passes in HEAD.
Originally for placed into ALWAYS_EXCEPT for old mds code and later moved to
HEAD test scripts but as mds in HEAD is completely new this bug is making any
sense there;

- in reply-single.sh - remove tests 0b 39 56 from ALWAYS_EXCEPT. They are
passing in HEAD. Also they are obsolete and related to closed bugs.