- Nov 11, 2008
-
-
Elena Gryaznova authored
i=Adilger conf-sanity test_32* fix to not be skipped for remote setup
-
Yang Sheng authored
b=17374 i=shadow, bobijam kernel update for sles9 2.6.5-7.314.
-
Hongchao Zhang authored
b=17176 fixed a bug in 14774 patch -- compare peer's nid instead of self's nid in ptlrpc_connection during select failover MDS/OST nodes i=deen
-
Yang Sheng authored
b=17458 i=shadow, bobijam Update kernel to SLES10 SP2 2.6.16.60-0.31.
-
Hongchao Zhang authored
move the check of recovering state of the OST in osc_precreate out of "if (oscc->oscc_last_id < oscc->oscc_next_id)" condition so create operation don't use recovering OST i=adilger i=nathan.rutman
-
- Nov 08, 2008
-
-
Alexey Lyashkov authored
Branch b1_6 b=16492 i=green i=johann
-
Yury Umanets authored
r=shadow,johann - make sure that no new inflight rpcs may come after ptlrpcd_deactivate_import() for both synchronous and asynchronous sending. To do so we make sure that imp_inflight++ is done only when permission is granted by ptlrpc_import_delay_req() which makes decision should req be sent, deferred or killed as import is not in the state to send it in observable future. For async sending, rpc is only counted inflight when its added to sending or delaying list instead of just adding it to set for processing. This fixes assert in ptlrpc_invalidate_import() and as number of other issues; - synchronize imp_inflight and the presence on sending or delaying list for ptlrpc_queue_wait() case. So that, now it is guaranteed that if imp_inflight != 0 we may always find hanging rpc either in sending or in delaying list; - make sure that in ptlrcp_queue_wait() we remove rpc from sending or delaying list and dec inflight only after ptlrpc_unregister_reply() is done. This way we make sure that accounting is correct. Rpc can't be returned to the pool or counted finished until lnet lets us go with finished reply unlink; - check for inflight and rq_list in pinger; - comments, cleanups;
-
- Nov 07, 2008
-
-
Elena Gryaznova authored
i=Adilger replace cleanup_and_setup_lustre fn by check_and_setup_lustre fn
-
Yury Umanets authored
r=adilger,johann - removes deadlock possibility by disabling rehash in hash_del() operations and moving hash_add() out of spin_locks when calling. Hash table has own mechanisms for protecting its structures and it also has hash_add_unique() method for using in concurrent run contexts; - fixed missed lh_put() in hash_add_unique() which led to extra refs in some cases (extra ref to export) and inability to cleanup; - fixed __lustre_hash_set_theta() which set @max theta into ->lh_min_theta; - in lustre_hash_rehash_size() disable rehash also for the case when new and old hash sizes equal in corner cases (max_size or min_size). Before this fix it could be possible to do needless rehashes when size is actually did not change but we do this expensive operation; - disable rehash in hash_add_unique() if no actual add happened since entry with the same key is already found in the table; - some cleanups in hash table code;
-
Elena Gryaznova authored
i=Adilger check config if lustre is mounted before acc-sm run
-
Elena Gryaznova authored
i=Brian assert_DIR cleanup
-
Yury Umanets authored
r=tappro,johann - implements proper locking for rq pool freeing
-
Johann Lombardi authored
b=16860 i=nathan i=rread Description: Excessive recovery window Details : With AT enabled, the recovery window can be excessively long (6000+ seconds). To address this problem, we no longer use OBD_RECOVERY_FACTOR when extending the recovery window (the connect timeout no longer depends on the service time, it is set to INITIAL_CONNECT_TIMEOUT now) and clients report the old service time via pb_service_time.
-
Bobi Jam authored
b=16578 o=adilger A faster way to get long string.
-
- Nov 06, 2008
-
-
Yury Umanets authored
- make sure that rpcs in RQ_PHASE_UNREGISTERING phase can be marked expired and interrupted.
-
Yury Umanets authored
r=johann,shadow - fixes ptlrpcd blocking on very long reply unlink waiting. To do so new rpc phase introduced RQ_PHASE_UNREGISTERING in which request stay until we have reply_in_callback() called by lnet signaling that reply is unlinked. All requests in this state are skipped in processing by prlrcd instead of waiting n * 300s on each of them. This allows ptlrpcd to process other rpcs in the set; - make sure that inflight count is coherent with being present on sending or delay list. That is, if we see inflight != 0, rpc must be on one of these lists. This is very helpful in ptlrpc_invalidate_import() to show all rpcs still waiting after invalidating import; - in ptlrpc_invalidate_import() wait maximal rq_deadline - now from all inflight rpcs instead of obd_timeout which may be much longer. If calculated timeout is 0, obd_timeout is used. This fixes the issue that rq_deadline - now > obd_timeout (very easy to see in logs) which led to inflight != 0 assert because inflight rpcs timed out later than our wait period is finished; - in ptlrpc_invalidate_import() wait forever for rpcs in UNREGISTERING phase. Check in assert for inflight == 0 for wait timed out case if no rpcs in UNREGISTERING phase. Only those in UNREGISTERING phase are allowed to stay longer than obd_timeout; - added ptlrpc_move_rqphase() function. All phase changes go through it. Add debug_req() there to track down all phase changes; - conf_sanity.sh test_45 added to emulate very long reply unlink and also situation when rq_deadline - now > obd_timeout; - do not wait forever in ptlrpc_unregister_reply() for async case (using it from sets). sync case left unchanged; - make sure that ptlrpc_set_next_timeout() yields 1s timeout (instead of 0s) for the set with rpcs in "unregistering" stage to prevent ptlrpcd from sleeping forever and hanging in test_45; - in ptlrpcd() make sure that we do not sleep on 0 timeout.
-
- Nov 05, 2008
-
-
Andrew Perepechko authored
b=17371 i=Johann Lombardi i=Oleg Drokin fix a race between requeue thread processing and umount
-
Elena Gryaznova authored
i\Adilger correct remote_[mds|ost] fn to work correctly on configuration with several MDS/OSS nodes
-
kalpak authored
b=16438 i=adilger i=girish Mounting a filesystem with extents feature will fail on big-endian systems since ext3-based ldiskfs is not supported on big-endian systems. This can be over-riden with "bigendian_extents" mount option.
-
Jinshan Xiong authored
b=15715 r=adilger,green Fixed the race of destroying and enqueuing a ldlm lock at OST side.
-
Bobi Jam authored
b=16578 i=adilger Description: ldlm_cancel_pack()) ASSERTION(max >= dlm->lock_count + count) Details : If there is no extra space in the request for early cancels, ldlm_req_handles_avail() returns 0 instead of a negative value.
-
Liu Ying authored
-
- Nov 04, 2008
-
-
Yury Umanets authored
-
Yang Sheng authored
b=17534 i=alilger, yangsheng Fixed for client crash by old-style mount command.
-
tianzy authored
Replace LBUG with RETURN(-EINVAL) to avoid crashing b=5135 i=adilger i=johann
-
- Nov 03, 2008
-
-
Mikhail Pershin authored
b:12512 i:grev, adilger
-
Andrew Perepechko authored
b=17493 i=Andreas Dilger i=Johann Lombardi handling of a broken readonly key
-
tianzy authored
fix an error in the test_18 of sanity-quota.sh b=17523 i=johann i=panda
-
Andreas Dilger authored
Quiet compiler warning about unused label. Conditional check will be optimized away by compiler.
-
Andreas Dilger authored
Fix 80-column line wrapping.
-
- Oct 31, 2008
-
-
Elena Gryaznova authored
i=Nikita sanity test_100 fix
-
Elena Gryaznova authored
i=Nikita test_53 fix
-
Andreas Dilger authored
Remove trailing whitespace.
-
Elena Gryaznova authored
o=Robert.Read i=grev test_27u fix
-
Andrew Perepechko authored
b=13904 i=Johann Lombardi i=ZhiYong Tian 64-bit quota support for kernel
-
cvs2svn authored
-
Yang Sheng authored
b=17379 i=adilger, johann Test case for recursive symlink.
-
Yang Sheng authored
b=17379 i=Brian(LLNL), johann Set recursive symlink depth to 5 when kernel has 4K stack.
-
tianzy authored
fix a possible NULL pointer in client_quota_ctl() b=17486 i=johann i=panda
-
- Oct 30, 2008
-
-
girish authored
i=johann i=adilger b=17448
-