Skip to content
Snippets Groups Projects
  1. Nov 11, 2008
    • Elena Gryaznova's avatar
      b=16551 · ee78b870
      Elena Gryaznova authored
      i=Adilger
      conf-sanity test_32* fix to not be skipped for remote setup
      ee78b870
    • Yang Sheng's avatar
      Branch b1_6 · e8519cd5
      Yang Sheng authored
      b=17374
      i=shadow, bobijam
      
      kernel update for sles9 2.6.5-7.314.
      e8519cd5
    • Hongchao Zhang's avatar
      branch b1_6 · a85a8430
      Hongchao Zhang authored
      b=17176
      
      fixed a bug in 14774 patch -- compare peer's nid instead of self's nid
      in ptlrpc_connection during select failover MDS/OST nodes
      
      i=deen
      a85a8430
    • Yang Sheng's avatar
      Branch b1_6 · 9296174e
      Yang Sheng authored
      b=17458
      i=shadow, bobijam
      
      Update kernel to SLES10 SP2 2.6.16.60-0.31.
      9296174e
    • Hongchao Zhang's avatar
      b=17495 · 0ea1ab8b
      Hongchao Zhang authored
      move the check of recovering state of the OST in osc_precreate
      out of "if (oscc->oscc_last_id < oscc->oscc_next_id)" condition
      so create operation don't use recovering OST
      
      i=adilger
      i=nathan.rutman
      0ea1ab8b
  2. Nov 08, 2008
    • Alexey Lyashkov's avatar
      don't panic on nfs reexport. · 9b34922a
      Alexey Lyashkov authored
      Branch b1_6
      b=16492
      i=green
      i=johann
      9b34922a
    • Yury Umanets's avatar
      b=17310 · 7eab5d6b
      Yury Umanets authored
      r=shadow,johann
      
      - make sure that no new inflight rpcs may come after ptlrpcd_deactivate_import() for both
      synchronous and asynchronous sending. To do so we make sure that imp_inflight++ is done only when
      permission is granted by ptlrpc_import_delay_req() which makes decision should req be sent,
      deferred or killed as import is not in the state to send it in observable future. For async
      sending, rpc is only counted inflight when its added to sending or delaying list instead of just
      adding it to set for processing.
      
      This fixes assert in ptlrpc_invalidate_import() and as number of other issues;
      
      - synchronize imp_inflight and the presence on sending or delaying list for ptlrpc_queue_wait()
      case. So that, now it is guaranteed that if imp_inflight != 0 we may always find hanging rpc either
      in sending or in delaying list;
      
      - make sure that in ptlrcp_queue_wait() we remove rpc from sending or delaying list and dec
      inflight only after ptlrpc_unregister_reply() is done. This way we make sure that accounting is
      correct. Rpc can't be returned to the pool or counted finished until lnet lets us go with finished
      reply unlink;
      
      - check for inflight and rq_list in pinger;
      
      - comments, cleanups;
      7eab5d6b
  3. Nov 07, 2008
    • Elena Gryaznova's avatar
      b=17477 · 45671edb
      Elena Gryaznova authored
      i=Adilger
      replace cleanup_and_setup_lustre fn by check_and_setup_lustre fn
      45671edb
    • Yury Umanets's avatar
      b=17511 · d91155fe
      Yury Umanets authored
      r=adilger,johann
      
      - removes deadlock possibility by disabling rehash in hash_del() operations and moving hash_add()
      out of spin_locks when calling. Hash table has own mechanisms for protecting its structures and it
      also has hash_add_unique() method for using in concurrent run contexts;
      
      - fixed missed lh_put() in hash_add_unique() which led to extra refs in some cases (extra ref to
      export) and inability to cleanup;
      
      - fixed __lustre_hash_set_theta() which set @max theta into ->lh_min_theta;
      
      - in lustre_hash_rehash_size() disable rehash also for the case when new and old hash sizes equal
      in corner cases (max_size or min_size). Before this fix it could be possible to do needless
      rehashes when size is actually did not change but we do this expensive operation;
      
      - disable rehash in hash_add_unique() if no actual add happened since entry with the same key is
      already found in the table;
      
      - some cleanups in hash table code;
      d91155fe
    • Elena Gryaznova's avatar
      b=17477 · f04bb1d7
      Elena Gryaznova authored
      i=Adilger
      check config if lustre is mounted before acc-sm run
      f04bb1d7
    • Elena Gryaznova's avatar
      b=14384 · ac21c0e8
      Elena Gryaznova authored
      i=Brian
      assert_DIR cleanup
      ac21c0e8
    • Yury Umanets's avatar
      b=17445 · 1dc6122d
      Yury Umanets authored
      r=tappro,johann
      
      - implements proper locking for rq pool freeing
      1dc6122d
    • Johann Lombardi's avatar
      Branch b1_6 · 1b818746
      Johann Lombardi authored
      b=16860
      i=nathan
      i=rread
      
      Description: Excessive recovery window
      Details    : With AT enabled, the recovery window can be excessively long (6000+
      	     seconds). To address this problem, we no longer use
      	     OBD_RECOVERY_FACTOR when extending the recovery window (the connect
      	     timeout no longer depends on the service time, it is set to
      	     INITIAL_CONNECT_TIMEOUT now) and clients report the old service
      	     time via pb_service_time.
      1b818746
    • Bobi Jam's avatar
      Branch b1_6 · 7770cb12
      Bobi Jam authored
      b=16578
      o=adilger
      
      A faster way to get long string.
      7770cb12
  4. Nov 06, 2008
    • Yury Umanets's avatar
      b=17310 · 03fbbb52
      Yury Umanets authored
      - make sure that rpcs in RQ_PHASE_UNREGISTERING phase can be marked expired and interrupted.
      03fbbb52
    • Yury Umanets's avatar
      b=17310 · 8c981415
      Yury Umanets authored
      r=johann,shadow
      
      - fixes ptlrpcd blocking on very long reply unlink waiting. To do so new rpc phase introduced
      RQ_PHASE_UNREGISTERING in which request stay until we have reply_in_callback() called by lnet
      signaling that reply is unlinked. All requests in this state are skipped in processing by prlrcd
      instead of waiting n * 300s on each of them. This allows ptlrpcd to process other rpcs in the set;
      
      - make sure that inflight count is coherent with being present on sending or delay list. That is,
      if we see inflight != 0, rpc must be on one of these lists. This is very helpful in
      ptlrpc_invalidate_import() to show all rpcs still waiting after invalidating import;
      
      - in ptlrpc_invalidate_import() wait maximal rq_deadline - now from all inflight rpcs instead of
      obd_timeout which may be much longer. If calculated timeout is 0, obd_timeout is used. This fixes
      the issue that rq_deadline - now > obd_timeout (very easy to see in logs) which led to inflight !=
      0 assert because inflight rpcs timed out later than our wait period is finished;
      
      - in ptlrpc_invalidate_import() wait forever for rpcs in UNREGISTERING phase. Check in assert for
      inflight == 0 for wait timed out case if no rpcs in UNREGISTERING phase. Only those in
      UNREGISTERING phase are allowed to stay longer than obd_timeout;
      
      - added ptlrpc_move_rqphase() function. All phase changes go through it. Add debug_req() there to
      track down all phase changes;
      
      - conf_sanity.sh test_45 added to emulate very long reply unlink and also situation when
      rq_deadline - now > obd_timeout;
      
      - do not wait forever in ptlrpc_unregister_reply() for async case (using it from sets). sync case
      left unchanged;
      
      - make sure that ptlrpc_set_next_timeout() yields 1s timeout (instead of 0s) for the set with rpcs
      in "unregistering" stage to prevent ptlrpcd from sleeping forever and hanging in test_45;
      
      - in ptlrpcd() make sure that we do not sleep on 0 timeout.
      8c981415
  5. Nov 05, 2008
    • Andrew Perepechko's avatar
      Branch b1_6 · a5fa4e1d
      Andrew Perepechko authored
      b=17371
      i=Johann Lombardi
      i=Oleg Drokin
      
      fix a race between requeue thread processing and umount
      a5fa4e1d
    • Elena Gryaznova's avatar
      b=16551 · 5bf79c45
      Elena Gryaznova authored
      i\Adilger
      correct remote_[mds|ost] fn to work correctly on configuration
      with several MDS/OSS nodes
      5bf79c45
    • kalpak's avatar
      · b1907268
      kalpak authored
      b=16438
      i=adilger
      i=girish
      
      Mounting a filesystem with extents feature will fail on big-endian systems since ext3-based ldiskfs is not supported on big-endian systems. This can be over-riden with "bigendian_extents" mount option.
      b1907268
    • Jinshan Xiong's avatar
      · ee927013
      Jinshan Xiong authored
      b=15715
      r=adilger,green
      
      Fixed the race of destroying and enqueuing a ldlm lock at OST side.
      ee927013
    • Bobi Jam's avatar
      Branch b1_6 · e9d306ad
      Bobi Jam authored
      b=16578
      i=adilger
      
      Description: ldlm_cancel_pack()) ASSERTION(max >= dlm->lock_count + count)
      Details    : If there is no extra space in the request for early cancels,
                   ldlm_req_handles_avail() returns 0 instead of a negative value.
      e9d306ad
    • Liu Ying's avatar
      *** empty log message *** · f391b34e
      Liu Ying authored
      f391b34e
  6. Nov 04, 2008
  7. Nov 03, 2008
    • Mikhail Pershin's avatar
      - test fix from 12512 · 1a2954ab
      Mikhail Pershin authored
        b:12512
        i:grev, adilger
      1a2954ab
    • Andrew Perepechko's avatar
      · cfa82133
      Andrew Perepechko authored
      b=17493
      i=Andreas Dilger
      i=Johann Lombardi
      handling of a broken readonly key
      cfa82133
    • tianzy's avatar
      Branch b1_6 · ff76c1dc
      tianzy authored
      fix an error in the test_18 of sanity-quota.sh
      b=17523
      i=johann
      i=panda
      ff76c1dc
    • Andreas Dilger's avatar
      Branch b1_6 · 542edff1
      Andreas Dilger authored
      Quiet compiler warning about unused label.
      Conditional check will be optimized away by compiler.
      542edff1
    • Andreas Dilger's avatar
      Branch b1_6 · 99eee3f6
      Andreas Dilger authored
      Fix 80-column line wrapping.
      99eee3f6
  8. Oct 31, 2008
  9. Oct 30, 2008
Loading