Skip to content
Snippets Groups Projects
  1. Nov 19, 2008
    • Elena Gryaznova's avatar
      b=17653 · b341fed2
      Elena Gryaznova authored
      i=Adilger
      test_21c fix: restore config
      b341fed2
    • Alexey Lyashkov's avatar
      fix handle ost additional correctly · 98892021
      Alexey Lyashkov authored
      Branch b1_6
      b=16492
      i=umka
      i=tappro
      98892021
    • Bobi Jam's avatar
      Branch b1_6 · 5142d04a
      Bobi Jam authored
      b=16992
      o=johann
      i=oleg.drokin (green)
      i=zhenyu.xu (bobijam)
      
      During ll_intent_lock(), server looks up parent and child, lock them, between these events parent could be deleted, then vfs_create may_access() fails with -ENOENT.
      
      Then client intent disposition got DISP_OPEN_CREATE | DISP_LOOKUP_NEG | DISP_LOOKUP_EXECD | DISP_IT_EXECD, and the request got double free.
      
      Solution: Clear DISP_ENQ_COMPLETE when we are going to release the intent (request cannot be reused anyway)
      5142d04a
  2. Nov 18, 2008
  3. Nov 17, 2008
    • tianzy's avatar
      Branch b1_6 · 08ce3a8f
      tianzy authored
      decay qos ost/oss penalties if MDS is not creating objects
      i=nathan
      i=johann
      08ce3a8f
    • tianzy's avatar
      Branch b1_6 · d9987483
      tianzy authored
      fix lov_brw_check() calls lov_stripe_intersects() with incorrect parameter.
      written by nikita
      d9987483
    • tianzy's avatar
      Branch b1_6 · 92008de5
      tianzy authored
      fix the error handling on quota slaves
      i=johann
      i=panda
      92008de5
  4. Nov 15, 2008
  5. Nov 14, 2008
  6. Nov 13, 2008
    • Yury Umanets's avatar
      b=17479 · 06f85cc5
      Yury Umanets authored
      r=adilger,behlendorf1
      
      - avoid div/mod in lustre_hash code
      06f85cc5
    • tianzy's avatar
      Branch b1_6 · 2d954654
      tianzy authored
      fix lquota.ko fails to install with --disable-liblustre used
      b=17620
      i=johann
      i=brian
      2d954654
    • Oleg Drokin's avatar
      b=16823 · 0b41127b
      Oleg Drokin authored
      r=shadow,adilger
      
      Lift 4G limit on stripe_size*stripe_count
      4G limit on stripe_size remains in place, though.
      0b41127b
  7. Nov 12, 2008
    • Elena Gryaznova's avatar
      b=17634 · 8a0375e6
      Elena Gryaznova authored
      i=Yury.Umanets
      insanity cleanup (remove dup fn, sync with HEAD t-f)
      8a0375e6
    • Yury Umanets's avatar
      b=17310 · c584acbe
      Yury Umanets authored
      r=shadow,vitaly
      - correct check for phase in ptlrpc_expired_set() and couple of other places.
      c584acbe
    • Elena Gryaznova's avatar
      b=16488 · c448adfa
      Elena Gryaznova authored
      i=Oleg.Drokin
      new runracer script
      c448adfa
    • Yury Umanets's avatar
      b=17037 · d35b4f51
      Yury Umanets authored
      r=tappro,wangdi
      
      - fixes ost cleanup issue due to missed llcd_put() in the case ost does not receive disconnect from mds;
      
        - do not sleep on hanging llcd. Instead assert on it _after_ stopping recov_thread's ptlrpcd which should kill any remeining llcds;
      
        - fixes and cleanups, comments.
      d35b4f51
    • Elena Gryaznova's avatar
      b=17555 · 3a5105ef
      Elena Gryaznova authored
      i=Adilger
      use current config instead of reformat fs to have a single ost
      3a5105ef
    • Hongchao Zhang's avatar
      branch b1_6 · 294526a1
      Hongchao Zhang authored
      b=17505
      
      remove "mfd" from "cloing_list" for the "mfd" will be freed in mds_mfd_close
      
      i=robert.read
      294526a1
    • huanghua's avatar
      Branch b1_6 · dc1bb083
      huanghua authored
      b=17602
      i=yury.umanets
      i=tappro
      
      use 1.8/2.0 compatible MDT config for 1.6 mds, easy to upgrade.
      dc1bb083
  8. Nov 11, 2008
    • Elena Gryaznova's avatar
      b=16551 · ee78b870
      Elena Gryaznova authored
      i=Adilger
      conf-sanity test_32* fix to not be skipped for remote setup
      ee78b870
    • Yang Sheng's avatar
      Branch b1_6 · e8519cd5
      Yang Sheng authored
      b=17374
      i=shadow, bobijam
      
      kernel update for sles9 2.6.5-7.314.
      e8519cd5
    • Hongchao Zhang's avatar
      branch b1_6 · a85a8430
      Hongchao Zhang authored
      b=17176
      
      fixed a bug in 14774 patch -- compare peer's nid instead of self's nid
      in ptlrpc_connection during select failover MDS/OST nodes
      
      i=deen
      a85a8430
    • Yang Sheng's avatar
      Branch b1_6 · 9296174e
      Yang Sheng authored
      b=17458
      i=shadow, bobijam
      
      Update kernel to SLES10 SP2 2.6.16.60-0.31.
      9296174e
    • Hongchao Zhang's avatar
      b=17495 · 0ea1ab8b
      Hongchao Zhang authored
      move the check of recovering state of the OST in osc_precreate
      out of "if (oscc->oscc_last_id < oscc->oscc_next_id)" condition
      so create operation don't use recovering OST
      
      i=adilger
      i=nathan.rutman
      0ea1ab8b
  9. Nov 08, 2008
    • Alexey Lyashkov's avatar
      don't panic on nfs reexport. · 9b34922a
      Alexey Lyashkov authored
      Branch b1_6
      b=16492
      i=green
      i=johann
      9b34922a
    • Yury Umanets's avatar
      b=17310 · 7eab5d6b
      Yury Umanets authored
      r=shadow,johann
      
      - make sure that no new inflight rpcs may come after ptlrpcd_deactivate_import() for both
      synchronous and asynchronous sending. To do so we make sure that imp_inflight++ is done only when
      permission is granted by ptlrpc_import_delay_req() which makes decision should req be sent,
      deferred or killed as import is not in the state to send it in observable future. For async
      sending, rpc is only counted inflight when its added to sending or delaying list instead of just
      adding it to set for processing.
      
      This fixes assert in ptlrpc_invalidate_import() and as number of other issues;
      
      - synchronize imp_inflight and the presence on sending or delaying list for ptlrpc_queue_wait()
      case. So that, now it is guaranteed that if imp_inflight != 0 we may always find hanging rpc either
      in sending or in delaying list;
      
      - make sure that in ptlrcp_queue_wait() we remove rpc from sending or delaying list and dec
      inflight only after ptlrpc_unregister_reply() is done. This way we make sure that accounting is
      correct. Rpc can't be returned to the pool or counted finished until lnet lets us go with finished
      reply unlink;
      
      - check for inflight and rq_list in pinger;
      
      - comments, cleanups;
      7eab5d6b
  10. Nov 07, 2008
    • Elena Gryaznova's avatar
      b=17477 · 45671edb
      Elena Gryaznova authored
      i=Adilger
      replace cleanup_and_setup_lustre fn by check_and_setup_lustre fn
      45671edb
    • Yury Umanets's avatar
      b=17511 · d91155fe
      Yury Umanets authored
      r=adilger,johann
      
      - removes deadlock possibility by disabling rehash in hash_del() operations and moving hash_add()
      out of spin_locks when calling. Hash table has own mechanisms for protecting its structures and it
      also has hash_add_unique() method for using in concurrent run contexts;
      
      - fixed missed lh_put() in hash_add_unique() which led to extra refs in some cases (extra ref to
      export) and inability to cleanup;
      
      - fixed __lustre_hash_set_theta() which set @max theta into ->lh_min_theta;
      
      - in lustre_hash_rehash_size() disable rehash also for the case when new and old hash sizes equal
      in corner cases (max_size or min_size). Before this fix it could be possible to do needless
      rehashes when size is actually did not change but we do this expensive operation;
      
      - disable rehash in hash_add_unique() if no actual add happened since entry with the same key is
      already found in the table;
      
      - some cleanups in hash table code;
      d91155fe
    • Elena Gryaznova's avatar
      b=17477 · f04bb1d7
      Elena Gryaznova authored
      i=Adilger
      check config if lustre is mounted before acc-sm run
      f04bb1d7
    • Elena Gryaznova's avatar
      b=14384 · ac21c0e8
      Elena Gryaznova authored
      i=Brian
      assert_DIR cleanup
      ac21c0e8
    • Yury Umanets's avatar
      b=17445 · 1dc6122d
      Yury Umanets authored
      r=tappro,johann
      
      - implements proper locking for rq pool freeing
      1dc6122d
    • Johann Lombardi's avatar
      Branch b1_6 · 1b818746
      Johann Lombardi authored
      b=16860
      i=nathan
      i=rread
      
      Description: Excessive recovery window
      Details    : With AT enabled, the recovery window can be excessively long (6000+
      	     seconds). To address this problem, we no longer use
      	     OBD_RECOVERY_FACTOR when extending the recovery window (the connect
      	     timeout no longer depends on the service time, it is set to
      	     INITIAL_CONNECT_TIMEOUT now) and clients report the old service
      	     time via pb_service_time.
      1b818746
    • Bobi Jam's avatar
      Branch b1_6 · 7770cb12
      Bobi Jam authored
      b=16578
      o=adilger
      
      A faster way to get long string.
      7770cb12
  11. Nov 06, 2008
    • Yury Umanets's avatar
      b=17310 · 03fbbb52
      Yury Umanets authored
      - make sure that rpcs in RQ_PHASE_UNREGISTERING phase can be marked expired and interrupted.
      03fbbb52
    • Yury Umanets's avatar
      b=17310 · 8c981415
      Yury Umanets authored
      r=johann,shadow
      
      - fixes ptlrpcd blocking on very long reply unlink waiting. To do so new rpc phase introduced
      RQ_PHASE_UNREGISTERING in which request stay until we have reply_in_callback() called by lnet
      signaling that reply is unlinked. All requests in this state are skipped in processing by prlrcd
      instead of waiting n * 300s on each of them. This allows ptlrpcd to process other rpcs in the set;
      
      - make sure that inflight count is coherent with being present on sending or delay list. That is,
      if we see inflight != 0, rpc must be on one of these lists. This is very helpful in
      ptlrpc_invalidate_import() to show all rpcs still waiting after invalidating import;
      
      - in ptlrpc_invalidate_import() wait maximal rq_deadline - now from all inflight rpcs instead of
      obd_timeout which may be much longer. If calculated timeout is 0, obd_timeout is used. This fixes
      the issue that rq_deadline - now > obd_timeout (very easy to see in logs) which led to inflight !=
      0 assert because inflight rpcs timed out later than our wait period is finished;
      
      - in ptlrpc_invalidate_import() wait forever for rpcs in UNREGISTERING phase. Check in assert for
      inflight == 0 for wait timed out case if no rpcs in UNREGISTERING phase. Only those in
      UNREGISTERING phase are allowed to stay longer than obd_timeout;
      
      - added ptlrpc_move_rqphase() function. All phase changes go through it. Add debug_req() there to
      track down all phase changes;
      
      - conf_sanity.sh test_45 added to emulate very long reply unlink and also situation when
      rq_deadline - now > obd_timeout;
      
      - do not wait forever in ptlrpc_unregister_reply() for async case (using it from sets). sync case
      left unchanged;
      
      - make sure that ptlrpc_set_next_timeout() yields 1s timeout (instead of 0s) for the set with rpcs
      in "unregistering" stage to prevent ptlrpcd from sleeping forever and hanging in test_45;
      
      - in ptlrpcd() make sure that we do not sleep on 0 timeout.
      8c981415
  12. Nov 05, 2008
    • Andrew Perepechko's avatar
      Branch b1_6 · a5fa4e1d
      Andrew Perepechko authored
      b=17371
      i=Johann Lombardi
      i=Oleg Drokin
      
      fix a race between requeue thread processing and umount
      a5fa4e1d
    • Elena Gryaznova's avatar
      b=16551 · 5bf79c45
      Elena Gryaznova authored
      i\Adilger
      correct remote_[mds|ost] fn to work correctly on configuration
      with several MDS/OSS nodes
      5bf79c45
    • kalpak's avatar
      · b1907268
      kalpak authored
      b=16438
      i=adilger
      i=girish
      
      Mounting a filesystem with extents feature will fail on big-endian systems since ext3-based ldiskfs is not supported on big-endian systems. This can be over-riden with "bigendian_extents" mount option.
      b1907268
    • Jinshan Xiong's avatar
      · ee927013
      Jinshan Xiong authored
      b=15715
      r=adilger,green
      
      Fixed the race of destroying and enqueuing a ldlm lock at OST side.
      ee927013
Loading