- 29 Oct, 2020 1 commit
-
-
Oleg Drokin authored
When we are trying to grant a lock and met an AST error, rerunning the policy is pointless since it cannot grant a potentially now eligible lock and our lock is already in all the queues, just be like all the other handlers for ERESTART return and run a full resource reprocess instead. Change-Id: I3edb37bf084b2e26ba03cf2079d3358779c84b6e Signed-off-by:
Oleg Drokin <green@whamcloud.com> Reviewed-on: https://review.whamcloud.com/39598 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Yingjin Qian <qian@ddn.com>
-
- 19 Sep, 2020 3 commits
-
-
Vitaly Fertman authored
Let pool recalc to be able to be called forcefully independently of the last recalc time; Call the pool recalc forcefully on the lock decref instead of LRU cancel to take into account the fresh SLV obtained from the server. Call LRU recalc from after_reply if a significant SLV change occurs. Add a sysfs attribute to control what 'a significant SLV change' is. Signed-off-by:
Vitaly Fertman <c17818@cray.com> Change-Id: Iffeb8d73effdfc494f412422f285921aa4eb9811 HPE-bug-id: LUS-8678 Reviewed-on: https://es-gerrit.dev.cray.com/157134 Reviewed-by:
Andriy Skulysh <c17819@cray.com> Tested-by:
Jenkins Build User <nssreleng@cray.com> Reviewed-by:
Alexey Lyashkov <c17817@cray.com> Reviewed-on: https://review.whamcloud.com/39564 Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Gu Zheng <gzheng@ddn.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Jinshan Xiong authored
We register ELC for extent locks to be canceled at enqueue time, but it can't make positive effect to locks that have dirty pages under it. To keep the semantics of lru_size, the client should check how many unused locks are cached after adding a lock into lru list. If it has already exceeded the hard limit (ns_max_unused), the client will initiate async lock cancellation process in batch mode (ns->ns_cancel_batch). To do it, re-use the new batching LRU cancel functionality. Wherever unlimited LRU cancel is called (not ELC), try to cancel in batched mode. And a new field named new sysfs attribute named *lru_cancel_batch* is introduced into ldlm namespace to control the batch count. Signed-off-by:
Jinshan Xiong <jinshan.xiong@intel.com> Signed-off-by:
Shuichi Ihara <sihara@ddn.com> Signed-off-by:
Gu Zheng <gzheng@ddn.com> Signed-off-by:
Vitaly Fertman <c17818@cray.com> Change-Id: Ib18b829372da8599ba872b5ac5ab7421661f942d Reviewed-on: https://es-gerrit.dev.cray.com/157068 Reviewed-by:
Andriy Skulysh <c17819@cray.com> Reviewed-by:
Alexey Lyashkov <c17817@cray.com> Tested-by:
Alexander Lezhoev <c17454@cray.com> Reviewed-on: https://review.whamcloud.com/39562 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Alexey Lyashkov authored
Partially initialized lock (without l_ast_data to be init) caused a fail with blocking ast, as discard from page cache skipped, and stale data will read later with fast read. Slow read have chance to attach this lock to right IO, but it don’t true always, so that should disabled, until lock will have l_ast_data set always for DoM and Lock Ahead locks. HPE-bugid: LUS-8750 Change-Id: I2c5180c8044a12d7bd8f5f1c871447ca8b47a8ff Signed-off-by:
Alexey Lyashkov <c17817@cray.com> Reviewed-on: https://review.whamcloud.com/39318 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Vitaly Fertman <vitaly.fertman@hpe.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 27 May, 2020 1 commit
-
-
Bruno Faccini authored
Running with USE_LU_REF ("configure --enable-lu_ref") configured triggers a LBUG (because "ref->lf_failed > 0" condition false) due to to using "current" as the lu_ref source, but in some cases lu_ref_del() occurs within a different task context. To avoid this, lu_ref source is changed to ldlm_lock address by this patch. Signed-off-by:
Bruno Faccini <bruno.faccini@intel.com> Change-Id: Ia35e31c1a722c03f97672025e2abff40486b3f76 Reviewed-on: https://review.whamcloud.com/37624 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Mike Pershin <mpershin@whamcloud.com>
-
- 23 Apr, 2020 1 commit
-
-
James Simmons authored
In the latest kernels time_t has been removed since time_t has been a 64 bit value just like time64_t so no need for it anymore. To avoid confusion between timestamps and timeout values Lustre has a timeout_t typedef which is in seconds and is a s32 since timeouts are generally short. This helps to avoid errors that has happens in the past with certain math operation between timeouts and timestamps that lead to overflow and underflow cases. Change-Id: I4524456d514561e145201079a420ff89fa829602 Signed-off-by:
James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/31576 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com>
-
- 15 Apr, 2020 1 commit
-
-
Oleg Drokin authored
This makes reocurence of LU-11568 back in force. I just did not realize it due to some mail filters. This reverts commit 0584eb73. Change-Id: I9efe670ab9b9b9f2ea81582fe67feaac668e54d5 Reviewed-on: https://review.whamcloud.com/38238 Reviewed-by:
Oleg Drokin <green@whamcloud.com> Tested-by:
Oleg Drokin <green@whamcloud.com>
-
- 14 Apr, 2020 1 commit
-
-
NeilBrown authored
This spinlock (l_lock) is only used to stablise the l_resource pointer while taking a spinlock on the resource. This is not necessary - it is sufficient to take the resource spinlock, and then check if l_resource has changed or not. If it hasn't then it cannot change until the resource spinlock is dropped. We must ensure this is safe even if the resource is freed before lock_res_and_lock() managed to get the lock. To do this we mark the slab as SLAB_TYPESAFE_BY_RCU and initialise the lock in an init_once() function, but not on every allocate (and specifically don't zero the whole struct on each allocation). This means that if we find a resource after taking the RCU read lock, then it is always safe to take and then drop the spinlock. After taking the spinlock, we can check if it is more generally safe to use. Discarding l_lock shrinks 'struct ldlm_lock' which helps save memory. Change-Id: I2646f198ca60bdbd2e94922bf7679fab31f45c41 Signed-off-by:
NeilBrown <neilb@suse.com> Reviewed-on: https://review.whamcloud.com/35483 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by:
Neil Brown <neilb@suse.de> Reviewed-by:
Petros Koutoupis <petros.koutoupis@hpe.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 11 Mar, 2020 1 commit
-
-
NeilBrown authored
Just use current->pid and current->comm directly, instead of having wrappers. Linux-commit: 63fd7d04580b6345ff1e0aab906c034f973d493e Test-Parameters: trivial Change-Id: I278f32d6dd8c370a7ab211c5147ee8d246ea1893 Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-on: https://review.whamcloud.com/37776 Reviewed-by:
Neil Brown <neilb@suse.de> Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Yang Sheng <ys@whamcloud.com>
-
- 05 Mar, 2020 1 commit
-
-
Mr NeilBrown authored
lock->l_resource can (sometimes) change when the resource isn't locked. So dereferencing lock->l_resource and then locking the resource looks wrong. As lock_res_and_lock() returns the locked resource, this code can easily be more obviously correct by using that return value. Change-Id: Iced0bf1af4fa8ddedffa817e00f1c6a02b035d76 Signed-off-by:
Mr NeilBrown <neilb@suse.de> Reviewed-on: https://review.whamcloud.com/35484 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 01 Mar, 2020 1 commit
-
-
Wang Shilong authored
ldlm_glimpse_ast() is registered for server lock which means when client send a glimpse request, it just return a special error for this lock, it is possible that local object has size expanding with this PW lock, so we should try update LVB upon error. Originally, ldlm_cb_interpret() has codes to handle this error, but it only try to handle case with some clients race, it doesn't cover server lock cases especially after we turn on lockless for DIO. Fixes: 6bce5367 ("LU-4198 clio: turn on lockless for some kind of IO") Change-Id: Ic84fd19d9eaf7f8245b8f7a2165ee5913849ac01 Signed-off-by:
Wang Shilong <wshilong@ddn.com> Reviewed-on: https://review.whamcloud.com/37611 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Bobi Jam <bobijam@hotmail.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 25 Feb, 2020 1 commit
-
-
Bruno Faccini authored
Since osc_lock_upcall() uses per-cpu env via cl_env_percpu_[get,put](), all undelying work must execute on the same CPU, meaning that no sleep()/scheduling must occur. This implies all lu_ref related work to no longer use lu_ref_add(), which calls might_sleep() (likely to cause a scheduling/cpu-switch...), but lu_ref_add_atomoc() instead. Signed-off-by:
Bruno Faccini <bruno.faccini@intel.com> Change-Id: Ide33d4c415e9e382f0bc344e2114182a1f122de6 Reviewed-on: https://review.whamcloud.com/37629 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Alexandr Boyko <c17825@cray.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 08 Feb, 2020 1 commit
-
-
Mr NeilBrown authored
When declaring a local list head, instead of struct list_head list; INIT_LIST_HEAD(&list); use LIST_HEAD(list); which does both steps. Signed-off-by:
Mr NeilBrown <neilb@suse.de> Change-Id: I67bda77c04479e9b2b8c84f02bfb86d9c2ef5671 Reviewed-on: https://review.whamcloud.com/36955 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Arshad Hussain <arshad.super@gmail.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 28 Jan, 2020 1 commit
-
-
NeilBrown authored
hlist_head/hlist_node is the preferred data structure for hash tables. Not only does it make the 'head' smaller, but is also provides hlist_unhashed() which can be used to check if an object is in the list. This means that we don't need h_in any more. Change-Id: I18e2799a6e719b96ed47747375e4e20675d9b7cc Signed-off-by:
NeilBrown <neilb@suse.com> Reviewed-on: https://review.whamcloud.com/35862 Reviewed-by:
Neil Brown <neilb@suse.de> Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by:
Yang Sheng <ys@whamcloud.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 10 Jan, 2020 2 commits
-
-
NeilBrown authored
Now that portals_handle_ops contains only a char*, it is functioning primarily to identify the owner of each handle. So change the name to h_owner, and the type to const char*. Note: this h_owner is now quite different from the similar h_owner in the server code. When server code it merged the "med" pointer should be stored in the "mfd" and validated separately. Change-Id: Ie2e9134ea22c4929683c84bf45c41b96b348d0a2 Signed-off-by:
NeilBrown <neilb@suse.com> Reviewed-on: https://review.whamcloud.com/35798 Reviewed-by:
Shaun Tancheff <stancheff@cray.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Neil Brown <neilb@suse.de> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Mr NeilBrown authored
If l_wait_event() is passed an lwi initialised with one of LWI_TIMEOUT_INTR( time, NULL, NULL, NULL) LWI_TIMEOUT_INTR( time, NULL, LWI_ON_SIGNAL_NOOP, NULL) LWI_TIMEOUT( time, NULL, NULL) where time != 0, then it behaves much like wait_event_idle_timeout(). All signals are blocked, and it waits either for the condition to be true, or for the timeout (in jiffies). Note that LWI_ON_SIGNAL_NOOP has no effect here. l_wait_event() returns 0 when the condition is true, or -ETIMEDOUT when the timeout occurs. wait_event_idle_timeout() instead returns a positive number when the condition is true, and 0 when the timeout occurs. So in the cases where return value is used, handling needs to be adjusted accordingly. Note that in some cases where cfs_fail_val gives the time to wait for, the current code re-tests the wait time against zero as cfs_fail_val can change asynchronously. This is because l_wait_event() behaves quite differently if th...
-
- 20 Dec, 2019 1 commit
-
-
NeilBrown authored
OBD_FREE_RCU and the hop_free call-back together form an overly complex mechanism equivalent to kfree_rcu() or call_rcu(...). Discard them and use the simpler approach. This removes the only use for the field h_size, so discard that too. Change-Id: I3b4135565dab6a9aa5034f42ae3f9b66851cae31 Signed-off-by:
NeilBrown <neilb@suse.com> Reviewed-on: https://review.whamcloud.com/35797 Reviewed-by:
Neil Brown <neilb@suse.de> Reviewed-by:
Mike Pershin <mpershin@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Shaun Tancheff <stancheff@cray.com> Reviewed-by:
Petros Koutoupis <pkoutoupis@cray.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com>
-
- 14 Dec, 2019 2 commits
-
-
Alexey Lyashkov authored
Regression introduced by "LU-580: update mgc llog process code". It takes additional cld reference to the lock, but lock cancel forget during normal shutdown. So this lock holds cld on the list for a long time. any config modification needs to cancel each lock separately. Cray-bugid: LUS-6253 Fixes: 5538eee2 ("LU-580: update mgc llog process code") Signed-off-by:
Alexey Lyashkov <c17817@cray.com> Change-Id: Ic83e42666bf788739a2f81ab0c66632daa329290 Reviewed-on: https://review.whamcloud.com/32890 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Alexandr Boyko <c17825@cray.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Andriy Skulysh authored
Original request can be processed after resend request, so it can create a lock on MDT without client lock or unlock other lock. Make flock enqueue to use modify RPC slot. Change-Id: Icfee202fe2e389beda1116f78f8b933c7ea182fb Cray-bug-id: LUS-5739 Signed-off-by:
Andriy Skulysh <c17819@cray.com> Signed-off-by:
Vitaly Fertman <c17818@cray.com> Reviewed-by:
Alexander Boyko <c17825@cray.com> Reviewed-by:
Andrew Perepechko <c17827@cray.com> Reviewed-on: https://review.whamcloud.com/36340 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Alexandr Boyko <c17825@cray.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 06 Dec, 2019 4 commits
-
-
NeilBrown authored
Most objects with a lustre_handle have a refcount. The exception is mdt_mfd which uses locking (med_open_lock) to manage its lifetime. The lustre_handles code currently needs a call-out to increment its refcount. To simplify things, move the refcount into the lustre_hanle (which will be largely ignored by mdt_mfd) and discard the call-out. To avoid warnings when refcount debugging is enabled the refcount of mdt_mfd is initialized to 1, and decremeneted after any class_handle2object() call which would have incremented it. In order to preserve the same debug messages, we store an object type name in the portals_handle_ops, and use that in a CDEBUG() when incrementing the ref count. Change-Id: I1920330b2aeffd4b865cb9b249997aa28b209c33 Signed-off-by:
NeilBrown <neilb@suse.com> Reviewed-on: https://review.whamcloud.com/35794 Reviewed-by:
Neil Brown <neilb@suse.de> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-...
-
Mr NeilBrown authored
Rather than struct list_head foo = LIST_HEAD_INIT(foo); use LIST_HEAD(foo); This is shorter and more in-keeping with upstream style. Test-Parameters: trivial Signed-off-by:
Mr NeilBrown <neilb@suse.de> Change-Id: I36aa8c7e0763f3dfc88fe482cd28935184c1effa Reviewed-on: https://review.whamcloud.com/36669 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Ben Evans <bevans@cray.com> Reviewed-by:
Shaun Tancheff <stancheff@cray.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Mr NeilBrown authored
Having: return; } at the end of a void function is unnecessary noise. Where it is the *only* statement in the function, it can be useful, so that remain unchanged. The rest have been removed. Test-Parameters: trivial Signed-off-by:
Mr NeilBrown <neilb@suse.de> Change-Id: If02f6f5b91d4134cf95a68ebccc83df28c360fb2 Reviewed-on: https://review.whamcloud.com/36654 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Shaun Tancheff <stancheff@cray.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Petros Koutoupis <pkoutoupis@cray.com> Reviewed-by:
Ben Evans <bevans@cray.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Mr NeilBrown authored
When l_wait_event() is passed an 'lwi' which is initialised to all zeroes, it behaves exactly like wait_event_idle(): - no timeout - not interrupted by any signal - doesn't add to load average. So change all these instances to wait_event_idle(), or in two cases, to wait_event_idle_exclusive(). There are three ways that lwi gets set to all zeros: struct l_wait_info lwi = { 0 }; lwi = LWI_INTR(NULL, NULL); memset(&lwi, 0, sizeof(lwi)); Change-Id: Ia6723cbe248ce067331a002e5e9d54796739c08a Signed-off-by:
Mr NeilBrown <neilb@suse.de> Reviewed-on: https://review.whamcloud.com/35971 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Petros Koutoupis <pkoutoupis@cray.com> Reviewed-by:
Yang Sheng <ys@whamcloud.com> Reviewed-by:
Shaun Tancheff <stancheff@cray.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 04 Oct, 2019 1 commit
-
-
Lai Siyao authored
Layout_change is a reint operation, and it should be handled the same as other reint operations, so that resent and replay can work correctly. Also replace the lock passed in ldlm_handle_enqueue0() with the lock taken in mdt_layout_change(). This avoids taking lock again in ldlm_handle_enqueue0(), and also makes replay eaiser. Note, before replacing, the mode is downgraded from EX to CR, because client only needs this mode, as can avoid unnecessary lock cancel later. Add missing resent reconstructor for REINT_RESYNC. Signed-off-by:
Lai Siyao <lai.siyao@whamcloud.com> Change-Id: I328044dacbf18d03232c9bbb51271f6202e9b939 Reviewed-on: https://review.whamcloud.com/35465 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Mike Pershin <mpershin@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 20 Sep, 2019 2 commits
-
-
Mr NeilBrown authored
Each of the functions changed here will have code changes in the next patch, so fix up all the indentation first. Test-Parameters: trivial Signed-off-by:
Mr NeilBrown <neilb@suse.com> Change-Id: Ib10e999a8c58eb96d3312878be91b465da3a2df8 Reviewed-on: https://review.whamcloud.com/35970 Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Arshad Hussain <arshad.super@gmail.com> Reviewed-by:
Shaun Tancheff <stancheff@cray.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Patrick Farrell authored
When there are lock-ahead write locks on a file, the server sends one glimpse AST RPC to each client having such (it may have many) locks. This callback is sent to the lock having the highest offset. Client's glimpse callback goes up to the clio layers and gets the global (not lock-specific) view of size. The clio layers are connected to the extent lock through the l_ast_data (which points to the OSC object). Speculative locks (AGL, lockahead) do not have l_ast_data initialised until an IO happens under the lock. Thus, some speculative locks may not have l_ast_data initialized. It is possible for the client to do a write using one lock (changing file size), but for the glimpse AST to be sent to another lock without l_ast_data initialized. Currently, a lock with no l_ast_data set returns ELDLM_NO_LOCK_DATA to the server. In this case, this means we do not return the updated size. The solution is to search the granted lock tree for any lock with initialized l_ast_data (it points to the OSC object which is the same for all the extent locks) and to reach the clio layers for the size through this lock instead. cray-bug-id: LUS-6747 Signed-off-by:
Patrick Farrell <pfarrell@whamcloud.com> Change-Id: I6c60f4133154a3d6652315f155af24bbc5752dd2 Reviewed-on: https://review.whamcloud.com/33660 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Bobi Jam <bobijam@hotmail.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 27 Aug, 2019 1 commit
-
-
Andriy Skulysh authored
setxattr takes inode lock and sends reint to MDS. truncate takes MDS_INODELOCK_DOM lock and wants to acquire inode lock. MDS locks are for different bits MDS_INODELOCK_UPDATE|MDS_INODELOCK_XATTR vs MDS_INODELOCK_DOM but they blocks each other if some blocking lock was present earlier. If IBITS waiting lock has no conflicts with any lock in the granted queue or any lock ahead in the waiting queue then it can be granted. Use separate waiting lists for each ibit to eliminate full lr_waiting list scan. Cray-bug-id: LUS-6970 Change-Id: I95b2ed0b1a0063b7ece5277a5ee06e2511d44e5f Signed-off-by:
Andriy Skulysh <c17819@cray.com> Reviewed-on: https://review.whamcloud.com/35057 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Mike Pershin <mpershin@whamcloud.com> Reviewed-by:
Patrick Farrell <pfarrell@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 15 Aug, 2019 1 commit
-
-
NeilBrown authored
lustre_handles assigned a 64bit unique identifier (a 'cookie') to objects of various types and stored them in a hash table, allowing them to be accessed by the cookie. There is a facility for type checking by recording an 'owner' for each object, and checking the owner on lookup. Unfortunately this is not used - owner is always zero for the client. Each object also contains an h_ops pointer which can be used to reliably identify an owner. So discard h_owner, pass and 'ops' pointer to class_handle2object(), and only return objects for which the h_ops matches. Note: this h_owner is now quiet different from the similar h_owner in the server code. When the server code is merged the "med" pointer should be stored in the "mfd" and validated separately. This reduces the size of the portals_handle by one pointer, which benefits various other structures including struct ldlm_lock which can be very populous and so is best keep small. Change-Id: I9cf2b32f8b...
-
- 03 Jul, 2019 1 commit
-
-
NeilBrown authored
'extern' declarations should only appear in .h files. All these names are declared in .h files as needed, and these duplicate declarations in .c files can be removed. Test-Parameters: trivial Change-Id: Ic563789f350fd21fd033f1d3c49cdac2125b86c5 Signed-off-by:
NeilBrown <neilb@suse.com> Reviewed-on: https://review.whamcloud.com/35294 Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by: Jenkins Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Arshad Hussain <arshad.super@gmail.com> Reviewed-by:
Petros Koutoupis <pkoutoupis@cray.com>
-
- 25 Jun, 2019 1 commit
-
-
NeilBrown authored
Since 2.6.36, Linux' vsprintf has supported %pV which supports "recursive sprintf" - exactly the task that libcfs_debug_vmsg2 aims to provide. Instead of calling libcfs_debug_vmsg2(), we can put the fmt and args in a 'struct va_format', and pass the address of that structure to the "%pV" format. So do this to remove all users of libcfs_debug_vmsg2(). Linux-commit: 0fe922e1eca8e2850f0e6c535a14ba7414ca73c2 Change-Id: I6952ca8fdb619423639734aab1a30f4635b089cc Signed-off-by:
NeilBrown <neilb@suse.com> Reviewed-on: https://review.whamcloud.com/35224 Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Shaun Tancheff <stancheff@cray.com> Tested-by: Jenkins Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Chris Horn <hornc@cray.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 01 Jun, 2019 1 commit
-
-
Alex Zhuravlev authored
to be able to lookup by current thread where it's too complicated to pass env by argument. this version has stats to see slow/fast lookups. so, in sanity-benchmark there were 172850 fast lookups (from per-cpu cache) and 27228 slow lookups (from rhashtable). going to see the ration in autotest's reports. Fixes: 2339e1b3 ("LU-11483 ldlm ofd_lvbo_init() and mdt_lvbo_fill() create env") Fixes: e02cb407 ("LU-11164 ldlm: pass env to lvbo methods") Change-Id: Ia760e10fa5c68e7a18284e4726d215b330fc0eed Signed-off-by:
Alex Zhuravlev <bzzz@whamcloud.com> Reviewed-on: https://review.whamcloud.com/34566 Tested-by: Jenkins Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andrew Perepechko <c17827@cray.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com>
-
- 21 Mar, 2019 2 commits
-
-
Andrew Perepechko authored
This patch removes ll_md_blocking_ast() processing for not granted locks. The reason is ll_invalidate_negative_children() can slow down I/O significantly without a reason if there are thousands or millions of files in the directory cache. Change-Id: Ic69c5f02f71c14db4b9609677d102dd2993f4feb Seagate-bug-id: MRP-3409 Signed-off-by:
Andrew Perepechko <c17827@cray.com> Reviewed-on: https://review.whamcloud.com/19665 Tested-by: Jenkins Reviewed-by:
Mike Pershin <mpershin@whamcloud.com> Reviewed-by:
Lai Siyao <lai.siyao@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Mikhail Pershin authored
DOM locks matching tries to ignore locks with LDLM_FL_KMS_IGNORE flag during ldlm_lock_match() but checks that after ldlm_lock_match() call. Therefore if there is any lock with such flag in queue then all other locks after it are ignored and new lock is created causing big amount of locks on single resource in some access patterns. Patch extends lock_matches() function to check flags to exclude and adds ldlm_lock_match_with_skip()p to use that when needed. Corresponding test was added in sanity-dom.sh Test-Parameters: testlist=sanity-dom Signed-off-by:
Mikhail Pershin <mpershin@whamcloud.com> Change-Id: Ic45ca10f0e603e79a3a00e4fde13a5fae15ea5fc Reviewed-on: https://review.whamcloud.com/34261 Tested-by: Jenkins Reviewed-by:
Patrick Farrell <pfarrell@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Lai Siyao <lai.siyao@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 04 Jan, 2019 1 commit
-
-
Patrick Farrell authored
The search_itree and search_queue functions should both return either a pointer to a found lock or NULL. Currently, search_itree just returns the contents of data->lmd_lock, whether or not a lock was found. search_queue will do the same under certain cirumstances. Zero lmd_lock in both search_* functions, and also stop searching in search_itree once a lock is found. cray-bug-id: LUS-6783 Signed-off-by:
Patrick Farrell <paf@cray.com> Change-Id: Ie231166756e60c228370f8f1a019ccfe14dfda6a Reviewed-on: https://review.whamcloud.com/33754 Tested-by: Jenkins Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
James Simmons <uja.ornl@yahoo.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 17 Nov, 2018 1 commit
-
-
Mikhail Pershin authored
The previous commit 954cc675 skips bl_ast for local lock but there are cases on MDT when local lock can become a client lock, see mdt_intent_lock_replace(). In that case the client should be notified if lock is a blocking lock. Patch reverts commit 954cc675 and provides alternative solution. During downgrade to COS the lock renews own blocking AST states and start reprocessing. Any new lock conflict will cause new blocking AST and related async commit as needed. Test-Parameters: mdssizegb=20 testlist=racer,racer,racer Signed-off-by:
Mikhail Pershin <mpershin@whamcloud.com> Change-Id: I41adab5c805a59fdbeade8ae3556556b779dc3c0 Reviewed-on: https://review.whamcloud.com/33458 Reviewed-by:
Vitaly Fertman <c17818@cray.com> Tested-by: Jenkins Tested-by:
Maloo <hpdd-maloo@intel.com> Reviewed-by:
Lai Siyao <lai.siyao@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 23 Oct, 2018 1 commit
-
-
Yang Sheng authored
Use INIT_LIST_HEAD_RCU to avoid compiler optimization too much in some case. Signed-off-by:
Yang Sheng <ys@whamcloud.com> Change-Id: I66b340ac3147d2cb911a2b7d3e210c6847047dac Reviewed-on: https://review.whamcloud.com/33317 Tested-by: Jenkins Tested-by:
Maloo <hpdd-maloo@intel.com> Reviewed-by:
James Simmons <uja.ornl@yahoo.com> Tested-by:
James Simmons <uja.ornl@yahoo.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
John L. Hammond <jhammond@whamcloud.com>
-
- 10 Oct, 2018 1 commit
-
-
Mikhail Pershin authored
Update l_blocking_lock under with locking to prevent race between lock_handle_convert0() and ldlm_work_bl_ast() code. Signed-off-by:
Mikhail Pershin <mpershin@whamcloud.com> Change-Id: I881a1daf6f3b09677abcd6a85f6891d409926cc8 Reviewed-on: https://review.whamcloud.com/33124 Tested-by: Jenkins Tested-by:
Maloo <hpdd-maloo@intel.com> Reviewed-by:
Lai Siyao <lai.siyao@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 05 Oct, 2018 1 commit
-
-
Alex Zhuravlev authored
to save on env allocation. Benchmarks made by Shuichi Ihara demonstrated 13% improvement for small I/Os: 564k vs 639k IOPS. the details can be found LU-11164. Change-Id: I797e3d7e19ef408993004a2b872842d655240525 Signed-off-by:
Alex Zhuravlev <bzzz@whamcloud.com> Reviewed-on: https://review.whamcloud.com/32832 Tested-by: Jenkins Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Patrick Farrell <paf@cray.com> Tested-by:
Maloo <hpdd-maloo@intel.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 18 Aug, 2018 1 commit
-
-
John L. Hammond authored
In enum ldlm_intent_flags, remove the obsolete constants IT_UNLINK, IT_TRUNC, IT_EXEC, IT_PIN, IT_SETXATTR. Remove any handling code for these opcodes. Signed-off-by:
John L. Hammond <john.hammond@intel.com> Change-Id: I66f20e4c881cb77a481805a148a33f1c2daa5f0c Reviewed-on: https://review.whamcloud.com/32361 Reviewed-by:
Fan Yong <fan.yong@intel.com> Tested-by: Jenkins Tested-by:
Maloo <hpdd-maloo@intel.com> Reviewed-by:
Mike Pershin <mpershin@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 29 May, 2018 1 commit
-
-
Alexander Boyko authored
When race happen between ldlm_server_blocking_ast() and ldlm_request_cancel(), the at_measured() is called with wrong value equal to current time. And even worse, ldlm_bl_timeout() can return current_time*1.5. Before a time functions was fixed by LU-9019(e920be68 ) for 64bit, this race leads to ETIMEDOUT at ptlrpc_import_delay_req() and client eviction during bl ast sending. The wrong type conversion take a place at pltrpc_send_limit_expired() at cfs_time_seconds(). We should not take cancels into accoount if the BLAST is not send, just because the last_activity is not properly initialised - it destroys the AT completely. The patch devides l_last_activity to the client l_activity and server l_blast_sent for better understanding. The l_blast_sent is used for blocking ast only to measure time between BLAST and cancel request. For example: server cancels blocked lock after 1518731697s waiting_locks_callback()) ### lock callback timer expired after 0s: evicting client Signed-off-by:
Alexander Boyko <c17825@cray.com> Change-Id: I44962d2b3675b77e09182bbe062bdd78d6cb0af5 Cray-bug-id: LUS-5736 Reviewed-on: https://review.whamcloud.com/32133 Tested-by: Jenkins Reviewed-by:
Andreas Dilger <andreas.dilger@intel.com> Tested-by:
Maloo <hpdd-maloo@intel.com> Reviewed-by:
Vitaly Fertman <c17818@cray.com> Reviewed-by:
James Simmons <uja.ornl@yahoo.com> Reviewed-by:
Mike Pershin <mike.pershin@intel.com> Reviewed-by:
Oleg Drokin <oleg.drokin@intel.com>
-