LU-9749 llite: Reduce overhead for ll_do_fast_read
In ll_do_fast_read, looking up a cl_env adds some overhead, and can also cause spinlock contention on older kernels. Fast read can safely use the preallocated percpu cl_env, so do that to reduce overhead. SLES numbers on recent Xeon, CentOS numbers on VMs on older hardware. SLES has queued spinlocks and scales perfectly with multiple threads, with or without this patch. CentOS scales poorly at small I/O sizes without this patch. SLES is SLES12SP2, CentOS is CentOS 7.3. SLES: 1 thread 8b 1K 1M Without: 23 2200 6800 With: 27.5 2500 7200 4 threads 8b 1K 1M Without: 90 8700 27000 With: 108 10000 28000 Earlier kernel (CentOS 7.3): 1 thread 8b 1K 1M Without: 9 1000 5100 with: 12 1300 5800 4 threads 8b 1K 1M Without: 22 2400 17000 With: 48 4900 20000 Signed-off-by:Patrick Farrell <paf@cray.com> Change-Id: Ice5d653ace5ce76bc8911501a9b15c11b7a3234a Reviewed-on: https://review.whamcloud.com/27970 Tested-by: Jenkins Reviewed-by:
Andreas Dilger <andreas.dilger@intel.com> Reviewed-by:
Jinshan Xiong <jinshan.xiong@intel.com> Tested-by:
Maloo <hpdd-maloo@intel.com> Reviewed-by:
Dmitry Eremin <dmitry.eremin@intel.com>
Loading
Please register or sign in to comment