- 07 Aug, 2020 1 commit
-
-
Lai Siyao authored
Pack parent FID in getattr request if OBD_CONNECT2_GETATTR_PFID is enabled, otherwise fill it with target FID for backward compatibility. Fixes: f9a2da63 ("LU-13437 mdt: don't fetch LOOKUP lock for remot...") Test-Parameters: clientversion=2.12 testlist=sanity env=SANITY_EXCEPT="27M 151 156" Test-Parameters: serverversion=2.12 testlist=sanity env=SANITY_EXCEPT="56 165 205b" Signed-off-by:
Lai Siyao <lai.siyao@whamcloud.com> Change-Id: Idcf8388b65dee1f0a09a53b240ce8303f3c6ff75 Reviewed-on: https://review.whamcloud.com/39290 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Neil Brown <neilb@suse.de> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 20 Jul, 2020 1 commit
-
-
Sebastien Buisson authored
Encryption layer needs to set an encryption context on files and dirs that are encrypted. This context is stored as an extended attribute, that then needs to be fetched upon metadata ops like lookup, getattr, open, truncate, and layout. With this patch we send encryption context to the MDT along with create RPCs. This closes the insecure window between creation and setting of the encryption context, and saves a setxattr request. This patch also introduces a way to have the MDT return encryption context upon granted lock reply, making the encryption context retrieval atomic, and sparing the client an additional getxattr request. Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49" clientdistro=el8.1 fstype=ldiskfs mdscount=2 mdtcount=4 Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49" clientdistro=el8.1 fstype=zfs mdscount=2 mdtcount=4 Test-Parameters: clientversion=2.12 env=SANITY_EXCEPT="27M 56ra 151 156 802" Test-Parameters: serverversion=2.12 env=SANITY_EXCEPT="56oc 56od 165a 165b 165d 205b" Test-Parameters: serverversion=2.12 clientdistro=el8.1 env=SANITYN_EXCEPT=106,SANITY_EXCEPT="56oc 56od 165a 165b 165d 205b" Signed-off-by:
Sebastien Buisson <sbuisson@ddn.com> Change-Id: I45599cdff13d5587103aff6edd699abcda6cb8f4 Reviewed-on: https://review.whamcloud.com/38430 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Mike Pershin <mpershin@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 10 Jul, 2020 1 commit
-
-
Mr NeilBrown authored
The lli_lsm_sem locks taken by ll_prep_md_op_data() are sometimes released by a different thread. This confuses lockdep unless we explain the situation. So use down_read_non_owner() and up_read_non_owner(). Test-Parameters: trivial Signed-off-by:
Mr NeilBrown <neilb@suse.de> Change-Id: Ie6543706c658fc427461ef03448f3fcf90abaab7 Reviewed-on: https://review.whamcloud.com/39234 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 04 Jul, 2020 2 commits
-
-
Mr NeilBrown authored
A kthread runs with the same fs_struct as init. It is only helpful to unshare this if the thread will change one of the fields in the fs_struct: root directory current working directory umask. No lustre kthread changes any of these, so there is no need to call unshare_fs_struct(). Signed-off-by:
Mr NeilBrown <neilb@suse.de> Change-Id: I7309b6ed184b14a272bad7dc5149ad36281f948e Reviewed-on: https://review.whamcloud.com/39132 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Yang Sheng <ys@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Mr NeilBrown authored
It is Linux policy to avoid #ifdef in C files where convenient - .h files are OK. This patch defines a few inline functions which differ depending on CONFIG_LUSTRE_FS_POSIX_ACL, and removes some #ifdefs from .c files. Test-Parameters: trivial Signed-off-by:
Mr NeilBrown <neilb@suse.de> Change-Id: I680bcf568d3a09d3768cc992a53671352bd125fd Reviewed-on: https://review.whamcloud.com/39131 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Jian Yu <yujian@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 23 Jun, 2020 1 commit
-
-
Lai Siyao authored
Pack parent FID in getattr by FID, which will be used to check whether child is remote object on parent. The helper function is called mdt_is_remote_object(). NB, directory shard is not treated as remote object, because if so, client needs to revalidate shards when dir is accessed, which will hurt performance much. For getattr by FID, if object is remote file on parent, don't fetch LOOKUP lock, otherwise client may see stale dir entries. Signed-off-by:
Lai Siyao <lai.siyao@whamcloud.com> Change-Id: Id181ecc053579ee394080381a82706334503ced0 Reviewed-on: https://review.whamcloud.com/38561 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Yingjin Qian <qian@ddn.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 19 Jun, 2020 1 commit
-
-
Andriy Skulysh authored
All MDT intent RPCs are sent with inode mutex locked while read/write and setattr unlocks inode mutex on entry, takes LDLM lock and locks inode mutex again and sends the RPC. So a deadlock can occur since LDLM lock is the same in case of DoM. In fact read/write and setattr takes lli_trunc_sem, so inode mutex can be ommited in truncate case. Replace inode_lock with new lli_setattr_mutex to keep protection from concurrent setattr time updates. HPE-bug-id: LUS-8455 Change-Id: Ie294154306cc3b6cff977a2dff485e8d44145ed9 Reviewed-by:
Andrew Perepechko <c17827@cray.com> Reviewed-by:
Vitaly Fertman <c17818@cray.com> Signed-off-by:
Andriy Skulysh <c17819@cray.com> Reviewed-on: https://review.whamcloud.com/38288 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Mike Pershin <mpershin@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 16 Jun, 2020 1 commit
-
-
Sebastien Buisson authored
Truncation of encrypted files is not a trivial operation. The page corresponding to the point where truncation occurs must be read, decrypted, zeroed after truncation point, re-encrypted and then written back. Signed-off-by:
Sebastien Buisson <sbuisson@ddn.com> Change-Id: I834f9372913d7051b1e0821515d3fea0873ffd78 Reviewed-on: https://review.whamcloud.com/37794 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
John L. Hammond <jhammond@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 10 Jun, 2020 2 commits
-
-
Patrick Farrell authored
The statfs state flags are oddly named "OS_STATE_[STATE]" Rename them to "OS_STATFS_[STATE]" to make their role clearer and make them easier to find. Test-Parameters: trivial Signed-off-by:
Patrick Farrell <pfarrell@whamcloud.com> Change-Id: I3f43b3e73155d9fbd8b3e0fa52e7f4d26b9d2f89 Reviewed-on: https://review.whamcloud.com/34289 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Patrick Farrell <farr0186@gmail.com>
-
Sebastien Buisson authored
Client enables encryption by default. However, this should be possible only if server side is encryption aware. Moreover, we want to give the ability to decide which clients can make use of encryption, by extending the nodemap mechanism with a new 'forbid_encryption' property, set to 0 by default. Signed-off-by:
Sebastien Buisson <sbuisson@ddn.com> Change-Id: I765e5ce555e8277319c03c770cb6e6ac73cfc9e8 Reviewed-on: https://review.whamcloud.com/36433 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
John L. Hammond <jhammond@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 06 Jun, 2020 1 commit
-
-
Sebastien Buisson authored
Enable client side encryption. By default it is activated, letting user specifies actual encryption policy to use on a per-directory basis. It is possible to deactivate client side encryption by using the 'noencrypt' mount option. Also add the test dummy encryption mode option to ease testing. Signed-off-by:
Sebastien Buisson <sbuisson@ddn.com> Change-Id: I0e8d4db7ab8a77aba0600788cca9403f7c50f8a6 Reviewed-on: https://review.whamcloud.com/36143 Reviewed-by:
John L. Hammond <jhammond@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 27 May, 2020 2 commits
-
-
James Simmons authored
Lustre uses a workqueue to clear out stale exports. Bind this workqueue to the cores used by Lustre defined by the CPT setup. Move the code handling workqueue binding to libcfs so it can be used by everyone. Rename CONFIG_LUSTRE_PINGER to CONFIG_LUSTRE_FS_PINGER to match linux client. Change-Id: Ifa109f6a93e6ec6bbdef5e91fe8ca1cde0eaea3e Signed-off-by:
James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/38212 Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by:
Wang Shilong <wshilong@ddn.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Qian Yingjin authored
System call statx() interface can specify a bitmask to fetch specific attributes from a file (e.g. st_uid, st_gid, st_mode, and st_btime = file creation time), rather than fetching all of the normal stat() attributes (such as st_size and st_blocks). It also has a AT_STATX_DONT_SYNC mode which allows the kernel to return cached attributes without flushing all of the client data and fetching an accurate result from the server. The conditions for adding statx() API for Lustre are mature: 1. statx() is added to Linux 4.11+; 2. glibc supports statx() (glibc 2.28+ -> RHEL 8, Ubuntun 18.10+) 3. The support for stat(1) and ls(1) to use statx(3) to fetch only the required attributes has landed to the upstream GNU coreutils package. This patch integrates statx() API with Lustre so that we can take advantage of the efficiencies available: - Only fetch MDS attributes if STATX_SIZE, STATX_BLOCKS and STATX_MTIME are not requested, and avoid OSS glimpse RPCs completely; - Hook this into statahead to avoid async glimpse locks (AGL) if OST information not needed; - Enhance the MDS RPC interface to return the file creation time stored in both ldiskfs and ZFS already, and enable STATX_BTIME; - Better support with AT_STATX_DONT_SYNC mode. Return the "lazy" attributes or cached attributes (even stale) on a client if available without any RPCs to servers (MDS and OSS). - statx (lustre/test/statx): port coreutils ls/stat by using statx(3) system call if OS supported it. - Test scripts. Using statx() to verify btime attribute and the advantage described above. Test-Parameters: clientdistro=el8 Test-Parameters: clientdistro=ubuntu1804 Signed-off-by:
Qian Yingjin <qian@ddn.com> Change-Id: I8432c9029bad9dea3e1ebc13a0d6978131d9b929 Reviewed-on: https://review.whamcloud.com/36674 Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com>
-
- 14 May, 2020 1 commit
-
-
Andreas Dilger authored
The client connection UUID sent to the servers (ASCII format) was being truncated to only 16 bytes in size, like '595f3c6a-20ae-4' instead of a full UUID like '18ae0f9a-4b09-4599-8ced-0f2126eab425'. This was caused by using UUID_SIZE to limit the size of the "%pU" string printed to avoid overflowing the target buffer, but in fact UUID_SIZE is the size of the binary uuid_t (16 bytes) instead of the size of struct obd_uuid (40 bytes) where the ASCII version of the UUID is stored. Fix this to use sizeof(target) rather than an external constant, which is exactly why sizeof(target) should always be used. The usage in osd_scrub.c is not actually broken, but it is still better to use sizeof(target) to avoid future inconsistencies. Fixes: 604c266a ("LU-11803 obd: replace class_uuid with linux kernel version") Signed-off-by:
Andreas Dilger <adilger@whamcloud.com> Change-Id: I05325646cd632a09997d6632a483909629ce7057 Reviewed-on: https://review.whamcloud.com/38443 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Arshad Hussain <arshad.super@gmail.com> Reviewed-by:
Mike Pershin <mpershin@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 08 May, 2020 1 commit
-
-
James Simmons authored
In the late 4.X kernel cycle Xarrays were introduced with the goal of replacing the radix tree for the page cache. It is highly optimized for densely packed data which is the case for several items in Lustre such as the static array for the obd devices and quota ids. This patch provides Xarray support for kernels that lack. The current verison of Xarray back ported is from the 5.4-rc2 kernel. Change-Id: I54f9046f50a353e1cd4271c0b97207062bbf3898 Signed-off-by:
James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/37391 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Neil Brown <neilb@suse.de> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 07 May, 2020 1 commit
-
-
Chris Horn authored
This patch adds a new constant, LNET_NID_LO_0, to represent the lolnd NID 0@lo. HPE-bug-id: LUS-8457 Signed-off-by:
Chris Horn <hornc@cray.com> Change-Id: I3e57637f297b8de306905a447af8f025e31d1fcf Reviewed-on: https://review.whamcloud.com/38312 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Serguei Smirnov <ssmirnov@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 23 Apr, 2020 1 commit
-
-
Sebastien Buisson authored
Instead of using inode->i_blkbits to infer preferred IO size, just set stat->blksize to the right value in ll_getattr_dentry(). Signed-off-by:
Sebastien Buisson <sbuisson@ddn.com> Change-Id: If705c7d52bdfabdd3e669ca2d34f8cc0ca1ae08a Reviewed-on: https://review.whamcloud.com/38176 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Mike Pershin <mpershin@whamcloud.com>
-
- 07 Apr, 2020 1 commit
-
-
James Simmons authored
A workqueue is used by Lustre to optimize readahead. This work queue can run on any core and can easily be over surscribed. This will have a negative impact on HPC applications running on a Lustre client. Limit the number of threads a workqueue can run to the size of the CPU allocated for Lustre and only allow those threads to run on the cores belonging to the CPT set. Change-Id: Ifcc662d52843f5028c34d55695c1d6297e5c00b0 Signed-off-by:
James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/37717 Reviewed-by:
Wang Shilong <wshilong@ddn.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by:
Stephen Champion <stephen.champion@hpe.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 06 Apr, 2020 1 commit
-
-
Andreas Dilger authored
The OBD_CONNECT_LOCKAHEAD_OLD feature was added for a short time for compatibility with an implementation of LDLM lockahead that was later replaced by OBD_CONNECT_LOCKAHEAD2. Remove the compatibility code for this old implementation that has been disabled since the 2.13 release. Also remove other obsolete compatibility code dating back to 2.8.53. Test-Parameters: trivial Signed-off-by:
Andreas Dilger <adilger@whamcloud.com> Change-Id: I4444eff180b2c6e2b27d260413f2debbb2ce7057 Reviewed-on: https://review.whamcloud.com/38109 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org>
-
- 31 Mar, 2020 1 commit
-
-
Lai Siyao authored
Current directory striping strategy is fixed striping, i.e., it calls (hash(filename) % stripe_count) to decide stripe for file. The problem with this approach is that if stripe_count changes, most of the files will need to be relocated between MDTs. This makes directory split/merge quite expensive. This patch introduces consistent hash striping strategy: it calls (hash(filename) % LMV_CRUSH_PG_COUNT) to locate PG_ID (placement group index), and then calls crush_hash(PG_ID, stripe_index) to get a straw for each stripe, and the stripe with the highest staw will be used to place this file. As we can see, it uses the CRUSH algorithm, but it only uses it to map placement group pseudo-randomly among all stripes, while doesn't use it to choose MDTs if MDT is not specified. The latter is done by MDT object QoS allocation in LMV and LOD (LMV decides the starting stripe MDT, while LOD decides the rest stripes). This implementation contains below changes: ...
-
- 24 Mar, 2020 1 commit
-
-
Wang Shilong authored
Currently async readahead is limited by following factors: 1) @ra_max_pages_per_file 2) @ra_max_read_ahead_whole_pages; 3) @ra_async_pages_per_file_threshold If admin change a large value 4G to @ra_max_read_ahead_whole_pages, with 16M RPC we could have 256 async readahead requests flighting at the same time, this could consume all CPU resources for readahead without limiting. Even though we could set @max_active for workqueue, RA requests still kept in the workqueue pool which help prevent from CPU busying, the problem is RA still try to use CPU later, we might still submit too many requests to workqueue, so instead of limiting it in the workqueue, we could limit it earlier, if there has been too many async RA requests in the system(let's say default is 1/2 of CPU cores), we just fallback to sync RA, which limit read threads using all CPU resources. Change-Id: I370c04e014f24c795c1a28effca9c51b1db2a417 Signed-off-by:
Wang Shilong <wshilong@ddn.com> Reviewed-on: https://review.whamcloud.com/37927 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 01 Mar, 2020 1 commit
-
-
Andreas Dilger authored
Move the max_read_ahead_* tunables from debugfs to sysfs, since they follow the one-value-per-file rule and should be visible to regular users. Rename the functions and constants from *readahead* to *read_ahead* or *READ_AHEAD* to match the tunable names from procfs. Deprecate usage of llprocfs_str_with_units_to_s64(), lu_str_to_s64(), llprocfs_str_with_units_to_u64(), and lu_str_to_u64(), and instead use sysfs_memparse() to parse sizes in the few remaining places where they are used. A separate patch will remove those functions. Minor fix to the "lctl set_param" man page. Fixes: adb5aca3 ("LU-8066 llite: Move all remaining procfs entries to debugfs") Signed-off-by:
Andreas Dilger <adilger@whamcloud.com> Change-Id: I2cdf5f8f0aeca458ed1989366102c33ae83ebbe5 Reviewed-on: https://review.whamcloud.com/34849 Reviewed-by:
James Simmons <jsimmons@infradead.org> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Wang Shilong <wshilong@ddn.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 20 Feb, 2020 3 commits
-
-
James Simmons authored
Remove from the lustre kernel code all the support for kernels earlier than the RHEL7 3.10+. This greatly simplifies the code and makes build times much better. Change-Id: If52091ac5249b2719b992032040ccf30cc5bf0e4 Signed-off-by:
James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/37085 Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Petros Koutoupis <petros.koutoupis@hpe.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Yang Sheng <ys@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Mr NeilBrown authored
The LUSTRE_FPRIVATE() macro adds no value. Instead of LUSTRE_FPRIVATE(file) use file->private_data which is shorter and more familiar, and widely used elsewhere in lustre. Also re-indent several functions where this was changed, to use TABs. Also join together some strings that were split across 2 lines. Signed-off-by:
Mr NeilBrown <neilb@suse.de> Change-Id: I811aea8069b22beed15fd96d8c6bef8eca42defd Reviewed-on: https://review.whamcloud.com/36652 Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by:
Arshad Hussain <arshad.super@gmail.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Mr NeilBrown authored
If the lwi passed to l_wait_event() was created with lwi = LWI_INTR(LWI_ON_SIGNAL_NOOP, NULL); the effect is to wait with no timeout and blocking any non-fatal signals. For this, we now have l_wait_event_abortable(), or for one case l_wait_event_abortable_exclusive(); So use those. l_wait_event_abortable() will return -ERESTARTSYS if a signal was received, while l_wait_event() returens -EINTR. We need to be careful to handle this difference. Signed-off-by:
Mr NeilBrown <neilb@suse.com> Change-Id: Iadf0fab92fcfd46802766198dcbe6b6b349214fa Reviewed-on: https://review.whamcloud.com/35975 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Sebastien Buisson <sbuisson@ddn.com> Reviewed-by:
Alex Zhuravlev <bzzz@whamcloud.com> Reviewed-by:
Petros Koutoupis <petros.koutoupis@hpe.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 14 Feb, 2020 1 commit
-
-
Mr NeilBrown authored
The construct: set_current_state(TASK_UNINTERRUPTIBLE); schedule_timeout(time); Is more clearly expressed as schedule_timeout_uninterruptible(time); And similarly with TASK_INTERRUPTIBLE / schedule_timeout_interruptible() Establishing this practice makes it harder to forget to call set_current_state() as has happened a couple of times - in lnet_peer_discovery and mdd_changelog_fini(). Also, there is no need to set_current_state(TASK_RUNNABLE) after calling schedule*(). That state is guaranteed to have been set. In mdd_changelog_fini() there was an attempt to sleep for 10 microseconds. This will always round up to 1 jiffy, so just make it schedule_timeout_uninterruptible(1). Finally a few places where the number of seconds was multiplied by 1, have had the '1 *' removed. Test-Parameters: trivial Signed-off-by:
Mr NeilBrown <neilb@suse.de> Change-Id: I01b37039de0bf7e07480de372c1a4cfe78a8cdd8 Reviewed-on: https://review.whamcloud.com/3665...
-
- 28 Jan, 2020 1 commit
-
-
Andriy Skulysh authored
On error ll_open_cleanup() is called while intent lock remains pinned. So eviction can happen while close request waits for a mod rpc slot. Release intent lock before ll_open_cleanup() Change-Id: Ia422351f3f54fc652078f742f2ead0bf278c9d17 Cray-bug-id: LUS-8055 Signed-off-by:
Andriy Skulysh <c17819@cray.com> Reviewed-by:
Alexander Boyko <c17825@cray.com> Reviewed-by:
Andrew Perepechko <c17827@cray.com> Reviewed-by:
Vitaly Fertman <c17818@cray.com> Reviewed-on: https://review.whamcloud.com/37096 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Alexandr Boyko <c17825@cray.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 23 Jan, 2020 3 commits
-
-
Lai Siyao authored
Deadlock may happen in in following senario: a lookup process called ll_update_lsm_md(), it found lli->lli_lsm_md is NULL, then down_write(&lli->lli_lsm_sem). but another lookup process initialized lli->lli_lsm_md after this check and before write lock, so the first lookup process called up_read(&lli->lli_lsm_sem) and return, so the write lock is never released, which cause subsequent lookups deadlock. Rearrange the code to simplify the locking: 1. take read lock. 2. if lsm was initialized and unchanged, release read lock and return. 3. otherwise release read lock and take write lock. 4. free current lsm and initialize with new lsm. 5. release write lock. 6. initialize stripes with read lock. Signed-off-by:
Lai Siyao <lai.siyao@whamcloud.com> Change-Id: Ifcc25a957983512db6f29105b5ca5b6ec914cb4b Reviewed-on: https://review.whamcloud.com/37182 Tested-by:
jenkins <devops@whamcloud.com> Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Hongchao Zhang <hongchao@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Andreas Dilger authored
The llapi_getname() function returns the combined fsname and client instance as one string, which is fine when using the entire string, but the output cannot be safely parsed into separate fsname and instance strings in all cases. Introduce new llapi_get_fsname() and llapi_get_instance() functions that return only the fsname and instance strings, since the source string returned from the kernel can be unambiguously separated before it is returned in a combined string via llapi_getname(). Fix the lfs_getname() '-n' and '-i' options to use the new routines rather than parsing the output from llapi_getname(). Add man pages for these functions. Fixes: 2a4821b8 ("LU-12159 utils: improve lfs getname functionality") Signed-off-by:
Andreas Dilger <adilger@whamcloud.com> Change-Id: Iaf5846a0ae147a428f66ec8a1d0251e7e12540e5 Reviewed-on: https://review.whamcloud.com/35451 Reviewed-by:
Olaf Faaland-LLNL <faaland1@llnl.gov> Reviewed-by:
James Simmons <jsimmons@infradead.org> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
NeilBrown authored
lli_trunc_sem can lead to a deadlock. vvp_io_read_start takes lli_trunc_sem, and can take mmap sem in the direct i/o case, via generic_file_read_iter->ll_direct_IO->get_user_pages_unlocked vvp_io_fault_start is called with mmap_sem held (taken in the kernel page fault code), and takes lli_trunc_sem. These aren't necessarily the same mmap_sem, but can be if you mmap a lustre file, then read into that mapped memory from the file. These are both 'down_read' calls on lli_trunc_sem so they don't directly conflict, but if vvp_io_setattr_start() is called to truncate the file between these, it does 'down_write' on lli_trunc_sem. As semaphores are queued, this down_write blocks subsequent reads. This means if the page fault has taken the mmap_sem, but not yet the lli_trunc_sem in vvp_io_fault_start, it will wait behind the lli_trunc_sem down_write from vvp_io_setattr_start. At the same time, vvp_io_read_start is holding the lli_trunc_sem and waiting for the mmap_sem, which will not be released because vvp_io_fault_start cannot get the lli_trunc_sem because the setattr 'down_write' operation is queued in front of it. Solve this by replacing with a hand-coded semaphore, using atomic counters and wait_var_event(). This allows a special down_read_nowait which ignores waiting down_write operations. This combined with waking up all waiters at once guarantees that down_read_nowait can always 'join' another down_read, guaranteeing our ability to take the semaphore twice for read and avoiding the deadlock. I'd like there to be a better way to fix this, but I haven't found it yet. Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Patrick Farrell <pfarrell@whamcloud.com> Change-Id: Ibd3abf4df1f1f6f45e440733a364999bd608b191 Reviewed-on: https://review.whamcloud.com/35271 Reviewed-by:
Neil Brown <neilb@suse.de> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 18 Jan, 2020 1 commit
-
-
Mr NeilBrown authored
lustre_fill_super() calls client_fill_super() without holding a reference to the module containing client_fill_super. If that module is unloaded at a bad time, this can crash. To be able to get a reference to the module using try_get_module(), we need a pointer to the module. So replace lustre_register_client_fill_super() and lustre_register_kill_super_cb() with a single lustre_register_super_ops() which also passed a module pointer. Then use a spinlock to ensure the module pointer isn't removed while try_module_get() is running, and use try_module_get() to ensure we have a reference before calling client_fill_super(). Now that we take the reference to the module before calling luster_fill_super(), we don't need to take one inside lustre_fill_super(). Linux-commit: d487fe31f49e78f3cdd826923bf0c340a839ffd8 Signed-off-by:
Mr NeilBrown <neilb@suse.de> Change-Id: I9474622f2a253d9882eae3f0578c50782dd11ad4 Reviewed-on: https://review.whamcloud.co...
-
- 10 Jan, 2020 1 commit
-
-
Patrick Farrell authored
Remove a few config checks for kernel versions we no longer support. Only 3.10+ kernels are now supported. Signed-off-by:
Patrick Farrell <pfarrell@whamcloud.com> Change-Id: I4f4177c512a37fb7a78bab69aa89aa7199ab30b4 Reviewed-on: https://review.whamcloud.com/35342 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Petros Koutoupis <pkoutoupis@cray.com> Reviewed-by:
Shaun Tancheff <stancheff@cray.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 20 Dec, 2019 1 commit
-
-
NeilBrown authored
This arg is always NULL and is never used. So discard it from this and related functions. Linux-commit: 7dc2155195586ec75f53d6dcd381f935ccc35d02 Change-Id: I00b16115edbff0de7605768121981b928585552c Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-on: https://review.whamcloud.com/35427 Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Neil Brown <neilb@suse.de> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 06 Dec, 2019 2 commits
-
-
Mr NeilBrown authored
Next patch will make some code changes to ll_put_super(), so fix up indenting and fix a couple of checkpatch warnings first. Test-Parameters: trivial Signed-off-by:
Mr NeilBrown <neilb@suse.com> Change-Id: I502c81f481c1046d0943f1407a910c1fceeb7ecc Reviewed-on: https://review.whamcloud.com/35974 Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Shaun Tancheff <stancheff@cray.com> Reviewed-by:
Arshad Hussain <arshad.super@gmail.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
Mr NeilBrown authored
When polling without any usable wait queue, it is clearest to have an explicit poll loop. So don't use l_wait_event() in these two cases, but use a while loop with ssleep(1); Signed-off-by:
Mr NeilBrown <neilb@suse.com> Change-Id: Ic6a203085699fb9802d32871479c822ebe3c2510 Reviewed-on: https://review.whamcloud.com/35968 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
James Simmons <jsimmons@infradead.org> Reviewed-by:
Shaun Tancheff <stancheff@cray.com> Reviewed-by:
Petros Koutoupis <pkoutoupis@cray.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 05 Dec, 2019 1 commit
-
-
Qian Yingjin authored
When init a new inode, the saved flags is set wrongly with PCC_DATASET_NONE which means that the file is known in NONE of PCC dataset. This patch corrects it with PCC_DATASET_INVALID. Signed-off-by:
Qian Yingjin <qian@ddn.com> Change-Id: Id775a20711cbc89979e81cbb2b0fe77dc5a850d5 Reviewed-on: https://review.whamcloud.com/36923 Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Reviewed-by:
Li Xi <lixi@ddn.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com>
-
- 03 Dec, 2019 1 commit
-
-
Qian Yingjin authored
When the inode of a PCC cached file in unused state was evicted from icache due to memory pressure or manual icache cleanup (i.e. "echo 3 > /proc/sys/vm/drop_caches"), this file will be detached from PCC also, and all PCC state for this file is cleared. In the current design, PCC only tries to auto attache the file once attached into PCC according to the in-memery PCC state. Thus later IO for the file is not directed to PCC and will trigger the data restore. If this is a not desired result for the user, then we need to try to auto attach file that was never attached into PCC or once attached but detached as a result of shrinking its inode from icache. Although the candidates to try auto attach are increased, but only the file in HSM released state (which can directly get from file layout) will be checked. This bug is easy reproduced on rhel8. It seems that the command "echo 3 > /proc/sys/vm/drop_caches" will drop all unused inodes from icache,...
-
- 12 Nov, 2019 2 commits
-
-
Alex Zhuravlev authored
otherwise client umount can get stuck if MDS is down for a reason. recovery-small/110k simulates this. Signed-off-by:
Alex Zhuravlev <bzzz@whamcloud.com> Change-Id: I40f6059d429b51a877deb532c1d0302dba0d5c85 Reviewed-on: https://review.whamcloud.com/36297 Reviewed-by:
Andreas Dilger <adilger@whamcloud.com> Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Mike Pershin <mpershin@whamcloud.com>
-
Andreas Dilger authored
Add the elapsed time of VFS operations to the llite stats counter, instead of just tracking the number of operations, to allow tracking of operation round-trip latency. Update sanity test_127[ab] to check that llite.*.stats and osc.*.stats counter shows read/write stats in usec, and fix code style nearby. Signed-off-by:
Andreas Dilger <adilger@whamcloud.com> Change-Id: I40e188374f91c030d978a83157d8869e928cab07 Reviewed-on: https://review.whamcloud.com/36078 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Li Xi <lixi@ddn.com> Reviewed-by:
Wang Shilong <wshilong@ddn.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-
- 22 Oct, 2019 1 commit
-
-
Andreas Dilger authored
When mounting a client with "-o localflock" or equivalent option in /etc/fstab, it does not clear out the "flock" mount option flag from the superblock. This results in "flock" still being the option used and it displays both options in the /proc/mounts output: 10.0.0.1@o2ib:/lfs on /mnt/lfs type lustre (rw,flock,localflock) Mount a client with both "flock,localflock" as mount options and verify that the "flock" option is cleared by "localflock", and vice versa. Verify that "noflock" clears both options. Remove the "remount_client()" helper in conf-sanity.sh, since this shadows a helper function of the same name in test-framework.sh and is confusing. Instead, use "mount_client()" now that it can accept mount options, and just pass "remount" explicitly in a few places. Fixes: 3613af3e ("LU-10885 llite: enable flock mount option by default") Test-Parameters: trivial testlist=conf-sanity Signed-off-by:
Andreas Dilger <adilger@whamcloud.com> Change-Id: Ie31b0c4f6674c99d3ed5b73caa39cfc23d3ebbe5 Reviewed-on: https://review.whamcloud.com/36452 Tested-by:
jenkins <devops@whamcloud.com> Tested-by:
Maloo <maloo@whamcloud.com> Reviewed-by:
Ben Evans <bevans@cray.com> Reviewed-by:
Hongchao Zhang <hongchao@whamcloud.com> Reviewed-by:
Oleg Drokin <green@whamcloud.com>
-