aboutsummaryrefslogtreecommitdiffstats
path: root/fs/quota/dquot.c
AgeCommit message (Collapse)Author
2024-03-26quota: Fix rcu annotations of inode dquot pointersJan Kara
[ Upstream commit 179b8c97ebf63429589f5afeba59a181fe70603e ] Dquot pointers in i_dquot array in the inode are protected by dquot_srcu. Annotate the array pointers with __rcu, perform the locked dereferences with srcu_dereference_check() instead of plain reads, and set the array elements with rcu_assign_pointer(). Fixes: b9ba6f94b238 ("quota: remove dqptr_sem") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402061900.rTuYDlo6-lkp@intel.com/ Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-26quota: Fix potential NULL pointer dereferenceWang Jianjian
[ Upstream commit d0aa72604fbd80c8aabb46eda00535ed35570f1f ] Below race may cause NULL pointer dereference P1 P2 dquot_free_inode quota_off drop_dquot_ref remove_dquot_ref dquots = i_dquot(inode) dquots = i_dquot(inode) srcu_read_lock dquots[cnt]) != NULL (1) dquots[type] = NULL (2) spin_lock(&dquots[cnt]->dq_dqb_lock) (3) .... If dquot_free_inode(or other routines) checks inode's quota pointers (1) before quota_off sets it to NULL(2) and use it (3) after that, NULL pointer dereference will be triggered. So let's fix it by using a temporary pointer to avoid this issue. Signed-off-by: Wang Jianjian <wangjianjian3@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240202081852.2514092-1-wangjianjian3@huawei.com> Stable-dep-of: 179b8c97ebf6 ("quota: Fix rcu annotations of inode dquot pointers") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-26quota: simplify drop_dquot_ref()Baokun Li
[ Upstream commit 7bce48f0fec602b3b6c335963b26d9eefa417788 ] As Honza said, remove_inode_dquot_ref() currently does not release the last dquot reference but instead adds the dquot to tofree_head list. This is because dqput() can sleep while dropping of the last dquot reference (writing back the dquot and calling ->release_dquot()) and that must not happen under dq_list_lock. Now that dqput() queues the final dquot cleanup into a workqueue, remove_inode_dquot_ref() can call dqput() unconditionally and we can significantly simplify it. Here we open code the simplified code of remove_inode_dquot_ref() into remove_dquot_ref() and remove the function put_dquot_list() which is no longer used. Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230630110822.3881712-6-libaokun1@huawei.com> Stable-dep-of: 179b8c97ebf6 ("quota: Fix rcu annotations of inode dquot pointers") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28quota: explicitly forbid quota files from being encryptedEric Biggers
commit d3cc1b0be258191d6360c82ea158c2972f8d3991 upstream. Since commit d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key"), xfstest generic/270 causes a WARNING when run on f2fs with test_dummy_encryption in the mount options: $ kvm-xfstests -c f2fs/encrypt generic/270 [...] WARNING: CPU: 1 PID: 2453 at fs/crypto/keyring.c:240 fscrypt_destroy_keyring+0x1f5/0x260 The cause of the WARNING is that not all encrypted inodes have been evicted before fscrypt_destroy_keyring() is called, which violates an assumption. This happens because the test uses an external quota file, which gets automatically encrypted due to test_dummy_encryption. Encryption of quota files has never really been supported. On ext4, ext4_quota_read() does not decrypt the data, so encrypted quota files are always considered invalid on ext4. On f2fs, f2fs_quota_read() uses the pagecache, so trying to use an encrypted quota file gets farther, resulting in the issue described above being possible. But this was never intended to be possible, and there is no use case for it. Therefore, make the quota support layer explicitly reject using IS_ENCRYPTED inodes when quotaon is attempted. Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230905003227.326998-1-ebiggers@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-19quota: Fix slow quotaoffJan Kara
commit 869b6ea1609f655a43251bf41757aa44e5350a8f upstream. Eric has reported that commit dabc8b207566 ("quota: fix dqput() to follow the guarantees dquot_srcu should provide") heavily increases runtime of generic/270 xfstest for ext4 in nojournal mode. The reason for this is that ext4 in nojournal mode leaves dquots dirty until the last dqput() and thus the cleanup done in quota_release_workfn() has to write them all. Due to the way quota_release_workfn() is written this results in synchronize_srcu() call for each dirty dquot which makes the dquot cleanup when turning quotas off extremely slow. To be able to avoid synchronize_srcu() for each dirty dquot we need to rework how we track dquots to be cleaned up. Instead of keeping the last dquot reference while it is on releasing_dquots list, we drop it right away and mark the dquot with new DQ_RELEASING_B bit instead. This way we can we can remove dquot from releasing_dquots list when new reference to it is acquired and thus there's no need to call synchronize_srcu() each time we drop dq_list_lock. References: https://lore.kernel.org/all/ZRytn6CxFK2oECUt@debian-BULLSEYE-live-builder-AMD64 Reported-by: Eric Whitney <enwlinux@gmail.com> Fixes: dabc8b207566 ("quota: fix dqput() to follow the guarantees dquot_srcu should provide") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19quota: fix dqput() to follow the guarantees dquot_srcu should provideBaokun Li
[ Upstream commit dabc8b20756601b9e1cc85a81d47d3f98ed4d13a ] The dquot_mark_dquot_dirty() using dquot references from the inode should be protected by dquot_srcu. quota_off code takes care to call synchronize_srcu(&dquot_srcu) to not drop dquot references while they are used by other users. But dquot_transfer() breaks this assumption. We call dquot_transfer() to drop the last reference of dquot and add it to free_dquots, but there may still be other users using the dquot at this time, as shown in the function graph below: cpu1 cpu2 _________________|_________________ wb_do_writeback CHOWN(1) ... ext4_da_update_reserve_space dquot_claim_block ... dquot_mark_dquot_dirty // try to dirty old quota test_bit(DQ_ACTIVE_B, &dquot->dq_flags) // still ACTIVE if (test_bit(DQ_MOD_B, &dquot->dq_flags)) // test no dirty, wait dq_list_lock ... dquot_transfer __dquot_transfer dqput_all(transfer_from) // rls old dquot dqput // last dqput dquot_release clear_bit(DQ_ACTIVE_B, &dquot->dq_flags) atomic_dec(&dquot->dq_count) put_dquot_last(dquot) list_add_tail(&dquot->dq_free, &free_dquots) // add the dquot to free_dquots if (!test_and_set_bit(DQ_MOD_B, &dquot->dq_flags)) add dqi_dirty_list // add released dquot to dirty_list This can cause various issues, such as dquot being destroyed by dqcache_shrink_scan() after being added to free_dquots, which can trigger a UAF in dquot_mark_dquot_dirty(); or after dquot is added to free_dquots and then to dirty_list, it is added to free_dquots again after dquot_writeback_dquots() is executed, which causes the free_dquots list to be corrupted and triggers a UAF when dqcache_shrink_scan() is called for freeing dquot twice. As Honza said, we need to fix dquot_transfer() to follow the guarantees dquot_srcu should provide. But calling synchronize_srcu() directly from dquot_transfer() is too expensive (and mostly unnecessary). So we add dquot whose last reference should be dropped to the new global dquot list releasing_dquots, and then queue work item which would call synchronize_srcu() and after that perform the final cleanup of all the dquots on releasing_dquots. Fixes: 4580b30ea887 ("quota: Do not dirty bad dquots") Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230630110822.3881712-5-libaokun1@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19quota: add new helper dquot_active()Baokun Li
[ Upstream commit 33bcfafc48cb186bc4bbcea247feaa396594229e ] Add new helper function dquot_active() to make the code more concise. Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230630110822.3881712-4-libaokun1@huawei.com> Stable-dep-of: dabc8b207566 ("quota: fix dqput() to follow the guarantees dquot_srcu should provide") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19quota: rename dquot_active() to inode_quota_active()Baokun Li
[ Upstream commit 4b9bdfa16535de8f49bf954aeed0f525ee2fc322 ] Now we have a helper function dquot_dirty() to determine if dquot has DQ_MOD_B bit. dquot_active() can easily be misunderstood as a helper function to determine if dquot has DQ_ACTIVE_B bit. So we avoid this by renaming it to inode_quota_active() and later on we will add the helper function dquot_active() to determine if dquot has DQ_ACTIVE_B bit. Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230630110822.3881712-3-libaokun1@huawei.com> Stable-dep-of: dabc8b207566 ("quota: fix dqput() to follow the guarantees dquot_srcu should provide") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19quota: factor out dquot_write_dquot()Baokun Li
[ Upstream commit 024128477809f8073d870307c8157b8826ebfd08 ] Refactor out dquot_write_dquot() to reduce duplicate code. Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230630110822.3881712-2-libaokun1@huawei.com> Stable-dep-of: dabc8b207566 ("quota: fix dqput() to follow the guarantees dquot_srcu should provide") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27quota: fix warning in dqgrab()Ye Bin
[ Upstream commit d6a95db3c7ad160bc16b89e36449705309b52bcb ] There's issue as follows when do fault injection: WARNING: CPU: 1 PID: 14870 at include/linux/quotaops.h:51 dquot_disable+0x13b7/0x18c0 Modules linked in: CPU: 1 PID: 14870 Comm: fsconfig Not tainted 6.3.0-next-20230505-00006-g5107a9c821af-dirty #541 RIP: 0010:dquot_disable+0x13b7/0x18c0 RSP: 0018:ffffc9000acc79e0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88825e41b980 RDX: 0000000000000000 RSI: ffff88825e41b980 RDI: 0000000000000002 RBP: ffff888179f68000 R08: ffffffff82087ca7 R09: 0000000000000000 R10: 0000000000000001 R11: ffffed102f3ed026 R12: ffff888179f68130 R13: ffff888179f68110 R14: dffffc0000000000 R15: ffff888179f68118 FS: 00007f450a073740(0000) GS:ffff88882fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe96f2efd8 CR3: 000000025c8ad000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> dquot_load_quota_sb+0xd53/0x1060 dquot_resume+0x172/0x230 ext4_reconfigure+0x1dc6/0x27b0 reconfigure_super+0x515/0xa90 __x64_sys_fsconfig+0xb19/0xd20 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd Above issue may happens as follows: ProcessA ProcessB ProcessC sys_fsconfig vfs_fsconfig_locked reconfigure_super ext4_remount dquot_suspend -> suspend all type quota sys_fsconfig vfs_fsconfig_locked reconfigure_super ext4_remount dquot_resume ret = dquot_load_quota_sb add_dquot_ref do_open -> open file O_RDWR vfs_open do_dentry_open get_write_access atomic_inc_unless_negative(&inode->i_writecount) ext4_file_open dquot_file_open dquot_initialize __dquot_initialize dqget atomic_inc(&dquot->dq_count); __dquot_initialize __dquot_initialize dqget if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) ext4_acquire_dquot -> Return error DQ_ACTIVE_B flag isn't set dquot_disable invalidate_dquots if (atomic_read(&dquot->dq_count)) dqgrab WARN_ON_ONCE(!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) -> Trigger warning In the above scenario, 'dquot->dq_flags' has no DQ_ACTIVE_B is normal when dqgrab(). To solve above issue just replace the dqgrab() use in invalidate_dquots() with atomic_inc(&dquot->dq_count). Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230605140731.2427629-3-yebin10@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-27quota: Properly disable quotas when add_dquot_ref() failsJan Kara
[ Upstream commit 6a4e3363792e30177cc3965697e34ddcea8b900b ] When add_dquot_ref() fails (usually due to IO error or ENOMEM), we want to disable quotas we are trying to enable. However dquot_disable() call was passed just the flags we are enabling so in case flags == DQUOT_USAGE_ENABLED dquot_disable() call will just fail with EINVAL instead of properly disabling quotas. Fix the problem by always passing DQUOT_LIMITS_ENABLED | DQUOT_USAGE_ENABLED to dquot_disable() in this case. Reported-and-tested-by: Ye Bin <yebin10@huawei.com> Reported-by: syzbot+e633c79ceaecbf479854@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230605140731.2427629-2-yebin10@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-12ext4: fix bug_on in __es_tree_search caused by bad quota inodeBaokun Li
commit d323877484765aaacbb2769b06e355c2041ed115 upstream. We got a issue as fllows: ================================================================== kernel BUG at fs/ext4/extents_status.c:202! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 1 PID: 810 Comm: mount Not tainted 6.1.0-rc1-next-g9631525255e3 #352 RIP: 0010:__es_tree_search.isra.0+0xb8/0xe0 RSP: 0018:ffffc90001227900 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 0000000077512a0f RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000002a10 RDI: ffff8881004cd0c8 RBP: ffff888177512ac8 R08: 47ffffffffffffff R09: 0000000000000001 R10: 0000000000000001 R11: 00000000000679af R12: 0000000000002a10 R13: ffff888177512d88 R14: 0000000077512a10 R15: 0000000000000000 FS: 00007f4bd76dbc40(0000)GS:ffff88842fd00000(0000)knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005653bf993cf8 CR3: 000000017bfdf000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ext4_es_cache_extent+0xe2/0x210 ext4_cache_extents+0xd2/0x110 ext4_find_extent+0x5d5/0x8c0 ext4_ext_map_blocks+0x9c/0x1d30 ext4_map_blocks+0x431/0xa50 ext4_getblk+0x82/0x340 ext4_bread+0x14/0x110 ext4_quota_read+0xf0/0x180 v2_read_header+0x24/0x90 v2_check_quota_file+0x2f/0xa0 dquot_load_quota_sb+0x26c/0x760 dquot_load_quota_inode+0xa5/0x190 ext4_enable_quotas+0x14c/0x300 __ext4_fill_super+0x31cc/0x32c0 ext4_fill_super+0x115/0x2d0 get_tree_bdev+0x1d2/0x360 ext4_get_tree+0x19/0x30 vfs_get_tree+0x26/0xe0 path_mount+0x81d/0xfc0 do_mount+0x8d/0xc0 __x64_sys_mount+0xc0/0x160 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> ================================================================== Above issue may happen as follows: ------------------------------------- ext4_fill_super ext4_orphan_cleanup ext4_enable_quotas ext4_quota_enable ext4_iget --> get error inode <5> ext4_ext_check_inode --> Wrong imode makes it escape inspection make_bad_inode(inode) --> EXT4_BOOT_LOADER_INO set imode dquot_load_quota_inode vfs_setup_quota_inode --> check pass dquot_load_quota_sb v2_check_quota_file v2_read_header ext4_quota_read ext4_bread ext4_getblk ext4_map_blocks ext4_ext_map_blocks ext4_find_extent ext4_cache_extents ext4_es_cache_extent __es_tree_search.isra.0 ext4_es_end --> Wrong extents trigger BUG_ON In the above issue, s_usr_quota_inum is set to 5, but inode<5> contains incorrect imode and disordered extents. Because 5 is EXT4_BOOT_LOADER_INO, the ext4_ext_check_inode check in the ext4_iget function can be bypassed, finally, the extents that are not checked trigger the BUG_ON in the __es_tree_search function. To solve this issue, check whether the inode is bad_inode in vfs_setup_quota_inode(). Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20221026042310.3839669-2-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-22quota: Prevent memory allocation recursion while holding dq_lockMatthew Wilcox (Oracle)
[ Upstream commit 537e11cdc7a6b3ce94fa25ed41306193df9677b7 ] As described in commit 02117b8ae9c0 ("f2fs: Set GF_NOFS in read_cache_page_gfp while doing f2fs_quota_read"), we must not enter filesystem reclaim while holding the dq_lock. Prevent this more generally by using memalloc_nofs_save() while holding the lock. Link: https://lore.kernel.org/r/20220605143815.2330891-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-02-23quota: make dquot_quota_sync return errors from ->sync_fsDarrick J. Wong
[ Upstream commit dd5532a4994bfda0386eb2286ec00758cee08444 ] Strangely, dquot_quota_sync ignores the return code from the ->sync_fs call, which means that quotacalls like Q_SYNC never see the error. This doesn't seem right, so fix that. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-10quota: Use 'hlist_for_each_entry' to simplify codeChristophe JAILLET
Use 'hlist_for_each_entry' instead of hand writing it. This saves a few lines of code. Link: https://lore.kernel.org/r/f82d3e33964dcbd2aac19866735e0a8381c8a735.1619599407.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jan Kara <jack@suse.cz>
2020-12-09fs: quota: fix array-index-out-of-bounds bug by passing correct argument to ↵Anant Thazhemadam
vfs_cleanup_quota_inode() When dquot_resume() was last updated, the argument that got passed to vfs_cleanup_quota_inode was incorrectly set. If type = -1 and dquot_load_quota_sb() returns a negative value, then vfs_cleanup_quota_inode() gets called with -1 passed as an argument, and this leads to an array-index-out-of-bounds bug. Fix this issue by correctly passing the arguments. Fixes: ae45f07d47cc ("quota: Simplify dquot_resume()") Link: https://lore.kernel.org/r/20201208194338.7064-1-anant.thazhemadam@gmail.com Reported-by: syzbot+2643e825238d7aabb37f@syzkaller.appspotmail.com Tested-by: syzbot+2643e825238d7aabb37f@syzkaller.appspotmail.com CC: stable@vger.kernel.org Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2020-06-24block: move block-related definitions out of fs.hChristoph Hellwig
Move most of the block related definition out of fs.h into more suitable headers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-27sysctl: pass kernel pointers to ->proc_handlerChristoph Hellwig
Instead of having all the sysctl handlers deal with user pointers, which is rather hairy in terms of the BPF interaction, copy the input to and from userspace in common code. This also means that the strings are always NUL-terminated by the common code, making the API a little bit safer. As most handler just pass through the data to one of the common handlers a lot of the changes are mechnical. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-12-18fs: avoid softlockups in s_inodes iteratorsEric Sandeen
Anything that walks all inodes on sb->s_inodes list without rescheduling risks softlockups. Previous efforts were made in 2 functions, see: c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb() ac05fbb inode: don't softlockup when evicting inodes but there hasn't been an audit of all walkers, so do that now. This also consistently moves the cond_resched() calls to the bottom of each loop in cases where it already exists. One loop remains: remove_dquot_ref(), because I'm not quite sure how to deal with that one w/o taking the i_lock. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-12-06Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds
Pull vfs d_inode/d_flags memory ordering fixes from Al Viro: "Fallout from tree-wide audit for ->d_inode/->d_flags barriers use. Basically, the problem is that negative pinned dentries require careful treatment - unless ->d_lock is locked or parent is held at least shared, another thread can make them positive right under us. Most of the uses turned out to be safe - the main surprises as far as filesystems are concerned were - race in dget_parent() fastpath, that might end up with the caller observing the returned dentry _negative_, due to insufficient barriers. It is positive in memory, but we could end up seeing the wrong value of ->d_inode in CPU cache. Fixed. - manual checks that result of lookup_one_len_unlocked() is positive (and rejection of negatives). Again, insufficient barriers (we might end up with inconsistent observed values of ->d_inode and ->d_flags). Fixed by switching to a new primitive that does the checks itself and returns ERR_PTR(-ENOENT) instead of a negative dentry. That way we get rid of boilerplate converting negatives into ERR_PTR(-ENOENT) in the callers and have a single place to deal with the barrier-related mess - inside fs/namei.c rather than in every caller out there. The guts of pathname resolution *do* need to be careful - the race found by Ritesh is real, as well as several similar races. Fortunately, it turns out that we can take care of that with fairly local changes in there. The tree-wide audit had not been fun, and I hate the idea of repeating it. I think the right approach would be to annotate the places where we are _not_ guaranteed ->d_inode/->d_flags stability and have sparse catch regressions. But I'm still not sure what would be the least invasive way of doing that and it's clearly the next cycle fodder" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs/namei.c: fix missing barriers when checking positivity fix dget_parent() fastpath race new helper: lookup_positive_unlocked() fs/namei.c: pull positivity check into follow_managed()
2019-11-15new helper: lookup_positive_unlocked()Al Viro
Most of the callers of lookup_one_len_unlocked() treat negatives are ERR_PTR(-ENOENT). Provide a helper that would do just that. Note that a pinned positive dentry remains positive - it's ->d_inode is stable, etc.; a pinned _negative_ dentry can become positive at any point as long as you are not holding its parent at least shared. So using lookup_one_len_unlocked() needs to be careful; lookup_positive_unlocked() is safer and that's what the callers end up open-coding anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-11-11fs/quota: handle overflows of sysctl fs.quota.* and report as unsigned longKonstantin Khlebnikov
Quota statistics counted as 64-bit per-cpu counter. Reading sums per-cpu fractions as signed 64-bit int, filters negative values and then reports lower half as signed 32-bit int. Result may looks like: fs.quota.allocated_dquots = 22327 fs.quota.cache_hits = -489852115 fs.quota.drops = -487288718 fs.quota.free_dquots = 22083 fs.quota.lookups = -486883485 fs.quota.reads = 22327 fs.quota.syncs = 335064 fs.quota.writes = 3088689 Values bigger than 2^31-1 reported as negative. All counters except "allocated_dquots" and "free_dquots" are monotonic, thus they should be reported as is without filtering negative values. Kernel doesn't have generic helper for 64-bit sysctl yet, let's use at least unsigned long. Link: https://lore.kernel.org/r/157337934693.2078.9842146413181153727.stgit@buzz Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jan Kara <jack@suse.cz>
2019-11-06Pull series refactoring quota enabling and disabling code.Jan Kara
2019-11-04quota: Handle quotas without quota inodes in dquot_get_state()Jan Kara
Make dquot_get_state() gracefully handle a situation when there are no quota files present even though quotas are enabled. Signed-off-by: Jan Kara <jack@suse.cz>
2019-11-04quota: Make dquot_disable() work without quota inodesJan Kara
Quota on and quota off are protected by s_umount semaphore held in exclusive mode since commit 7d6cd73d33b6 "quota: Hold s_umount in exclusive mode when enabling / disabling quotas". This makes it impossible for dquot_disable() to race with other enabling or disabling of quotas. Simplify the cleanup done by dquot_disable() based on this fact and also remove some stale comments. As a bonus this cleanup makes dquot_disable() properly handle a case when there are no quota inodes. Signed-off-by: Jan Kara <jack@suse.cz>
2019-11-04quota: Drop dquot_enable()Jan Kara
Now dquot_enable() has only two internal callers and both of them just need to update quota flags and don't need most of checks. Just drop dquot_enable() and fold necessary functionality into the two calling places. Signed-off-by: Jan Kara <jack@suse.cz>
2019-11-04quota: Rename vfs_load_quota_inode() to dquot_load_quota_inode()Jan Kara
Rename vfs_load_quota_inode() to dquot_load_quota_inode() to be consistent with naming of other functions used for enabling quota accounting from filesystems. Also export the function and add some sanity checks to assure filesystems are calling the function properly. Signed-off-by: Jan Kara <jack@suse.cz>
2019-11-04quota: Simplify dquot_resume()Jan Kara
We already have quota inode loaded when resuming quotas. Use vfs_load_quota() to avoid some pointless churn with the quota inode. Signed-off-by: Jan Kara <jack@suse.cz>
2019-11-04quota: Factor out setup of quota inodeJan Kara
Factor out setting up of quota inode and eventual error cleanup from vfs_load_quota_inode(). This will simplify situation for filesystems that don't have any quota inodes. Signed-off-by: Jan Kara <jack@suse.cz>
2019-10-31quota: Check that quota is not dirty before releaseDmitry Monakhov
There is a race window where quota was redirted once we drop dq_list_lock inside dqput(), but before we grab dquot->dq_lock inside dquot_release() TASK1 TASK2 (chowner) ->dqput() we_slept: spin_lock(&dq_list_lock) if (dquot_dirty(dquot)) { spin_unlock(&dq_list_lock); dquot->dq_sb->dq_op->write_dquot(dquot); goto we_slept if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) { spin_unlock(&dq_list_lock); dquot->dq_sb->dq_op->release_dquot(dquot); dqget() mark_dquot_dirty() dqput() goto we_slept; } So dquot dirty quota will be released by TASK1, but on next we_sleept loop we detect this and call ->write_dquot() for it. XFSTEST: https://github.com/dmonakhov/xfstests/commit/440a80d4cbb39e9234df4d7240aee1d551c36107 Link: https://lore.kernel.org/r/20191031103920.3919-2-dmonakhov@openvz.org CC: stable@vger.kernel.org Signed-off-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru> Signed-off-by: Jan Kara <jack@suse.cz>
2019-10-31quota: fix livelock in dquot_writeback_dquotsDmitry Monakhov
Write only quotas which are dirty at entry. XFSTEST: https://github.com/dmonakhov/xfstests/commit/b10ad23566a5bf75832a6f500e1236084083cddc Link: https://lore.kernel.org/r/20191031103920.3919-1-dmonakhov@openvz.org CC: stable@vger.kernel.org Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru> Signed-off-by: Jan Kara <jack@suse.cz>
2019-10-04quota: code cleanup for hash bits calculationChengguang Xu
Code cleanup for hash bits calculation by calling ilog2(). Link: https://lore.kernel.org/r/20190923135223.27674-1-cgxu519@zoho.com.cn Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn> Signed-off-by: Jan Kara <jack@suse.cz>
2019-10-04quota: avoid increasing DQST_LOOKUPS when iterating over dirty/inuse listChengguang Xu
It is meaningless to increase DQST_LOOKUPS number while iterating over dirty/inuse list, so just avoid it. Link: https://lore.kernel.org/r/20190926083408.4269-1-cgxu519@zoho.com.cn Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn> Signed-off-by: Jan Kara <jack@suse.cz>
2019-07-31quota: fix condition for resetting time limit in do_set_dqblk()Chengguang Xu
We reset time limit when current usage is smaller or equal to soft limit in other place, so follow this rule in do_set_dqblk(). Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn> Link: https://lore.kernel.org/r/20190724053216.19392-1-cgxu519@zoho.com.cn Signed-off-by: Jan Kara <jack@suse.cz>
2019-07-10Merge tag 'for_v5.3-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, udf and quota updates from Jan Kara: - some ext2 fixes and cleanups - a fix of udf bug when extending files - a fix of quota Q_XGETQSTAT[V] handling * tag 'for_v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Fix incorrect final NOT_ALLOCATED (hole) extent length ext2: Use kmemdup rather than duplicating its implementation quota: honor quota type in Q_XGETQSTAT[V] calls ext2: Always brelse bh on failure in ext2_iget() ext2: add missing brelse() in ext2_iget() ext2: Fix a typo in ext2_getattr argument ext2: fix a typo in comment ext2: add missing brelse() in ext2_new_inode() ext2: optimize ext2_xattr_get() ext2: introduce new helper for xattr entry comparison ext2: merge xattr next entry check to ext2_xattr_entry_valid() ext2: code cleanup for ext2_preread_inode() ext2: code cleanup by using test_opt() and clear_opt() doc: ext2: update description of quota options for ext2 ext2: Strengthen xattr block checks ext2: Merge loops in ext2_xattr_set() ext2: introduce helper for xattr entry validation ext2: introduce helper for xattr header validation quota: add dqi_dirty_list description to comment of Dquot List Management
2019-06-19quota: fix a problem about transfer quotayangerkun
Run below script as root, dquot_add_space will return -EDQUOT since __dquot_transfer call dquot_add_space with flags=0, and dquot_add_space think it's a preallocation. Fix it by set flags as DQUOT_SPACE_WARN. mkfs.ext4 -O quota,project /dev/vdb mount -o prjquota /dev/vdb /mnt setquota -P 23 1 1 0 0 /dev/vdb dd if=/dev/zero of=/mnt/test-file bs=4K count=1 chattr -p 23 test-file Fixes: 7b9ca4c61bc2 ("quota: Reduce contention on dq_data_lock") Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-05-20quota: add dqi_dirty_list description to comment of Dquot List ManagementChengguang Xu
Actually there are four lists for dquot management, so add the description of dqui_dirty_list to comment. Signed-off-by: Chengguang Xu <cgxu519@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-04-30quota: check time limit when back out space/inode changeChengguang Xu
When we fail from allocating inode/space, we back out the change we already did. In a special case which has exceeded soft limit by the change, we should also check time limit and reset it properly. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-04-25fs/quota: erase unused but set variable warningJiang Biao
Local variable *reserved* of remove_dquot_ref() is only used if define CONFIG_QUOTA_DEBUG, but not ebraced in CONFIG_QUOTA_DEBUG macro, which leads to unused-but-set-variable warning when compiling. This patch ebrace it into CONFIG_QUOTA_DEBUG macro like what is done in add_dquot_ref(). Signed-off-by: Jiang Biao <benbjiang@tencent.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-04-25quota: fix wrong indentationChengguang Xu
We need to check return code only when calling ->read_dqblk(), so fix it properly. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Jan Kara <jack@suse.cz>
2019-03-26quota: remove trailing whitespacesSascha Hauer
This removes all trailing whitespaces in fs/quota/. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Jan Kara <jack@suse.cz>
2019-03-26quota: code cleanup for __dquot_alloc_space()Chengguang Xu
Replace (flags & DQUOT_SPACE_RESERVE) with variable reserve. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-06-20quota: Cleanup list iteration in dqcache_shrink_scan()Jan Kara
Use list_first_entry() and list_empty() instead of opencoded variants. Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-06-20quota: reclaim least recently used dquotsGreg Thelen
The dquots in the free_dquots list are not reclaimed in LRU way. put_dquot_last() puts entries to the tail and dqcache_shrink_scan() frees from the tail. Free unreferenced dquots in LRU order because it seems more reasonable than freeing most recently used. Signed-off-by: Greg Thelen <gthelen@google.com> Signed-off-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-04-09fs: quota: Replace GFP_ATOMIC with GFP_KERNEL in dquot_initJia-Ju Bai
dquot_init() is never called in atomic context. This function is only set as a parameter of fs_initcall(). Despite never getting called from atomic context, dquot_init() calls __get_free_pages() with GFP_ATOMIC, which waits busily for allocation. GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, to avoid busy waiting and improve the possibility of sucessful allocation. This is found by a static analysis tool named DCNS written by myself. And I also manually check it. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2017-11-29quota: Check for register_shrinker() failure.Tetsuo Handa
register_shrinker() might return -ENOMEM error since Linux 3.12. Call panic() as with other failure checks in this function if register_shrinker() failed. Fixes: 1d3d4437eae1 ("vmscan: per-node deferred work") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Jan Kara <jack@suse.com> Cc: Michal Hocko <mhocko@suse.com> Reviewed-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Jan Kara <jack@suse.cz>
2017-11-28quota: propagate error from __dquot_initializeChao Yu
In commit 6184fc0b8dd7 ("quota: Propagate error from ->acquire_dquot()"), we have propagated error from __dquot_initialize to caller, but we forgot to handle such error in add_dquot_ref(), so, currently, during quota accounting information initialization flow, if we failed for some of inodes, we just ignore such error, and do account for others, which is not a good implementation. In this patch, we choose to let user be aware of such error, so after turning on quota successfully, we can make sure all inodes disk usage can be accounted, which will be more reasonable. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>
2017-11-14Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota, ext2, isofs and udf fixes from Jan Kara: - two small quota error handling fixes - two isofs fixes for architectures with signed char - several udf block number overflow and signedness fixes - ext2 rework of mount option handling to avoid GFP_KERNEL allocation with spinlock held - ... it also contains a patch to implement auditing of responses to fanotify permission events. That should have been in the fanotify pull request but I mistakenly merged that patch into a wrong branch and noticed only now at which point I don't think it's worth rebasing and redoing. * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: quota: be aware of error from dquot_initialize quota: fix potential infinite loop isofs: use unsigned char types consistently isofs: fix timestamps beyond 2027 udf: Fix some sign-conversion warnings udf: Fix signed/unsigned format specifiers udf: Fix 64-bit sign extension issues affecting blocks > 0x7FFFFFFF udf: Remove some outdate references from documentation udf: Avoid overflow when session starts at large offset ext2: Fix possible sleep in atomic during mount option parsing ext2: Parse mount options into a dedicated structure audit: Record fanotify access control decisions
2017-11-14Merge udf, isofs, quota, ext2 changes for 4.15-rc1.Jan Kara
2017-11-13quota: be aware of error from dquot_initializeChao Yu
Commit 6184fc0b8dd7 ("quota: Propagate error from ->acquire_dquot()") missed to handle error from dquot_initialize in dquot_file_open, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jan Kara <jack@suse.cz>