aboutsummaryrefslogtreecommitdiffstats
path: root/fs/f2fs/super.c
AgeCommit message (Collapse)Author
2017-07-27f2fs: sanity check checkpoint segno and blkoffJin Qian
commit 15d3042a937c13f5d9244241c7a9c8416ff6e82a upstream. Make sure segno and blkoff read from raw image are valid. Signed-off-by: Jin Qian <jinqian@google.com> [Jaegeuk Kim: adjust minor coding style] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-03f2fs: Make flush bios explicitely syncJan Kara
Commit b685d3d65ac7 "block: treat REQ_FUA and REQ_PREFLUSH as synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...} definitions. generic_make_request_checks() however strips REQ_FUA and REQ_PREFLUSH flags from a bio when the storage doesn't report volatile write cache and thus write effectively becomes asynchronous which can lead to performance regressions. Fix the problem by making sure all bios which are synchronous are properly marked with REQ_SYNC. Fixes: b685d3d65ac791406e0dfd8779cc9b3707fea5a3 Cc: stable@vger.kernel.org # 4.9+ CC: Jaegeuk Kim <jaegeuk@kernel.org> CC: linux-f2fs-devel@lists.sourceforge.net Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-03f2fs: introduce CP_TRIMMED_FLAG to avoid unneeded discardChao Yu
Introduce CP_TRIMMED_FLAG to indicate all invalid block were trimmed before umount, so once we do mount with image which contain the flag, we don't record invalid blocks as undiscard one, when fstrim is being triggered, we can avoid issuing redundant discard commands. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-02f2fs: sanity check segment countJin Qian
F2FS uses 4 bytes to represent block address. As a result, supported size of disk is 16 TB and it equals to 16 * 1024 * 1024 / 2 segments. Signed-off-by: Jin Qian <jinqian@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-10f2fs: clean up get_valid_blocks with consistent parameterJaegeuk Kim
This patch cleans up get_valid_blocks, which has no functional change. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-10f2fs: introduce f2fs_wait_discard_biosChao Yu
Split f2fs_wait_discard_bios from f2fs_wait_discard_bio, just for cleanup, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-05f2fs: avoid IO split due to mixed WB_SYNC_ALL and WB_SYNC_NONEJaegeuk Kim
If two threads try to flush dirty pages in different inodes respectively, f2fs_write_data_pages() will produce WRITE and WRITE_SYNC one at a time, resulting in a lot of 4KB seperated IOs. So, this patch gives higher priority to WB_SYNC_ALL IOs and gathers write IOs with a big WRITE_SYNC'ed bio. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-05f2fs: write small sized IO to hot logJaegeuk Kim
It would better split small and large IOs separately in order to get more consecutive big writes. The default threshold is set to 64KB, but configurable by sysfs/min_hot_blocks. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-29f2fs: allocate node and hot data in the beginning of partitionJaegeuk Kim
In order to give more spatial locality, this patch changes the block allocation policy which assigns beginning of partition for small and hot data/node blocks. In order to do this, we set noheap allocation by default and introduce another mount option, heap, to reset it back. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-25f2fs: allow write page cache when writting cpYunlei He
This patch allow write data to normal file when writting new checkpoint. We relax three limitations for write_begin path: 1. data allocation 2. node allocation 3. variables in checkpoint Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-21f2fs: add fault injection on f2fs_truncateJaegeuk Kim
Inject a fault during f2fs_truncate(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-21f2fs: build stat_info before orphan inode recoveryJaegeuk Kim
f2fs_sync_fs() -> write_checkpoint() calls stat_inc_cp_count(sbi->stat_info), which needs stat_info allocation. Otherwise, we can hit: [254042.598623] ? count_shadow_nodes+0xa0/0xa0 [254042.598633] f2fs_sync_fs+0x65/0xd0 [f2fs] [254042.598645] f2fs_balance_fs_bg+0xe4/0x1c0 [f2fs] [254042.598657] f2fs_write_node_pages+0x34/0x1a0 [f2fs] [254042.598664] ? pagevec_lookup_entries+0x1e/0x30 [254042.598673] do_writepages+0x1e/0x30 [254042.598682] __writeback_single_inode+0x45/0x330 [254042.598688] writeback_single_inode+0xd7/0x190 [254042.598694] write_inode_now+0x86/0xa0 [254042.598699] iput+0x122/0x200 [254042.598709] f2fs_fill_super+0xd4a/0x14d0 [f2fs] [254042.598717] mount_bdev+0x184/0x1c0 [254042.598934] ? f2fs_commit_super+0x100/0x100 [f2fs] [254042.599142] f2fs_mount+0x15/0x20 [f2fs] [254042.599349] mount_fs+0x39/0x160 [254042.599554] ? __alloc_percpu+0x15/0x20 [254042.599759] vfs_kern_mount+0x67/0x110 [254042.599972] do_mount+0x1bb/0xc80 [254042.600175] ? memdup_user+0x42/0x60 [254042.600380] SyS_mount+0x83/0xd0 [254042.600583] entry_SYSCALL_64_fastpath+0x1e/0xad Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-01Merge tag 'for-f2fs-4.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This round introduces several interesting features such as on-disk NAT bitmaps, IO alignment, and a discard thread. And it includes a couple of major bug fixes as below. Enhancements: - introduce on-disk bitmaps to avoid scanning NAT blocks when getting free nids - support IO alignment to prepare open-channel SSD integration in future - introduce a discard thread to avoid long latency during checkpoint and fstrim - use SSR for warm node and enable inline_xattr by default - introduce in-memory bitmaps to check FS consistency for debugging - improve write_begin by avoiding needless read IO Bug fixes: - fix broken zone_reset behavior for SMR drive - fix wrong victim selection policy during GC - fix missing behavior when preparing discard commands - fix bugs in atomic write support and fiemap - workaround to handle multiple f2fs_add_link calls having same name ... and it includes a bunch of clean-up patches as well" * tag 'for-f2fs-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (97 commits) f2fs: avoid to flush nat journal entries f2fs: avoid to issue redundant discard commands f2fs: fix a plint compile warning f2fs: add f2fs_drop_inode tracepoint f2fs: Fix zoned block device support f2fs: remove redundant set_page_dirty() f2fs: fix to enlarge size of write_io_dummy mempool f2fs: fix memory leak of write_io_dummy mempool during umount f2fs: fix to update F2FS_{CP_}WB_DATA count correctly f2fs: use MAX_FREE_NIDS for the free nids target f2fs: introduce free nid bitmap f2fs: new helper cur_cp_crc() getting crc in f2fs_checkpoint f2fs: update the comment of default nr_pages to skipping f2fs: drop the duplicate pval in f2fs_getxattr f2fs: Don't update the xattr data that same as the exist f2fs: kill __is_extent_same f2fs: avoid bggc->fggc when enough free segments are avaliable after cp f2fs: select target segment with closer temperature in SSR mode f2fs: show simple call stack in fault injection message f2fs: no need lock_op in f2fs_write_inline_data ...
2017-02-27f2fs: add f2fs_drop_inode tracepointHou Pengyang
Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27f2fs: Fix zoned block device supportMasato Suzuki
The introduction of the multi-device feature partially broke the support for zoned block devices. In the function f2fs_scan_devices, sbi->devs allocation and initialization is skipped in the case of a single device mount. This result in no device information structure being allocated for the device. This is fine if the device is a regular device, but in the case of a zoned block device, the device zone type array is not initialized, which causes the function __f2fs_issue_discard_zone to fail as get_blkz_type is unable to determine the zone type of a section. Fix this by always allocating and initializing the sbi->devs device information array even in the case of a single device if that device is zoned. For this particular case, make sure to obtain a reference on the single device so that the call to blkdev_put() in destroy_device_list operates as expected. Fixes: 3c62be17d4f562f4 ("f2fs: support multiple devices") Cc: <stable@vger.kernel.org> # v4.10 Signed-off-by: Masato Suzuki <masato.suzuki@wdc.com> Acked-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27f2fs: fix to enlarge size of write_io_dummy mempoolChao Yu
It needs to double cache size of write_io_dummy mempool, otherwise we may run out of cache in scenraio of Data/Node IOs were issued concurrently. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27f2fs: fix memory leak of write_io_dummy mempool during umountChao Yu
Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-23f2fs: enable inline_xattr by defaultChao Yu
In android, since SElinux is enable, security policy will be appliedd for each file, it stores in inode as an xattr entry, so it will take one 4k size node block additionally for each file. Let's enable inline_xattr by default in order to save storage space. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-23f2fs: introduce noinline_xattr mount optionChao Yu
This patch introduces new mount option 'noinline_xattr', so we can disable inline xattr functionality which is already set as a default mount option. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-23f2fs: super: constify fscrypt_operations structureBhumika Goyal
Declare fscrypt_operations structure as const as it is only stored in the s_cop field of a super_block structure. This field is of type const, so fscrypt_operations structure having this property can be made const too. File size before: fs/f2fs/super.o text data bss dec hex filename 54131 31355 184 85670 14ea6 fs/f2fs/super.o File size after: fs/f2fs/super.o text data bss dec hex filename 54227 31259 184 85670 14ea6 fs/f2fs/super.o Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-23f2fs: show checkpoint version at mount timeJaegeuk Kim
If we mounted f2fs successfully, let's show current checkpoint version. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-22f2fs: show the fault injection mount optionKaixu Xia
This patch shows the fault injection mount option in f2fs_show_options(). Signed-off-by: Kaixu Xia <xiakaixu@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-22f2fs: factor out discard command info into discard_cmd_controlJaegeuk Kim
This patch adds discard_cmd_control with the existing discarding controls. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-20Merge tag 'fscrypt-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt Pull fscrypt updates from Ted Ts'o: "Various cleanups for the file system encryption feature" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt: fscrypt: constify struct fscrypt_operations fscrypt: properly declare on-stack completion fscrypt: split supp and notsupp declarations into their own headers fscrypt: remove redundant assignment of res fscrypt: make fscrypt_operations.key_prefix a string fscrypt: remove unused 'mode' member of fscrypt_ctx ext4: don't allow encrypted operations without keys fscrypt: make test_dummy_encryption require a keyring key fscrypt: factor out bio specific functions fscrypt: pass up error codes from ->get_context() fscrypt: remove user-triggerable warning messages fscrypt: use EEXIST when file already uses different policy fscrypt: use ENOTDIR when setting encryption policy on nondirectory fscrypt: use ENOKEY when file cannot be created w/o key
2017-02-08fscrypt: constify struct fscrypt_operationsEric Biggers
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Richard Weinberger <richard@nod.at>
2017-01-29f2fs: relax async discard commands moreJaegeuk Kim
This patch relaxes async discard commands to avoid waiting its end_io during checkpoint. Instead of waiting them during checkpoint, it will be done when actually reusing them. Test on initial partition of nvme drive. # time fstrim /mnt/test Before : 6.158s After : 4.822s Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-01-29f2fs: get io size bit from mount optionJaegeuk Kim
This patch adds to set io_size_bits from mount option. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-01-29f2fs: support IO alignment for DATA and NODE writesJaegeuk Kim
This patch implements IO alignment by filling dummy blocks in DATA and NODE write bios. If we can guarantee, for example, 32KB or 64KB for such the IOs, we can eliminate underlying dummy page problem which FTL conducts in order to close MLC or TLC partial written pages. Note that, - it requires "-o mode=lfs". - IO size should be power of 2, not exceed BIO_MAX_PAGES, 256. - read IO is still 4KB. - do checkpoint at fsync, if dummy NODE page was written. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-01-12block: Rename blk_queue_zone_size and bdev_zone_sizeDamien Le Moal
All block device data fields and functions returning a number of 512B sectors are by convention named xxx_sectors while names in the form xxx_size are generally used for a number of bytes. The blk_queue_zone_size and bdev_zone_size functions were not following this convention so rename them. No functional change is introduced by this patch. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Collapsed the two patches, they were nonsensically split and broke bisection. Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-08fscrypt: make fscrypt_operations.key_prefix a stringEric Biggers
There was an unnecessary amount of complexity around requesting the filesystem-specific key prefix. It was unclear why; perhaps it was envisioned that different instances of the same filesystem type could use different key prefixes, or that key prefixes could be binary. However, neither of those things were implemented or really make sense at all. So simplify the code by making key_prefix a const char *. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Richard Weinberger <richard@nod.at> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-12-14Merge tag 'for-f2fs-4.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch series contains several performance tuning patches regarding to the IO submission flow, in addition to supporting new features such as a ZBC-base drive and multiple devices. It also includes some major bug fixes such as: - checkpoint version control - fdatasync-related roll-forward recovery routine - memory boundary or null-pointer access in corner cases - missing error cases It has various minor clean-up patches as well" * tag 'for-f2fs-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (66 commits) f2fs: fix a missing size change in f2fs_setattr f2fs: fix to access nullified flush_cmd_control pointer f2fs: free meta pages if sanity check for ckpt is failed f2fs: detect wrong layout f2fs: call sync_fs when f2fs is idle Revert "f2fs: use percpu_counter for # of dirty pages in inode" f2fs: return AOP_WRITEPAGE_ACTIVATE for writepage f2fs: do not activate auto_recovery for fallocated i_size f2fs: fix to determine start_cp_addr by sbi->cur_cp_pack f2fs: fix 32-bit build f2fs: set ->owner for debugfs status file's file_operations f2fs: fix incorrect free inode count in ->statfs f2fs: drop duplicate header timer.h f2fs: fix wrong AUTO_RECOVER condition f2fs: do not recover i_size if it's valid f2fs: fix fdatasync f2fs: fix to account total free nid correctly f2fs: fix an infinite loop when flush nodes in cp f2fs: don't wait writeback for datas during checkpoint f2fs: fix wrong written_valid_blocks counting ...
2016-12-07f2fs: fix to access nullified flush_cmd_control pointerJaegeuk Kim
f2fs_sync_file() remount_ro - f2fs_readonly - destroy_flush_cmd_control - f2fs_issue_flush - no fcc pointer! So, this patch doesn't free fcc in this case, but just stop its kernel thread which sends flush commands. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-07f2fs: detect wrong layoutJaegeuk Kim
Previous mkfs.f2fs allows small partition inappropriately, so f2fs should detect that as well. Refer this in f2fs-tools. mkfs.f2fs: detect small partition by overprovision ratio and # of segments Reported-and-Tested-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-05Revert "f2fs: use percpu_counter for # of dirty pages in inode"Jaegeuk Kim
This reverts commit 1beba1b3a953107c3ff5448ab4e4297db4619c76. The perpcu_counter doesn't provide atomicity in single core and consume more DRAM. That incurs fs_mark test failure due to ENOMEM. Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-25f2fs: fix incorrect free inode count in ->statfsChao Yu
While calculating inode count that we can create at most in the left space, we should consider space which data/node blocks occupied, since we create data/node mixly in main area. So fix the wrong calculation in ->statfs. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-25f2fs: support multiple devicesJaegeuk Kim
This patch implements multiple devices support for f2fs. Given multiple devices by mkfs.f2fs, f2fs shows them entirely as one big volume under one f2fs instance. Internal block management is very simple, but we will modify block allocation and background GC policy to boost IO speed by exploiting them accoording to each device speed. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: remove checkpoint in f2fs_freezeJaegeuk Kim
The generic freeze_super() calls sync_filesystems() before f2fs_freeze(). So, basically we don't need to do checkpoint in f2fs_freeze(). But, in xfs/068, it triggers circular locking problem below due to gc_mutex for checkpoint. ====================================================== [ INFO: possible circular locking dependency detected ] 4.9.0-rc1+ #132 Tainted: G OE ------------------------------------------------------- 1. wait for __sb_start_write() by [<ffffffff9845f353>] dump_stack+0x85/0xc2 [<ffffffff980e80bf>] print_circular_bug+0x1cf/0x230 [<ffffffff980eb4d0>] __lock_acquire+0x19e0/0x1bc0 [<ffffffff980ebdcb>] lock_acquire+0x11b/0x220 [<ffffffffc08c7c3b>] ? f2fs_drop_inode+0x9b/0x160 [f2fs] [<ffffffff9826bdd0>] __sb_start_write+0x130/0x200 [<ffffffffc08c7c3b>] ? f2fs_drop_inode+0x9b/0x160 [f2fs] [<ffffffffc08c7c3b>] f2fs_drop_inode+0x9b/0x160 [f2fs] [<ffffffff98289991>] iput+0x171/0x2c0 [<ffffffffc08cfccf>] f2fs_sync_inode_meta+0x3f/0xf0 [f2fs] [<ffffffffc08cfe04>] block_operations+0x84/0x110 [f2fs] [<ffffffffc08cff78>] write_checkpoint+0xe8/0xf20 [f2fs] [<ffffffff980e979d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc08c6de9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffff9803e9d9>] ? sched_clock+0x9/0x10 [<ffffffffc08c6de9>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc08c6df5>] f2fs_sync_fs+0x85/0x190 [f2fs] [<ffffffff982a4f90>] ? do_fsync+0x70/0x70 [<ffffffff982a4f90>] ? do_fsync+0x70/0x70 [<ffffffff982a4fb0>] sync_fs_one_sb+0x20/0x30 [<ffffffff9826ca3e>] iterate_supers+0xae/0x100 [<ffffffff982a50b5>] sys_sync+0x55/0x90 [<ffffffff9890b345>] entry_SYSCALL_64_fastpath+0x23/0xc6 2. wait for sbi->gc_mutex by [<ffffffff980ebdcb>] lock_acquire+0x11b/0x220 [<ffffffff989063d6>] mutex_lock_nested+0x76/0x3f0 [<ffffffffc08c6de9>] f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc08c7a6c>] f2fs_freeze+0x1c/0x20 [f2fs] [<ffffffff9826b6ef>] freeze_super+0xcf/0x190 [<ffffffff9827eebc>] do_vfs_ioctl+0x53c/0x6a0 [<ffffffff9827f099>] SyS_ioctl+0x79/0x90 [<ffffffff9890b345>] entry_SYSCALL_64_fastpath+0x23/0xc6 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Cache zoned block devices zone typeDamien Le Moal
With the zoned block device feature enabled, section discard need to do a zone reset for sections contained in sequential zones, and a regular discard (if supported) for sections stored in conventional zones. Avoid the need for a costly report zones to obtain a section zone type when discarding it by caching the types of the device zones in the super block information. This cache is initialized at mount time for mounts with the zoned block device feature enabled. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Do not allow adaptive mode for host-managed zoned block devicesDamien Le Moal
The LFS mode is mandatory for host-managed zoned block devices as update in place optimizations are not possible for segments in sequential zones. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Always enable discard for zoned blocks devicesDamien Le Moal
Zone write pointer reset acts as discard for zoned block devices. So if the zoned block device feature is enabled, always declare that discard is enabled, even if the device does not actually support the command. For the same reason, prevent the use the "nodicard" mount option. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Suppress discard warning message for zoned block devicesDamien Le Moal
For zoned block devices, discard is replaced by zone reset. So do not warn if the device does not supports discard. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Check zoned block feature for host-managed zoned block devicesDamien Le Moal
The F2FS_FEATURE_BLKZONED feature indicates that the drive was formatted with zone alignment optimization. This is optional for host-aware devices, but mandatory for host-managed zoned block devices. So check that the feature is set in this latter case. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Use generic zoned block device terminologyDamien Le Moal
SMR stands for "Shingled Magnetic Recording" which makes sense only for hard disk drives (spinning rust). The ZBC/ZAC standards enable management of SMR disks, but solid state drives may also support those standards. So rename the HMSMR feature to BLKZONED to avoid a HDD centric terminology. For the same reason, rename f2fs_sb_mounted_hmsmr to f2fs_sb_mounted_blkzoned. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: Add missing break in switch-caseDamien Le Moal
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: avoid infinite loop in the EIO case on recover_orphan_inodesJaegeuk Kim
This patch should fix an infinite loop case below. F2FS-fs : inject IO error in f2fs_read_end_io+0xf3/0x120 [f2fs] F2FS-fs (nvme0n1p1): recover_orphan_inode: orphan failed (ino=39ac1a), run fsck to fix. ... [<ffffffffc0b11ede>] sync_meta_pages+0xae/0x270 [f2fs] [<ffffffffc0b288dd>] ? flush_sit_entries+0x8d/0x960 [f2fs] [<ffffffffc0b13801>] write_checkpoint+0x361/0xf20 [f2fs] [<ffffffffb40e979d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc0b0a199>] ? f2fs_sync_fs+0x79/0x190 [f2fs] [<ffffffffc0b0a1a5>] f2fs_sync_fs+0x85/0x190 [f2fs] [<ffffffffc0b2560e>] f2fs_balance_fs_bg+0x7e/0x1c0 [f2fs] [<ffffffffc0b216c4>] f2fs_write_node_pages+0x34/0x320 [f2fs] [<ffffffffb41dff21>] do_writepages+0x21/0x30 [<ffffffffb429edb1>] __writeback_single_inode+0x61/0x760 [<ffffffffb490a937>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffb42a0805>] writeback_single_inode+0xd5/0x190 [<ffffffffb42a0959>] write_inode_now+0x99/0xc0 [<ffffffffb4289a16>] iput+0x1f6/0x2c0 [<ffffffffc0b0e3be>] f2fs_fill_super+0xe0e/0x1300 [f2fs] [<ffffffffb426c394>] ? sget_userns+0x4f4/0x530 [<ffffffffb426c692>] mount_bdev+0x182/0x1b0 [<ffffffffc0b0d5b0>] ? f2fs_commit_super+0x100/0x100 [f2fs] [<ffffffffc0b0a375>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffffb426d038>] mount_fs+0x38/0x170 [<ffffffffb428ec9b>] vfs_kern_mount+0x6b/0x160 [<ffffffffb4291d9e>] do_mount+0x1be/0xd60 [<ffffffffb4291a57>] ? copy_mount_options+0xb7/0x220 [<ffffffffb4292c54>] SyS_mount+0x94/0xd0 [<ffffffffb490b345>] entry_SYSCALL_64_fastpath+0x23/0xc6 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: remove percpu_count due to performance regressionJaegeuk Kim
This patch removes percpu_count usage due to performance regression in iozone. Fixes: 523be8a6b3 ("f2fs: use percpu_counter for page counters") Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: keep dirty inodes selectively for checkpointJaegeuk Kim
This is to avoid no free segment bug during checkpoint caused by a number of dirty inodes. The case was reported by Chao like this. 1. mount with lazytime option 2. fill 4k file until disk is full 3. sync filesystem 4. read all files in the image 5. umount In this case, we actually don't need to flush dirty inode to inode page during checkpoint. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: fix to release discard entries during checkpointChao Yu
In f2fs_fill_super, if there is any IO error occurs during recovery, cached discard entries will be leaked, in order to avoid this, make write_checkpoint() handle memory release by itself, besides, move clear_prefree_segments to write_checkpoint for readability. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-01block,fs: use REQ_* flags directlyChristoph Hellwig
Remove the WRITE_* and READ_SYNC wrappers, and just use the flags directly. Where applicable this also drops usage of the bio_set_op_attrs wrapper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-30f2fs: support checkpoint error injectionChao Yu
This patch adds to support checkpoint error injection in f2fs for testing fatal error tolerance, it will be useful that it can simulate abnormal power off by f2fs itself instead of calling godown ioctl by running apps. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>