aboutsummaryrefslogtreecommitdiffstats
path: root/fs/f2fs/segment.h
AgeCommit message (Collapse)Author
2018-09-19f2fs: do not set free of current sectionYunlong Song
[ Upstream commit 3611ce9911267cb93d364bd71ddea6821278d11f ] For the case when sbi->segs_per_sec > 1, take section:segment = 5 for example, if segment 1 is just used and allocate new segment 2, and the blocks of segment 1 is invalidated, at this time, the previous code will use __set_test_and_free to free the free_secmap and free_sections++, this is not correct since it is still a current section, so fix it. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-04f2fs: fix to update mtime correctlyChao Yu
If we change system time to the past, get_mtime() will return a overflowed time, and SIT_I(sbi)->max_mtime will be udpated incorrectly, this patch fixes the two issues. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31f2fs: avoid stucking GC due to atomic writeChao Yu
f2fs doesn't allow abuse on atomic write class interface, so except limiting in-mem pages' total memory usage capacity, we need to limit atomic-write usage as well when filesystem is seriously fragmented, otherwise we may run into infinite loop during foreground GC because target blocks in victim segment are belong to atomic opened file for long time. Now, we will detect failure due to atomic write in foreground GC, if the count exceeds threshold, we will drop all atomic written data in cache, by this, I expect it can keep our system running safely to prevent Dos attack. In addition, his patch adds to show GC skip information in debugfs, now it just shows count of skipped caused by atomic write. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31f2fs: clean up with is_valid_blkaddr()Chao Yu
- rename is_valid_blkaddr() to is_valid_meta_blkaddr() for readability. - introduce is_valid_blkaddr() for cleanup. No logic change in this patch. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31Revert "f2fs: add ovp valid_blocks check for bg gc victim to fg_gc"Chao Yu
For extreme case: 10 section, op = 10%, no_fggc_threshold = 90% All section usage: 85% 85% 85% 85% 90% 90% 95% 95% 95% 95% During foreground GC, if we skip select dirty section whose usage is larger than no_fggc_threshold, we can only recycle 80% invalid space from four 85% usage sections and two 90% usage sections, result in encountering out-of-space issue. This reverts commit e93b9865251a0503d83fd570e7d5a7c8bc351715 to fix this issue, besides, we keep the logic that we scan all dirty section when searching a victim, so that GC can select victim with least valid blocks. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31f2fs: don't use GFP_ZERO for page cachesChao Yu
Related to https://lkml.org/lkml/2018/4/8/661 Sometimes, we need to write meta data to new allocated block address, then we will allocate a zeroed page in inner inode's address space, and fill partial data in it, and leave other place with zero value which means some fields are initial status. There are two inner inodes (meta inode and node inode) setting __GFP_ZERO, I have just checked them, for both of them, we can avoid using __GFP_ZERO, and do initialization by ourselves to avoid unneeded/redundant zeroing from mm. Cc: <stable@vger.kernel.org> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-18f2fs: check blkaddr more accuratly before issue a bioYunlei He
This patch check blkaddr more accuratly before issue a write or read bio. Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-13f2fs: add auto tuning for small devicesJaegeuk Kim
If f2fs is running on top of very small devices, it's worth to avoid abusing free LBAs. In order to achieve that, this patch introduces some parameter tuning. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-01-25f2fs: rebuild sit page from sit info in memYunlei He
This patch rebuild sit page from sit info in mem instead of issue a read io. I test this method and the result is as below: Pre: mmc_perf_test-12061 [001] ...1 976.819992: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [001] ...1 976.856446: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [003] ...1 998.976946: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [003] ...1 999.023269: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [003] ...1 1022.060772: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [003] ...1 1022.111034: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [002] ...1 1070.127643: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [003] ...1 1070.187352: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [003] ...1 1095.942124: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [003] ...1 1095.995975: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [003] ...1 1122.535091: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [003] ...1 1122.586521: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [001] ...1 1147.897487: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [001] ...1 1147.959438: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [003] ...1 1177.926951: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [002] ...1 1177.976823: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-12061 [002] ...1 1204.176087: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-12061 [002] ...1 1204.239046: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit Some sit flush consume more than 50ms. Now: mmc_perf_test-2187 [007] ...1 196.840684: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [007] ...1 196.841258: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [007] ...1 219.430582: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [007] ...1 219.431144: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [002] ...1 243.638678: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [000] ...1 243.638980: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [002] ...1 265.392180: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [002] ...1 265.392245: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [000] ...1 290.309051: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [000] ...1 290.309116: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [003] ...1 317.144209: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [003] ...1 317.145913: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [005] ...1 343.224954: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [005] ...1 343.225574: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [000] ...1 370.239846: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [000] ...1 370.241138: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [001] ...1 397.029043: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [001] ...1 397.030750: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit mmc_perf_test-2187 [003] ...1 425.386377: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = start flush sit mmc_perf_test-2187 [003] ...1 425.387735: f2fs_write_checkpoint: dev = (259,44), checkpoint for Sync, state = end flush sit Most sit flush consume no more than 1ms. Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-01-22f2fs: split need_inplace_updateChao Yu
This patch splits need_inplace_update to two functions: a. should_update_inplace() includes all conditions that we must use IPU. b. should_update_outplace() includes all conditions that we must use OPU. So that, in f2fs_ioc_set_pin_file() and f2fs_defragment_range(), we can use corresponding function to check whether we can trigger OPU/IPU or not. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-01-02f2fs: return error during fill_superJaegeuk Kim
Let's avoid BUG_ON during fill_super, when on-disk was totall corrupted. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05f2fs: check curseg space before foreground GCChao Yu
When we are closing to trigger foreground GC, if there are only a few of dirty metas, we can log these dirty metas in left space of opened segments instead of triggering foreground GC. With this patch, total count of foreground GC triggered by test/generic/* of fstest suit reduce from 254 to 184. So let's do the check before foreground GC anyway. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05f2fs: use rw_semaphore to protect SIT cacheChao Yu
There are some cases user didn't update SIT cache under this lock, so let's use rw_semaphore instead of mutex to enhance concurrently accessing. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: fix to correct no_fggc_candidateChao Yu
There may be extreme case as below: For one section contains one segment, and there are total 100 segments with 10% over-privision ratio in f2fs partition, fggc_threshold will be rounded down to 460 instead of 460.8 as below caclulation: sbi->fggc_threshold = div_u64((u64)(main_count - ovp_count) * BLKS_PER_SEC(sbi), (main_count - resv_count)); If section usage is as: 60 segments which contain 460 valid blocks 40 segments which contain 462 valid blocks As valid block number in all sections is large than fggc_threshold, so none of them will be chosen as candidate due to incorrect fggc_threshold. Let's just soften the term of choosing foreground GC candidates. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-10-26f2fs: split discard policyChao Yu
There are many different scenarios such as fstrim, umount, urgent or background where we will issue discards, actually, they need use different policy in aspect of io aware, discard granularity, delay interval and so on. But now they just share one common discard policy, so there will be race when changing policy in between these scenarios, the interference of changing discard policy will be very serious. This patch changes to split discard policy for different scenarios. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-11f2fs: speed up gc_urgent mode with SSRJaegeuk Kim
This patch activates SSR in gc_urgent mode. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-29f2fs: wake up discard_thread iff there is a candidateJaegeuk Kim
This patch fixes to avoid needless wake ups. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-21f2fs: remove unused function overprovision_sectionsYunlong Song
Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-08-15f2fs: use IPU for cold filesJaegeuk Kim
We expect cold files write data sequentially, but sometimes some of small data can be updated, which incurs fragmentation. Let's avoid that. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-07-10Merge tag 'for-f2fs-4.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've added new features such as disk quota and statx, and modified internal bio management flow to merge more IOs depending on block types. We've also made internal threads freezeable for Android battery life. In addition to them, there are some patches to avoid lock contention as well as a couple of deadlock conditions. Enhancements: - support usrquota, grpquota, and statx - manage DATA/NODE typed bios separately to serialize more IOs - modify f2fs_lock_op/wio_mutex to avoid lock contention - prevent lock contention in migratepage Bug fixes: - fix missing load of written inode flag - fix worst case victim selection in GC - freezeable GC and discard threads for Android battery life - sanitize f2fs metadata to deal with security hole - clean up sysfs-related code and docs" * tag 'for-f2fs-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (59 commits) f2fs: support plain user/group quota f2fs: avoid deadlock caused by lock order of page and lock_op f2fs: use spin_{,un}lock_irq{save,restore} f2fs: relax migratepage for atomic written page f2fs: don't count inode block in in-memory inode.i_blocks Revert "f2fs: fix to clean previous mount option when remount_fs" f2fs: do not set LOST_PINO for renamed dir f2fs: do not set LOST_PINO for newly created dir f2fs: skip ->writepages for {mete,node}_inode during recovery f2fs: introduce __check_sit_bitmap f2fs: stop gc/discard thread in prior during umount f2fs: introduce reserved_blocks in sysfs f2fs: avoid redundant f2fs_flush after remount f2fs: report # of free inodes more precisely f2fs: add ioctl to do gc with target block address f2fs: don't need to check encrypted inode for partial truncation f2fs: measure inode.i_blocks as generic filesystem f2fs: set CP_TRIMMED_FLAG correctly f2fs: require key for truncate(2) of encrypted file f2fs: move sysfs code from super.c to fs/f2fs/sysfs.c ...
2017-05-23f2fs: split bio cacheJaegeuk Kim
Split DATA/NODE type bio cache according to different temperature, so write IOs with the same temperature can be merged in corresponding bio cache as much as possible, otherwise, different temperature write IOs submitting into one bio cache will always cause split of bio. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-05-08Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge more updates from Andrew Morton: - the rest of MM - various misc things - procfs updates - lib/ updates - checkpatch updates - kdump/kexec updates - add kvmalloc helpers, use them - time helper updates for Y2038 issues. We're almost ready to remove current_fs_time() but that awaits a btrfs merge. - add tracepoints to DAX * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits) drivers/staging/ccree/ssi_hash.c: fix build with gcc-4.4.4 selftests/vm: add a test for virtual address range mapping dax: add tracepoint to dax_insert_mapping() dax: add tracepoint to dax_writeback_one() dax: add tracepoints to dax_writeback_mapping_range() dax: add tracepoints to dax_load_hole() dax: add tracepoints to dax_pfn_mkwrite() dax: add tracepoints to dax_iomap_pte_fault() mtd: nand: nandsim: convert to memalloc_noreclaim_*() treewide: convert PF_MEMALLOC manipulations to new helpers mm: introduce memalloc_noreclaim_{save,restore} mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC mm/huge_memory.c: deposit a pgtable for DAX PMD faults when required mm/huge_memory.c: use zap_deposited_table() more time: delete CURRENT_TIME_SEC and CURRENT_TIME gfs2: replace CURRENT_TIME with current_time apparmorfs: replace CURRENT_TIME with current_time() lustre: replace CURRENT_TIME macro fs: ubifs: replace CURRENT_TIME_SEC with current_time fs: ufs: use ktime_get_real_ts64() for birthtime ...
2017-05-08fs: f2fs: use ktime_get_real_seconds for sit_info timesDeepa Dinamani
CURRENT_TIME_SEC is not y2038 safe. Replace use of CURRENT_TIME_SEC with ktime_get_real_seconds in segment timestamps used by GC algorithm including the segment mtime timestamps. Link: http://lkml.kernel.org/r/1491613030-11599-2-git-send-email-deepa.kernel@gmail.com Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Chao Yu <yuchao0@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-05-02f2fs: reconstruct code to write a data pageHou Pengyang
This patch introduces encrypt_one_page which encrypts one data page before submit_bio, and change the use of need_inplace_update. Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-24f2fs: skip encrypted inode in ASYNC IPU policyHou Pengyang
Async request may be throttled in block layer, so page for async may keep WRITE_BACK for a long time. For encrytped inode, we need wait on page writeback no matter if the device supports BDI_CAP_STABLE_WRITES. This may result in a higher waiting page writeback time for async encrypted inode page. This patch skips IPU for encrypted inode's updating write. Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-24f2fs: add ioctl to flush data from faster device to cold areaJaegeuk Kim
This patch adds an ioctl to flush data in faster device to cold area. User can give device number and number of segments to move. It doesn't move it if there is only one device. The parameter looks like: struct f2fs_flush_device { u32 dev_num; /* device number to flush */ u32 segments; /* # of segments to flush */ }; Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-19f2fs: introduce async IPU policyHou Pengyang
This patch introduces an ASYNC IPU policy. Under senario of large # of async updating(e.g. log writing in Android), disk would be seriously fragmented, and higher frequent gc would be triggered. This patch uses IPU to rewrite the async update writting, since async is NOT sensitive to io latency. Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
2017-04-10f2fs: clean up some macros in terms of GET_SEGNOJaegeuk Kim
This patch cleans several macros by introducing: - BLKS_PER_SEC - GET_SEC_FROM_SEG - GET_SEG_FROM_SEC - GET_ZONE_FROM_SEC - GET_ZONE_FROM_SEG Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-10f2fs: clean up get_valid_blocks with consistent parameterJaegeuk Kim
This patch cleans up get_valid_blocks, which has no functional change. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-10f2fs: use segment number for get_valid_blocksJaegeuk Kim
This patch fixes to submit a segment number for get_valid_blocks. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-10f2fs: guard macro variables with bracesTomohiro Kusumi
Add braces around variables used within macros for those make sense to do it. Many of the macros in f2fs already do this. What this commit doesn't do is anything that changes line# as a result of adding braces, which usually affects the binary via __LINE__. Confirmed no diff in fs/f2fs/f2fs.ko before/after this commit on x86_64, to make sure this has no functional change as well as there's been no unexpected side effect due to callers' arithmetics within the existing code. Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-04-05f2fs: write small sized IO to hot logJaegeuk Kim
It would better split small and large IOs separately in order to get more consecutive big writes. The default threshold is set to 64KB, but configurable by sysfs/min_hot_blocks. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-29f2fs: start SSR much eariler to avoid FG_GCJaegeuk Kim
This patch initiates SSR much eariler, resulting in less FG_GC. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27f2fs: update the comment of default nr_pages to skippingKinglong Mee
Fixes: 2c237ebaa4 ("f2fs: avoid writing node/metapages during writes") Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-23f2fs: add ovp valid_blocks check for bg gc victim to fg_gcHou Pengyang
For foreground gc, greedy algorithm should be adapted, which makes this formula work well: (2 * (100 / config.overprovision + 1) + 6) But currently, we fg_gc have a prior to select bg_gc victim segments to gc first, these victims are selected by cost-benefit algorithm, we can't guarantee such segments have the small valid blocks, which may destroy the f2fs rule, on the worstest case, would consume all the free segments. This patch fix this by add a filter in check_bg_victims, if segment's has # of valid blocks over overprovision ratio, skip such segments. Cc: <stable@vger.kernel.org> Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-22f2fs: check in-memory sit version bitmapChao Yu
This patch adds a mirror for sit version bitmap, and use it to detect in-memory bitmap corruption which may be caused by bit-transition of cache or memory overflow. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-22f2fs: check in-memory block bitmapChao Yu
This patch adds a mirror for valid block bitmap, and use it to detect in-memory bitmap corruption which may be caused by bit-transition of cache or memory overflow. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-01-29f2fs: support IO alignment for DATA and NODE writesJaegeuk Kim
This patch implements IO alignment by filling dummy blocks in DATA and NODE write bios. If we can guarantee, for example, 32KB or 64KB for such the IOs, we can eliminate underlying dummy page problem which FTL conducts in order to close MLC or TLC partial written pages. Note that, - it requires "-o mode=lfs". - IO size should be power of 2, not exceed BIO_MAX_PAGES, 256. - read IO is still 4KB. - do checkpoint at fsync, if dummy NODE page was written. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-07f2fs: detect wrong layoutJaegeuk Kim
Previous mkfs.f2fs allows small partition inappropriately, so f2fs should detect that as well. Refer this in f2fs-tools. mkfs.f2fs: detect small partition by overprovision ratio and # of segments Reported-and-Tested-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: use BIO_MAX_PAGES for bio allocationJaegeuk Kim
We don't need to allocate bio partially in order to maximize sequential writes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23f2fs: count dirty inodes to flush node pages during checkpointJaegeuk Kim
If there are a lot of dirty inodes, we need to flush all of them when doing checkpoint. So, we need to count this for enough free space. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-12f2fs: check free_sections for defragmentationJaegeuk Kim
Fix wrong condition check for defragmentation of a file. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-08-24f2fs: not allow to write illegal blkaddrYunlei He
we came across an error as below: [build_nat_area_bitmap:1710] nid[0x 1718] addr[0x 1c18ddc] ino[0x 1718] [build_nat_area_bitmap:1710] nid[0x 1719] addr[0x 1c193d5] ino[0x 1719] [build_nat_area_bitmap:1710] nid[0x 171a] addr[0x 1c1736e] ino[0x 171a] [build_nat_area_bitmap:1710] nid[0x 171b] addr[0x 58b3ee8f] ino[0x815f92ed] [build_nat_area_bitmap:1710] nid[0x 171c] addr[0x fcdc94b] ino[0x49366377] [build_nat_area_bitmap:1710] nid[0x 171d] addr[0x 7cd2facf] ino[0xb3c55300] [build_nat_area_bitmap:1710] nid[0x 171e] addr[0x bd4e25d0] ino[0x77c34c09] ... ... [build_nat_area_bitmap:1710] nid[0x 1718] addr[0x 1c18ddc] ino[0x 1718] [build_nat_area_bitmap:1710] nid[0x 1719] addr[0x 1c193d5] ino[0x 1719] [build_nat_area_bitmap:1710] nid[0x 171a] addr[0x 1c1736e] ino[0x 171a] [build_nat_area_bitmap:1710] nid[0x 171b] addr[0x 58b3ee8f] ino[0x815f92ed] [build_nat_area_bitmap:1710] nid[0x 171c] addr[0x fcdc94b] ino[0x49366377] [build_nat_area_bitmap:1710] nid[0x 171d] addr[0x 7cd2facf] ino[0xb3c55300] [build_nat_area_bitmap:1710] nid[0x 171e] addr[0x bd4e25d0] ino[0x77c34c09] One nat block may be stepped by a data block, so this patch forbid to write if the blkaddr is illegal Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-15f2fs: add maximum prefree segmentsJaegeuk Kim
In 1TB storage, we need to admit 22841 prefree segments, which can consume too much segments. This patch sets 8GB in max. prefree segments in that case. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06f2fs: avoid writing node/metapages during writesJaegeuk Kim
Let's keep more node/meta pages in run time. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-13f2fs: introduce mode=lfs mount optionJaegeuk Kim
This mount option is to enable original log-structured filesystem forcefully. So, there should be no random writes for main area. Especially, this supports host-managed SMR device. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02f2fs: do not skip writing data pagesJaegeuk Kim
For data pages, let's try to flush as much as possible in background. On /dev/pmem0, 1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync Before : 800 MB/s After : 1.1 GB/s 2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 Before : 1.3 GB/s After : 2.2 GB/s Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02f2fs: flush inode metadata when checkpoint is doingJaegeuk Kim
This patch registers all the inodes which have dirty metadata to sync when checkpoint is doing. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02f2fs: use inode pointer for {set, clear}_inode_flagJaegeuk Kim
This patch refactors to use inode pointer for set_inode_flag and clear_inode_flag. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-07f2fs: shrink size of struct seg_entryChao Yu
Restructure struct seg_entry to eliminate holes in it, after that, in 32-bits machine, it reduces size from 32 bytes to 24 bytes; in 64-bits machine, it reduces size from 56 bytes to 40 bytes. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>