summaryrefslogtreecommitdiffstats
path: root/fs/f2fs
AgeCommit message (Collapse)Author
2013-06-17f2fs: add remount_fs callback supportNamjae Jeon
Add the f2fs_remount function call which will be used during the filesystem remounting. This function will help us to change the mount options specific to f2fs. Also modify the f2fs background_gc mount option, which will allow the user to dynamically trun on/off the garbage collection in f2fs based on the background_gc value. If background_gc=on, Garbage collection will be turned off & if background_gc=off, Garbage collection will be truned on. By default the garbage collection is on in f2fs. Change Log: v2: Incorporated the review comments by Gu Zheng. Removing the restore part for VFS flags Updating comments with proper flag conditions Display GC background option as ON/OFF Revised conditions to stop GC in case of remount v1: Initial changes for adding remount_fs callback support. Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: change /** with /* for the coding style] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-14f2fs: recover wrong pino after checkpoint during fsyncJaegeuk Kim
If a file is linked, f2fs loose its parent inode number so that fsync calls for the linked file should do checkpoint all the time. But, if we can recover its parent inode number after the checkpoint, we can adjust roll-forward mechanism for the further fsync calls, which is able to improve the fsync performance significatly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-14f2fs: optimize do_write_data_page()Haicheng Li
Since "need_inplace_update() == true" is a very rare case, using unlikely() to give compiler a chance to optimize the code. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-14f2fs: make locate_dirty_segment() as staticHaicheng Li
It's used only locally and could be static. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-14f2fs: remove unnecessary parameter "offset" from __add_sum_entry()Haicheng Li
We can get the value directly from pointer "curseg". Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-14f2fs: avoid freqeunt write_inode callsJaegeuk Kim
If update_inode is called, we don't need to do write_inode. So, let's use a *dirty* flag for each inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-14f2fs: optimise the truncate_data_blocks_range() rangeNamjae Jeon
The function truncate_data_blocks_range() decrements the valid block count of inode via dec_valid_block_count(). Since this function updates the i_blocks field of inode, we can update this field once we have calculated total the number of blocks to be freed. Therefore we can decrement valid blocks outside of the for loop. if (nr_free) { + dec_valid_block_count(sbi, dn->inode, nr_free); set_page_dirty(dn->node_page); sync_inode_page(dn); } 'nr_free' tells the total number of blocks freed. So, we can just directly pass this value to dec_valid_block_count() and update the i_blocks. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-14f2fs: use the F2FS specific flags in f2fs_ioctl()Namjae Jeon
In f2fs_ioctl() function, it is using generic flags. Since F2FS specific flags are defined. So lets use those flags. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-12f2fs: sync dir->i_size with its block allocationJaegeuk Kim
If new dentry block is allocated and its i_size is updated, we should update its inode block together in order to sync i_size and its block allocation. Otherwise, we can loose additional dentry block due to the unconsistent i_size. Errorneous Scenario ------------------- In the recovery routine, - recovery_dentry | - __f2fs_add_link | | - get_new_data_page | | | - i_size_write(new_i_size) | | | - mark_inode_dirty_sync(dir) | | - update_parent_metadata | | | - mark_inode_dirty(dir) | - write_checkpoint - sync_dirty_dir_inodes - filemap_flush(dentry_blocks) - f2fs_write_data_page - skip to write the last dentry block due to index < i_size In the above flow, new_i_size is not updated to its inode block so that the last dentry block will be lost accordingly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-11f2fs: fix i_blocks translation on various types of filesJaegeuk Kim
Basically an inode manages the number of allocated blocks with inode->i_blocks which is represented in a unit of sectors, not file system blocks. But, f2fs has used i_blocks in a unit of file system blocks, and f2fs_getattr translates it to the number of sectors when fstat is called. However, previously f2fs_file_inode_operations only has this, so this patch adds it to all the types of inode_operations. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-11f2fs: set sb->s_fs_info before calling parse_options()Gu Zheng
In f2fs_fill_super(), set sb->s_fs_info before calling parse_options(), then we can get f2fs_sb_info via F2FS_SB(sb) in parse_options(). So that the second argument "sbi" of func parse_options() is no longer needed. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-11f2fs: support xattr security labelsJaegeuk Kim
This patch adds the support of security labels for f2fs, which will be used by Linus Security Models (LSMs). Quote from http://en.wikipedia.org/wiki/Linux_Security_Modules: "Linux Security Modules (LSM) is a framework that allows the Linux kernel to support a variety of computer security models while avoiding favoritism toward any single security implementation. The framework is licensed under the terms of the GNU General Public License and is standard part of the Linux kernel since Linux 2.6. AppArmor, SELinux, Smack and TOMOYO Linux are the currently accepted modules in the official kernel.". Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-07f2fs: fix iget/iput of dir during recoveryJaegeuk Kim
It is possible that iput is skipped after iget during the recovery. In recover_dentry(), dir = f2fs_iget(); ... if (de && inode->i_ino == le32_to_cpu(de->ino)) goto out; In this case, this dir is not able to be added in dirty_dir_inode_list. The actual linking is done only when set_page_dirty() is called. So let's add this newly got inode into the list explicitly, and put it at the end of the recovery routine. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-06f2fs: reorganise the function get_victim_by_defaultNamjae Jeon
Fix the function get_victim_by_default, where it checks for the condition that p.min_segno != NULL_SEGNO as shown: if (p.min_segno != NULL_SEGNO) goto got_it; and if above condition is true then got_it: if (p.min_segno != NULL_SEGNO) { So this condition is being checked twice. Hence move the goto statement after the if condition so that duplication of condition check is avoided. Also this function makes a call to get_max_cost() to compute the max cost based on the f2fs_sbi_info and victim policy. Since get_max_cost depends on on three parameters of victim_sel_policy => alloc_mode, gc_mode & ofs_unit, once this victim policy is initialised, these value will not change till the execution time of get_victim_by_default() & also f2fs_sbi_info structure parameters will not change. Hence making calls to get_max_cost() in while loop does not seems to be a good point. Instead we can call it once in begining and store the results in local variable, which later can serve our purpose for comparing the cost with max cost inside the while loop. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-06-03f2fs: handle errors from get_node_page callsJason Hrycay
Add check for error pointers returned from get_node_page in order to avoid dereferencing a bad address on the next use. Signed-off-by: Jason Hrycay <jason.hrycay@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: cover cp_file information with ilockJaegeuk Kim
If a file is linked with other files, it should be checkpointed at every fsync calls. For this, we use set_cp_file() with FADVISE_CP_BIT, but previously we didn't cover the flag by the global lock. This patch fixes that the inode page stores this correctly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: fix incorrect iputs during the dentry recoveryJaegeuk Kim
- iget/iput flow in the dentry recovery process 1. *dir* = f2fs_iget 2. set FI_DELAY_IPUT to *dir* 3. add *dir* to the dirty_dir_list - __f2fs_add_link - recover_dentry) 4. iput *dir* by remove_dirty_dir_inode - sync_dirty_dir_inodes - write_chekcpoint If *dir*'s i_count is not 1 (i.e., root dir), remove_dirty_dir_inode is called later and then iput is triggered again due to the FI_DELAY_IPUT flag. So, let's unset the flag properly once iput is triggered. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: fix dentry recovery routineJaegeuk Kim
The error scenario is: 1. create /a (1.a link /a /b) 2. sync 3. unlinke /a 4. create /a 5. fsync /a 6. Sudden power-off When the f2fs recovers the fsynced dentry, /a, we discover an exsiting dentry at f2fs_find_entry() in recover_dentry(). In such the case, we should unlink the existing dentry and its inode and then recover newly created dentry. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: iput only if whole data blocks are flushedJaegeuk Kim
If there remains some unwritten blocks from the recovery, we should not call iput on that directory inode. Otherwise, we can loose some dentry blocks after the recovery. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: return proper error from start_gc_threadNamjae Jeon
when there is an error from kthread_run, then return proper error rather than returning -ENOMEM. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: optimize several routines in node.hNamjae Jeon
There are various functions with common code which could be separated out to make common routines. So, made new routines and in order to retain the same call path and no major changes, written some macros to access those routines. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: remove unneeded initializations in f2fs_parent_dirNamjae Jeon
There is no need to initialize few pointers in f2fs_parent_dir as the values are not checked and instead directly initialized values are used. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: push some variables to debug partNamjae Jeon
Some, counters are needed only for the statistical information while debugging. So, those can be controlled using CONFIG_F2FS_STAT_FS, pushing the usage for few variables under this flag. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: align data types between on-disk and in-memory block addressesJaegeuk Kim
The on-disk block address is defined as __le32, but in-memory block address, block_t, does as u64. Let's synchronize them to 32 bits. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: dereferencing an ERR_PTRDan Carpenter
There is an error path where "dir" is an ERR_PTR. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: use iholdJaegeuk Kim
Use the following helper function committed by Al. commit 7de9c6ee3ecffd99e1628e81a5ea5468f7581a1f Author: Al Viro <viro@zeniv.linux.org.uk> Date: Sat Oct 23 11:11:40 2010 -0400 new helper: ihold() ... Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: should not make_bad_inode on f2fs_link failureJaegeuk Kim
If -ENOSPC is met during f2fs_link, we should not make the inode as bad. The inode is still alive. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: fix to handle do_recover_data errorsJaegeuk Kim
This patch adds error handling codes of check_index_in_prev_nodes and its caller, do_recover_data. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: reuse the locked dnode page and its inodeJaegeuk Kim
This patch fixes the following deadlock bug during the recovery. INFO: task mount:1322 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mount D ffffffff81125870 0 1322 1266 0x00000000 ffff8801207e39d8 0000000000000046 ffff88012ab1dee0 0000000000000046 ffff8801207e3a08 ffff880115903f40 ffff8801207e3fd8 ffff8801207e3fd8 ffff8801207e3fd8 ffff880115903f40 ffff8801207e39d8 ffff88012fc94520 Call Trace: [<ffffffff81125870>] ? __lock_page+0x70/0x70 [<ffffffff816a92d9>] schedule+0x29/0x70 [<ffffffff816a93af>] io_schedule+0x8f/0xd0 [<ffffffff8112587e>] sleep_on_page+0xe/0x20 [<ffffffff816a649a>] __wait_on_bit_lock+0x5a/0xc0 [<ffffffff81125867>] __lock_page+0x67/0x70 [<ffffffff8106c7b0>] ? autoremove_wake_function+0x40/0x40 [<ffffffff81126857>] find_lock_page+0x67/0x80 [<ffffffff8112698f>] find_or_create_page+0x3f/0xb0 [<ffffffffa03901a8>] ? sync_inode_page+0xa8/0xd0 [f2fs] [<ffffffffa038fdf7>] get_node_page+0x67/0x180 [f2fs] [<ffffffffa039818b>] recover_fsync_data+0xacb/0xff0 [f2fs] [<ffffffff816aaa1e>] ? _raw_spin_unlock+0x3e/0x40 [<ffffffffa0389634>] f2fs_fill_super+0x7d4/0x850 [f2fs] [<ffffffff81184cf9>] mount_bdev+0x1c9/0x210 [<ffffffffa0388e60>] ? validate_superblock+0x180/0x180 [f2fs] [<ffffffffa0387635>] f2fs_mount+0x15/0x20 [f2fs] [<ffffffff81185a13>] mount_fs+0x43/0x1b0 [<ffffffff81145ba0>] ? __alloc_percpu+0x10/0x20 [<ffffffff811a0796>] vfs_kern_mount+0x76/0x120 [<ffffffff811a2cb7>] do_mount+0x237/0xa10 [<ffffffff81140b9b>] ? strndup_user+0x5b/0x80 [<ffffffff811a3520>] SyS_mount+0x90/0xe0 [<ffffffff816b3502>] system_call_fastpath+0x16/0x1b The bug is triggered when check_index_in_prev_nodes tries to get the direct node page by calling get_node_page. At this point, if the direct node page is already locked by get_dnode_of_data, its caller, we got a deadlock condition. This patch adds additional condition check for the reuse of locked direct node pages prior to the get_node_page call. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: fix wrong condition checkJaegeuk Kim
While an orphan inode has zero link_count, f2fs_gc is able to select the inode for foreground gc. - f2fs_gc - do_garbage_collect - gc_data_segment : f2fs_iget is failed : get_valid_blocks() != 0, so that retry --> here we got the infinite loop. This patch resolved this issue. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: add f2fs_readonly()Jaegeuk Kim
Introduce a simple macro function for readability. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: avoid RECLAIM_FS-ON-W: deadlockJaegeuk Kim
This patch tries to avoid the following deadlock condition of which the reclaim path can trigger f2fs_balance_fs again. ================================= [ INFO: inconsistent lock state ] --------------------------------- inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. kswapd0/41 [HC0[0]:SC0[0]:HE1:SE1] takes: (&sbi->gc_mutex){+.+.?.}, at: f2fs_balance_fs+0xe6/0x100 [f2fs] {RECLAIM_FS-ON-W} state was registered at: [<ffffffff810aa5a9>] mark_held_locks+0xb9/0x140 [<ffffffff810aae85>] lockdep_trace_alloc+0x85/0xf0 [<ffffffff8113ab2c>] __alloc_pages_nodemask+0x7c/0x9b0 [<ffffffff81175aa8>] alloc_pages_current+0xb8/0x180 [<ffffffff811319cf>] __page_cache_alloc+0xaf/0xd0 [<ffffffff8113225c>] find_or_create_page+0x4c/0xb0 [<ffffffffa021359e>] find_data_page+0x14e/0x210 [f2fs] [<ffffffffa021161b>] f2fs_gc+0x9eb/0xd90 [f2fs] [<ffffffffa0218fae>] f2fs_balance_fs+0xee/0x100 [f2fs] [<ffffffffa020848c>] f2fs_setattr+0x6c/0x200 [f2fs] [<ffffffff811ae51b>] notify_change+0x1db/0x3a0 [<ffffffff8118fbd0>] do_truncate+0x60/0xa0 [<ffffffff8118fd95>] vfs_truncate+0x185/0x1b0 [<ffffffff8118fe1c>] do_sys_truncate+0x5c/0xa0 [<ffffffff8118ffee>] SyS_truncate+0xe/0x10 [<ffffffff816e2b42>] system_call_fastpath+0x16/0x1b Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: don't do checkpoint if error is occurredJaegeuk Kim
If we met an error during the dentry recovery, we should not conduct checkpoint. Otherwise, some errorneous dentry blocks overwrites the existing blocks that contain the remaining recovery information. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: fix to unlock page before exitJaegeuk Kim
If we got an error after lock_page, we should unlock it before exit. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: remove unnecessary kmap/kunmap operationsJaegeuk Kim
The allocated page used by the recovery is not on HIGHMEM, so that we don't need to use kmap/kunmap. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: reorganize f2fs_vm_page_mkwriteNamjae Jeon
Few things can be changed in the default mkwrite function 1) Make file_update_time at the start before acquiring any lock 2) the condition page_offset(page) >= i_size_read(inode) should be changed to page_offset(page) > i_size_read 3) Move wait_on_page_writeback. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: use list_for_each_entry rather than list_for_each_entry_safemajianpeng
We can do this, since now we use a global mutex, f2fs_stat_mutex to protect its list operations. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> [Jaegeuk Kim: add description] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: remove unecessary variable and codeHaicheng Li
Code cleanup without behavior changed. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs, lockdep: annotate mutex_lock_all()Peter Zijlstra
Majianpeng reported a lockdep splat for f2fs. It turns out mutex_lock_all() acquires an array of locks (in global/local lock style). Any such operation is always serialized using cp_mutex, therefore there is no fs_lock[] lock-order issue; tell lockdep about this using the mutex_lock_nest_lock() primitive. Reported-by: majianpeng <majianpeng@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: add debug msgs in the recovery routineJaegeuk Kim
This patch adds some trivial debugging messages in the recovery process. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: update inode page after creationJaegeuk Kim
I found a bug when testing power-off-recovery as follows. [Bug Scenario] 1. create a file 2. fsync the file 3. reboot w/o any sync 4. try to recover the file - found its fsync mark - found its dentry mark : try to recover its dentry - get its file name - get its parent inode number : here we got zero value The reason why we get the wrong parent inode number is that we didn't synchronize the inode page with its newly created inode information perfectly. Especially, previous f2fs stores fi->i_pino and writes it to the cached node page in a wrong order, which incurs the zero-valued i_pino during the recovery. So, this patch modifies the creation flow to fix the synchronization order of inode page with its inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: change get_new_data_page to pass a locked node pageJaegeuk Kim
This patch is for passing a locked node page to get_dnode_of_data. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: skip get_node_page if locked node page is passedJaegeuk Kim
If get_dnode_of_data gets a locked node page, let's skip redundant get_node_page calls. This is for the futher enhancement. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: remove unnecessary por_doing checkJaegeuk Kim
This por_doing check is totally not related to the recovery process. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: fix BUG_ON during f2fs_evict_inode(dir)Jaegeuk Kim
During the dentry recovery routine, recover_inode() triggers __f2fs_add_link with its directory inode. In the following scenario, a bug is captured. 1. dir = f2fs_iget(pino) 2. __f2fs_add_link(dir, name) 3. iput(dir) -> f2fs_evict_inode() faces with BUG_ON(atomic_read(fi->dirty_dents)) Kernel BUG at ffffffffa01c0676 [verbose debug info unavailable] [<ffffffffa01c0676>] f2fs_evict_inode+0x276/0x300 [f2fs] Call Trace: [<ffffffff8118ea00>] evict+0xb0/0x1b0 [<ffffffff8118f1c5>] iput+0x105/0x190 [<ffffffffa01d2dac>] recover_fsync_data+0x3bc/0x1070 [f2fs] [<ffffffff81692e8a>] ? io_schedule+0xaa/0xd0 [<ffffffff81690acb>] ? __wait_on_bit_lock+0x7b/0xc0 [<ffffffff8111a0e7>] ? __lock_page+0x67/0x70 [<ffffffff81165e21>] ? kmem_cache_alloc+0x31/0x140 [<ffffffff8118a502>] ? __d_instantiate+0x92/0xf0 [<ffffffff812a949b>] ? security_d_instantiate+0x1b/0x30 [<ffffffff8118a5b4>] ? d_instantiate+0x54/0x70 This means that we should flush all the dentry pages between iget and iput(). But, during the recovery routine, it is unallowed due to consistency, so we have to wait the whole recovery process. And then, write_checkpoint flushes all the dirty dentry blocks, and nicely we can put the stale dir inodes from the dirty_dir_inode_list. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: fix por_doing variable coverageJaegeuk Kim
The reason of using sbi->por_doing is to alleviate data writes during the recovery. The find_fsync_dnodes() produces some dirty dentry pages, so we should cover it too with sbi->por_doing. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: remove redundant assignmentJaegeuk Kim
We don't need to assign a value redundantly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: fix the inconsistent state of data pagesJaegeuk Kim
In get_lock_data_page, if there is a data race between get_dnode_of_data for node and grab_cache_page for data, f2fs is able to face with the following BUG_ON(dn.data_blkaddr == NEW_ADDR). kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/data.c:251! [<ffffffffa044966c>] get_lock_data_page+0x1ec/0x210 [f2fs] Call Trace: [<ffffffffa043b089>] f2fs_readdir+0x89/0x210 [f2fs] [<ffffffff811a0920>] ? fillonedir+0x100/0x100 [<ffffffff811a0920>] ? fillonedir+0x100/0x100 [<ffffffff811a07f8>] vfs_readdir+0xb8/0xe0 [<ffffffff811a0b4f>] sys_getdents+0x8f/0x110 [<ffffffff816d7999>] system_call_fastpath+0x16/0x1b This bug is able to be occurred when the block address of the data block is changed after f2fs_put_dnode(). In order to avoid that, this patch fixes the lock order of node and data blocks in which the node block lock is covered by the data block lock. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-28f2fs: fix inconsistency of block count during recoveryJaegeuk Kim
Currently f2fs recovers the dentry of fsynced files. When power-off-recovery is conducted, this newly recovered inode should increase node block count as well as inode block count. This patch resolves this inconsistency that results in: 1. create a file 2. write data 3. fsync 4. reboot without sync 5. mount and recover the file 6. node block count is 1 and inode block count is 2 : fall into the inconsistent state 7. unlink the file : trigger the following BUG_ON ------------[ cut here ]------------ kernel BUG at /home/zeus/f2fs_test/src/fs/f2fs/f2fs.h:716! Call Trace: [<ffffffffa0344100>] ? get_node_page+0x50/0x1a0 [f2fs] [<ffffffffa0344bfc>] remove_inode_page+0x8c/0x100 [f2fs] [<ffffffffa03380f0>] ? f2fs_evict_inode+0x180/0x2d0 [f2fs] [<ffffffffa033812e>] f2fs_evict_inode+0x1be/0x2d0 [f2fs] [<ffffffff811c7a67>] evict+0xa7/0x1a0 [<ffffffff811c82b5>] iput+0x105/0x190 [<ffffffff811c2b30>] d_kill+0xe0/0x120 [<ffffffff811c2c57>] dput+0xe7/0x1e0 [<ffffffff811acc3d>] __fput+0x19d/0x2d0 [<ffffffff811acd7e>] ____fput+0xe/0x10 [<ffffffff81070645>] task_work_run+0xb5/0xe0 [<ffffffff81002941>] do_notify_resume+0x71/0xb0 [<ffffffff8175f14a>] int_signal+0x12/0x17 Reported-and-Tested-by: Chris Fries <C.Fries@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-05-21mm: change invalidatepage prototype to accept lengthLukas Czerner
Currently there is no way to truncate partial page where the end truncate point is not at the end of the page. This is because it was not needed and the functionality was enough for file system truncate operation to work properly. However more file systems now support punch hole feature and it can benefit from mm supporting truncating page just up to the certain point. Specifically, with this functionality truncate_inode_pages_range() can be changed so it supports truncating partial page at the end of the range (currently it will BUG_ON() if 'end' is not at the end of the page). This commit changes the invalidatepage() address space operation prototype to accept range to be invalidated and update all the instances for it. We also change the block_invalidatepage() in the same way and actually make a use of the new length argument implementing range invalidation. Actual file system implementations will follow except the file systems where the changes are really simple and should not change the behaviour in any way .Implementation for truncate_page_range() which will be able to accept page unaligned ranges will follow as well. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com>