~shefty/rdma-dev.git
9 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Fri, 21 Dec 2012 02:14:31 +0000 (18:14 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull VFS update from Al Viro:
 "fscache fixes, ESTALE patchset, vmtruncate removal series, assorted
  misc stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (79 commits)
  vfs: make lremovexattr retry once on ESTALE error
  vfs: make removexattr retry once on ESTALE
  vfs: make llistxattr retry once on ESTALE error
  vfs: make listxattr retry once on ESTALE error
  vfs: make lgetxattr retry once on ESTALE
  vfs: make getxattr retry once on an ESTALE error
  vfs: allow lsetxattr() to retry once on ESTALE errors
  vfs: allow setxattr to retry once on ESTALE errors
  vfs: allow utimensat() calls to retry once on an ESTALE error
  vfs: fix user_statfs to retry once on ESTALE errors
  vfs: make fchownat retry once on ESTALE errors
  vfs: make fchmodat retry once on ESTALE errors
  vfs: have chroot retry once on ESTALE error
  vfs: have chdir retry lookup and call once on ESTALE error
  vfs: have faccessat retry once on an ESTALE error
  vfs: have do_sys_truncate retry once on an ESTALE error
  vfs: fix renameat to retry on ESTALE errors
  vfs: make do_unlinkat retry once on ESTALE errors
  vfs: make do_rmdir retry once on ESTALE errors
  vfs: add a flags argument to user_path_parent
  ...

9 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Linus Torvalds [Fri, 21 Dec 2012 02:05:28 +0000 (18:05 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/signal

Pull signal handling cleanups from Al Viro:
 "sigaltstack infrastructure + conversion for x86, alpha and um,
  COMPAT_SYSCALL_DEFINE infrastructure.

  Note that there are several conflicts between "unify
  SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline;
  resolution is trivial - just remove definitions of SS_ONSTACK and
  SS_DISABLED from arch/*/uapi/asm/signal.h; they are all identical and
  include/uapi/linux/signal.h contains the unified variant."

Fixed up conflicts as per Al.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  alpha: switch to generic sigaltstack
  new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those
  generic compat_sys_sigaltstack()
  introduce generic sys_sigaltstack(), switch x86 and um to it
  new helper: compat_user_stack_pointer()
  new helper: restore_altstack()
  unify SS_ONSTACK/SS_DISABLE definitions
  new helper: current_user_stack_pointer()
  missing user_stack_pointer() instances
  Bury the conditionals from kernel_thread/kernel_execve series
  COMPAT_SYSCALL_DEFINE: infrastructure

9 years agoMerge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Linus Torvalds [Fri, 21 Dec 2012 01:56:23 +0000 (17:56 -0800)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM fixes from Russell King:
 "A number of smallish fixes scattered around the ARM code.  Probably
  the most serious one is the one from Al addressing the missing locking
  in the swap emulation code."

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7607/1: realview: fix private peripheral memory base for EB rev. B boards
  ARM: 7606/1: cache: flush to LoUU instead of LoUIS on uniprocessor CPUs
  ARM: missing ->mmap_sem around find_vma() in swp_emulate.c
  ARM: 7605/1: vmlinux.lds: Move .notes section next to the rodata
  ARM: 7602/1: Pass real "__machine_arch_type" variable to setup_machine_tags() procedure
  ARM: 7600/1: include CONFIG_DEBUG_LL_INCLUDE rather than mach/debug-macro.S

9 years agoMerge tag 'fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Linus Torvalds [Fri, 21 Dec 2012 01:55:34 +0000 (17:55 -0800)]
Merge tag 'fixes2' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes part 2 from Olof Johansson:
 "Here are a few more fixes for 3.8.  Two branches of fixes for Samsung
  platforms, including fixes for the audio build errors on all non-DT
  platforms.  There's also a fixup to the sunxi device-tree file renames
  due to a bad patch application by me, and a fix for OMAP due to
  function renames merged through the powerpc tree."

* tag 'fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: OMAP2+: Fix compillation error in mach-omap2/timer.c
  ARM: sunxi: rename device tree source files
  ARM: EXYNOS: Avoid passing the clks through platform data
  ARM: S5PV210: Avoid passing the clks through platform data
  ARM: S5P64X0: Add I2S clkdev support
  ARM: S5PC100: Add I2S clkdev support
  ARM: S3C64XX: Add I2S clkdev support
  ARM: EXYNOS: Fix MSHC clocks instance names
  ARM: EXYNOS: Fix NULL pointer dereference bug in SMDKV310
  ARM: EXYNOS: Fix NULL pointer dereference bug in SMDK4X12
  ARM: EXYNOS: Fix NULL pointer dereference bug in Origen
  ARM: SAMSUNG: Add missing include guard to gpio-core.h
  pinctrl: exynos5440/samsung: Staticize pcfgs
  pinctrl: samsung: Fix a typo in pinctrl-samsung.h
  ARM: EXYNOS: fix skip scu_enable() for EXYNOS5440
  ARM: EXYNOS: fix GIC using for EXYNOS5440
  ARM: EXYNOS: fix build error when MFC is not selected

9 years agoMerge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Linus Torvalds [Fri, 21 Dec 2012 01:52:06 +0000 (17:52 -0800)]
Merge branch 'misc' of git://git./linux/kernel/git/mmarek/kbuild

Pull kbuild misc changes from Michal Marek:
 "This is the non-critical part of kbuild

   - scripts/kernel-doc requires a "Return:" section for non-void
     functions
   - ARCH=arm SUBARCH=... support for make tags
   - COMPILED_SOURCE=1 support for make tags (only indexes .c files for
     which a .o exists)
   - New coccinelle check
   - Option parsing fix for scripts/config"

* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  scripts/config: Fix wrong "shift" for --keep-case
  scripts/tags.sh: Support compiled source
  scripts/tags.sh: Support subarch for ARM
  scripts/coccinelle/misc/warn.cocci: use WARN
  scripts/kernel-doc: check that non-void fcts describe their return value
  Kernel-doc: Convention: Use a "Return" section to describe return values

9 years agovfs: make lremovexattr retry once on ESTALE error
Jeff Layton [Tue, 11 Dec 2012 17:10:18 +0000 (12:10 -0500)]
vfs: make lremovexattr retry once on ESTALE error

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: make removexattr retry once on ESTALE
Jeff Layton [Tue, 11 Dec 2012 17:10:17 +0000 (12:10 -0500)]
vfs: make removexattr retry once on ESTALE

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: make llistxattr retry once on ESTALE error
Jeff Layton [Tue, 11 Dec 2012 17:10:17 +0000 (12:10 -0500)]
vfs: make llistxattr retry once on ESTALE error

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: make listxattr retry once on ESTALE error
Jeff Layton [Tue, 11 Dec 2012 17:10:16 +0000 (12:10 -0500)]
vfs: make listxattr retry once on ESTALE error

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: make lgetxattr retry once on ESTALE
Jeff Layton [Tue, 11 Dec 2012 17:10:16 +0000 (12:10 -0500)]
vfs: make lgetxattr retry once on ESTALE

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: make getxattr retry once on an ESTALE error
Jeff Layton [Tue, 11 Dec 2012 17:10:16 +0000 (12:10 -0500)]
vfs: make getxattr retry once on an ESTALE error

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: allow lsetxattr() to retry once on ESTALE errors
Jeff Layton [Tue, 11 Dec 2012 17:10:15 +0000 (12:10 -0500)]
vfs: allow lsetxattr() to retry once on ESTALE errors

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: allow setxattr to retry once on ESTALE errors
Jeff Layton [Tue, 11 Dec 2012 17:10:15 +0000 (12:10 -0500)]
vfs: allow setxattr to retry once on ESTALE errors

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: allow utimensat() calls to retry once on an ESTALE error
Jeff Layton [Tue, 11 Dec 2012 17:10:14 +0000 (12:10 -0500)]
vfs: allow utimensat() calls to retry once on an ESTALE error

Clearly, we can't handle the NULL filename case, but we can deal with
the case where there's a real pathname.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: fix user_statfs to retry once on ESTALE errors
Jeff Layton [Tue, 11 Dec 2012 17:10:14 +0000 (12:10 -0500)]
vfs: fix user_statfs to retry once on ESTALE errors

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: make fchownat retry once on ESTALE errors
Jeff Layton [Tue, 11 Dec 2012 17:10:13 +0000 (12:10 -0500)]
vfs: make fchownat retry once on ESTALE errors

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: make fchmodat retry once on ESTALE errors
Jeff Layton [Tue, 11 Dec 2012 17:10:13 +0000 (12:10 -0500)]
vfs: make fchmodat retry once on ESTALE errors

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: have chroot retry once on ESTALE error
Jeff Layton [Thu, 20 Dec 2012 22:08:32 +0000 (17:08 -0500)]
vfs: have chroot retry once on ESTALE error

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: have chdir retry lookup and call once on ESTALE error
Jeff Layton [Tue, 11 Dec 2012 17:10:12 +0000 (12:10 -0500)]
vfs: have chdir retry lookup and call once on ESTALE error

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: have faccessat retry once on an ESTALE error
Jeff Layton [Tue, 11 Dec 2012 17:10:11 +0000 (12:10 -0500)]
vfs: have faccessat retry once on an ESTALE error

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: have do_sys_truncate retry once on an ESTALE error
Jeff Layton [Tue, 11 Dec 2012 17:10:11 +0000 (12:10 -0500)]
vfs: have do_sys_truncate retry once on an ESTALE error

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: fix renameat to retry on ESTALE errors
Jeff Layton [Tue, 11 Dec 2012 17:10:10 +0000 (12:10 -0500)]
vfs: fix renameat to retry on ESTALE errors

...as always, rename is the messiest of the bunch. We have to track
whether to retry or not via a separate flag since the error handling
is already quite complex.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: make do_unlinkat retry once on ESTALE errors
Jeff Layton [Thu, 20 Dec 2012 21:38:04 +0000 (16:38 -0500)]
vfs: make do_unlinkat retry once on ESTALE errors

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: make do_rmdir retry once on ESTALE errors
Jeff Layton [Thu, 20 Dec 2012 21:28:33 +0000 (16:28 -0500)]
vfs: make do_rmdir retry once on ESTALE errors

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: add a flags argument to user_path_parent
Jeff Layton [Tue, 11 Dec 2012 17:10:09 +0000 (12:10 -0500)]
vfs: add a flags argument to user_path_parent

...so we can pass in LOOKUP_REVAL. For now, nothing does yet.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: fix linkat to retry once on ESTALE errors
Jeff Layton [Thu, 20 Dec 2012 21:15:38 +0000 (16:15 -0500)]
vfs: fix linkat to retry once on ESTALE errors

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: fix symlinkat to retry on ESTALE errors
Jeff Layton [Tue, 11 Dec 2012 17:10:08 +0000 (12:10 -0500)]
vfs: fix symlinkat to retry on ESTALE errors

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: fix mkdirat to retry once on an ESTALE error
Jeff Layton [Thu, 20 Dec 2012 21:04:09 +0000 (16:04 -0500)]
vfs: fix mkdirat to retry once on an ESTALE error

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: fix mknodat to retry on ESTALE errors
Jeff Layton [Thu, 20 Dec 2012 21:00:10 +0000 (16:00 -0500)]
vfs: fix mknodat to retry on ESTALE errors

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: turn is_dir argument to kern_path_create into a lookup_flags arg
Jeff Layton [Tue, 11 Dec 2012 17:10:06 +0000 (12:10 -0500)]
vfs: turn is_dir argument to kern_path_create into a lookup_flags arg

Where we can pass in LOOKUP_DIRECTORY or LOOKUP_REVAL. Any other flags
passed in here are currently ignored.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: fix readlinkat to retry on ESTALE
Jeff Layton [Tue, 11 Dec 2012 17:10:06 +0000 (12:10 -0500)]
vfs: fix readlinkat to retry on ESTALE

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: make fstatat retry on ESTALE errors from getattr call
Jeff Layton [Tue, 11 Dec 2012 17:10:05 +0000 (12:10 -0500)]
vfs: make fstatat retry on ESTALE errors from getattr call

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: add a retry_estale helper function to handle retries on ESTALE
Jeff Layton [Thu, 20 Dec 2012 19:59:40 +0000 (14:59 -0500)]
vfs: add a retry_estale helper function to handle retries on ESTALE

This function is expected to be called from path-based syscalls to help
them decide whether to try the lookup and call again in the event that
they got an -ESTALE return back on an earier try.

Currently, we only retry the call once on an ESTALE error, but in the
event that we decide that that's not enough in the future, we should be
able to change the logic in this helper without too much effort.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoMerge branch 'fscache' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells...
Al Viro [Thu, 20 Dec 2012 23:49:14 +0000 (18:49 -0500)]
Merge branch 'fscache' of git://git./linux/kernel/git/dhowells/linux-fs into for-linus

9 years agovfs: d_obtain_alias() needs to use "/" as default name.
NeilBrown [Fri, 9 Nov 2012 00:09:37 +0000 (16:09 -0800)]
vfs: d_obtain_alias() needs to use "/" as default name.

NFS appears to use d_obtain_alias() to create the root dentry rather than
d_make_root.  This can cause 'prepend_path()' to complain that the root
has a weird name if an NFS filesystem is lazily unmounted.  e.g.  if
"/mnt" is an NFS mount then

 { cd /mnt; umount -l /mnt ; ls -l /proc/self/cwd; }

will cause a WARN message like
   WARNING: at /home/git/linux/fs/dcache.c:2624 prepend_path+0x1d7/0x1e0()
   ...
   Root dentry has weird name <>

to appear in kernel logs.

So change d_obtain_alias() to use "/" rather than "" as the anonymous
name.

Signed-off-by: NeilBrown <neilb@suse.de>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: Remove useless function prototypes
Alessio Igor Bogani [Thu, 13 Dec 2012 11:22:39 +0000 (12:22 +0100)]
vfs: Remove useless function prototypes

Commit 8e22cc88d68ca1a46d7d582938f979eb640ed30f removes the (un)lock_super
function definitions but forgets to remove their prototypes.

Signed-off-by: Alessio Igor Bogani <abogani@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agodocumentation: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 11:00:38 +0000 (12:00 +0100)]
documentation: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agomm: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 11:00:02 +0000 (12:00 +0100)]
mm: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:59:20 +0000 (11:59 +0100)]
vfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agontfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:58:36 +0000 (11:58 +0100)]
ntfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agonilfs2: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:57:37 +0000 (11:57 +0100)]
nilfs2: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoncpfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:57:03 +0000 (11:57 +0100)]
ncpfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agominix: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:56:25 +0000 (11:56 +0100)]
minix: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agologfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:55:42 +0000 (11:55 +0100)]
logfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agohfsplus: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:55:07 +0000 (11:55 +0100)]
hfsplus: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agojfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:54:25 +0000 (11:54 +0100)]
jfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agohpfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:53:50 +0000 (11:53 +0100)]
hpfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoFS-Cache: Clear remaining page count on retrieval cancellation
David Howells [Fri, 14 Dec 2012 11:02:22 +0000 (11:02 +0000)]
FS-Cache: Clear remaining page count on retrieval cancellation

Provide fscache_cancel_op() with a pointer to a function it should invoke under
lock if it cancels an operation.

Use this to clear the remaining page count upon cancellation of a pending
retrieval operation so that fscache_release_retrieval_op() doesn't get an
assertion failure (see below).  This can happen when a signal occurs, say from
CTRL-C being pressed during data retrieval.

FS-Cache: Assertion failed
3 == 0 is false
------------[ cut here ]------------
kernel BUG at fs/fscache/page.c:237!
invalid opcode: 0000 [#641] SMP
Modules linked in: cachefiles(F) nfsv4(F) nfsv3(F) nfsv2(F) nfs(F) fscache(F) auth_rpcgss(F) nfs_acl(F) lockd(F) sunrpc(F)
CPU 0
Pid: 6075, comm: slurp-q Tainted: GF     D      3.7.0-rc8-fsdevel+ #411                  /DG965RY
RIP: 0010:[<ffffffffa007f328>]  [<ffffffffa007f328>] fscache_release_retrieval_op+0x75/0xff [fscache]
RSP: 0000:ffff88001c6d7988  EFLAGS: 00010296
RAX: 000000000000000f RBX: ffff880014cdfe00 RCX: ffffffff6c102000
RDX: ffffffff8102d1ad RSI: ffffffff6c102000 RDI: ffffffff8102d1d6
RBP: ffff88001c6d7998 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffe00
R13: ffff88001c6d7ab4 R14: ffff88001a8638a0 R15: ffff88001552b190
FS:  00007f877aaf0700(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fff11378fd2 CR3: 000000001c6c6000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process slurp-q (pid: 6075, threadinfo ffff88001c6d6000, task ffff88001c6c4080)
Stack:
 ffffffffa007ec07 ffff880014cdfe00 ffff88001c6d79c8 ffffffffa007db4d
 ffffffffa007ec07 ffff880014cdfe00 00000000fffffe00 ffff88001c6d7ab4
 ffff88001c6d7a38 ffffffffa008116d 0000000000000000 ffff88001c6c4080
Call Trace:
 [<ffffffffa007ec07>] ? fscache_cancel_op+0x194/0x1cf [fscache]
 [<ffffffffa007db4d>] fscache_put_operation+0x135/0x2ed [fscache]
 [<ffffffffa007ec07>] ? fscache_cancel_op+0x194/0x1cf [fscache]
 [<ffffffffa008116d>] __fscache_read_or_alloc_pages+0x413/0x4bc [fscache]
 [<ffffffff810ac8ae>] ? __alloc_pages_nodemask+0x195/0x75c
 [<ffffffffa00aab0f>] __nfs_readpages_from_fscache+0x86/0x13d [nfs]
 [<ffffffffa00a5fe0>] nfs_readpages+0x186/0x1bd [nfs]
 [<ffffffff810d23c8>] ? alloc_pages_current+0xc7/0xe4
 [<ffffffff810a68b5>] ? __page_cache_alloc+0x84/0x91
 [<ffffffff810af912>] ? __do_page_cache_readahead+0xa6/0x2e0
 [<ffffffff810afaa3>] __do_page_cache_readahead+0x237/0x2e0
 [<ffffffff810af912>] ? __do_page_cache_readahead+0xa6/0x2e0
 [<ffffffff810afe3e>] ra_submit+0x1c/0x20
 [<ffffffff810b019b>] ondemand_readahead+0x359/0x382
 [<ffffffff810b0279>] page_cache_sync_readahead+0x38/0x3a
 [<ffffffff810a77b5>] generic_file_aio_read+0x26b/0x637
 [<ffffffffa00f1852>] ? nfs_mark_delegation_referenced+0xb/0xb [nfsv4]
 [<ffffffffa009cc85>] nfs_file_read+0xaa/0xcf [nfs]
 [<ffffffff810db5b3>] do_sync_read+0x91/0xd1
 [<ffffffff810dbb8b>] vfs_read+0x9b/0x144
 [<ffffffff810dbc78>] sys_read+0x44/0x75
 [<ffffffff81422892>] system_call_fastpath+0x16/0x1b

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoFS-Cache: Mark cancellation of in-progress operation
David Howells [Thu, 13 Dec 2012 20:03:13 +0000 (20:03 +0000)]
FS-Cache: Mark cancellation of in-progress operation

Mark as cancelled an operation that is in progress rather than pending at the
time it is cancelled, and call fscache_complete_op() to cancel an operation so
that blocked ops can be started.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoFS-Cache: One of the write operation paths doesn't set the object state
David Howells [Fri, 7 Dec 2012 10:41:26 +0000 (10:41 +0000)]
FS-Cache: One of the write operation paths doesn't set the object state

In fscache_write_op(), if the object is determined to have become inactive or
to have lost its cookie, we don't move the operation state from in-progress,
and so an assertion in fscache_put_operation() fails with an assertion (see
below).

Instrumenting fscache_op_work_func() indicates that it called
fscache_write_op() before calling fscache_put_operation() - where the assertion
failed.  The assertion at line 433 indicates that the operation state is
IN_PROGRESS rather than being COMPLETE or CANCELLED.

Instrumenting fscache_write_op() showed that it was being called on an object
that had had its cookie removed and that this was due to relinquishment of the
cookie by the netfs.  At this point fscache no longer has access to the pages
of netfs data that were requested to be written, and so simply cancelling the
operation is the thing to do.

FS-Cache: Assertion failed
3 == 5 is false
------------[ cut here ]------------
kernel BUG at fs/fscache/operation.c:433!
invalid opcode: 0000 [#1] SMP
Modules linked in: cachefiles(F) nfsv4(F) nfsv3(F) nfsv2(F) nfs(F) fscache(F) auth_rpcgss(F) nfs_acl(F) lockd(F) sunrpc(F)
CPU 0
Pid: 1035, comm: kworker/u:3 Tainted: GF            3.7.0-rc8-fsdevel+ #411                  /DG965RY
RIP: 0010:[<ffffffffa007db22>]  [<ffffffffa007db22>] fscache_put_operation+0x11a/0x2ed [fscache]
RSP: 0018:ffff88003e32bcf8  EFLAGS: 00010296
RAX: 000000000000000f RBX: ffff88001818eb78 RCX: ffffffff6c102000
RDX: ffffffff8102d1ad RSI: ffffffff6c102000 RDI: ffffffff8102d1d6
RBP: ffff88003e32bd18 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa00811da
R13: 0000000000000001 R14: 0000000100625d26 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fff7dd31c68 CR3: 000000003d730000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/u:3 (pid: 1035, threadinfo ffff88003e32a000, task ffff88003bb38080)
Stack:
 ffffffff8102d1ad ffff88001818eb78 ffffffffa00811da 0000000000000001
 ffff88003e32bd48 ffffffffa007f0ad ffff88001818eb78 ffffffff819583c0
 ffff88003df24e00 ffff88003882c3e0 ffff88003e32bde8 ffffffff81042de0
Call Trace:
 [<ffffffff8102d1ad>] ? vprintk_emit+0x3c6/0x41a
 [<ffffffffa00811da>] ? __fscache_read_or_alloc_pages+0x4bc/0x4bc [fscache]
 [<ffffffffa007f0ad>] fscache_op_work_func+0xec/0x123 [fscache]
 [<ffffffff81042de0>] process_one_work+0x21c/0x3b0
 [<ffffffff81042d82>] ? process_one_work+0x1be/0x3b0
 [<ffffffffa007efc1>] ? fscache_operation_gc+0x23e/0x23e [fscache]
 [<ffffffff8104332e>] worker_thread+0x202/0x2df
 [<ffffffff8104312c>] ? rescuer_thread+0x18e/0x18e
 [<ffffffff81047c1c>] kthread+0xd0/0xd8
 [<ffffffff81421bfa>] ? _raw_spin_unlock_irq+0x29/0x3e
 [<ffffffff81047b4c>] ? __init_kthread_worker+0x55/0x55
 [<ffffffff814227ec>] ret_from_fork+0x7c/0xb0
 [<ffffffff81047b4c>] ? __init_kthread_worker+0x55/0x55

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoFS-Cache: Fix signal handling during waits
David Howells [Fri, 7 Dec 2012 18:08:02 +0000 (18:08 +0000)]
FS-Cache: Fix signal handling during waits

wait_on_bit() with TASK_INTERRUPTIBLE returns 1 rather than a negative error
code, so change what we check for.  This means that the signal handling in
fscache_wait_for_retrieval_activation()  should now work properly.

Without this, the following bug can be seen if CTRL-C is pressed during
fscache read operation:

FS-Cache: Assertion failed
2 == 3 is false
------------[ cut here ]------------
kernel BUG at fs/fscache/page.c:347!
invalid opcode: 0000 [#1] SMP
Modules linked in: cachefiles(F) nfsv4(F) nfsv3(F) nfsv2(F) nfs(F) fscache(F) auth_rpcgss(F) nfs_acl(F) lockd(F) sunrpc(F)
CPU 1
Pid: 15006, comm: slurp-q Tainted: GF            3.7.0-rc8-fsdevel+ #411                  /DG965RY
RIP: 0010:[<ffffffffa007fcb4>]  [<ffffffffa007fcb4>] fscache_wait_for_retrieval_activation+0x167/0x177 [fscache]
RSP: 0018:ffff88002a4c39a8  EFLAGS: 00010292
RAX: 000000000000001a RBX: ffff88002d3dc158 RCX: 0000000000008685
RDX: ffffffff8102ccd6 RSI: 0000000000000001 RDI: ffffffff8102d1d6
RBP: ffff88002a4c39c8 R08: 0000000000000002 R09: 0000000000000000
R10: ffffffff8163afa0 R11: ffff88003bd11900 R12: ffffffffa00868c8
R13: ffff880028306458 R14: ffff88002d3dc1b0 R15: ffff88001372e538
FS:  00007f17426a0700(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f1742494a44 CR3: 0000000031bd7000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process slurp-q (pid: 15006, threadinfo ffff88002a4c2000, task ffff880023de3040)
Stack:
 ffff88002d3dc158 ffff88001372e538 ffff88002a4c3ab4 ffff8800283064e0
 ffff88002a4c3a38 ffffffffa0080f6d 0000000000000000 ffff880023de3040
 ffff88002a4c3ac8 ffffffff810ac8ae ffff880028306458 ffff88002a4c3bc8
Call Trace:
 [<ffffffffa0080f6d>] __fscache_read_or_alloc_pages+0x24f/0x4bc [fscache]
 [<ffffffff810ac8ae>] ? __alloc_pages_nodemask+0x195/0x75c
 [<ffffffffa00aab0f>] __nfs_readpages_from_fscache+0x86/0x13d [nfs]
 [<ffffffffa00a5fe0>] nfs_readpages+0x186/0x1bd [nfs]
 [<ffffffff810d23c8>] ? alloc_pages_current+0xc7/0xe4
 [<ffffffff810a68b5>] ? __page_cache_alloc+0x84/0x91
 [<ffffffff810af912>] ? __do_page_cache_readahead+0xa6/0x2e0
 [<ffffffff810afaa3>] __do_page_cache_readahead+0x237/0x2e0
 [<ffffffff810af912>] ? __do_page_cache_readahead+0xa6/0x2e0
 [<ffffffff810afe3e>] ra_submit+0x1c/0x20
 [<ffffffff810b019b>] ondemand_readahead+0x359/0x382
 [<ffffffff810b0279>] page_cache_sync_readahead+0x38/0x3a
 [<ffffffff810a77b5>] generic_file_aio_read+0x26b/0x637
 [<ffffffffa00f1852>] ? nfs_mark_delegation_referenced+0xb/0xb [nfsv4]
 [<ffffffffa009cc85>] nfs_file_read+0xaa/0xcf [nfs]
 [<ffffffff810db5b3>] do_sync_read+0x91/0xd1
 [<ffffffff810dbb8b>] vfs_read+0x9b/0x144
 [<ffffffff810dbc78>] sys_read+0x44/0x75
 [<ffffffff81422892>] system_call_fastpath+0x16/0x1b

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoNFS4: Open files for fscaching
David Howells [Wed, 5 Dec 2012 16:31:49 +0000 (16:31 +0000)]
NFS4: Open files for fscaching

nfs4_file_open() should open files for fscaching.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoFS-Cache: Add transition to handle invalidate immediately after lookup
David Howells [Wed, 5 Dec 2012 13:34:49 +0000 (13:34 +0000)]
FS-Cache: Add transition to handle invalidate immediately after lookup

Add a missing transition to the FS-Cache object state machine to handle an
invalidation event occuring between the back end completing the object lookup
by calling fscache_obtained_object() (which moves to state OBJECT_AVAILABLE)
and the backend returning to fscache_lookup_object() and thence to
fscache_object_state_machine() which then does a goto lookup_transit to handle
the transition - but lookup_transit doesn't handle EV_INVALIDATE.

Without this, the following BUG can be logged:

FS-Cache: Unsupported event 2 [5/f7] in state OBJECT_AVAILABLE
------------[ cut here ]------------
kernel BUG at fs/fscache/object.c:357!

Where event 2 is EV_INVALIDATE.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoMerge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Linus Torvalds [Thu, 20 Dec 2012 22:15:53 +0000 (14:15 -0800)]
Merge branch 'kbuild' of git://git./linux/kernel/git/mmarek/kbuild

Pull kbuild changes from Michal Marek:
 "The kbuild changes are minimal this time:

   - scripts/pnmlogo fix for some newer format

   - minor top-level Makefile cleanup

   - fix for a v3.5 regression with make clean M=<directory>"

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kbuild: Do not remove vmlinux when cleaning external module
  scripts/pnmtologo: fix for plain PBM
  kbuild: Remove reference to uninitialised variable

9 years agoARM: OMAP2+: Trivial fix for IOMMU merge issue
Tony Lindgren [Thu, 20 Dec 2012 19:50:34 +0000 (11:50 -0800)]
ARM: OMAP2+: Trivial fix for IOMMU merge issue

Commit 787314c35fbb ("Merge tag 'iommu-updates-v3.8' of
git://git./linux/kernel/git/joro/iommu") did not account for the changed
header location.

The headers were made local to mach-omap2 as they are specific to omap2+
only, and we wanted to get most of the #include <plat/*.h> headers fixed
up anyways for the ARM multiplatform support.

We attempted to avoid this kind of merge conflict early on by setting up
a minimal git branch shared by the arm-soc tree and the iommu tree, but
looks like we still hit a merge issue there as the branches got merged
as various topic branches.

Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoNFS: nfs_migrate_page() does not wait for FS-Cache to finish with a page
David Howells [Wed, 5 Dec 2012 13:34:49 +0000 (13:34 +0000)]
NFS: nfs_migrate_page() does not wait for FS-Cache to finish with a page

nfs_migrate_page() does not wait for FS-Cache to finish with a page, probably
leading to the following bad-page-state:

 BUG: Bad page state in process python-bin  pfn:17d39b
 page:ffffea00053649e8 flags:004000000000100c count:0 mapcount:0 mapping:(null)
index:38686 (Tainted: G    B      ---------------- )
 Pid: 31053, comm: python-bin Tainted: G    B      ----------------
2.6.32-71.24.1.el6.x86_64 #1
 Call Trace:
 [<ffffffff8111bfe7>] bad_page+0x107/0x160
 [<ffffffff8111ee69>] free_hot_cold_page+0x1c9/0x220
 [<ffffffff8111ef19>] __pagevec_free+0x59/0xb0
 [<ffffffff8104b988>] ? flush_tlb_others_ipi+0x128/0x130
 [<ffffffff8112230c>] release_pages+0x21c/0x250
 [<ffffffff8115b92a>] ? remove_migration_pte+0x28a/0x2b0
 [<ffffffff8115f3f8>] ? mem_cgroup_get_reclaim_stat_from_page+0x18/0x70
 [<ffffffff81122687>] ____pagevec_lru_add+0x167/0x180
 [<ffffffff811226f8>] __lru_cache_add+0x58/0x70
 [<ffffffff81122731>] lru_cache_add_lru+0x21/0x40
 [<ffffffff81123f49>] putback_lru_page+0x69/0x100
 [<ffffffff8115c0bd>] migrate_pages+0x13d/0x5d0
 [<ffffffff81122687>] ? ____pagevec_lru_add+0x167/0x180
 [<ffffffff81152ab0>] ? compaction_alloc+0x0/0x370
 [<ffffffff8115255c>] compact_zone+0x4cc/0x600
 [<ffffffff8111cfac>] ? get_page_from_freelist+0x15c/0x820
 [<ffffffff810672f4>] ? check_preempt_wakeup+0x1c4/0x3c0
 [<ffffffff8115290e>] compact_zone_order+0x7e/0xb0
 [<ffffffff81152a49>] try_to_compact_pages+0x109/0x170
 [<ffffffff8111e94d>] __alloc_pages_nodemask+0x5ed/0x850
 [<ffffffff814c9136>] ? thread_return+0x4e/0x778
 [<ffffffff81150d43>] alloc_pages_vma+0x93/0x150
 [<ffffffff81167ea5>] do_huge_pmd_anonymous_page+0x135/0x340
 [<ffffffff814cb6f6>] ? rwsem_down_read_failed+0x26/0x30
 [<ffffffff81136755>] handle_mm_fault+0x245/0x2b0
 [<ffffffff814ce383>] do_page_fault+0x123/0x3a0
 [<ffffffff814cbdf5>] page_fault+0x25/0x30

nfs_migrate_page() calls nfs_fscache_release_page() which doesn't actually wait
- even if __GFP_WAIT is set.  The reason that doesn't wait is that
fscache_maybe_release_page() might deadlock the allocator as the work threads
writing to the cache may all end up sleeping on memory allocation.

However, I wonder if that is actually a problem.  There are a number of things
I can do to deal with this:

 (1) Make nfs_migrate_page() wait.

 (2) Make fscache_maybe_release_page() honour the __GFP_WAIT flag.

 (3) Set a timeout around the wait.

 (4) Make nfs_migrate_page() return an error if the page is still busy.

For the moment, I'll select (2) and (4).

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
9 years agoFS-Cache: Exclusive op submission can BUG if there's been an I/O error
David Howells [Wed, 5 Dec 2012 13:34:48 +0000 (13:34 +0000)]
FS-Cache: Exclusive op submission can BUG if there's been an I/O error

The function to submit an exclusive op (fscache_submit_exclusive_op()) can BUG
if there's been an I/O error because it may see the parent cache object in an
unexpected state.  It should only BUG if there hasn't been an I/O error.

In this case the problem was produced by remounting the cache partition to be
R/O.  The EROFS state was detected and the cache was aborted, but not
everything handled the aborting correctly.

SysRq : Emergency Remount R/O
EXT4-fs (sda6): re-mounted. Opts: (null)
Emergency Remount complete
CacheFiles: I/O Error: Failed to update xattr with error -30
FS-Cache: Cache cachefiles stopped due to I/O error
------------[ cut here ]------------
kernel BUG at fs/fscache/operation.c:128!
invalid opcode: 0000 [#1] SMP
CPU 0
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc

Pid: 6612, comm: kworker/u:2 Not tainted 3.1.0-rc8-fsdevel+ #1093                  /DG965RY
RIP: 0010:[<ffffffffa00739c0>]  [<ffffffffa00739c0>] fscache_submit_exclusive_op+0x2ad/0x2c2 [fscache]
RSP: 0018:ffff880000853d40  EFLAGS: 00010206
RAX: ffff880038ac72a8 RBX: ffff8800181f2260 RCX: ffffffff81f2b2b0
RDX: 0000000000000001 RSI: ffffffff8179a478 RDI: ffff8800181f2280
RBP: ffff880000853d60 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff880038ac7268
R13: ffff8800181f2280 R14: ffff88003a359190 R15: 000000010122b162
FS:  0000000000000000(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000034cc4a77f0 CR3: 0000000010e96000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/u:2 (pid: 6612, threadinfo ffff880000852000, task ffff880014c3c040)
Stack:
 ffff8800181f2260 ffff8800181f2310 ffff880038ac7268 ffff8800181f2260
 ffff880000853dc0 ffffffffa0072375 ffff880037ecfe00 ffff88003a359198
 ffff880000853dc0 0000000000000246 0000000000000000 ffff88000a91d308
Call Trace:
 [<ffffffffa0072375>] fscache_object_work_func+0x792/0xe65 [fscache]
 [<ffffffff81047e44>] process_one_work+0x1eb/0x37f
 [<ffffffff81047de6>] ? process_one_work+0x18d/0x37f
 [<ffffffffa0071be3>] ? fscache_enqueue_dependents+0xd8/0xd8 [fscache]
 [<ffffffff810482e4>] worker_thread+0x15a/0x21a
 [<ffffffff8104818a>] ? rescuer_thread+0x188/0x188
 [<ffffffff8104bf96>] kthread+0x7f/0x87
 [<ffffffff813ad6f4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
 [<ffffffff813abd1d>] ? retint_restore_args+0xe/0xe
 [<ffffffff8104bf17>] ? __init_kthread_worker+0x53/0x53
 [<ffffffff813ad6f0>] ? gs_change+0xb/0xb

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoFS-Cache: Limit the number of I/O error reports for a cache
David Howells [Wed, 5 Dec 2012 13:34:48 +0000 (13:34 +0000)]
FS-Cache: Limit the number of I/O error reports for a cache

Limit the number of I/O error reports for a cache to 1 to prevent massive
amounts of noise.  After the first I/O error the cache is taken off line
automatically, so must be restarted to resume caching.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoFS-Cache: Don't mask off the object event mask when printing it
David Howells [Wed, 5 Dec 2012 13:34:47 +0000 (13:34 +0000)]
FS-Cache: Don't mask off the object event mask when printing it

Don't mask off the object event mask when printing it.  That way it can be seen
if threre are bits set that shouldn't be.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoFS-Cache: Initialise the object event mask with the calculated mask
David Howells [Wed, 5 Dec 2012 13:34:46 +0000 (13:34 +0000)]
FS-Cache: Initialise the object event mask with the calculated mask

Initialise the object event mask with the calculated mask rather than unmasking
undefined events also.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoFS-Cache: Convert the object event ID #defines into an enum
David Howells [Wed, 5 Dec 2012 13:34:46 +0000 (13:34 +0000)]
FS-Cache: Convert the object event ID #defines into an enum

Convert the fscache_object event IDs from #defines into an enum.  Also add an
extra label to the enum to carry the event count and redefine the event mask
in terms of that.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoCacheFiles: Add missing retrieval completions
David Howells [Wed, 5 Dec 2012 13:34:45 +0000 (13:34 +0000)]
CacheFiles: Add missing retrieval completions

CacheFiles is missing some calls to fscache_retrieval_complete() in the error
handling/collision paths of its reader functions.

This can be seen by the following assertion tripping in fscache_put_operation()
whereby the operation being destroyed is still in the in-progress state and has
not been cancelled or completed:

FS-Cache: Assertion failed
3 == 5 is false
------------[ cut here ]------------
kernel BUG at fs/fscache/operation.c:408!
invalid opcode: 0000 [#1] SMP
CPU 2
Modules linked in: xfs ioatdma dca loop joydev evdev
psmouse dcdbas pcspkr serio_raw i5000_edac edac_core i5k_amb shpchp
pci_hotplug sg sr_mod]

Pid: 8062, comm: httpd Not tainted 3.1.0-rc8 #1 Dell Inc. PowerEdge 1950/0DT097
RIP: 0010:[<ffffffff81197b24>]  [<ffffffff81197b24>] fscache_put_operation+0x304/0x330
RSP: 0018:ffff880062f739d8  EFLAGS: 00010296
RAX: 0000000000000025 RBX: ffff8800c5122e84 RCX: ffffffff81ddf040
RDX: 00000000ffffffff RSI: 0000000000000082 RDI: ffffffff81ddef30
RBP: ffff880062f739f8 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000003 R12: ffff8800c5122e40
R13: ffff880037a2cd20 R14: ffff880087c7a058 R15: ffff880087c7a000
FS:  00007f63dcf636e0(0000) GS:ffff88022fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0c0a91f000 CR3: 0000000062ec2000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process httpd (pid: 8062, threadinfo ffff880062f72000, task ffff880087e58000)
Stack:
 ffff880062f73bf8 0000000000000000 ffff880062f73bf8 ffff880037a2cd20
 ffff880062f73a68 ffffffff8119aa7e ffff88006540e000 ffff880062f73ad4
 ffff88008e9a4308 ffff880037a2cd20 ffff880062f73a48 ffff8800c5122e40
Call Trace:
 [<ffffffff8119aa7e>] __fscache_read_or_alloc_pages+0x1fe/0x530
 [<ffffffff81250780>] __nfs_readpages_from_fscache+0x70/0x1c0
 [<ffffffff8123142a>] nfs_readpages+0xca/0x1e0
 [<ffffffff815f3c06>] ? rpc_do_put_task+0x36/0x50
 [<ffffffff8122755b>] ? alloc_nfs_open_context+0x4b/0x110
 [<ffffffff815ecd1a>] ? rpc_call_sync+0x5a/0x70
 [<ffffffff810e7e9a>] __do_page_cache_readahead+0x1ca/0x270
 [<ffffffff810e7f61>] ra_submit+0x21/0x30
 [<ffffffff810e818d>] ondemand_readahead+0x11d/0x250
 [<ffffffff810e83b6>] page_cache_sync_readahead+0x36/0x60
 [<ffffffff810dffa4>] generic_file_aio_read+0x454/0x770
 [<ffffffff81224ce1>] nfs_file_read+0xe1/0x130
 [<ffffffff81121bd9>] do_sync_read+0xd9/0x120
 [<ffffffff8114088f>] ? mntput+0x1f/0x40
 [<ffffffff811238cb>] ? fput+0x1cb/0x260
 [<ffffffff81122938>] vfs_read+0xc8/0x180
 [<ffffffff81122af5>] sys_read+0x55/0x90

Reported-by: Mark Moseley <moseleymark@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoNFS: Use FS-Cache invalidation
David Howells [Thu, 20 Dec 2012 21:52:38 +0000 (21:52 +0000)]
NFS: Use FS-Cache invalidation

Use the new FS-Cache invalidation facility from NFS to deal with foreign
changes being detected on the server rather than attempting to retire the old
cookie and get a new one.

The problem with the old method was that NFS did not wait for all outstanding
storage and retrieval ops on the cache to complete.  There was no automatic
wait between the calls to ->readpages() and calls to invalidate_inode_pages2()
as the latter can only wait on locked pages that have been added to the
pagecache (which they haven't yet on entry to ->readpages()).

This was leading to oopses like the one below when an outstanding read got cut
off from its cookie by a premature release.

BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
IP: [<ffffffffa0075118>] __fscache_read_or_alloc_pages+0x1dd/0x315 [fscache]
PGD 15889067 PUD 15890067 PMD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc

Pid: 4544, comm: tar Not tainted 3.1.0-rc4-fsdevel+ #1064                  /DG965RY
RIP: 0010:[<ffffffffa0075118>]  [<ffffffffa0075118>] __fscache_read_or_alloc_pages+0x1dd/0x315 [fscache]
RSP: 0018:ffff8800158799e8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8800070d41e0 RCX: ffff8800083dc1b0
RDX: 0000000000000000 RSI: ffff880015879960 RDI: ffff88003e627b90
RBP: ffff880015879a28 R08: 0000000000000002 R09: 0000000000000002
R10: 0000000000000001 R11: ffff880015879950 R12: ffff880015879aa4
R13: 0000000000000000 R14: ffff8800083dc158 R15: ffff880015879be8
FS:  00007f671e9d87c0(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000a8 CR3: 000000001587f000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process tar (pid: 4544, threadinfo ffff880015878000, task ffff880015875040)
Stack:
 ffffffffa00b1759 ffff8800070dc158 ffff8800000213da ffff88002a286508
 ffff880015879aa4 ffff880015879be8 0000000000000001 ffff88002a2866e8
 ffff880015879a88 ffffffffa00b20be 00000000000200da ffff880015875040
Call Trace:
 [<ffffffffa00b1759>] ? nfs_fscache_wait_bit+0xd/0xd [nfs]
 [<ffffffffa00b20be>] __nfs_readpages_from_fscache+0x7e/0x13f [nfs]
 [<ffffffff81095fe7>] ? __alloc_pages_nodemask+0x156/0x662
 [<ffffffffa0098763>] nfs_readpages+0xee/0x187 [nfs]
 [<ffffffff81098a5e>] __do_page_cache_readahead+0x1be/0x267
 [<ffffffff81098942>] ? __do_page_cache_readahead+0xa2/0x267
 [<ffffffff81098d7b>] ra_submit+0x1c/0x20
 [<ffffffff8109900a>] ondemand_readahead+0x28b/0x29a
 [<ffffffff810990ce>] page_cache_sync_readahead+0x38/0x3a
 [<ffffffff81091d8a>] generic_file_aio_read+0x2ab/0x67e
 [<ffffffffa008cfbe>] nfs_file_read+0xa4/0xc9 [nfs]
 [<ffffffff810c22c4>] do_sync_read+0xba/0xfa
 [<ffffffff810a62c9>] ? might_fault+0x4e/0x9e
 [<ffffffff81177a47>] ? security_file_permission+0x7b/0x84
 [<ffffffff810c25dd>] ? rw_verify_area+0xab/0xc8
 [<ffffffff810c29a4>] vfs_read+0xaa/0x13a
 [<ffffffff810c2a79>] sys_read+0x45/0x6c
 [<ffffffff813ac37b>] system_call_fastpath+0x16/0x1b

Reported-by: Mark Moseley <moseleymark@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoCacheFiles: Implement invalidation
David Howells [Thu, 20 Dec 2012 21:52:36 +0000 (21:52 +0000)]
CacheFiles: Implement invalidation

Implement invalidation for CacheFiles.  This is in two parts:

 (1) Provide an invalidation method (which just truncates the backing file).

 (2) Abort attempts to copy anything read from the backing file whilst
     invalidation is in progress.

Question: CacheFiles uses truncation in a couple of places.  It has been using
notify_change() rather than sys_truncate() or something similar.  This means
it bypasses a bunch of checks and suchlike that it possibly should be making
(security, file locking, lease breaking, vfsmount write).  Should it be using
vfs_truncate() as added by a preceding patch or should it use notify_write()
and assume that anyone poking around in the cache files on disk gets
everything they deserve?

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoVFS: Make more complete truncate operation available to CacheFiles
David Howells [Thu, 20 Dec 2012 21:52:36 +0000 (21:52 +0000)]
VFS: Make more complete truncate operation available to CacheFiles

Make a more complete truncate operation available to CacheFiles (including
security checks and suchlike) so that it can use this to clear invalidated
cache files.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoMerge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Thu, 20 Dec 2012 22:04:11 +0000 (14:04 -0800)]
Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux

Pull nfsd update from Bruce Fields:
 "Included this time:

   - more nfsd containerization work from Stanislav Kinsbursky: we're
     not quite there yet, but should be by 3.9.

   - NFSv4.1 progress: implementation of basic backchannel security
     negotiation and the mandatory BACKCHANNEL_CTL operation.  See

       http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues

     for remaining TODO's

   - Fixes for some bugs that could be triggered by unusual compounds.
     Our xdr code wasn't designed with v4 compounds in mind, and it
     shows.  A more thorough rewrite is still a todo.

   - If you've ever seen "RPC: multiple fragments per record not
     supported" logged while using some sort of odd userland NFS client,
     that should now be fixed.

   - Further work from Jeff Layton on our mechanism for storing
     information about NFSv4 clients across reboots.

   - Further work from Bryan Schumaker on his fault-injection mechanism
     (which allows us to discard selective NFSv4 state, to excercise
     rarely-taken recovery code paths in the client.)

   - The usual mix of miscellaneous bugs and cleanup.

  Thanks to everyone who tested or contributed this cycle."

* 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits)
  nfsd4: don't leave freed stateid hashed
  nfsd4: free_stateid can use the current stateid
  nfsd4: cleanup: replace rq_resused count by rq_next_page pointer
  nfsd: warn on odd reply state in nfsd_vfs_read
  nfsd4: fix oops on unusual readlike compound
  nfsd4: disable zero-copy on non-final read ops
  svcrpc: fix some printks
  NFSD: Correct the size calculation in fault_inject_write
  NFSD: Pass correct buffer size to rpc_ntop
  nfsd: pass proper net to nfsd_destroy() from NFSd kthreads
  nfsd: simplify service shutdown
  nfsd: replace boolean nfsd_up flag by users counter
  nfsd: simplify NFSv4 state init and shutdown
  nfsd: introduce helpers for generic resources init and shutdown
  nfsd: make NFSd service structure allocated per net
  nfsd: make NFSd service boot time per-net
  nfsd: per-net NFSd up flag introduced
  nfsd: move per-net startup code to separated function
  nfsd: pass net to __write_ports() and down
  nfsd: pass net to nfsd_set_nrthreads()
  ...

9 years agoFS-Cache: Provide proper invalidation
David Howells [Thu, 20 Dec 2012 21:52:36 +0000 (21:52 +0000)]
FS-Cache: Provide proper invalidation

Provide a proper invalidation method rather than relying on the netfs retiring
the cookie it has and getting a new one.  The problem with this is that isn't
easy for the netfs to make sure that it has completed/cancelled all its
outstanding storage and retrieval operations on the cookie it is retiring.

Instead, have the cache provide an invalidation method that will cancel or wait
for all currently outstanding operations before invalidating the cache, and
will cause new operations to queue up behind that.  Whilst invalidation is in
progress, some requests will be rejected until the cache can stack a barrier on
the operation queue to cause new operations to be deferred behind it.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Thu, 20 Dec 2012 22:00:13 +0000 (14:00 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

Pull Ceph update from Sage Weil:
 "There are a few different groups of commits here.  The largest is
  Alex's ongoing work to enable the coming RBD features (cloning,
  striping).  There is some cleanup in libceph that goes along with it.

  Cyril and David have fixed some problems with NFS reexport (leaking
  dentries and page locks), and there is a batch of patches from Yan
  fixing problems with the fs client when running against a clustered
  MDS.  There are a few bug fixes mixed in for good measure, many of
  which will be going to the stable trees once they're upstream.

  My apologies for the late pull.  There is still a gremlin in the rbd
  map/unmap code and I was hoping to include the fix for that as well,
  but we haven't been able to confirm the fix is correct yet; I'll send
  that in a separate pull once it's nailed down."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits)
  rbd: get rid of rbd_{get,put}_dev()
  libceph: register request before unregister linger
  libceph: don't use rb_init_node() in ceph_osdc_alloc_request()
  libceph: init event->node in ceph_osdc_create_event()
  libceph: init osd->o_node in create_osd()
  libceph: report connection fault with warning
  libceph: socket can close in any connection state
  rbd: don't use ENOTSUPP
  rbd: remove linger unconditionally
  rbd: get rid of RBD_MAX_SEG_NAME_LEN
  libceph: avoid using freed osd in __kick_osd_requests()
  ceph: don't reference req after put
  rbd: do not allow remove of mounted-on image
  libceph: Unlock unprocessed pages in start_read() error path
  ceph: call handle_cap_grant() for cap import message
  ceph: Fix __ceph_do_pending_vmtruncate
  ceph: Don't add dirty inode to dirty list if caps is in migration
  ceph: Fix infinite loop in __wake_requests
  ceph: Don't update i_max_size when handling non-auth cap
  bdi_register: add __printf verification, fix arg mismatch
  ...

9 years agoFS-Cache: Fix operation state management and accounting
David Howells [Thu, 20 Dec 2012 21:52:35 +0000 (21:52 +0000)]
FS-Cache: Fix operation state management and accounting

Fix the state management of internal fscache operations and the accounting of
what operations are in what states.

This is done by:

 (1) Give struct fscache_operation a enum variable that directly represents the
     state it's currently in, rather than spreading this knowledge over a bunch
     of flags, who's processing the operation at the moment and whether it is
     queued or not.

     This makes it easier to write assertions to check the state at various
     points and to prevent invalid state transitions.

 (2) Add an 'operation complete' state and supply a function to indicate the
     completion of an operation (fscache_op_complete()) and make things call
     it.  The final call to fscache_put_operation() can then check that an op
     in the appropriate state (complete or cancelled).

 (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
     govern the state of an object:

(a) The ->n_ops is now the number of extant operations on the object
    and is now decremented by fscache_put_operation() only.

(b) The ->n_in_progress is simply the number of objects that have been
    taken off of the object's pending queue for the purposes of being
    run.  This is decremented by fscache_op_complete() only.

(c) The ->n_exclusive is the number of exclusive ops that have been
    submitted and queued or are in progress.  It is decremented by
    fscache_op_complete() and by fscache_cancel_op().

     fscache_put_operation() and fscache_operation_gc() now no longer try to
     clean up ->n_exclusive and ->n_in_progress.  That was leading to double
     decrements against fscache_cancel_op().

     fscache_cancel_op() now no longer decrements ->n_ops.  That was leading to
     double decrements against fscache_put_operation().

     fscache_submit_exclusive_op() now decides whether it has to queue an op
     based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
     will persist in being true even after all preceding operations have been
     cancelled or completed.  Furthermore, if an object is active and there are
     runnable ops against it, there must be at least one op running.

 (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
     provide a function to record completion of the pages as they complete.

     When n_pages reaches 0, the operation is deemed to be complete and
     fscache_op_complete() is called.

     Add calls to fscache_retrieval_complete() anywhere we've finished with a
     page we've been given to read or allocate for.  This includes places where
     we just return pages to the netfs for reading from the server and where
     accessing the cache fails and we discard the proposed netfs page.

The bugs in the unfixed state management manifest themselves as oopses like the
following where the operation completion gets out of sync with return of the
cookie by the netfs.  This is possible because the cache unlocks and returns
all the netfs pages before recording its completion - which means that there's
nothing to stop the netfs discarding them and returning the cookie.

FS-Cache: Cookie 'NFS.fh' still has outstanding reads
------------[ cut here ]------------
kernel BUG at fs/fscache/cookie.c:519!
invalid opcode: 0000 [#1] SMP
CPU 1
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc

Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090                  /DG965RY
RIP: 0010:[<ffffffffa007050a>]  [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache]
RSP: 0018:ffff8800368cfb00  EFLAGS: 00010282
RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
FS:  0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
Stack:
 ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
Call Trace:
 [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
 [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs]
 [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs]
 [<ffffffff810d8d47>] evict+0xa1/0x15c
 [<ffffffff810d8e2e>] dispose_list+0x2c/0x38
 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b
 [<ffffffff810c56b7>] prune_super+0xd5/0x140
 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab
 [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595
 [<ffffffff8103e009>] ? process_timeout+0xb/0xb
 [<ffffffff8109dba3>] kswapd+0x270/0x289
 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46
 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595
 [<ffffffff8104bf7a>] kthread+0x7f/0x87
 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe
 [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53
 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoFS-Cache: Make cookie relinquishment wait for outstanding reads
David Howells [Thu, 20 Dec 2012 21:52:35 +0000 (21:52 +0000)]
FS-Cache: Make cookie relinquishment wait for outstanding reads

Make fscache_relinquish_cookie() log a warning and wait if there are any
outstanding reads left on the cookie it was given.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoCacheFiles: Make some debugging statements conditional
David Howells [Thu, 20 Dec 2012 21:52:34 +0000 (21:52 +0000)]
CacheFiles: Make some debugging statements conditional

Downgrade some debugging statements to not unconditionally print stuff, but
rather be conditional on the appropriate module parameter setting.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoFS-Cache: Check that there are no read ops when cookie relinquished
David Howells [Thu, 20 Dec 2012 21:52:33 +0000 (21:52 +0000)]
FS-Cache: Check that there are no read ops when cookie relinquished

Check that the netfs isn't trying to relinquish a cookie that still has read
operations in progress upon it.  If there are, then give log a warning and BUG.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoCacheFiles: Downgrade the requirements passed to the allocator
David Howells [Thu, 20 Dec 2012 21:52:33 +0000 (21:52 +0000)]
CacheFiles: Downgrade the requirements passed to the allocator

Downgrade the requirements passed to the allocator in the gfp flags parameter.
FS-Cache/CacheFiles can handle OOM conditions simply by aborting the attempt to
store an object or a page in the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
9 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux...
Linus Torvalds [Thu, 20 Dec 2012 21:57:09 +0000 (13:57 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs

Pull two btrfs reverts from Chris Mason:
 "I had missed that for two of the patches in my last pull, we had
  included different fixes during 3.7."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Revert "Btrfs: reorder tree mod log operations in deleting a pointer"
  Revert "Btrfs: MOD_LOG_KEY_REMOVE_WHILE_MOVING never change node's nritems"

9 years agoMerge tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk...
Linus Torvalds [Thu, 20 Dec 2012 21:54:51 +0000 (13:54 -0800)]
Merge tag 'for-3.8-merge' of git://git./linux/kernel/git/jaegeuk/f2fs

Pull new F2FS filesystem from Jaegeuk Kim:
 "Introduce a new file system, Flash-Friendly File System (F2FS), to
  Linux 3.8.

  Highlights:
   - Add initial f2fs source codes
   - Fix an endian conversion bug
   - Fix build failures on random configs
   - Fix the power-off-recovery routine
   - Minor cleanup, coding style, and typos patches"

From the Kconfig help text:

  F2FS is based on Log-structured File System (LFS), which supports
  versatile "flash-friendly" features. The design has been focused on
  addressing the fundamental issues in LFS, which are snowball effect
  of wandering tree and high cleaning overhead.

  Since flash-based storages show different characteristics according to
  the internal geometry or flash memory management schemes aka FTL, F2FS
  and tools support various parameters not only for configuring on-disk
  layout, but also for selecting allocation and cleaning algorithms.

and there's an article by Neil Brown about it on lwn.net:

  http://lwn.net/Articles/518988/

* tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (36 commits)
  f2fs: fix tracking parent inode number
  f2fs: cleanup the f2fs_bio_alloc routine
  f2fs: introduce accessor to retrieve number of dentry slots
  f2fs: remove redundant call to f2fs_put_page in delete entry
  f2fs: make use of GFP_F2FS_ZERO for setting gfp_mask
  f2fs: rewrite f2fs_bio_alloc to make it simpler
  f2fs: fix a typo in f2fs documentation
  f2fs: remove unused variable
  f2fs: move error condition for mkdir at proper place
  f2fs: remove unneeded initialization
  f2fs: check read only condition before beginning write out
  f2fs: remove unneeded memset from init_once
  f2fs: show error in case of invalid mount arguments
  f2fs: fix the compiler warning for uninitialized use of variable
  f2fs: resolve build failures
  f2fs: adjust kernel coding style
  f2fs: fix endian conversion bugs reported by sparse
  f2fs: remove unneeded version.h header file from f2fs.h
  f2fs: update the f2fs document
  f2fs: update Kconfig and Makefile
  ...

9 years agoCacheFiles: Fix the marking of cached pages
David Howells [Thu, 20 Dec 2012 21:52:32 +0000 (21:52 +0000)]
CacheFiles: Fix the marking of cached pages

Under some circumstances CacheFiles defers the marking of pages with PG_fscache
so that it can take advantage of pagevecs to reduce the number of calls to
fscache_mark_pages_cached() and the netfs's hook to keep track of this.

There are, however, two problems with this:

 (1) It can lead to the PG_fscache mark being applied _after_ the page is set
     PG_uptodate and unlocked (by the call to fscache_end_io()).

 (2) CacheFiles's ref on the page is dropped immediately following
     fscache_end_io() - and so may not still be held when the mark is applied.
     This can lead to the page being passed back to the allocator before the
     mark is applied.

Fix this by, where appropriate, marking the page before calling
fscache_end_io() and releasing the page.  This means that we can't take
advantage of pagevecs and have to make a separate call for each page to the
marking routines.

The symptoms of this are Bad Page state errors cropping up under memory
pressure, for example:

BUG: Bad page state in process tar  pfn:002da
page:ffffea0000009fb0 count:0 mapcount:0 mapping:          (null) index:0x1447
page flags: 0x1000(private_2)
Pid: 4574, comm: tar Tainted: G        W   3.1.0-rc4-fsdevel+ #1064
Call Trace:
 [<ffffffff8109583c>] ? dump_page+0xb9/0xbe
 [<ffffffff81095916>] bad_page+0xd5/0xea
 [<ffffffff81095d82>] get_page_from_freelist+0x35b/0x46a
 [<ffffffff810961f3>] __alloc_pages_nodemask+0x362/0x662
 [<ffffffff810989da>] __do_page_cache_readahead+0x13a/0x267
 [<ffffffff81098942>] ? __do_page_cache_readahead+0xa2/0x267
 [<ffffffff81098d7b>] ra_submit+0x1c/0x20
 [<ffffffff8109900a>] ondemand_readahead+0x28b/0x29a
 [<ffffffff81098ee2>] ? ondemand_readahead+0x163/0x29a
 [<ffffffff810990ce>] page_cache_sync_readahead+0x38/0x3a
 [<ffffffff81091d8a>] generic_file_aio_read+0x2ab/0x67e
 [<ffffffffa008cfbe>] nfs_file_read+0xa4/0xc9 [nfs]
 [<ffffffff810c22c4>] do_sync_read+0xba/0xfa
 [<ffffffff81177a47>] ? security_file_permission+0x7b/0x84
 [<ffffffff810c25dd>] ? rw_verify_area+0xab/0xc8
 [<ffffffff810c29a4>] vfs_read+0xaa/0x13a
 [<ffffffff810c2a79>] sys_read+0x45/0x6c
 [<ffffffff813ac37b>] system_call_fastpath+0x16/0x1b

As can be seen, PG_private_2 (== PG_fscache) is set in the page flags.

Instrumenting fscache_mark_pages_cached() to verify whether page->mapping was
set appropriately showed that sometimes it wasn't.  This led to the discovery
that sometimes the page has apparently been reclaimed by the time the marker
got to see it.

Reported-by: M. Stevens <m@tippett.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
9 years agolib: atomic64: Initialize locks statically to fix early users
Stephen Boyd [Thu, 20 Dec 2012 07:39:48 +0000 (23:39 -0800)]
lib: atomic64: Initialize locks statically to fix early users

The atomic64 library uses a handful of static spin locks to implement
atomic 64-bit operations on architectures without support for atomic
64-bit instructions.

Unfortunately, the spinlocks are initialized in a pure initcall and that
is too late for the vfs namespace code which wants to use atomic64
operations before the initcall is run.

This became a problem as of commit 8823c079ba71: "vfs: Add setns support
for the mount namespace".

This leads to BUG messages such as:

  BUG: spinlock bad magic on CPU#0, swapper/0/0
   lock: atomic64_lock+0x240/0x400, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
    do_raw_spin_lock+0x158/0x198
    _raw_spin_lock_irqsave+0x4c/0x58
    atomic64_add_return+0x30/0x5c
    alloc_mnt_ns.clone.14+0x44/0xac
    create_mnt_ns+0xc/0x54
    mnt_init+0x120/0x1d4
    vfs_caches_init+0xe0/0x10c
    start_kernel+0x29c/0x300

coming out early on during boot when spinlock debugging is enabled.

Fix this by initializing the spinlocks statically at compile time.

Reported-and-tested-by: Vaibhav Bedia <vaibhav.bedia@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agohfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:53:15 +0000 (11:53 +0100)]
hfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agobfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:52:33 +0000 (11:52 +0100)]
bfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoaffs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:51:53 +0000 (11:51 +0100)]
affs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoadfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:51:11 +0000 (11:51 +0100)]
adfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoocfs2: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:50:20 +0000 (11:50 +0100)]
ocfs2: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoomfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:49:42 +0000 (11:49 +0100)]
omfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Acked-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoprocfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:48:48 +0000 (11:48 +0100)]
procfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoreiserfs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:47:31 +0000 (11:47 +0100)]
reiserfs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agosysv: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:45:58 +0000 (11:45 +0100)]
sysv: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoufs: drop vmtruncate
Marco Stornelli [Sat, 15 Dec 2012 10:45:14 +0000 (11:45 +0100)]
ufs: drop vmtruncate

Removed vmtruncate

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agofs: Fix imbalance in freeze protection in mark_files_ro()
Jan Kara [Wed, 5 Dec 2012 13:40:14 +0000 (14:40 +0100)]
fs: Fix imbalance in freeze protection in mark_files_ro()

File descriptors (even those for writing) do not hold freeze protection.
Thus mark_files_ro() must call __mnt_drop_write() to only drop protection
against remount read-only. Calling mnt_drop_write_file() as we do now
results in:

[ BUG: bad unlock balance detected! ]
3.7.0-rc6-00028-g88e75b6 #101 Not tainted
-------------------------------------
kworker/1:2/79 is trying to release lock (sb_writers) at:
[<ffffffff811b33b4>] mnt_drop_write+0x24/0x30
but there are no more locks to release!

Reported-by: Zdenek Kabelac <zkabelac@redhat.com>
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: remove DCACHE_NEED_LOOKUP
Jeff Layton [Wed, 28 Nov 2012 16:30:53 +0000 (11:30 -0500)]
vfs: remove DCACHE_NEED_LOOKUP

The code that relied on that flag was ripped out of btrfs quite some
time ago, and never added back. Josef indicated that he was going to
take a different approach to the problem in btrfs, and that we
could just eliminate this flag.

Cc: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agopath_init(): make -ENOTDIR failure exits consistent
Al Viro [Thu, 20 Dec 2012 18:41:28 +0000 (13:41 -0500)]
path_init(): make -ENOTDIR failure exits consistent

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs: remove unneeded permission check from path_init
Jeff Layton [Tue, 11 Dec 2012 13:56:16 +0000 (08:56 -0500)]
vfs: remove unneeded permission check from path_init

When path_init is called with a valid dfd, that code checks permissions
on the open directory fd and returns an error if the check fails. This
permission check is redundant, however.

Both callers of path_init immediately call link_path_walk afterward. The
first thing that link_path_walk does for pathnames that do not consist
only of slashes is to check for exec permissions at the starting point of
the path walk.  And this check in path_init() is on the path taken only
when *name != '/' && *name != '\0'.

In most cases, these checks are very quick, but when the dfd is for a
file on a NFS mount with the actimeo=0, each permission check goes
out onto the wire. The result is 2 identical ACCESS calls.

Given that these codepaths are fairly "hot", I think it makes sense to
eliminate the permission check in path_init and simply assume that the
caller will eventually check the permissions before proceeding.

Reported-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agovfs, freeze: use ACCESS_ONCE() to guard access to ->mnt_flags
Miao Xie [Fri, 16 Nov 2012 09:23:50 +0000 (17:23 +0800)]
vfs, freeze: use ACCESS_ONCE() to guard access to ->mnt_flags

The compiler may optimize the while loop and make the check just be done once,
so we should use ACCESS_ONCE() to guard access to ->mnt_flags

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
9 years agoMerge tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro...
Linus Torvalds [Thu, 20 Dec 2012 18:07:25 +0000 (10:07 -0800)]
Merge tag 'iommu-updates-v3.8' of git://git./linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "A few new features this merge-window.  The most important one is
  probably, that dma-debug now warns if a dma-handle is not checked with
  dma_mapping_error by the device driver.  This requires minor changes
  to some architectures which make use of dma-debug.  Most of these
  changes have the respective Acks by the Arch-Maintainers.

  Besides that there are updates to the AMD IOMMU driver for refactor
  the IOMMU-Groups support and to make sure it does not trigger a
  hardware erratum.

  The OMAP changes (for which I pulled in a branch from Tony Lindgren's
  tree) have a conflict in linux-next with the arm-soc tree.  The
  conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is
  deleted in the arm-soc tree.  It is safe to delete the file too so
  solve the conflict.  Similar changes are done in the arm-soc tree in
  the common clock framework migration.  A missing hunk from the patch
  in the IOMMU tree will be submitted as a seperate patch when the
  merge-window is closed."

* tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits)
  ARM: dma-mapping: support debug_dma_mapping_error
  ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks
  iommu/omap: Adapt to runtime pm
  iommu/omap: Migrate to hwmod framework
  iommu/omap: Keep mmu enabled when requested
  iommu/omap: Remove redundant clock handling on ISR
  iommu/amd: Remove obsolete comment
  iommu/amd: Don't use 512GB pages
  iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch
  iommu/tegra: gart: Move bus_set_iommu after probe for multi arch
  iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all
  tile: dma_debug: add debug_dma_mapping_error support
  sh: dma_debug: add debug_dma_mapping_error support
  powerpc: dma_debug: add debug_dma_mapping_error support
  mips: dma_debug: add debug_dma_mapping_error support
  microblaze: dma-mapping: support debug_dma_mapping_error
  ia64: dma_debug: add debug_dma_mapping_error support
  c6x: dma_debug: add debug_dma_mapping_error support
  ARM64: dma_debug: add debug_dma_mapping_error support
  intel-iommu: Prevent devices with RMRRs from being placed into SI Domain
  ...

9 years agointel-iommu: Free old page tables before creating superpage
Woodhouse, David [Wed, 19 Dec 2012 13:25:35 +0000 (13:25 +0000)]
intel-iommu: Free old page tables before creating superpage

The dma_pte_free_pagetable() function will only free a page table page
if it is asked to free the *entire* 2MiB range that it covers. So if a
page table page was used for one or more small mappings, it's likely to
end up still present in the page tables... but with no valid PTEs.

This was fine when we'd only be repopulating it with 4KiB PTEs anyway
but the same virtual address range can end up being reused for a
*large-page* mapping. And in that case were were trying to insert the
large page into the second-level page table, and getting a complaint
from the sanity check in __domain_mapping() because there was already a
corresponding entry. This was *relatively* harmless; it led to a memory
leak of the old page table page, but no other ill-effects.

Fix it by calling dma_pte_clear_range (hopefully redundant) and
dma_pte_free_pagetable() before setting up the new large page.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Tested-by: Ravi Murty <Ravi.Murty@intel.com>
Tested-by: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: stable@kernel.org [3.0+]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoARM: OMAP2+: Fix compillation error in mach-omap2/timer.c
Peter Ujfalusi [Wed, 19 Dec 2012 09:50:09 +0000 (10:50 +0100)]
ARM: OMAP2+: Fix compillation error in mach-omap2/timer.c

prom_add_property() has been renamed to of_add_property()
This patch fixes the following comilation error:

arch/arm/mach-omap2/timer.c: In function ‘omap_get_timer_dt’:
arch/arm/mach-omap2/timer.c:178:3: error: implicit declaration of function ‘prom_add_property’ [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
make[1]: *** [arch/arm/mach-omap2/timer.o] Error 1
make[1]: *** Waiting for unfinished jobs....

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
9 years agoMerge branch 'v3.8-samsung-fixes-audio' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Thu, 20 Dec 2012 16:06:30 +0000 (08:06 -0800)]
Merge branch 'v3.8-samsung-fixes-audio' of git://git./linux/kernel/git/kgene/linux-samsung into fixes

From Kukjin Kim:

This is for fix the following build error in dev-audio.c on current Samsung
platforms :-(

arch/arm/mach-exynos/dev-audio.c:58:4: error: unknown field 'src_clk'
specified in initializer
arch/arm/mach-exynos/dev-audio.c:58:4: warning: initialization makes integer
from pointer without a cast [enabled by default]
arch/arm/mach-exynos/dev-audio.c:58:4: warning: (near initialization for
'i2sv5_pdata.type.i2s.idma_addr') [enabled by default]
arch/arm/mach-exynos/dev-audio.c:91:4: error: unknown field 'src_clk'
specified in initializer
arch/arm/mach-exynos/dev-audio.c:91:4: warning: initialization makes integer
from pointer without a cast [enabled by default]
arch/arm/mach-exynos/dev-audio.c:91:4: warning: (near initialization for
'i2sv3_pdata.type.i2s.idma_addr') [enabled by default]

* 'v3.8-samsung-fixes-audio' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: EXYNOS: Avoid passing the clks through platform data
  ARM: S5PV210: Avoid passing the clks through platform data
  ARM: S5P64X0: Add I2S clkdev support
  ARM: S5PC100: Add I2S clkdev support
  ARM: S3C64XX: Add I2S clkdev support

Signed-off-by: Olof Johansson <olof@lixom.net>
9 years agoMerge branch 'v3.8-samsung-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Thu, 20 Dec 2012 16:04:48 +0000 (08:04 -0800)]
Merge branch 'v3.8-samsung-fixes-1' of git://git./linux/kernel/git/kgene/linux-samsung into fixes

From Kukjin Kim:

Here is Samsung fixes-1 for v3.8-rc1.
Most of them are trivial fixes which are for NULL pointer dereference, MSHC
clocks instance names and exynos5440 stuff.

* 'v3.8-samsung-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: EXYNOS: Fix MSHC clocks instance names
  ARM: EXYNOS: Fix NULL pointer dereference bug in SMDKV310
  ARM: EXYNOS: Fix NULL pointer dereference bug in SMDK4X12
  ARM: EXYNOS: Fix NULL pointer dereference bug in Origen
  ARM: SAMSUNG: Add missing include guard to gpio-core.h
  pinctrl: exynos5440/samsung: Staticize pcfgs
  pinctrl: samsung: Fix a typo in pinctrl-samsung.h
  ARM: EXYNOS: fix skip scu_enable() for EXYNOS5440
  ARM: EXYNOS: fix GIC using for EXYNOS5440
  ARM: EXYNOS: fix build error when MFC is not selected

Signed-off-by: Olof Johansson <olof@lixom.net>
9 years agoARM: sunxi: rename device tree source files
Olof Johansson [Wed, 12 Dec 2012 16:08:23 +0000 (17:08 +0100)]
ARM: sunxi: rename device tree source files

This is the rename portion of "ARM: sunxi: Change device tree naming
scheme for sunxi" that were missed when the patch was applied.

Signed-off-by: Olof Johansson <olof@lixom.net>
9 years agorbd: get rid of rbd_{get,put}_dev()
Alex Elder [Fri, 16 Nov 2012 15:29:16 +0000 (09:29 -0600)]
rbd: get rid of rbd_{get,put}_dev()

The functions rbd_get_dev() and rbd_put_dev() are trivial wrappers
that add no value, and their existence suggests they may do more
than what they do.

Get rid of them.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Dan Mick <dan.mick@inktank.com>
9 years agolibceph: register request before unregister linger
Alex Elder [Thu, 6 Dec 2012 13:22:04 +0000 (07:22 -0600)]
libceph: register request before unregister linger

In kick_requests(), we need to register the request before we
unregister the linger request.  Otherwise the unregister will
reset the request's osd pointer to NULL.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>