~shefty/rdma-dev.git
8 years agomisc/tifm_core: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:31 +0000 (17:04 -0800)]
misc/tifm_core: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomisc/c2port: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:30 +0000 (17:04 -0800)]
misc/c2port: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Rodolfo Giometti <giometti@linux.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomfd: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:29 +0000 (17:04 -0800)]
mfd: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomemstick: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:28 +0000 (17:04 -0800)]
memstick: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodm: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:26 +0000 (17:04 -0800)]
dm: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alasdair Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoIB/qib: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:25 +0000 (17:04 -0800)]
IB/qib: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Mike Marciniszyn <infinipath@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoIB/ocrdma: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:24 +0000 (17:04 -0800)]
IB/ocrdma: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoIB/mlx4: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:23 +0000 (17:04 -0800)]
IB/mlx4: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoIB/ipath: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:22 +0000 (17:04 -0800)]
IB/ipath: convert to idr_alloc()

Convert to the much saner new idr interface.

[yongjun_wei@trendmicro.com.cn: use GFP_NOWAIT under spin lock]
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoIB/ehca: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:21 +0000 (17:04 -0800)]
IB/ehca: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Cc: Christoph Raisch <raisch@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoIB/cxgb4: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:20 +0000 (17:04 -0800)]
IB/cxgb4: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoIB/cxgb3: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:18 +0000 (17:04 -0800)]
IB/cxgb3: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoIB/amso1100: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:17 +0000 (17:04 -0800)]
IB/amso1100: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Cc: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoIB/core: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:16 +0000 (17:04 -0800)]
IB/core: convert to idr_alloc()

Convert to the much saner new idr interface.

v2: Mike triggered WARN_ON() in idr_preload() because send_mad(),
    which may be used from non-process context, was calling
    idr_preload() unconditionally.  Preload iff @gfp_mask has
    __GFP_WAIT.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Reported-by: "Marciniszyn, Mike" <mike.marciniszyn@intel.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoi2c: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:15 +0000 (17:04 -0800)]
i2c: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Wolfram Sang <wolfram@the-dreams.de>
Tested-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrm/vmwgfx: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:14 +0000 (17:04 -0800)]
drm/vmwgfx: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrm/via: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:12 +0000 (17:04 -0800)]
drm/via: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrm/sis: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:11 +0000 (17:04 -0800)]
drm/sis: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrm/i915: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:10 +0000 (17:04 -0800)]
drm/i915: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrm/exynos: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:09 +0000 (17:04 -0800)]
drm/exynos: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Airlie <airlied@linux.ie>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrm: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:08 +0000 (17:04 -0800)]
drm: convert to idr_alloc()

Convert to the much saner new idr interface.

* drm_ctxbitmap_next() error handling in drm_addctx() seems broken.
  drm_ctxbitmap_next() return -errno on failure not -1.

[artem.savkov@gmail.com: missing idr_preload_end in drm_gem_flink_ioctl]
[jslaby@suse.cz: fix drm_gem_flink_ioctl() return value]
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Artem Savkov <artem.savkov@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agogpio: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:06 +0000 (17:04 -0800)]
gpio: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofirewire: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:05 +0000 (17:04 -0800)]
firewire: convert to idr_alloc()

Convert to the much saner new idr interface.

v2: Stefan pointed out that add_client_resource() may be called from
    non-process context.  Preload iff @gfp_mask contains __GFP_WAIT.
    Also updated to include minor upper limit check.

[tim.gardner@canonical.com: fix accidentally orphaned 'minor'[
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofirewire: add minor number range check to fw_device_init()
Tejun Heo [Thu, 28 Feb 2013 01:04:04 +0000 (17:04 -0800)]
firewire: add minor number range check to fw_device_init()

fw_device_init() didn't check whether the allocated minor number isn't
too large.  Fail if it goes overflows MINORBITS.

Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodmaengine: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:03 +0000 (17:04 -0800)]
dmaengine: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Dan Williams <djbw@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodca: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:02 +0000 (17:04 -0800)]
dca: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Cc: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrbd: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:01 +0000 (17:04 -0800)]
drbd: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoatm/nicstar: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:04:00 +0000 (17:04 -0800)]
atm/nicstar: convert to idr_alloc()

Convert to the much saner new idr interface.  The existing code looks
buggy to me - ID 0 is treated as no-ID but allocation specifies 0 as
lower limit and there's no error handling after partial success.  This
conversion keeps the bugs unchanged.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Chas Williams <chas@cmf.nrl.navy.mil>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoblock/loop: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:03:58 +0000 (17:03 -0800)]
block/loop: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoblock: convert to idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:03:57 +0000 (17:03 -0800)]
block: convert to idr_alloc()

Convert to the much saner new idr interface.  Both bsg and genhd
protect idr w/ mutex making preloading unnecessary.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoblock: fix synchronization and limit check in blk_alloc_devt()
Tejun Heo [Thu, 28 Feb 2013 01:03:56 +0000 (17:03 -0800)]
block: fix synchronization and limit check in blk_alloc_devt()

idr allocation in blk_alloc_devt() wasn't synchronized against lookup
and removal, and its limit check was off by one - 1 << MINORBITS is
the number of minors allowed, not the maximum allowed minor.

Add locking and rename MAX_EXT_DEVT to NR_EXT_DEVT and fix limit
checking.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoidr: implement idr_preload[_end]() and idr_alloc()
Tejun Heo [Thu, 28 Feb 2013 01:03:55 +0000 (17:03 -0800)]
idr: implement idr_preload[_end]() and idr_alloc()

The current idr interface is very cumbersome.

* For all allocations, two function calls - idr_pre_get() and
  idr_get_new*() - should be made.

* idr_pre_get() doesn't guarantee that the following idr_get_new*()
  will not fail from memory shortage.  If idr_get_new*() returns
  -EAGAIN, the caller is expected to retry pre_get and allocation.

* idr_get_new*() can't enforce upper limit.  Upper limit can only be
  enforced by allocating and then freeing if above limit.

* idr_layer buffer is unnecessarily per-idr.  Each idr ends up keeping
  around MAX_IDR_FREE idr_layers.  The memory consumed per idr is
  under two pages but it makes it difficult to make idr_layer larger.

This patch implements the following new set of allocation functions.

* idr_preload[_end]() - Similar to radix preload but doesn't fail.
  The first idr_alloc() inside preload section can be treated as if it
  were called with @gfp_mask used for idr_preload().

* idr_alloc() - Allocate an ID w/ lower and upper limits.  Takes
  @gfp_flags and can be used w/o preloading.  When used inside
  preloaded section, the allocation mask of preloading can be assumed.

If idr_alloc() can be called from a context which allows sufficiently
relaxed @gfp_mask, it can be used by itself.  If, for example,
idr_alloc() is called inside spinlock protected region, preloading can
be used like the following.

idr_preload(GFP_KERNEL);
spin_lock(lock);

id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT);

spin_unlock(lock);
idr_preload_end();
if (id < 0)
error;

which is much simpler and less error-prone than idr_pre_get and
idr_get_new*() loop.

The new interface uses per-pcu idr_layer buffer and thus the number of
idr's in the system doesn't affect the amount of memory used for
preloading.

idr_layer_alloc() is introduced to handle idr_layer allocations for
both old and new ID allocation paths.  This is a bit hairy now but the
new interface is expected to replace the old and the internal
implementation eventually will become simpler.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoidr: refactor idr_get_new_above()
Tejun Heo [Thu, 28 Feb 2013 01:03:54 +0000 (17:03 -0800)]
idr: refactor idr_get_new_above()

Move slot filling to idr_fill_slot() from idr_get_new_above_int() and
make idr_get_new_above() directly call it.  idr_get_new_above_int() is
no longer needed and removed.

This will be used to implement a new ID allocation interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoidr: remove _idr_rc_to_errno() hack
Tejun Heo [Thu, 28 Feb 2013 01:03:53 +0000 (17:03 -0800)]
idr: remove _idr_rc_to_errno() hack

idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate
exception conditions internally.  The return value is later translated
to errno values using _idr_rc_to_errno().

This is confusing.  Drop the custom ones and consistently use -EAGAIN
for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for
"ran out of ID space".

Due to the weird memory preloading mechanism, [ra]_get_new*() return
-EAGAIN on memory shortage, so we need to substitute -ENOMEM w/
-EAGAIN on those interface functions.  They'll eventually be cleaned
up and the translations will go away.

This patch doesn't introduce any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoidr: relocate idr_for_each_entry() and reorganize id[r|a]_get_new()
Tejun Heo [Thu, 28 Feb 2013 01:03:52 +0000 (17:03 -0800)]
idr: relocate idr_for_each_entry() and reorganize id[r|a]_get_new()

* Move idr_for_each_entry() definition next to other idr related
  definitions.

* Make id[r|a]_get_new() inline wrappers of id[r|a]_get_new_above().

This changes the implementation of idr_get_new() but the new
implementation is trivial.  This patch doesn't introduce any
functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoidr: cosmetic updates to struct / initializer definitions
Tejun Heo [Thu, 28 Feb 2013 01:03:51 +0000 (17:03 -0800)]
idr: cosmetic updates to struct / initializer definitions

* Tab align fields like a normal person.

* Drop the unnecessary 0 inits from IDR_INIT().

This patch is purely cosmetic.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoidr: deprecate idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:50 +0000 (17:03 -0800)]
idr: deprecate idr_remove_all()

There was only one legitimate use of idr_remove_all() and a lot more of
incorrect uses (or lack of it).  Now that idr_destroy() implies
idr_remove_all() and all the in-kernel users updated not to use it,
there's no reason to keep it around.  Mark it deprecated so that we can
later unexport it.

idr_remove_all() is made an inline function calling __idr_remove_all()
to avoid triggering deprecated warning on EXPORT_SYMBOL().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocgroup: don't use idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:49 +0000 (17:03 -0800)]
cgroup: don't use idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoinotify: don't use idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:48 +0000 (17:03 -0800)]
inotify: don't use idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agonfs: idr_destroy() no longer needs idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:46 +0000 (17:03 -0800)]
nfs: idr_destroy() no longer needs idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop reference to idr_remove_all().  Note that the code
wasn't completely correct before because idr_remove() on all entries
doesn't necessarily release all idr_layers which could lead to memory
leak.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodlm: don't use idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:45 +0000 (17:03 -0800)]
dlm: don't use idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.

The conversion isn't completely trivial for recover_idr_clear() as it's
the only place in kernel which makes legitimate use of idr_remove_all()
w/o idr_destroy().  Replace it with idr_remove() call inside
idr_for_each_entry() loop.  It goes on top so that it matches the
operation order in recover_idr_del().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodlm: use idr_for_each_entry() in recover_idr_clear() error path
Tejun Heo [Thu, 28 Feb 2013 01:03:44 +0000 (17:03 -0800)]
dlm: use idr_for_each_entry() in recover_idr_clear() error path

Convert recover_idr_clear() to use idr_for_each_entry() instead of
idr_for_each().  It's somewhat less efficient this way but it shouldn't
matter in an error path.  This is to help with deprecation of
idr_remove_all().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agorpmsg: don't use idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:43 +0000 (17:03 -0800)]
rpmsg: don't use idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoremoteproc: don't use idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:42 +0000 (17:03 -0800)]
remoteproc: don't use idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodm: don't use idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:41 +0000 (17:03 -0800)]
dm: don't use idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alasdair Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrm: don't use idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:39 +0000 (17:03 -0800)]
drm: don't use idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.

* drm_ctxbitmap_cleanup() was calling idr_remove_all() but forgetting
  idr_destroy() thus leaking all buffered free idr_layers.  Replace it
  with idr_destroy().

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Airlie <airlied@linux.ie>
Cc: Inki Dae <inki.dae@samsung.com>
Cc: Joonyoung Shim <jy0922.shim@samsung.com>
Cc: Seung-Woo Kim <sw0312.kim@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofirewire: don't use idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:38 +0000 (17:03 -0800)]
firewire: don't use idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoblock/loop: don't use idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:37 +0000 (17:03 -0800)]
block/loop: don't use idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoatm/nicstar: don't use idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:36 +0000 (17:03 -0800)]
atm/nicstar: don't use idr_remove_all()

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.  Drop its usage.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Chas Williams <chas@cmf.nrl.navy.mil>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoidr: make idr_destroy() imply idr_remove_all()
Tejun Heo [Thu, 28 Feb 2013 01:03:35 +0000 (17:03 -0800)]
idr: make idr_destroy() imply idr_remove_all()

idr is silly in quite a few ways, one of which is how it's supposed to
be destroyed - idr_destroy() doesn't release IDs and doesn't even whine
if the idr isn't empty.  If the caller forgets idr_remove_all(), it
simply leaks memory.

Even ida gets this wrong and leaks memory on destruction.  There is
absoltely no reason not to call idr_remove_all() from idr_destroy().
Nobody is abusing idr_destroy() for shrinking free layer buffer and
continues to use idr after idr_destroy(), so it's safe to do remove_all
from destroy.

In the whole kernel, there is only one place where idr_remove_all() is
legitimiately used without following idr_destroy() while there are quite
a few places where the caller forgets either idr_remove_all() or
idr_destroy() leaking memory.

This patch makes idr_destroy() call idr_destroy_all() and updates the
function description accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoidr: fix a subtle bug in idr_get_next()
Tejun Heo [Thu, 28 Feb 2013 01:03:34 +0000 (17:03 -0800)]
idr: fix a subtle bug in idr_get_next()

The iteration logic of idr_get_next() is borrowed mostly verbatim from
idr_for_each().  It walks down the tree looking for the slot matching
the current ID.  If the matching slot is not found, the ID is
incremented by the distance of single slot at the given level and
repeats.

The implementation assumes that during the whole iteration id is aligned
to the layer boundaries of the level closest to the leaf, which is true
for all iterations starting from zero or an existing element and thus is
fine for idr_for_each().

However, idr_get_next() may be given any point and if the starting id
hits in the middle of a non-existent layer, increment to the next layer
will end up skipping the same offset into it.  For example, an IDR with
IDs filled between [64, 127] would look like the following.

          [  0  64 ... ]
       /----/   |
       |        |
      NULL    [ 64 ... 127 ]

If idr_get_next() is called with 63 as the starting point, it will try
to follow down the pointer from 0.  As it is NULL, it will then try to
proceed to the next slot in the same level by adding the slot distance
at that level which is 64 - making the next try 127.  It goes around the
loop and finds and returns 127 skipping [64, 126].

Note that this bug also triggers in idr_for_each_entry() loop which
deletes during iteration as deletions can make layers go away leaving
the iteration with unaligned ID into missing layers.

Fix it by ensuring proceeding to the next slot doesn't carry over the
unaligned offset - ie.  use round_up(id + 1, slot_distance) instead of
id += slot_distance.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: David Teigland <teigland@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoblock: fix ext_devt_idr handling
Tomas Henzl [Thu, 28 Feb 2013 01:03:32 +0000 (17:03 -0800)]
block: fix ext_devt_idr handling

While adding and removing a lot of disks disks and partitions this
sometimes shows up:

  WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted)
  Hardware name:
  sysfs: cannot create duplicate filename '/dev/block/259:751'
  Modules linked in: raid1 autofs4 bnx2fc cnic uio fcoe libfcoe libfc 8021q scsi_transport_fc scsi_tgt garp stp llc sunrpc cpufreq_ondemand powernow_k8 freq_table mperf ipv6 dm_mirror dm_region_hash dm_log power_meter microcode dcdbas serio_raw amd64_edac_mod edac_core edac_mce_amd i2c_piix4 i2c_core k10temp bnx2 sg ixgbe dca mdio ext4 mbcache jbd2 dm_round_robin sr_mod cdrom sd_mod crc_t10dif ata_generic pata_acpi pata_atiixp ahci mptsas mptscsih mptbase scsi_transport_sas dm_multipath dm_mod [last unloaded: scsi_wait_scan]
  Pid: 44103, comm: async/16 Not tainted 2.6.32-195.el6.x86_64 #1
  Call Trace:
    warn_slowpath_common+0x87/0xc0
    warn_slowpath_fmt+0x46/0x50
    sysfs_add_one+0xc9/0x130
    sysfs_do_create_link+0x12b/0x170
    sysfs_create_link+0x13/0x20
    device_add+0x317/0x650
    idr_get_new+0x13/0x50
    add_partition+0x21c/0x390
    rescan_partitions+0x32b/0x470
    sd_open+0x81/0x1f0 [sd_mod]
    __blkdev_get+0x1b6/0x3c0
    blkdev_get+0x10/0x20
    register_disk+0x155/0x170
    add_disk+0xa6/0x160
    sd_probe_async+0x13b/0x210 [sd_mod]
    add_wait_queue+0x46/0x60
    async_thread+0x102/0x250
    default_wake_function+0x0/0x20
    async_thread+0x0/0x250
    kthread+0x96/0xa0
    child_rip+0xa/0x20
    kthread+0x0/0xa0
    child_rip+0x0/0x20

This most likely happens because dev_t is freed while the number is
still used and idr_get_new() is not protected on every use.  The fix
adds a mutex where it wasn't before and moves the dev_t free function so
it is called after device del.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokexec: avoid freeing NULL pointer in image_crash_alloc()
Zhang Yanfei [Thu, 28 Feb 2013 01:03:31 +0000 (17:03 -0800)]
kexec: avoid freeing NULL pointer in image_crash_alloc()

Though there is no error if we free a NULL pointer, I think we could
avoid this behaviour.  Change the code a little in kimage_crash_alloc()
could avoid this kind of unnecessary free.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokexec: fix memory leak in function kimage_normal_alloc
Zhang Yanfei [Thu, 28 Feb 2013 01:03:29 +0000 (17:03 -0800)]
kexec: fix memory leak in function kimage_normal_alloc

If kimage_normal_alloc() fails to alloc pages for image->swap_page, it
should call kimage_free_page_list() to free allocated pages in
image->control_pages list before it frees image.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokexec: prevent double free on image allocation failure
Sasha Levin [Thu, 28 Feb 2013 01:03:28 +0000 (17:03 -0800)]
kexec: prevent double free on image allocation failure

If kimage_normal_alloc() fails to initialize an allocated kimage, it will
free the image but would still set 'rimage', as a result kexec_load will
try to free it again.

This would explode as part of the freeing process is accessing internal
members which point to uninitialized memory.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokexec: export PG_hwpoison flag into vmcoreinfo
Mitsuhiro Tanino [Thu, 28 Feb 2013 01:03:27 +0000 (17:03 -0800)]
kexec: export PG_hwpoison flag into vmcoreinfo

This patch exports a PG_hwpoison into vmcoreinfo when
CONFIG_MEMORY_FAILURE is defined.  "makedumpfile" needs to read
information of memory, such as 'mem_section', 'zone', 'pageflags' from
vmcore.

We introduce a function into "makedumpfile" to exclude hwpoison page from
vmcore dump.  In order to introduce this function, PG_hwpoison flag have
to export into vmcoreinfo.

Signed-off-by: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokexec: get rid of duplicate check for hole_end
Zhang Yanfei [Thu, 28 Feb 2013 01:03:26 +0000 (17:03 -0800)]
kexec: get rid of duplicate check for hole_end

hole_end has been checked to make sure it is <= crash_res.end in the while
condition check, so the if condition check is duplicate.

Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokexec: add the values related to buddy system for filtering free pages.
Atsushi Kumagai [Thu, 28 Feb 2013 01:03:25 +0000 (17:03 -0800)]
kexec: add the values related to buddy system for filtering free pages.

tAdd adds the values related to buddy system to vmcoreinfo data so that
makedumpfile (dump filtering command) can filter out all free pages with
the new logic.

It's faster than the current logic because it can distinguish free page
by analyzing page structure at the same time as filtering for other
unnecessary pages (e.g.  anonymous page).

OTOH, the current logic has to trace free_list to distinguish free pages
while analyzing page structure to filter out other unnecessary pages.

The new logic uses the fact that buddy page is marked by _mapcount ==
PAGE_BUDDY_MAPCOUNT_VALUE.  But, _mapcount shares its memory with other
fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab
is set or not before looking up _mapcount value.  And we can get the
order of buddy system from private field.  To sum it up, the values
below are required for this logic.

Required values:
  - OFFSET(page._mapcount)
  - OFFSET(page.private)
  - NUMBER(PG_slab)
  - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)

Changelog from v1 to v2:
1. remove SIZE(pageflags)
  The new logic was changed after I sent v1 patch.
  Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile.

What's makedumpfile:
  makedumpfile creates a small dumpfile by excluding unnecessary pages
  for the analysis. To distinguish unnecessary pages, makedumpfile gets
  the vmcoreinfo data which has the minimum debugging information only
  for dump filtering.

Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofork: unshare: remove dead code
Alan Cox [Thu, 28 Feb 2013 01:03:23 +0000 (17:03 -0800)]
fork: unshare: remove dead code

If new_nsproxy is set we will always call switch_task_namespaces and
then set new_nsproxy back to NULL so the reassignment and fall through
check are redundant

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs/seq_file.c:seq_lseek(): fix switch statement indenting
Andrew Morton [Thu, 28 Feb 2013 01:03:22 +0000 (17:03 -0800)]
fs/seq_file.c:seq_lseek(): fix switch statement indenting

[akpm@linux-foundation.org: checkpatch fixes]
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoseq-file: use SEEK_ macros instead of hardcoded numbers
Cyrill Gorcunov [Thu, 28 Feb 2013 01:03:21 +0000 (17:03 -0800)]
seq-file: use SEEK_ macros instead of hardcoded numbers

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocoredump: use a freezable_schedule for the coredump_finish wait
Mandeep Singh Baines [Thu, 28 Feb 2013 01:03:20 +0000 (17:03 -0800)]
coredump: use a freezable_schedule for the coredump_finish wait

Prevents hung_task detector from panicing the machine. This is also
needed to prevent this wait from blocking suspend.

(It doesnt' currently block suspend but it would once the next
patch in this series is applied.)

[yongjun_wei@trendmicro.com.cn: kernel/exit.c: remove duplicated include]
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Ben Chan <benchan@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolockdep: check that no locks held at freeze time
Mandeep Singh Baines [Thu, 28 Feb 2013 01:03:18 +0000 (17:03 -0800)]
lockdep: check that no locks held at freeze time

We shouldn't try_to_freeze if locks are held.  Holding a lock can cause a
deadlock if the lock is later acquired in the suspend or hibernate path
(e.g.  by dpm).  Holding a lock can also cause a deadlock in the case of
cgroup_freezer if a lock is held inside a frozen cgroup that is later
acquired by a process outside that group.

[akpm@linux-foundation.org: export debug_check_no_locks_held]
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Ben Chan <benchan@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs/proc/vmcore.c: put if tests in the top of the while loop to reduce duplication
Zhang Yanfei [Thu, 28 Feb 2013 01:03:17 +0000 (17:03 -0800)]
fs/proc/vmcore.c: put if tests in the top of the while loop to reduce duplication

In read_vmcore() two `if' tests are duplicated.  Change the position of
them could reduce the duplication.  This change does not affect the
behaviour of the function.

[akpm@linux-foundation.org: avoid `if (foo = bar)' thing, use min_t()]
[akpm@linux-foundation.org: s/max_t/min_t/]
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs/proc: clean up printks
Andrew Morton [Thu, 28 Feb 2013 01:03:16 +0000 (17:03 -0800)]
fs/proc: clean up printks

- use pr_foo() throughout

- remove a couple of duplicated KERN_WARNINGs, via WARN(KERN_WARNING "...")

- nuke a few warnings which I've never seen happen, ever.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocoredump: remove redundant defines for dumpable states
Kees Cook [Thu, 28 Feb 2013 01:03:15 +0000 (17:03 -0800)]
coredump: remove redundant defines for dumpable states

The existing SUID_DUMP_* defines duplicate the newer SUID_DUMPABLE_*
defines introduced in 54b501992dd2 ("coredump: warn about unsafe
suid_dumpable / core_pattern combo").  Remove the new ones, and use the
prior values instead.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Chen Gang <gang.chen@asianux.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@linux.intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agokernel/signal.c: fix suboptimal printk usage
Valdis Kletnieks [Thu, 28 Feb 2013 01:03:13 +0000 (17:03 -0800)]
kernel/signal.c: fix suboptimal printk usage

Several printk's were missing KERN_INFO and KERN_CONT flags.  In
addition, a printk that was outside a #if/#endif should have been
inside, which would result in stray blank line on non-x86 boxes.

Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agosignal: allow to send any siginfo to itself
Andrey Vagin [Thu, 28 Feb 2013 01:03:12 +0000 (17:03 -0800)]
signal: allow to send any siginfo to itself

The idea is simple.  We need to get the siginfo for each signal on
checkpointing dump, and then return it back on restore.

The first problem is that the kernel doesn't report complete siginfos to
userspace.  In a signal handler the kernel strips SI_CODE from siginfo.
When a siginfo is received from signalfd, it has a different format with
fixed sizes of fields.  The interface of signalfd was extended.  If a
signalfd is created with the flag SFD_RAW, it returns siginfo in a raw
format.

rt_sigqueueinfo looks suitable for restoring signals, but it can't send
siginfo with a positive si_code, because these codes are reserved for
the kernel.  In the real world each person has right to do anything with
himself, so I think a process should able to send any siginfo to itself.

This patch:

The kernel prevents sending of siginfo with positive si_code, because
these codes are reserved for kernel.  I think we can allow a task to
send such a siginfo to itself.  This operation should not be dangerous.

This functionality is required for restoring signals in
checkpoint/restart.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoDocumentation/cgroups/blkio-controller.txt: fix typo
Warren Turkal [Thu, 28 Feb 2013 01:03:11 +0000 (17:03 -0800)]
Documentation/cgroups/blkio-controller.txt: fix typo

Signed-off-by: Warren Turkal <wt@ooyala.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoDocumentation/DMA-API-HOWTO.txt: minor grammar corrections
Shuah Khan [Thu, 28 Feb 2013 01:03:10 +0000 (17:03 -0800)]
Documentation/DMA-API-HOWTO.txt: minor grammar corrections

Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofat: mark fs as dirty on mount and clean on umount
Oleksij Rempel [Thu, 28 Feb 2013 01:03:09 +0000 (17:03 -0800)]
fat: mark fs as dirty on mount and clean on umount

There is no documented methods to mark FAT as dirty.  Unofficially MS
started to use reserved Byte in boot sector for this purpose, at least
since Win 2000.  With Win 7 user is warned if fs is dirty and asked to
clean it.

Different versions of Win, handle it in different ways, but always have
same meaning:

- Win 2000 and XP, set it on write operations and
  remove it after operation was finnished
- Win 7, set dirty flag on first write and remove it on umount.

We will do it as follows:

- set dirty flag on mount. If fs was initially dirty, warn user,
  remember it and do not do any changes to boot sector.
- clean it on umount. If fs was initially dirty, leave it dirty.
- do not do any thing if fs mounted read-only.
- TODO: leave fs dirty if we found some error after mount.

Signed-off-by: Oleksij Rempel <bug-track@fisher-privat.net>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofat: add extended fileds to struct fat_boot_sector
Oleksij Rempel [Thu, 28 Feb 2013 01:03:07 +0000 (17:03 -0800)]
fat: add extended fileds to struct fat_boot_sector

Later we will need "state" field to check if volume was cleanly unmounted.

Signed-off-by: Oleksij Rempel <bug-track@fisher-privat.net>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agohfsplus: fix issue with unzeroed unused b-tree nodes
Vyacheslav Dubeyko [Thu, 28 Feb 2013 01:03:06 +0000 (17:03 -0800)]
hfsplus: fix issue with unzeroed unused b-tree nodes

The fsck_hfs (under MacOS X) complains about unzeroed unused b-tree nodes
after deletion of folders' tree under Linux.

SYMPTOMS:

  Running Disk Utiltiy's "Verify Disk" on "test" gives the following:
  Verifying volume “Test”
  Checking file systemChecking Journaled HFS Plus volume.
  Checking extents overflow file.
  Checking catalog file.
  Unused node is not erased (node = 3111)
  Checking multi-linked files.
  Checking catalog hierarchy.
  Checking extended attributes file.
  Checking volume bitmap.
  Checking volume information.
  The volume Test was found corrupt and needs to be repaired.
  Error: This disk needs to be repaired. Click Repair Disk.

REPRODUCING PATH:

1. Prepare HFS+ (non-case sensitive) partition (for example, 5GB)
   under MacOS X.
2. Copy linux kernel source tree (for example, 3.7-rc6 version) on
   this partition under MacOS X.
3. Then switch to Linux and mount this prepared partition.
4. Execute `sudo rm -r` under prepared directory with linux kernel
   source tree.
5. Unmount and boot back into OS X.
6. Open up Disk Utility and verify partition.

REPRODUCIBILITY: 100%

FIX:

It is added code of node clearing in hfs_bnode_put() method for the case
when node has flag HFS_BNODE_DELETED.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Kyle Laracey <kalaracey@gmail.com>
Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agohfsplus: add support of manipulation by attributes file
Vyacheslav Dubeyko [Thu, 28 Feb 2013 01:03:04 +0000 (17:03 -0800)]
hfsplus: add support of manipulation by attributes file

Add support of manipulation by attributes file.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agohfsplus: rework functionality of getting, setting and deleting of extended attributes
Vyacheslav Dubeyko [Thu, 28 Feb 2013 01:03:03 +0000 (17:03 -0800)]
hfsplus: rework functionality of getting, setting and deleting of extended attributes

Rework functionality of getting, setting and deleting of extended attributes.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agohfsplus: add functionality of manipulating by records in attributes tree
Vyacheslav Dubeyko [Thu, 28 Feb 2013 01:03:01 +0000 (17:03 -0800)]
hfsplus: add functionality of manipulating by records in attributes tree

Add functionality of manipulating by records in attributes tree.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agohfsplus: add on-disk layout declarations related to attributes tree
Vyacheslav Dubeyko [Thu, 28 Feb 2013 01:03:00 +0000 (17:03 -0800)]
hfsplus: add on-disk layout declarations related to attributes tree

Add all necessary on-disk layout declarations related to attributes file.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agohfsplus: add osx.* prefix for handling namespace of Mac OS X extended attributes
Vyacheslav Dubeyko [Thu, 28 Feb 2013 01:02:59 +0000 (17:02 -0800)]
hfsplus: add osx.* prefix for handling namespace of Mac OS X extended attributes

hfsplus: reworked support of extended attributes.

Current mainline implementation of hfsplus file system driver treats as
extended attributes only two fields (fdType and fdCreator) of user_info
field in file description record (struct hfsplus_cat_file).  It is
possible to get or set only these two fields as extended attributes.
But HFS+ treats as com.apple.FinderInfo extended attribute an union of
user_info and finder_info fields as for file (struct hfsplus_cat_file)
as for folder (struct hfsplus_cat_folder).  Moreover, current mainline
implementation of hfsplus file system driver doesn't support special
metadata file - attributes tree.

Mac OS X 10.4 and later support extended attributes by making use of the
HFS+ filesystem Attributes file B*-tree feature which allows for named
forks.  Mac OS X supports only inline extended attributes, limiting
their size to 3802 bytes.  Any regular file may have a list of extended
attributes.  HFS+ supports an arbitrary number of named forks.  Each
attribute is denoted by a name and the associated data.  The name is a
null-terminated Unicode string.  It is possible to list, to get, to set,
and to remove extended attributes from files or directories.

It exists some peculiarity during getting of extended attributes list by
means of getfattr utility.  The getfattr utility expects prefix "user."
before any extended attribute's name.  So, it ignores any names that
don't contained such prefix.  Such behavior of getfattr utility results
in unexpected empty output of extended attributes list even in the case
when file (or folder) contains extended attributes.  It needs to use
empty string as regular expression pattern for names matching (getfattr
--match="").

For support of extended attributes in HFS+:
1. It was added necessary on-disk layout declarations related to Attributes
   tree into hfsplus_raw.h file.
2. It was added attributes.c file with implementation of functionality of
   manipulation by records in Attributes tree.
3. It was reworked hfsplus_listxattr, hfsplus_getxattr, hfsplus_setxattr
   functions in ioctl.c. Moreover, it was added hfsplus_removexattr method.

This patch:

Add osx.* prefix for handling namespace of Mac OS X extended attributes.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/scatterlist: use page iterator in the mapping iterator
Imre Deak [Thu, 28 Feb 2013 01:02:57 +0000 (17:02 -0800)]
lib/scatterlist: use page iterator in the mapping iterator

For better code reuse use the newly added page iterator to iterate
through the pages.  The offset, length within the page is still
calculated by the mapping iterator as well as the actual mapping.  Idea
from Tejun Heo.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/scatterlist: add simple page iterator
Imre Deak [Thu, 28 Feb 2013 01:02:56 +0000 (17:02 -0800)]
lib/scatterlist: add simple page iterator

Add an iterator to walk through a scatter list a page at a time starting
at a specific page offset.  As opposed to the mapping iterator this is
meant to be small, performing well even in simple loops like collecting
all pages on the scatterlist into an array or setting up an iommu table
based on the pages' DMA address.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: update Tegra section to capture all Tegra files
Stephen Warren [Thu, 28 Feb 2013 01:02:54 +0000 (17:02 -0800)]
MAINTAINERS: update Tegra section to capture all Tegra files

The intent is to ensure that all Tegra-related patches are sent to the
linux-tegra@ mailing list, so people can keep up-to-date on all misc
driver changes.

Doing this with a keyword is far simpler and more compact than listing
all Tegra-related drivers, even if wildcards were used.

Words such as integrate or integrator are common.  Ensure the character
right before "tegra" isn't a-z (case-insensitive), to make sure the
keyword doesn't match those.

The only files that the keyword doesn't match are the NVEC driver.  Add
the linux-tegra mailing list to the NVEC entry to solve this.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Cc: Joe Perches <joe@perches.com>
Cc: Julian Andres Klode <jak@jak-linux.org>
Cc: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoget_maintainer: allow keywords to match filenames
Stephen Warren [Thu, 28 Feb 2013 01:02:53 +0000 (17:02 -0800)]
get_maintainer: allow keywords to match filenames

Allow K: entries in MAINTAINERS to match directly against filenames;
either those extracted from patch +++ or --- lines, or those specified
on the command-line using the -f option.

This potentially allows fewer lines in a MAINTAINERS entry, if all the
relevant files are scattered throughout the whole kernel tree, yet
contain some common keyword.  An example would be using an ARM SoC name
as the keyword to catch all related drivers.

I don't think setting exact_pattern_match_hash would be appropriate
here; at least for intended Tegra use case, this feature is to ensure
that all Tegra-related driver changes get Cc'd to the Tegra mailing
list.  Setting exact_pattern_match_hash would prevent git history
parsing for e.g.  S-o-b tags, which still seems like it would be useful.
Hence, this flag isn't set.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agousermodehelper: cleanup/fix __orderly_poweroff() && argv_free()
Oleg Nesterov [Thu, 28 Feb 2013 01:02:52 +0000 (17:02 -0800)]
usermodehelper: cleanup/fix __orderly_poweroff() && argv_free()

__orderly_poweroff() does argv_free() if call_usermodehelper_fns()
returns -ENOMEM.  As Lucas pointed out, this can be wrong if -ENOMEM was
not triggered by the failing call_usermodehelper_setup(), in this case
both __orderly_poweroff() and argv_cleanup() can do kfree().

Kill argv_cleanup() and change __orderly_poweroff() to call argv_free()
unconditionally like do_coredump() does.  This info->cleanup() is not
needed (and wrong) since 6c0c0d4d "fix bug in orderly_poweroff() which
did the UMH_NO_WAIT => UMH_WAIT_EXEC change, we can rely on the fact
that CLONE_VFORK can't return until do_execve() succeeds/fails.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: hongfeng <hongfeng@marvell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: use vm_unmapped_area() on frv architecture
Michel Lespinasse [Thu, 28 Feb 2013 01:02:51 +0000 (17:02 -0800)]
mm: use vm_unmapped_area() on frv architecture

Update the frv arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: ac->ac_allow_chain_relink=0 won't disable group relink
Xiaowei.Hu [Thu, 28 Feb 2013 01:02:49 +0000 (17:02 -0800)]
ocfs2: ac->ac_allow_chain_relink=0 won't disable group relink

ocfs2_block_group_alloc_discontig() disables chain relink by setting
ac->ac_allow_chain_relink = 0 because it grabs clusters from multiple
cluster groups.

It doesn't keep the credits for all chain relink,but
ocfs2_claim_suballoc_bits overrides this in this call trace:
ocfs2_block_group_claim_bits()->ocfs2_claim_clusters()->
__ocfs2_claim_clusters()->ocfs2_claim_suballoc_bits()
ocfs2_claim_suballoc_bits set ac->ac_allow_chain_relink = 1; then call
ocfs2_search_chain() one time and disable it again, and then we run out
of credits.

Fix is to allow relink by default and disable it in
ocfs2_block_group_alloc_discontig.

Without this patch, End-users will run into a crash due to run out of
credits, backtrace like this:

  RIP: 0010:[<ffffffffa0808b14>]  [<ffffffffa0808b14>]
  jbd2_journal_dirty_metadata+0x164/0x170 [jbd2]
  RSP: 0018:ffff8801b919b5b8  EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff88022139ddc0 RCX: ffff880159f652d0
  RDX: ffff880178aa3000 RSI: ffff880159f652d0 RDI: ffff880087f09bf8
  RBP: ffff8801b919b5e8 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000001e00 R11: 00000000000150b0 R12: ffff880159f652d0
  R13: ffff8801a0cae908 R14: ffff880087f09bf8 R15: ffff88018d177800
  FS:  00007fc9b0b6b6e0(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 000000000040819c CR3: 0000000184017000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process dd (pid: 9945, threadinfo ffff8801b919a000, task ffff880149a264c0)
  Call Trace:
    ocfs2_journal_dirty+0x2f/0x70 [ocfs2]
    ocfs2_relink_block_group+0x111/0x480 [ocfs2]
    ocfs2_search_chain+0x455/0x9a0 [ocfs2]
    ...

Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2: fix ocfs2_init_security_and_acl() to initialize acl correctly
Jeff Liu [Thu, 28 Feb 2013 01:02:48 +0000 (17:02 -0800)]
ocfs2: fix ocfs2_init_security_and_acl() to initialize acl correctly

We need to re-initialize the security for a new reflinked inode with its
parent dirs if it isn't specified to be preserved for ocfs2_reflink().
However, the code logic is broken at ocfs2_init_security_and_acl()
although ocfs2_init_security_get() succeed.  As a result,
ocfs2_acl_init() does not involked and therefore the default ACL of
parent dir was missing on the new inode.

Note this was introduced by 9d8f13ba3 ("security: new
security_inode_init_security API adds function callback")

To reproduce:

    set default ACL for the parent dir(ocfs2 in this case):
    $ setfacl -m default:user:jeff:rwx ../ocfs2/
    $ getfacl ../ocfs2/
    # file: ../ocfs2/
    # owner: jeff
    # group: jeff
    user::rwx
    group::r-x
    other::r-x
    default:user::rwx
    default:user:jeff:rwx
    default:group::r-x
    default:mask::rwx
    default:other::r-x

    $ touch a
    $ getfacl a
    # file: a
    # owner: jeff
    # group: jeff
    user::rw-
    group::rw-
    other::r--

Before patching, create reflink file b from a, the user
default ACL entry(user:jeff:rwx)was missing:

    $ ./ocfs2_reflink a b
    $ getfacl b
    # file: b
    # owner: jeff
    # group: jeff
    user::rw-
    group::rw-
    other::r--

In this case, the end user can also observed an error message at syslog:

  (ocfs2_reflink,3229,2):ocfs2_init_security_and_acl:7193 ERROR: status = 0

After applying this patch, create reflink file c from a:

    $ ./ocfs2_reflink a c
    $ getfacl c
    # file: c
    # owner: jeff
    # group: jeff
    user::rw-
    user:jeff:rwx #effective:rw-
    group::r-x #effective:r--
    mask::rw-
    other::r--

Test program:
/* Usage: reflink <source> <dest> */
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <string.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ioctl.h>

static int
reflink_file(char const *src_name, char const *dst_name,
     bool preserve_attrs)
{
int fd;

#ifndef REFLINK_ATTR_NONE
#  define REFLINK_ATTR_NONE 0
#endif
#ifndef REFLINK_ATTR_PRESERVE
#  define REFLINK_ATTR_PRESERVE 1
#endif
#ifndef OCFS2_IOC_REFLINK
struct reflink_arguments {
uint64_t old_path;
uint64_t new_path;
uint64_t preserve;
};

#  define OCFS2_IOC_REFLINK _IOW ('o', 4, struct reflink_arguments)
#endif
struct reflink_arguments args = {
.old_path = (unsigned long) src_name,
.new_path = (unsigned long) dst_name,
.preserve = preserve_attrs ? REFLINK_ATTR_PRESERVE :
     REFLINK_ATTR_NONE,
};

fd = open(src_name, O_RDONLY);
if (fd < 0) {
fprintf(stderr, "Failed to open %s: %s\n",
src_name, strerror(errno));
return -1;
}

if (ioctl(fd, OCFS2_IOC_REFLINK, &args) < 0) {
fprintf(stderr, "Failed to reflink %s to %s: %s\n",
src_name, dst_name, strerror(errno));
return -1;
}
}

int
main(int argc, char *argv[])
{
if (argc != 3) {
fprintf(stdout, "Usage: %s source dest\n", argv[0]);
return 1;
}

return reflink_file(argv[1], argv[2], 0);
}

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Tao Ma <boyu.mt@taobao.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoscripts/kernel-doc: handle struct member __aligned without numbers
Nishanth Menon [Thu, 28 Feb 2013 01:02:46 +0000 (17:02 -0800)]
scripts/kernel-doc: handle struct member __aligned without numbers

Commit ef5da59f1260 ("scripts/kernel-doc: handle struct member
__aligned") permits "char something [123] __aligned(8);".

However, by using \d we constraint ourselves with integers.  This is not
always the case.  In fact, it might be better to do char something[123]
__aligned(sizeof(u16));

For example, With wireless_dev defining:

    u8 address[ETH_ALEN] __aligned(sizeof(u16));

With \d, scripts/kernel-doc erroneously says:

    Warning(include/net/cfg80211.h:2618): Excess struct/union/enum/typedef member 'address' description in 'wireless_dev'

This is because the regex __aligned\s*\(\d+\) fails match at \d as
sizeof is used.

So replace \d with .  to indicate "something" in kernel-doc to ignore
__aligned(SOMETHING) in structs.  With this change, we can use integers
OR sizeof() or macros as we please.

Signed-off-by: Nishanth Menon <nm@ti.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: accelerate munlock() treatment of THP pages
Michel Lespinasse [Thu, 28 Feb 2013 01:02:44 +0000 (17:02 -0800)]
mm: accelerate munlock() treatment of THP pages

munlock_vma_pages_range() was always incrementing addresses by PAGE_SIZE
at a time.  When munlocking THP pages (or the huge zero page), this
resulted in taking the mm->page_table_lock 512 times in a row.

We can do better by making use of the page_mask returned by
follow_page_mask (for the huge zero page case), or the size of the page
munlock_vma_page() operated on (for the true THP page case).

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agobacklight: add new lp8788 backlight driver
Kim, Milo [Thu, 28 Feb 2013 01:02:43 +0000 (17:02 -0800)]
backlight: add new lp8788 backlight driver

TI LP8788 PMU supports regulators, battery charger, RTC, ADC, backlight
dri= ver and current sinks.  This patch enables LP8788 backlight module.

(Brightness mode)
The brightness is controlled by PWM input or I2C register.
All modes are supported in the driver.

(Platform data)
Configurable data can be defined in the platform side.
 name                  : backlight driver name. (default: "lcd-backlight")
 initial_brightness    : initial value of backlight brightness
 bl_mode               : brightness control by PWM or lp8788 register
 dim_mode              : dimming mode selection
 full_scale            : full scale current setting
 rise_time             : brightness ramp up step time
 fall_time             : brightness ramp down step time
 pwm_pol               : PWM polarity setting when bl_mode is PWM based
 period_ns             : platform specific PWM period value. unit is nano.

The default values are set in case no platform data is defined.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Thierry Reding <thierry.reding@avionic-design.de>
Cc: "devendra.aaru" <devendra.aaru@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/devres.c: fix misplaced #endif
Jingoo Han [Thu, 28 Feb 2013 01:02:42 +0000 (17:02 -0800)]
lib/devres.c: fix misplaced #endif

A misplaced #endif causes link errors related to pcim_*() functions.

This is because pcim_*() functions are related to CONFIG_PCI option,
however these are not related to CONFIG_HAS_IOPORT option.  Therefore,
when CONFIG_PCI is enabled and CONFIG_HAS_IOPORT is not enabled, it makes
link errors related to pcim_*() functions as below:

drivers/ata/libata-sff.c:3233: undefined reference to `pcim_iomap_regions'
drivers/ata/libata-sff.c:3238: undefined reference to `pcim_iomap_table'
drivers/built-in.o: In function `ata_pci_sff_init_host':
drivers/ata/libata-sff.c:2318: undefined reference to `pcim_iomap_regions'
drivers/ata/libata-sff.c:2329: undefined reference to `pcim_iomap_table

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Greg KH <greg@kroah.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: use vm_unmapped_area() on parisc architecture
Michel Lespinasse [Thu, 28 Feb 2013 01:02:40 +0000 (17:02 -0800)]
mm: use vm_unmapped_area() on parisc architecture

Update the parisc arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

[akpm@linux-foundation.org: remove now-unused DCACHE_ALIGN(), per James]
Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Acked-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrivers/video/backlight/ams369fg06.c: make power_on() call optional
Jingoo Han [Thu, 28 Feb 2013 01:02:39 +0000 (17:02 -0800)]
drivers/video/backlight/ams369fg06.c: make power_on() call optional

This patch makes power_on() call optional.  The voltage source can be
provided to some boards using ams369fg06 panel, thus in this case, power
on/off sequence is not necessary.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agocheckpatch: improve CamelCase test for Page
Joe Perches [Thu, 28 Feb 2013 01:02:38 +0000 (17:02 -0800)]
checkpatch: improve CamelCase test for Page

Add the ClearPage/SetPage/TestClearPage/TestSetPage variants to the not
reported Page CamelCase variables.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agosysrq: don't depend on weak undefined arrays to have an address that compares as...
Linus Torvalds [Wed, 27 Feb 2013 17:59:50 +0000 (09:59 -0800)]
sysrq: don't depend on weak undefined arrays to have an address that compares as NULL

When taking an address of an extern array, gcc quite naturally should be
able to say "an address of an object can never be NULL" and just
optimize away the test entirely.

However, the new alternate sysrq reset code (commit 154b7a489a5b:
"Input: sysrq - allow specifying alternate reset sequence") did exactly
that, and declared platform_sysrq_reset_seq[] as a weak array, and
expecting that testing the address of the array would show whether it
actually got linked against something or not.

And that doesn't work with all gcc versions.  Clearly it works with
*some* versions of gcc, and maybe it's even supposed to work, but it
really is a very fragile concept.

So instead of testing the address of the weak variable, just create a
weak instance of that array that is empty.  If some platform then has a
real platform_sysrq_reset_seq[] that overrides our weak one, the linker
will switch to that one, and it all works without any run-time
conditionals at all.

Reported-by: Dave Airlie <airlied@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: do not grow the stack vma just because of an overrun on preceding vma
Linus Torvalds [Wed, 27 Feb 2013 16:36:04 +0000 (08:36 -0800)]
mm: do not grow the stack vma just because of an overrun on preceding vma

The stack vma is designed to grow automatically (marked with VM_GROWSUP
or VM_GROWSDOWN depending on architecture) when an access is made beyond
the existing boundary.  However, particularly if you have not limited
your stack at all ("ulimit -s unlimited"), this can cause the stack to
grow even if the access was really just one past *another* segment.

And that's wrong, especially since we first grow the segment, but then
immediately later enforce the stack guard page on the last page of the
segment.  So _despite_ first growing the stack segment as a result of
the access, the kernel will then make the access cause a SIGSEGV anyway!

So do the same logic as the guard page check does, and consider an
access to within one page of the next segment to be a bad access, rather
than growing the stack to abut the next segment.

Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Wed, 27 Feb 2013 04:16:07 +0000 (20:16 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs pile (part one) from Al Viro:
 "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
  locking violations, etc.

  The most visible changes here are death of FS_REVAL_DOT (replaced with
  "has ->d_weak_revalidate()") and a new helper getting from struct file
  to inode.  Some bits of preparation to xattr method interface changes.

  Misc patches by various people sent this cycle *and* ocfs2 fixes from
  several cycles ago that should've been upstream right then.

  PS: the next vfs pile will be xattr stuff."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
  saner proc_get_inode() calling conventions
  proc: avoid extra pde_put() in proc_fill_super()
  fs: change return values from -EACCES to -EPERM
  fs/exec.c: make bprm_mm_init() static
  ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
  ocfs2: fix possible use-after-free with AIO
  ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
  get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
  target: writev() on single-element vector is pointless
  export kernel_write(), convert open-coded instances
  fs: encode_fh: return FILEID_INVALID if invalid fid_type
  kill f_vfsmnt
  vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
  nfsd: handle vfs_getattr errors in acl protocol
  switch vfs_getattr() to struct path
  default SET_PERSONALITY() in linux/elf.h
  ceph: prepopulate inodes only when request is aborted
  d_hash_and_lookup(): export, switch open-coded instances
  9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
  9p: split dropping the acls from v9fs_set_create_acl()
  ...

8 years agoMerge tag 'xtensa-next-20130225' of git://github.com/czankel/xtensa-linux
Linus Torvalds [Wed, 27 Feb 2013 03:53:12 +0000 (19:53 -0800)]
Merge tag 'xtensa-next-20130225' of git://github.com/czankel/xtensa-linux

Pull xtensa update from Chris Zankel:
 "Added features:
   - add support for thread local storage (TLS)

   - add accept4 and finit_module syscalls

   - support medium-priority interrupts

   - add support for dc232c processor variant

   - support file-base simulated disk for ISS simulator

  Bug fixes:

   - fix return values returned by the str[n]cmp functions

   - avoid mmap cache aliasing

   - fix handling of 'windowed registers' in ptrace"

* tag 'xtensa-next-20130225' of git://github.com/czankel/xtensa-linux:
  xtensa: add accept4 syscall
  xtensa: add support for TLS
  xtensa: add missing include asm/uaccess.h to checksum.h
  xtensa: do not enable GENERIC_GPIO by default
  xtensa: complete ptrace handling of register windows
  xtensa: add support for oprofile
  xtensa: move spill_registers to traps.h
  xtensa: ISS: add host file-based simulated disk
  xtensa: fix str[n]cmp return value
  xtensa: avoid mmap cache aliasing
  xtensa: add finit_module syscall
  xtensa: pull signal definitions from signal-defs.h
  xtensa: fix ipc_parse_version selection
  xtensa: dispatch medium-priority interrupts
  xtensa: Add config files for Diamond 233L - Rev C processor variant
  xtensa: use new common dtc rule
  xtensa: rename prom_update_property to of_update_property

8 years agoMerge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
Linus Torvalds [Wed, 27 Feb 2013 03:50:22 +0000 (19:50 -0800)]
Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze

Pull microblaze update from Michal Simek:
 "Microblaze changes.

  After my discussion with Arnd I have also added there asm-generic io
  patch which is Acked by him and Geert."

* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
  asm-generic: io: Fix ioread16/32be and iowrite16/32be
  microblaze: Do not use module.h in files which are not modules
  microblaze: Fix coding style issues
  microblaze: Add missing return from debugfs_tlb
  microblaze: Makefile clean
  microblaze: Add .gitignore entries for auto-generated files
  microblaze: Fix strncpy_from_user macro

8 years agoMerge branch 'for-upstream' of git://openrisc.net/jonas/linux
Linus Torvalds [Wed, 27 Feb 2013 03:46:23 +0000 (19:46 -0800)]
Merge branch 'for-upstream' of git://openrisc.net/jonas/linux

Pull OpenRISC updates from Jonas Bonn:
 "An equal number of bug fixes and trivial cleanups; no new features.

   - Two patches to fix errors thrown by the updated toolchain.

   - Three other bug fixes.

   - Four trivial cleanups."

* 'for-upstream' of git://openrisc.net/jonas/linux:
  openrisc: add missing header inclusion
  openrisc: really pass correct arg to schedule_tail
  Add bitops include needed for ext2 filesystem
  openrisc: update DTLB-miss handler last
  openrisc: fix up vmalloc page table loading
  openrisc idle: delete pm_idle
  openrisc: remove CONFIG_SYMBOL_PREFIX
  openrisc: avoid using function parameter regs in reset vector
  openrisc: remove unused current_regs

8 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 27 Feb 2013 03:45:29 +0000 (19:45 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge
  Revert "x86, mm: Make spurious_fault check explicitly check explicitly check the PRESENT bit"
  x86/mm/numa: Don't check if node is NUMA_NO_NODE
  x86, efi: Make "noefi" really disable EFI runtime serivces
  x86/apic: Fix parsing of the 'lapic' cmdline option