~shefty/rdma-dev.git
9 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
David S. Miller [Sat, 28 Jan 2012 01:40:18 +0000 (20:40 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless

9 years agonet: explicitly add jump_label.h header to sock.h
Glauber Costa [Thu, 26 Jan 2012 12:09:28 +0000 (12:09 +0000)]
net: explicitly add jump_label.h header to sock.h

Commit 36a1211970193ce215de50ed1e4e1272bc814df1 removed linux/module.h
include statement from one of the headers that end up in net/sock.h.
It was providing us with static_branch() definition implicitly, so
after its removal the build got broken.

To fix this, and avoid having this happening in the future,
let me do the right thing and include linux/jump_label.h
explicitly in sock.h.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: RTNETLINK adjusting values of min_ifinfo_dump_size
Stefan Gula [Thu, 26 Jan 2012 11:01:06 +0000 (11:01 +0000)]
net: RTNETLINK adjusting values of min_ifinfo_dump_size

Setting link parameters on a netdevice changes the value
of if_nlmsg_size(), therefore it is necessary to recalculate
min_ifinfo_dump_size.

Signed-off-by: Stefan Gula <steweg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoipv6: Fix ip_gre lockless xmits.
Willem de Bruijn [Thu, 26 Jan 2012 10:34:35 +0000 (10:34 +0000)]
ipv6: Fix ip_gre lockless xmits.

Tunnel devices set NETIF_F_LLTX to bypass HARD_TX_LOCK.  Sit and
ipip set this unconditionally in ops->setup, but gre enables it
conditionally after parameter passing in ops->newlink. This is
not called during tunnel setup as below, however, so GRE tunnels are
still taking the lock.

modprobe ip_gre
ip tunnel add test0 mode gre remote 10.5.1.1 dev lo
ip link set test0 up
ip addr add 10.6.0.1 dev test0
 # cat /sys/class/net/test0/features
 # $DIR/test_tunnel_xmit 10 10.5.2.1
ip route add 10.5.2.0/24 dev test0
ip tunnel del test0

The newlink callback is only called in rtnl_netlink, and only if
the device is new, as it calls register_netdevice internally. Gre
tunnels are created at 'ip tunnel add' with ioctl SIOCADDTUNNEL,
which calls ipgre_tunnel_locate, which calls register_netdev.
rtnl_newlink is called at 'ip link set', but skips ops->newlink
and the device is up with locking still enabled. The equivalent
ipip tunnel works fine, btw (just substitute 'method gre' for
'method ipip').

On kernels before /sys/class/net/*/features was removed [1],
the first commented out line returns 0x6000 with method gre,
which indicates that NETIF_F_LLTX (0x1000) is not set. With ipip,
it reports 0x7000. This test cannot be used on recent kernels where
the sysfs file is removed (and ETHTOOL_GFEATURES does not currently
work for tunnel devices, because they lack dev->ethtool_ops).

The second commented out line calls a simple transmission test [2]
that sends on 24 cores at maximum rate. Results of a single run:

ipip: 19,372,306
gre before patch:  4,839,753
gre after patch: 19,133,873

This patch replicates the condition check in ipgre_newlink to
ipgre_tunnel_locate. It works for me, both with oseq on and off.
This is the first time I looked at rtnetlink and iproute2 code,
though, so someone more knowledgeable should probably check the
patch. Thanks.

The tail of both functions is now identical, by the way. To avoid
code duplication, I'll be happy to rework this and merge the two.

[1] http://patchwork.ozlabs.org/patch/104610/
[2] http://kernel.googlecode.com/files/xmit_udp_parallel.c

Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoxen-netfront: correct MAX_TX_TARGET calculation.
Wei Liu [Thu, 26 Jan 2012 07:23:23 +0000 (07:23 +0000)]
xen-netfront: correct MAX_TX_TARGET calculation.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetns: fix net_alloc_generic()
Eric Dumazet [Thu, 26 Jan 2012 00:41:38 +0000 (00:41 +0000)]
netns: fix net_alloc_generic()

When a new net namespace is created, we should attach to it a "struct
net_generic" with enough slots (even empty), or we can hit the following
BUG_ON() :

[  200.752016] kernel BUG at include/net/netns/generic.h:40!
...
[  200.752016]  [<ffffffff825c3cea>] ? get_cfcnfg+0x3a/0x180
[  200.752016]  [<ffffffff821cf0b0>] ? lockdep_rtnl_is_held+0x10/0x20
[  200.752016]  [<ffffffff825c41be>] caif_device_notify+0x2e/0x530
[  200.752016]  [<ffffffff810d61b7>] notifier_call_chain+0x67/0x110
[  200.752016]  [<ffffffff810d67c1>] raw_notifier_call_chain+0x11/0x20
[  200.752016]  [<ffffffff821bae82>] call_netdevice_notifiers+0x32/0x60
[  200.752016]  [<ffffffff821c2b26>] register_netdevice+0x196/0x300
[  200.752016]  [<ffffffff821c2ca9>] register_netdev+0x19/0x30
[  200.752016]  [<ffffffff81c1c67a>] loopback_net_init+0x4a/0xa0
[  200.752016]  [<ffffffff821b5e62>] ops_init+0x42/0x180
[  200.752016]  [<ffffffff821b600b>] setup_net+0x6b/0x100
[  200.752016]  [<ffffffff821b6466>] copy_net_ns+0x86/0x110
[  200.752016]  [<ffffffff810d5789>] create_new_namespaces+0xd9/0x190

net_alloc_generic() should take into account the maximum index into the
ptr array, as a subsystem might use net_generic() anytime.

This also reduces number of reallocations in net_assign_generic()

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Sjur Brændeland <sjur.brandeland@stericsson.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: bind() optimize port allocation
Flavio Leitner [Wed, 25 Jan 2012 08:34:52 +0000 (08:34 +0000)]
tcp: bind() optimize port allocation

Port autoselection finds a port and then drop the lock,
then right after that, gets the hash bucket again and lock it.

Fix it to go direct.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: bind() fix autoselection to share ports
Flavio Leitner [Wed, 25 Jan 2012 08:34:51 +0000 (08:34 +0000)]
tcp: bind() fix autoselection to share ports

The current code checks for conflicts when the application
requests a specific port.  If there is no conflict, then
the request is granted.

On the other hand, the port autoselection done by the kernel
fails when all ports are bound even when there is a port
with no conflict available.

The fix changes port autoselection to check if there is a
conflict and use it if not.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agol2tp: l2tp_ip - fix possible oops on packet receive
James Chapman [Wed, 25 Jan 2012 02:39:05 +0000 (02:39 +0000)]
l2tp: l2tp_ip - fix possible oops on packet receive

When a packet is received on an L2TP IP socket (L2TPv3 IP link
encapsulation), the l2tpip socket's backlog_rcv function calls
xfrm4_policy_check(). This is not necessary, since it was called
before the skb was added to the backlog. With CONFIG_NET_NS enabled,
xfrm4_policy_check() will oops if skb->dev is null, so this trivial
patch removes the call.

This bug has always been present, but only when CONFIG_NET_NS is
enabled does it cause problems. Most users are probably using UDP
encapsulation for L2TP, hence the problem has only recently
surfaced.

EIP: 0060:[<c12bb62b>] EFLAGS: 00210246 CPU: 0
EIP is at l2tp_ip_recvmsg+0xd4/0x2a7
EAX: 00000001 EBX: d77b5180 ECX: 00000000 EDX: 00200246
ESI: 00000000 EDI: d63cbd30 EBP: d63cbd18 ESP: d63cbcf4
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Call Trace:
 [<c1218568>] sock_common_recvmsg+0x31/0x46
 [<c1215c92>] __sock_recvmsg_nosec+0x45/0x4d
 [<c12163a1>] __sock_recvmsg+0x31/0x3b
 [<c1216828>] sock_recvmsg+0x96/0xab
 [<c10b2693>] ? might_fault+0x47/0x81
 [<c10b2693>] ? might_fault+0x47/0x81
 [<c1167fd0>] ? _copy_from_user+0x31/0x115
 [<c121e8c8>] ? copy_from_user+0x8/0xa
 [<c121ebd6>] ? verify_iovec+0x3e/0x78
 [<c1216604>] __sys_recvmsg+0x10a/0x1aa
 [<c1216792>] ? sock_recvmsg+0x0/0xab
 [<c105a99b>] ? __lock_acquire+0xbdf/0xbee
 [<c12d5a99>] ? do_page_fault+0x193/0x375
 [<c10d1200>] ? fcheck_files+0x9b/0xca
 [<c10d1259>] ? fget_light+0x2a/0x9c
 [<c1216bbb>] sys_recvmsg+0x2b/0x43
 [<c1218145>] sys_socketcall+0x16d/0x1a5
 [<c11679f0>] ? trace_hardirqs_on_thunk+0xc/0x10
 [<c100305f>] sysenter_do_call+0x12/0x38
Code: c6 05 8c ea a8 c1 01 e8 0c d4 d9 ff 85 f6 74 07 3e ff 86 80 00 00 00 b9 17 b6 2b c1 ba 01 00 00 00 b8 78 ed 48 c1 e8 23 f6 d9 ff <ff> 76 0c 68 28 e3 30 c1 68 2d 44 41 c1 e8 89 57 01 00 83 c4 0c

Signed-off-by: James Chapman <jchapman@katalix.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Tue, 24 Jan 2012 23:51:40 +0000 (15:51 -0800)]
Merge git://git./linux/kernel/git/davem/net

Davem says:

1) Fix JIT code generation on x86-64 for divide by zero, from Eric Dumazet.

2) tg3 header length computation correction from Eric Dumazet.

3) More build and reference counting fixes for socket memory cgroup
   code from Glauber Costa.

4) module.h snuck back into a core header after all the hard work we
   did to remove that, from Paul Gortmaker and Jesper Dangaard Brouer.

5) Fix PHY naming regression and add some new PCI IDs in stmmac, from
   Alessandro Rubini.

6) Netlink message generation fix in new team driver, should only advertise
   the entries that changed during events, from Jiri Pirko.

7) SRIOV VF registration and unregistration fixes, and also add a
   missing PCI ID, from Roopa Prabhu.

8) Fix infinite loop in tx queue flush code of brcmsmac, from Stanislaw Gruszka.

9) ftgmac100/ftmac100 build fix, missing interrupt.h include.

10) Memory leak fix in net/hyperv do_set_mutlicast() handling, from Wei Yongjun.

11) Off by one fix in netem packet scheduler, from Vijay Subramanian.

12) TCP loss detection fix from Yuchung Cheng.

13) TCP reset packet MD5 calculation uses wrong address, fix from Shawn Lu.

14) skge carrier assertion and DMA mapping fixes from Stephen Hemminger.

15) Congestion recovery undo performed at the wrong spot in BIC and CUBIC
    congestion control modules, fix from Neal Cardwell.

16) Ethtool ETHTOOL_GSSET_INFO is unnecessarily restrictive, from Michał Mirosław.

17) Fix triggerable race in ipv6 sysctl handling, from Francesco Ruggeri.

18) Statistics bug fixes in mlx4 from Eugenia Emantayev.

19) rds locking bug fix during info dumps, from your's truly.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits)
  rds: Make rds_sock_lock BH rather than IRQ safe.
  netprio_cgroup.h: dont include module.h from other includes
  net: flow_dissector.c missing include linux/export.h
  team: send only changed options/ports via netlink
  net/hyperv: fix possible memory leak in do_set_multicast()
  drivers/net: dsa/mv88e6xxx.c files need linux/module.h
  stmmac: added PCI identifiers
  llc: Fix race condition in llc_ui_recvmsg
  stmmac: fix phy naming inconsistency
  dsa: Add reporting of silicon revision for Marvell 88E6123/88E6161/88E6165 switches.
  tg3: fix ipv6 header length computation
  skge: add byte queue limit support
  mv643xx_eth: Add Rx Discard and Rx Overrun statistics
  bnx2x: fix compilation error with SOE in fw_dump
  bnx2x: handle CHIP_REVISION during init_one
  bnx2x: allow user to change ring size in ISCSI SD mode
  bnx2x: fix Big-Endianess in ethtool -t
  bnx2x: fixed ethtool statistics for MF modes
  bnx2x: credit-leakage fixup on vlan_mac_del_all
  macvlan: fix a possible use after free
  ...

9 years agords: Make rds_sock_lock BH rather than IRQ safe.
David S. Miller [Tue, 24 Jan 2012 22:03:44 +0000 (17:03 -0500)]
rds: Make rds_sock_lock BH rather than IRQ safe.

rds_sock_info() triggers locking warnings because we try to perform a
local_bh_enable() (via sock_i_ino()) while hardware interrupts are
disabled (via taking rds_sock_lock).

There is no reason for rds_sock_lock to be a hardware IRQ disabling
lock, none of these access paths run in hardware interrupt context.

Therefore making it a BH disabling lock is safe and sufficient to
fix this bug.

Reported-by: Kumar Sanghvi <kumaras@chelsio.com>
Reported-by: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetprio_cgroup.h: dont include module.h from other includes
Paul Gortmaker [Tue, 24 Jan 2012 11:33:19 +0000 (11:33 +0000)]
netprio_cgroup.h: dont include module.h from other includes

A considerable effort was invested in wiping out module.h
from being present in all the other standard includes.  This
one leaked back in, but once again isn't strictly necessary,
so remove it.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: flow_dissector.c missing include linux/export.h
Jesper Dangaard Brouer [Tue, 24 Jan 2012 21:03:33 +0000 (16:03 -0500)]
net: flow_dissector.c missing include linux/export.h

The file net/core/flow_dissector.c seems to be missing
including linux/export.h.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoteam: send only changed options/ports via netlink
Jiri Pirko [Tue, 24 Jan 2012 05:16:00 +0000 (05:16 +0000)]
team: send only changed options/ports via netlink

This patch changes event message behaviour to send only updated records
instead of whole list. This fixes bug on which userspace receives non-actual
data in case multiple events occur in row.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet/hyperv: fix possible memory leak in do_set_multicast()
Wei Yongjun [Tue, 24 Jan 2012 10:21:28 +0000 (10:21 +0000)]
net/hyperv: fix possible memory leak in do_set_multicast()

do_set_multicast() may not free the memory malloc in
netvsc_set_multicast_list().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agodrivers/net: dsa/mv88e6xxx.c files need linux/module.h
Paul Gortmaker [Tue, 24 Jan 2012 10:41:40 +0000 (10:41 +0000)]
drivers/net: dsa/mv88e6xxx.c files need linux/module.h

An implicit instance of module.h leaked back into existence
and was masking the fact that these drivers weren't calling
out the include for itself.  Fix the drivers before we remove
the implicit include path via net/netprio_cgroup.h file.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agostmmac: added PCI identifiers
Alessandro Rubini [Mon, 23 Jan 2012 23:08:56 +0000 (23:08 +0000)]
stmmac: added PCI identifiers

STM has a device ID within its own VENDOR space, and it is being
used in the STA2X11 I/O Hub.

Signed-off-by: Alessandro Rubini <rubini@gnudd.com>
Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agollc: Fix race condition in llc_ui_recvmsg
Radu Iliescu [Thu, 19 Jan 2012 03:57:57 +0000 (03:57 +0000)]
llc: Fix race condition in llc_ui_recvmsg

There is a race on sk_receive_queue between llc_ui_recvmsg and
sock_queue_rcv_skb.

Our current solution is to protect skb_eat in llc_ui_recvmsg
with the queue spinlock.

Signed-off-by: Radu Iliescu <riliescu@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agostmmac: fix phy naming inconsistency
Alessandro Rubini [Mon, 23 Jan 2012 23:26:48 +0000 (23:26 +0000)]
stmmac: fix phy naming inconsistency

After commit "db8857b stmmac: use an unique MDIO bus name" my
device stopped being probed because two different names were being
used in different places. This fixes the inconsistency.

Signed-off-by: Alessandro Rubini <rubini@gnudd.com>
Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Florian Fainelli <florian@openwrt.org>
Acked-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Linus Torvalds [Tue, 24 Jan 2012 20:12:40 +0000 (12:12 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: Pass information that quota is stored in system file to userspace
  ext2: protect inode changes in the SETVERSION and SETFLAGS ioctls
  jbd: Issue cache flush after checkpointing

9 years agoiwlwifi: fix PCI-E transport "inta" race
Johannes Berg [Thu, 19 Jan 2012 16:20:57 +0000 (08:20 -0800)]
iwlwifi: fix PCI-E transport "inta" race

When an interrupt comes in, we read the reason
bits and collect them into "trans_pcie->inta".
This happens with the spinlock held. However,
there's a bug resetting this variable -- that
happens after the spinlock has been released.
This means that it is possible for interrupts
to be missed if the reset happens after some
other interrupt reasons were already added to
the variable.

I found this by code inspection, looking for a
reason that we sometimes see random commands
time out. It seems possible that this causes
such behaviour, but I can't say for sure right
now since it happens extremely infrequently on
my test systems.

Cc: stable@vger.kernel.org [3.2]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
9 years agomac80211: set bss_conf.idle when vif is connected
Eliad Peller [Wed, 11 Jan 2012 11:11:50 +0000 (13:11 +0200)]
mac80211: set bss_conf.idle when vif is connected

__ieee80211_recalc_idle() iterates through the vifs,
sets bss_conf.idle = true if they are disconnected,
and increases "count" if they are not (which later
gets evaluated in order to determine whether the
device is idle).

However, the loop doesn't set bss_conf.idle = false
(along with increasing "count"), causing the device
idle state and the vif idle state to get out of sync
in some cases.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
9 years agomac80211: update oper_channel on ibss join
Eliad Peller [Tue, 10 Jan 2012 13:19:54 +0000 (15:19 +0200)]
mac80211: update oper_channel on ibss join

Commit 13c40c5 ("mac80211: Add HT operation modes for IBSS") broke
ibss operation by mistakenly removing the local->oper_channel
update (causing ibss to start on the wrong channel). fix it.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
9 years agoregulator: Fix documentation for of_node parameter of regulator_register()
Mark Brown [Tue, 24 Jan 2012 11:17:26 +0000 (11:17 +0000)]
regulator: Fix documentation for of_node parameter of regulator_register()

Commit 5bc75a886353 ("kernel-doc: fix new warning in regulator core")
added documentation for of_node to address a warning but the
documentation didn't explain what the parameter is for so would be
likely to be unhelpful for users.  Clarify that.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Tue, 24 Jan 2012 18:25:29 +0000 (10:25 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix silent outputs from docking-station jacks of Dell laptops
  ALSA: HDA: Use model=auto for Thinkpad T510
  ALSA: hda - Fix buffer-alignment regression with Nvidia HDMI
  ALSA: hda - Fix a unused variable warning
  snd-hda-intel: better Alienware M17x R3 quirk
  ALSA: hda/realtek - Remove use_jack_tbl field
  ALSA: hda/realtek - Avoid conflict of unsol-events with static quirks
  ALSA: hda/realtek - Avoid multi-ios conflicting with multi-speakers

9 years agogma500: Fix shmem mapping
Alan Cox [Tue, 24 Jan 2012 16:57:42 +0000 (16:57 +0000)]
gma500: Fix shmem mapping

GMA500 did it the old way and it's been on the TODO list to fix.
Current kernels now blow up if we use the old way so we'd better
do the work!

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomigrate_mode.h is not exported to user mode
Stephen Rothwell [Tue, 24 Jan 2012 00:41:32 +0000 (11:41 +1100)]
migrate_mode.h is not exported to user mode

so move its include into fs.h inside the __KERNEL__ protection.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoMerge tag 'pm-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Mon, 23 Jan 2012 23:11:27 +0000 (15:11 -0800)]
Merge tag 'pm-fixes-for-3.3' of git://git./linux/kernel/git/rafael/linux-pm

Power management fixes for 3.3

Two fixes for regressions introduced during the merge window, one fix for
a long-standing obscure issue in the computation of hibernate image size
and two small PM documentation fixes.

* tag 'pm-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / Sleep: Fix read_unlock_usermodehelper() call.
  PM / Hibernate: Rewrite unlock_system_sleep() to fix s2disk regression
  PM / Hibernate: Correct additional pages number calculation
  PM / Documentation: Fix minor issue in freezing_of_tasks.txt
  PM / Documentation: Fix spelling mistake in basic-pm-debugging.txt

9 years agoMerge tag 'arm-soc-imx-move' of git://git.kernel.org/pub/scm/linux/kernel/git/arm...
Linus Torvalds [Mon, 23 Jan 2012 22:50:30 +0000 (14:50 -0800)]
Merge tag 'arm-soc-imx-move' of git://git./linux/kernel/git/arm/arm-soc

Consolidate i.MX 5 platforms to be under the new shared i.MX 3/5/6 tree.

* tag 'arm-soc-imx-move' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM i.MX: Update defconfig
  ARM i.MX: Merge i.MX5 support into mach-imx
  ARM i.MX5: remove unnecessary includes from board files

Fix up fairly trivial conflicts due to various changes nearby in
arch/arm/{mach,plat}-imx/{Kconfig,Makefile}

Pull request had been sent to the wrong email address, but happened
before the merge window closed.  I'm merging the MX 5 consolidation,
since it apparently will help the next development window and will avoid
conflicts later as per Arnd.

9 years agoPM / Sleep: Fix read_unlock_usermodehelper() call.
Tetsuo Handa [Mon, 23 Jan 2012 20:59:08 +0000 (21:59 +0100)]
PM / Sleep: Fix read_unlock_usermodehelper() call.

Commit b298d289
 "PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()"
added read_unlock_usermodehelper() but read_unlock_usermodehelper() is called
without read_lock_usermodehelper() when kmalloc() failed.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
9 years agodsa: Add reporting of silicon revision for Marvell 88E6123/88E6161/88E6165 switches.
Chris Healy [Sun, 22 Jan 2012 21:20:54 +0000 (21:20 +0000)]
dsa: Add reporting of silicon revision for Marvell 88E6123/88E6161/88E6165 switches.

Add reporting of silicon revision during the probe function for Marvell 88E6123/88E6161/88E6165 switches.

Signed-off-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotg3: fix ipv6 header length computation
Eric Dumazet [Mon, 23 Jan 2012 01:22:09 +0000 (01:22 +0000)]
tg3: fix ipv6 header length computation

tg3_start_xmit() makes the wrong assumption for TSOV6 that skb->head
doesnt include any payload data.

if (skb_is_gso_v6(skb))
hdr_len = skb_headlen(skb) - ETH_HLEN;

This is not true anymore after commit f07d960df3 (tcp: avoid frag
allocation for small frames)

We should instead use : skb_transport_offset(skb) + tcp_hdrlen(skb)

Its also true for IPv4

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Matt Carlson <mcarlson@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoskge: add byte queue limit support
stephen hemminger [Sun, 22 Jan 2012 09:40:40 +0000 (09:40 +0000)]
skge: add byte queue limit support

This also changes the cleanup logic slightly to aggregate
completed notifications for multiple packets.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomv643xx_eth: Add Rx Discard and Rx Overrun statistics
Paulius Zaleckas [Mon, 23 Jan 2012 01:16:35 +0000 (01:16 +0000)]
mv643xx_eth: Add Rx Discard and Rx Overrun statistics

These statistics helped me a lot while searching who is losing
packets in my setup.
I added these stats to MIB group since they are very similar,
but just in other registers.
I have tested this patch on 88F6281 SoC.

Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobnx2x: fix compilation error with SOE in fw_dump
Yuval Mintz [Mon, 23 Jan 2012 07:31:56 +0000 (07:31 +0000)]
bnx2x: fix compilation error with SOE in fw_dump

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobnx2x: handle CHIP_REVISION during init_one
Ariel Elior [Mon, 23 Jan 2012 07:31:55 +0000 (07:31 +0000)]
bnx2x: handle CHIP_REVISION during init_one

The macro `CHIP_IS_E1x' requires `bp' to be initialized.
As `bp' is not yet initialized during this phase of `bnx2x_init_dev',
it accessed uninitialized fields in the struct.

Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobnx2x: allow user to change ring size in ISCSI SD mode
Dmitry Kravkov [Mon, 23 Jan 2012 07:31:54 +0000 (07:31 +0000)]
bnx2x: allow user to change ring size in ISCSI SD mode

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobnx2x: fix Big-Endianess in ethtool -t
Dmitry Kravkov [Mon, 23 Jan 2012 07:31:53 +0000 (07:31 +0000)]
bnx2x: fix Big-Endianess in ethtool -t

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobnx2x: fixed ethtool statistics for MF modes
Yuval Mintz [Mon, 23 Jan 2012 07:31:52 +0000 (07:31 +0000)]
bnx2x: fixed ethtool statistics for MF modes

Previosuly, in MF modes `ethtool -S' lacked some of the statistics
which appeared in non-MF modes. This has been fixed.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobnx2x: credit-leakage fixup on vlan_mac_del_all
Yuval Mintz [Mon, 23 Jan 2012 07:31:51 +0000 (07:31 +0000)]
bnx2x: credit-leakage fixup on vlan_mac_del_all

Upon insertion of elements into the execution queue, it is validated
that there are enough credits to support additional vlan-macs,
and the credits are consumed. However, when removing a pending
command in `bnx2x_vland_mac_del_all' the consumed credits are not
released, which might cause leakage and eventually the inability to
add new vlan-macs in certain scenarios.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomacvlan: fix a possible use after free
Eric Dumazet [Mon, 23 Jan 2012 05:38:59 +0000 (05:38 +0000)]
macvlan: fix a possible use after free

Commit bc416d9768 (macvlan: handle fragmented multicast frames) added a
possible use after free in macvlan_handle_frame(), since
ip_check_defrag() uses pskb_may_pull() : skb header can be reallocated.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoMerge branch 'kernel-doc' from Randy Dunlap
Linus Torvalds [Mon, 23 Jan 2012 18:08:08 +0000 (10:08 -0800)]
Merge branch 'kernel-doc' from Randy Dunlap

The usual kernel-doc fixups from Randy.  Some of them David acked as
merged in his tree, this is the random left-overs.

* kernel-doc:
  docbook: fix sched source file names in device-drivers book
  docbook: change iomap source filename in deviceiobook
  docbook: don't use serial_core.h in device-drivers book
  kernel-doc: fix kernel-doc warnings in sched
  kernel-doc: fix new warnings in cfg80211.h
  kernel-doc: fix new warning in usb.h
  kernel-doc: fix new warnings in device.h
  kernel-doc: fix new warnings in debugfs
  kernel-doc: fix new warning in regulator core
  kernel-doc: fix new warnings in pci
  kernel-doc: fix new warnings in driver-core
  kernel-doc: fix new warnings in auditsc.c
  scripts/kernel-doc: fix fatal error caused by cfg80211.h

9 years agoMerge branch 'akpm'
Linus Torvalds [Mon, 23 Jan 2012 17:27:54 +0000 (09:27 -0800)]
Merge branch 'akpm'

Quoth Andrew:
  "Random fixes.  And a simple new LED driver which I'm trying to sneak
   in while you're not looking."

Sneaking successful.

* akpm:
  score: fix off-by-one index into syscall table
  mm: fix rss count leakage during migration
  SHM_UNLOCK: fix Unevictable pages stranded after swap
  SHM_UNLOCK: fix long unpreemptible section
  kdump: define KEXEC_NOTE_BYTES arch specific for s390x
  mm/hugetlb.c: undo change to page mapcount in fault handler
  mm: memcg: update the correct soft limit tree during migration
  proc: clear_refs: do not clear reserved pages
  drivers/video/backlight/l4f00242t03.c: return proper error in l4f00242t03_probe if regulator_get() fails
  drivers/video/backlight/adp88x0_bl.c: fix bit testing logic
  kprobes: initialize before using a hlist
  ipc/mqueue: simplify reading msgqueue limit
  leds: add led driver for Bachmann's ot200
  mm: __count_immobile_pages(): make sure the node is online
  mm: fix NULL ptr dereference in __count_immobile_pages
  mm: fix warnings regarding enum migrate_mode

9 years agoMerge branch 'for-linus-3.3' of git://git.linaro.org/people/sumitsemwal/linux-dma-buf
Linus Torvalds [Mon, 23 Jan 2012 17:26:37 +0000 (09:26 -0800)]
Merge branch 'for-linus-3.3' of git://git.linaro.org/people/sumitsemwal/linux-dma-buf

* 'for-linus-3.3' of git://git.linaro.org/people/sumitsemwal/linux-dma-buf:
  MAINTAINERS: Add dma-buf sharing framework maintainer

9 years agoALSA: hda - Fix silent outputs from docking-station jacks of Dell laptops
Takashi Iwai [Mon, 23 Jan 2012 17:23:36 +0000 (18:23 +0100)]
ALSA: hda - Fix silent outputs from docking-station jacks of Dell laptops

The recent change of the power-widget handling for IDT codecs caused
the silent output from the docking-station line-out jack.  This was
partially fixed by the commit f2cbba7602383cd9cdd21f0a5d0b8bd1aad47b33
"ALSA: hda - Fix the lost power-setup of seconary pins after PM resume".
But the line-out on the docking-station is still silent when booted
with the jack plugged even by this fix.

The remainig bug is that the power-widget is set off in stac92xx_init()
because the pins in cfg->line_out_pins[] aren't checked there properly
but only hp_pins[] are checked in is_nid_hp_pin().

This patch fixes the problem by checking both HP and line-out pins
and leaving the power-map correctly.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42637

Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
9 years agoMerge git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Mon, 23 Jan 2012 16:59:49 +0000 (08:59 -0800)]
Merge git://git.samba.org/sfrench/cifs-2.6

* git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Rename *UCS* functions to *UTF16*
  [CIFS] ACL and FSCACHE support no longer EXPERIMENTAL
  [CIFS] Fix build break with multiuser patch when LANMAN disabled
  cifs: warn about impending deprecation of legacy MultiuserMount code
  cifs: fetch credentials out of keyring for non-krb5 auth multiuser mounts
  cifs: sanitize username handling
  keys: add a "logon" key type
  cifs: lower default wsize when unix extensions are not used
  cifs: better instrumentation for coalesce_t2
  cifs: integer overflow in parse_dacl()
  cifs: Fix sparse warning when calling cifs_strtoUCS
  CIFS: Add descriptions to the brlock cache functions

9 years agodocbook: fix sched source file names in device-drivers book
Randy Dunlap [Sat, 21 Jan 2012 19:03:24 +0000 (11:03 -0800)]
docbook: fix sched source file names in device-drivers book

Fix warning since kernel scheduler functions and kernel-doc
notation were moved to other files.

docproc: lnx-33-rc1/kernel/sched.c: No such file or directory

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agodocbook: change iomap source filename in deviceiobook
Randy Dunlap [Sat, 21 Jan 2012 19:03:20 +0000 (11:03 -0800)]
docbook: change iomap source filename in deviceiobook

Fix warning since kernel-doc comments were moved to another
source file.

Warning(lib/iomap.c): no structured comments found

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agodocbook: don't use serial_core.h in device-drivers book
Randy Dunlap [Sat, 21 Jan 2012 19:03:16 +0000 (11:03 -0800)]
docbook: don't use serial_core.h in device-drivers book

Fix new kernel-doc warning.  This file no longer contains
kernel-doc comments.

Warning(include/linux/serial_core.h): no structured comments found

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokernel-doc: fix kernel-doc warnings in sched
Randy Dunlap [Sat, 21 Jan 2012 19:03:13 +0000 (11:03 -0800)]
kernel-doc: fix kernel-doc warnings in sched

Fix new kernel-doc notation warnings:

Warning(include/linux/sched.h:2094): No description found for parameter 'p'
Warning(include/linux/sched.h:2094): Excess function parameter 'tsk' description in 'is_idle_task'
Warning(kernel/sched/cpupri.c:139): No description found for parameter 'newpri'
Warning(kernel/sched/cpupri.c:139): Excess function parameter 'pri' description in 'cpupri_set'
Warning(kernel/sched/cpupri.c:208): Excess function parameter 'bootmem' description in 'cpupri_init'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokernel-doc: fix new warnings in cfg80211.h
Randy Dunlap [Sat, 21 Jan 2012 19:03:00 +0000 (11:03 -0800)]
kernel-doc: fix new warnings in cfg80211.h

Fix new kernel-doc warnings:

Warning(include/net/cfg80211.h:1165): No description found for parameter 'channel_type'
Warning(include/net/cfg80211.h:2090): No description found for parameter 'probe_resp_offload'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: linux-wireless@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokernel-doc: fix new warning in usb.h
Randy Dunlap [Sat, 21 Jan 2012 19:02:56 +0000 (11:02 -0800)]
kernel-doc: fix new warning in usb.h

Fix new kernel-doc warning:

Warning(include/linux/usb.h:1251): No description found for parameter 'num_mapped_sgs'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokernel-doc: fix new warnings in device.h
Randy Dunlap [Sat, 21 Jan 2012 19:02:51 +0000 (11:02 -0800)]
kernel-doc: fix new warnings in device.h

Fix new kernel-doc warnings:

Warning(include/linux/device.h:299): No description found for parameter 'name'
Warning(include/linux/device.h:299): No description found for parameter 'subsys'
Warning(include/linux/device.h:299): No description found for parameter 'node'
Warning(include/linux/device.h:299): No description found for parameter 'add_dev'
Warning(include/linux/device.h:299): No description found for parameter 'remove_dev'
Warning(include/linux/device.h:685): No description found for parameter 'id'
Warning(include/linux/device.h:1009): No description found for parameter '__driver'
Warning(include/linux/device.h:1009): No description found for parameter '__register'
Warning(include/linux/device.h:1009): No description found for parameter '__unregister'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokernel-doc: fix new warnings in debugfs
Randy Dunlap [Sat, 21 Jan 2012 19:02:42 +0000 (11:02 -0800)]
kernel-doc: fix new warnings in debugfs

Fix new kernel-doc warnings:

Warning(fs/debugfs/file.c:556): No description found for parameter 'nregs'
Warning(fs/debugfs/file.c:556): Excess function parameter 'mregs' description in 'debugfs_print_regs32'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokernel-doc: fix new warning in regulator core
Randy Dunlap [Sat, 21 Jan 2012 19:02:38 +0000 (11:02 -0800)]
kernel-doc: fix new warning in regulator core

Fix new kernel-doc warning:

Warning(drivers/regulator/core.c:2741): No description found for parameter 'of_node'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Liam Girdwood <lrg@ti.com>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokernel-doc: fix new warnings in pci
Randy Dunlap [Sat, 21 Jan 2012 19:02:35 +0000 (11:02 -0800)]
kernel-doc: fix new warnings in pci

Fix new kernel-doc warnings:

Warning(drivers/pci/pci.c:2811): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2811): Excess function parameter 'pdev' description in 'pci_intx_mask_supported'
Warning(drivers/pci/pci.c:2894): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2894): Excess function parameter 'pdev' description in 'pci_check_and_mask_intx'
Warning(drivers/pci/pci.c:2908): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2908): Excess function parameter 'pdev' description in 'pci_check_and_unmask_intx'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokernel-doc: fix new warnings in driver-core
Randy Dunlap [Sat, 21 Jan 2012 19:02:28 +0000 (11:02 -0800)]
kernel-doc: fix new warnings in driver-core

Fix new kernel-doc warnings:

Warning(drivers/base/bus.c:925): No description found for parameter 'key'
Warning(drivers/base/bus.c:1241): No description found for parameter 'subsys'
Warning(drivers/base/bus.c:1241): No description found for parameter 'groups'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokernel-doc: fix new warnings in auditsc.c
Randy Dunlap [Sat, 21 Jan 2012 19:02:24 +0000 (11:02 -0800)]
kernel-doc: fix new warnings in auditsc.c

Fix new kernel-doc warnings in auditsc.c:

Warning(kernel/auditsc.c:1875): No description found for parameter 'success'
Warning(kernel/auditsc.c:1875): No description found for parameter 'return_code'
Warning(kernel/auditsc.c:1875): Excess function parameter 'pt_regs' description in '__audit_syscall_exit'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoscripts/kernel-doc: fix fatal error caused by cfg80211.h
Randy Dunlap [Sat, 21 Jan 2012 18:31:54 +0000 (10:31 -0800)]
scripts/kernel-doc: fix fatal error caused by cfg80211.h

include/net/cfg80211.h uses __must_check in functions that
have kernel-doc notation.  This was confusing scripts/kernel-doc,
so have scripts/kernel-doc ignore "__must_check".

Error(include/net/cfg80211.h:2702): cannot understand prototype: 'struct cfg80211_bss * __must_check cfg80211_inform_bss(...)

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoscore: fix off-by-one index into syscall table
Dan Rosenberg [Fri, 20 Jan 2012 22:34:27 +0000 (14:34 -0800)]
score: fix off-by-one index into syscall table

If the provided system call number is equal to __NR_syscalls, the
current check will pass and a function pointer just after the system
call table may be called, since sys_call_table is an array with total
size __NR_syscalls.

Whether or not this is a security bug depends on what the compiler puts
immediately after the system call table.  It's likely that this won't do
anything bad because there is an additional NULL check on the syscall
entry, but if there happens to be a non-NULL value immediately after the
system call table, this may result in local privilege escalation.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: <stable@vger.kernel.org>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Eugene Teo <eugeneteo@kernel.sg>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomm: fix rss count leakage during migration
Konstantin Khlebnikov [Fri, 20 Jan 2012 22:34:24 +0000 (14:34 -0800)]
mm: fix rss count leakage during migration

Memory migration fills a pte with a migration entry and it doesn't
update the rss counters.  Then it replaces the migration entry with the
new page (or the old one if migration failed).  But between these two
passes this pte can be unmaped, or a task can fork a child and it will
get a copy of this migration entry.  Nobody accounts for this in the rss
counters.

This patch properly adjust rss counters for migration entries in
zap_pte_range() and copy_one_pte().  Thus we avoid extra atomic
operations on the migration fast-path.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoSHM_UNLOCK: fix Unevictable pages stranded after swap
Hugh Dickins [Fri, 20 Jan 2012 22:34:21 +0000 (14:34 -0800)]
SHM_UNLOCK: fix Unevictable pages stranded after swap

Commit cc39c6a9bbde ("mm: account skipped entries to avoid looping in
find_get_pages") correctly fixed an infinite loop; but left a problem
that find_get_pages() on shmem would return 0 (appearing to callers to
mean end of tree) when it meets a run of nr_pages swap entries.

The only uses of find_get_pages() on shmem are via pagevec_lookup(),
called from invalidate_mapping_pages(), and from shmctl SHM_UNLOCK's
scan_mapping_unevictable_pages().  The first is already commented, and
not worth worrying about; but the second can leave pages on the
Unevictable list after an unusual sequence of swapping and locking.

Fix that by using shmem_find_get_pages_and_swap() (then ignoring the
swap) instead of pagevec_lookup().

But I don't want to contaminate vmscan.c with shmem internals, nor
shmem.c with LRU locking.  So move scan_mapping_unevictable_pages() into
shmem.c, renaming it shmem_unlock_mapping(); and rename
check_move_unevictable_page() to check_move_unevictable_pages(), looping
down an array of pages, oftentimes under the same lock.

Leave out the "rotate unevictable list" block: that's a leftover from
when this was used for /proc/sys/vm/scan_unevictable_pages, whose flawed
handling involved looking at pages at tail of LRU.

Was there significance to the sequence first ClearPageUnevictable, then
test page_evictable, then SetPageUnevictable here? I think not, we're
under LRU lock, and have no barriers between those.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michel Lespinasse <walken@google.com>
Cc: <stable@vger.kernel.org> [back to 3.1 but will need respins]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoSHM_UNLOCK: fix long unpreemptible section
Hugh Dickins [Fri, 20 Jan 2012 22:34:19 +0000 (14:34 -0800)]
SHM_UNLOCK: fix long unpreemptible section

scan_mapping_unevictable_pages() is used to make SysV SHM_LOCKed pages
evictable again once the shared memory is unlocked.  It does this with
pagevec_lookup()s across the whole object (which might occupy most of
memory), and takes 300ms to unlock 7GB here.  A cond_resched() every
PAGEVEC_SIZE pages would be good.

However, KOSAKI-san points out that this is called under shmem.c's
info->lock, and it's also under shm.c's shm_lock(), both spinlocks.
There is no strong reason for that: we need to take these pages off the
unevictable list soonish, but those locks are not required for it.

So move the call to scan_mapping_unevictable_pages() from shmem.c's
unlock handling up to shm.c's unlock handling.  Remove the recently
added barrier, not needed now we have spin_unlock() before the scan.

Use get_file(), with subsequent fput(), to make sure we have a reference
to mapping throughout scan_mapping_unevictable_pages(): that's something
that was previously guaranteed by the shm_lock().

Remove shmctl's lru_add_drain_all(): we don't fault in pages at SHM_LOCK
time, and we lazily discover them to be Unevictable later, so it serves
no purpose for SHM_LOCK; and serves no purpose for SHM_UNLOCK, since
pages still on pagevec are not marked Unevictable.

The original code avoided redundant rescans by checking VM_LOCKED flag
at its level: now avoid them by checking shp's SHM_LOCKED.

The original code called scan_mapping_unevictable_pages() on a locked
area at shm_destroy() time: perhaps we once had accounting cross-checks
which required that, but not now, so skip the overhead and just let
inode eviction deal with them.

Put check_move_unevictable_page() and scan_mapping_unevictable_pages()
under CONFIG_SHMEM (with stub for the TINY case when ramfs is used),
more as comment than to save space; comment them used for SHM_UNLOCK.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokdump: define KEXEC_NOTE_BYTES arch specific for s390x
Michael Holzheu [Fri, 20 Jan 2012 22:34:16 +0000 (14:34 -0800)]
kdump: define KEXEC_NOTE_BYTES arch specific for s390x

kdump only allocates memory for the prstatus ELF note.  For s390x,
besides of prstatus multiple ELF notes for various different register
types are stored.  Therefore the currently allocated memory is not
sufficient.  With this patch the KEXEC_NOTE_BYTES macro can be defined
by architecture code and for s390x it is set to the correct size now.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomm/hugetlb.c: undo change to page mapcount in fault handler
Hillf Danton [Fri, 20 Jan 2012 22:34:13 +0000 (14:34 -0800)]
mm/hugetlb.c: undo change to page mapcount in fault handler

Page mapcount should be updated only if we are sure that the page ends
up in the page table otherwise we would leak if we couldn't COW due to
reservations or if idx is out of bounds.

Signed-off-by: Hillf Danton <dhillf@gmail.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomm: memcg: update the correct soft limit tree during migration
Johannes Weiner [Fri, 20 Jan 2012 22:34:12 +0000 (14:34 -0800)]
mm: memcg: update the correct soft limit tree during migration

end_migration() passes the old page instead of the new page to commit
the charge.  This page descriptor is not used for committing itself,
though, since we also pass the (correct) page_cgroup descriptor.  But
it's used to find the soft limit tree through the page's zone, so the
soft limit tree of the old page's zone is updated instead of that of the
new page's, which might get slightly out of date until the next charge
reaches the ratelimit point.

This glitch has been present since 5564e88 ("memcg: condense
page_cgroup-to-page lookup points").

This fixes a bug that I introduced in 2.6.38.  It's benign enough (to my
knowledge) that we probably don't want this for stable.

Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoproc: clear_refs: do not clear reserved pages
Will Deacon [Fri, 20 Jan 2012 22:34:09 +0000 (14:34 -0800)]
proc: clear_refs: do not clear reserved pages

/proc/pid/clear_refs is used to clear the Referenced and YOUNG bits for
pages and corresponding page table entries of the task with PID pid, which
includes any special mappings inserted into the page tables in order to
provide things like vDSOs and user helper functions.

On ARM this causes a problem because the vectors page is mapped as a
global mapping and since ec706dab ("ARM: add a vma entry for the user
accessible vector page"), a VMA is also inserted into each task for this
page to aid unwinding through signals and syscall restarts.  Since the
vectors page is required for handling faults, clearing the YOUNG bit (and
subsequently writing a faulting pte) means that we lose the vectors page
*globally* and cannot fault it back in.  This results in a system deadlock
on the next exception.

To see this problem in action, just run:

$ echo 1 > /proc/self/clear_refs

on an ARM platform (as any user) and watch your system hang.  I think this
has been the case since 2.6.37

This patch avoids clearing the aforementioned bits for reserved pages,
therefore leaving the vectors page intact on ARM.  Since reserved pages
are not candidates for swap, this change should not have any impact on the
usefulness of clear_refs.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Reported-by: Moussa Ba <moussaba@micron.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Acked-by: Nicolas Pitre <nico@linaro.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: <stable@vger.kernel.org> [2.6.37+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agodrivers/video/backlight/l4f00242t03.c: return proper error in l4f00242t03_probe if...
Axel Lin [Fri, 20 Jan 2012 22:34:06 +0000 (14:34 -0800)]
drivers/video/backlight/l4f00242t03.c: return proper error in l4f00242t03_probe if regulator_get() fails

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Alberto Panizzo <alberto@amarulasolutions.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agodrivers/video/backlight/adp88x0_bl.c: fix bit testing logic
Axel Lin [Fri, 20 Jan 2012 22:34:05 +0000 (14:34 -0800)]
drivers/video/backlight/adp88x0_bl.c: fix bit testing logic

We need to write new value if the bit mask fields of new value is not
equal to old value.  It does not make sense to write new value only when
all the bit_mask bits are zero.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Michael Hennerich <michael.hennerich@analog.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agokprobes: initialize before using a hlist
Ananth N Mavinakayanahalli [Fri, 20 Jan 2012 22:34:04 +0000 (14:34 -0800)]
kprobes: initialize before using a hlist

Commit ef53d9c5e ("kprobes: improve kretprobe scalability with hashed
locking") introduced a bug where we can potentially leak
kretprobe_instances since we initialize a hlist head after having used
it.

Initialize the hlist head before using it.

Reported by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Srinivasa D S <srinivasa@in.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoipc/mqueue: simplify reading msgqueue limit
Davidlohr Bueso [Fri, 20 Jan 2012 22:34:01 +0000 (14:34 -0800)]
ipc/mqueue: simplify reading msgqueue limit

Because the current task is being used to get the limit, we can simply
use rlimit() instead of task_rlimit().

Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoleds: add led driver for Bachmann's ot200
Sebastian Andrzej Siewior [Fri, 20 Jan 2012 22:33:59 +0000 (14:33 -0800)]
leds: add led driver for Bachmann's ot200

Add support for leds on Bachmann's ot200 visualisation device.  The
device has three leds on the back panel (led_err, led_init and led_run)
and can handle up to seven leds on the front panel.

The driver was written by Linutronix on behalf of Bachmann electronic
GmbH.  It incorporates feedback from Lars-Peter Clausen

[akpm@linux-foundation.org: add dependency on HAS_IOMEM]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomm: __count_immobile_pages(): make sure the node is online
Michal Hocko [Fri, 20 Jan 2012 22:33:58 +0000 (14:33 -0800)]
mm: __count_immobile_pages(): make sure the node is online

page_zone() requires an online node otherwise we are accessing NULL
NODE_DATA.  This is not an issue at the moment because node_zones are
located at the structure beginning but this might change in the future
so better be careful about that.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomm: fix NULL ptr dereference in __count_immobile_pages
Michal Hocko [Fri, 20 Jan 2012 22:33:55 +0000 (14:33 -0800)]
mm: fix NULL ptr dereference in __count_immobile_pages

Fix the following NULL ptr dereference caused by

  cat /sys/devices/system/memory/memory0/removable

Pid: 13979, comm: sed Not tainted 3.0.13-0.5-default #1 IBM BladeCenter LS21 -[7971PAM]-/Server Blade
RIP: __count_immobile_pages+0x4/0x100
Process sed (pid: 13979, threadinfo ffff880221c36000, task ffff88022e788480)
Call Trace:
  is_pageblock_removable_nolock+0x34/0x40
  is_mem_section_removable+0x74/0xf0
  show_mem_removable+0x41/0x70
  sysfs_read_file+0xfe/0x1c0
  vfs_read+0xc7/0x130
  sys_read+0x53/0xa0
  system_call_fastpath+0x16/0x1b

We are crashing because we are trying to dereference NULL zone which
came from pfn=0 (struct page ffffea0000000000). According to the boot
log this page is marked reserved:
e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)

and early_node_map confirms that:
early_node_map[3] active PFN ranges
    1: 0x00000010 -> 0x0000009c
    1: 0x00000100 -> 0x000bffa3
    1: 0x00100000 -> 0x00240000

The problem is that memory_present works in PAGE_SECTION_MASK aligned
blocks so the reserved range sneaks into the the section as well.  This
also means that free_area_init_node will not take care of those reserved
pages and they stay uninitialized.

When we try to read the removable status we walk through all available
sections and hope that the zone is valid for all pages in the section.
But this is not true in this case as the zone and nid are not initialized.

We have only one node in this particular case and it is marked as node=1
(rather than 0) and that made the problem visible because page_to_nid will
return 0 and there are no zones on the node.

Let's check that the zone is valid and that the given pfn falls into its
boundaries and mark the section not removable.  This might cause some
false positives, probably, but we do not have any sane way to find out
whether the page is reserved by the platform or it is just not used for
whatever other reasons.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agomm: fix warnings regarding enum migrate_mode
Andrew Morton [Fri, 20 Jan 2012 22:33:53 +0000 (14:33 -0800)]
mm: fix warnings regarding enum migrate_mode

sparc64 allmodconfig:

In file included from include/linux/compat.h:15,
                 from /usr/src/25/arch/sparc/include/asm/siginfo.h:19,
                 from include/linux/signal.h:5,
                 from include/linux/sched.h:73,
                 from arch/sparc/kernel/asm-offsets.c:13:
include/linux/fs.h:618: warning: parameter has incomplete type

It seems that my sparc64 compiler (gcc-3.4.5) doesn't like the forward
declaration of enums.

Fix this by moving the "enum migrate_mode" definition into its own header
file.

Acked-by: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andy Isaacson <adi@hexapodia.org>
Cc: Nai Xia <nai.xia@gmail.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9 years agoALSA: HDA: Use model=auto for Thinkpad T510
David Henningsson [Mon, 23 Jan 2012 15:39:55 +0000 (16:39 +0100)]
ALSA: HDA: Use model=auto for Thinkpad T510

The user reports that model=auto works fine for him. Using
model=auto bring in new features such as jack detection notification
to userspace.

Alsa info is available at http://paste.ubuntu.com/805351/

Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
9 years agoALSA: hda - Fix buffer-alignment regression with Nvidia HDMI
Takashi Iwai [Mon, 23 Jan 2012 16:10:24 +0000 (17:10 +0100)]
ALSA: hda - Fix buffer-alignment regression with Nvidia HDMI

The commit 2ae66c26550cd94b0e2606a9275eb0ab7070ad0e
    ALSA: hda: option to enable arbitrary buffer/period sizes
introduced a regression on machines with Intel controller and Nvidia
HDMI.  The reason is that the driver modifies the global variable
align_buffer_size when an Intel controller is found, and the Nvidia
HDMI controller is probed after Intel although Nvidia chips require
the aligned buffers.

This patch fixes the problem by moving the flag into the local struct
so that it's not affected by other controllers.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42567

Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
9 years agoethtool: allow ETHTOOL_GSSET_INFO for users
Michał Mirosław [Sun, 22 Jan 2012 00:20:40 +0000 (00:20 +0000)]
ethtool: allow ETHTOOL_GSSET_INFO for users

Allow ETHTOOL_GSSET_INFO ethtool ioctl() for unprivileged users.
ETHTOOL_GSTRINGS is already allowed, but is unusable without this one.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobluetooth: hci: Fix type of "enable_hs" to bool.
David S. Miller [Sun, 22 Jan 2012 19:45:14 +0000 (14:45 -0500)]
bluetooth: hci: Fix type of "enable_hs" to bool.

Fixes:

net/bluetooth/hci_core.c: In function ‘__check_enable_hs’:
net/bluetooth/hci_core.c:2587:1: warning: return from incompatible pointer type [enabled by default]

Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: introduce res_counter_charge_nofail() for socket allocations
Glauber Costa [Fri, 20 Jan 2012 04:57:16 +0000 (04:57 +0000)]
net: introduce res_counter_charge_nofail() for socket allocations

There is a case in __sk_mem_schedule(), where an allocation
is beyond the maximum, but yet we are allowed to proceed.
It happens under the following condition:

sk->sk_wmem_queued + size >= sk->sk_sndbuf

The network code won't revert the allocation in this case,
meaning that at some point later it'll try to do it. Since
this is never communicated to the underlying res_counter
code, there is an inbalance in res_counter uncharge operation.

I see two ways of fixing this:

1) storing the information about those allocations somewhere
   in memcg, and then deducting from that first, before
   we start draining the res_counter,
2) providing a slightly different allocation function for
   the res_counter, that matches the original behavior of
   the network code more closely.

I decided to go for #2 here, believing it to be more elegant,
since #1 would require us to do basically that, but in a more
obscure way.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
CC: Tejun Heo <tj@kernel.org>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Laurent Chavey <chavey@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agocgroup: make sure memcg margin is 0 when over limit
Glauber Costa [Fri, 20 Jan 2012 04:57:15 +0000 (04:57 +0000)]
cgroup: make sure memcg margin is 0 when over limit

For the memcg sock code, we'll need to register allocations
that are temporarily over limit. Let's make sure that margin
is 0 in this case.

I am keeping this as a separate patch, so that if any weirdness
interaction appears in the future, we can now exactly what caused
it.

Suggested by Johannes Weiner

Signed-off-by: Glauber Costa <glommer@parallels.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Johannes Weiner <hannes@cmpxchg.org>
CC: Michal Hocko <mhocko@suse.cz>
CC: Tejun Heo <tj@kernel.org>
CC: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonet: fix socket memcg build with !CONFIG_NET
Glauber Costa [Fri, 20 Jan 2012 04:57:14 +0000 (04:57 +0000)]
net: fix socket memcg build with !CONFIG_NET

There is still a build bug with the sock memcg code, that triggers
with !CONFIG_NET, that survived my series of randconfig builds.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
CC: Hiroyouki Kamezawa <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agokernel-doc: fix new warning in net/sock.h
Randy Dunlap [Sat, 21 Jan 2012 09:03:10 +0000 (09:03 +0000)]
kernel-doc: fix new warning in net/sock.h

Fix new kernel-doc warning:

Warning(include/net/sock.h:372): No description found for parameter 'sk_cgrp_prioidx'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agokernel-doc: fix new warning in net/phy/mdio_bus.c
Randy Dunlap [Sat, 21 Jan 2012 09:03:04 +0000 (09:03 +0000)]
kernel-doc: fix new warning in net/phy/mdio_bus.c

Fix new kernel-doc warning:

Warning(drivers/net/phy/mdio_bus.c:49): No description found for parameter 'size'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: md5: using remote adress for md5 lookup in rst packet
shawnlu [Fri, 20 Jan 2012 12:22:04 +0000 (12:22 +0000)]
tcp: md5: using remote adress for md5 lookup in rst packet

md5 key is added in socket through remote address.
remote address should be used in finding md5 key when
sending out reset packet.

Signed-off-by: shawnlu <shawn.lu@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agobe2net: create RSS rings even in multi-channel configs
Sathya Perla [Thu, 19 Jan 2012 20:34:04 +0000 (20:34 +0000)]
be2net: create RSS rings even in multi-channel configs

Currently RSS rings are not created in a multi-channel config.
RSS rings can be created on one (out of four) interfaces per port in a
multi-channel config. Doing this insulates the driver from a FW bug wherin
multi-channel config is wrongly reported even when not enabled. This also
helps performance in a multi-channel config, as one interface per port gets
RSS rings.

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agopktgen: Fix unsigned function that is returning negative vals
Paul Gortmaker [Thu, 19 Jan 2012 15:40:06 +0000 (15:40 +0000)]
pktgen: Fix unsigned function that is returning negative vals

Every call to num_args() immediately checks the return value for
less than zero, as it will return -EFAULT for a failed get_user()
call.  So it makes no sense for the function to be declared as an
unsigned long.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: detect loss above high_seq in recovery
Yuchung Cheng [Thu, 19 Jan 2012 14:42:21 +0000 (14:42 +0000)]
tcp: detect loss above high_seq in recovery

Correctly implement a loss detection heuristic: New sequences (above
high_seq) sent during the fast recovery are deemed lost when higher
sequences are SACKed.

Current code does not catch these losses, because tcp_mark_head_lost()
does not check packets beyond high_seq. The fix is straight-forward by
checking packets until the highest sacked packet. In addition, all the
FLAG_DATA_LOST logic are in-effective and redundant and can be removed.

Update the loss heuristic comments. The algorithm above is documented
as heuristic B, but it is redundant too because heuristic A already
covers B.

Note that this change only marks some forward-retransmitted packets LOST.
It does NOT forbid TCP performing further CWR on new losses. A potential
follow-up patch under preparation is to perform another CWR on "new"
losses such as
1) sequence above high_seq is lost (by resetting high_seq to snd_nxt)
2) retransmission is lost.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoskge: check for PCI dma mapping errors
stephen hemminger [Thu, 19 Jan 2012 14:37:18 +0000 (14:37 +0000)]
skge: check for PCI dma mapping errors

Driver should check for mapping errors.
Machines with limited DMA maps may return an error when a PCI map is
requested (not an issue on standard x86).

Also use upper/lower 32 bits macros for clarity.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoskge: don't assert carrier until link is up
stephen hemminger [Thu, 19 Jan 2012 14:35:25 +0000 (14:35 +0000)]
skge: don't assert carrier until link is up

Skge device would assert carrier (link up) as soon as network device open
was called, rather than waiting until PHY has detected link.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agonetem: Fix off-by-one bug in reordering
Vijay Subramanian [Thu, 19 Jan 2012 10:20:59 +0000 (10:20 +0000)]
netem: Fix off-by-one bug in reordering

With netem reordering, a gap of N is supposed to reorder every Nth packet with
given reorder probability.  However, the code currently skips N packets and
reorders every (N+1)th packet.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlx4_core: map async events to arbitrary slave eqs
Marcel Apfelbaum [Thu, 19 Jan 2012 09:45:46 +0000 (09:45 +0000)]
mlx4_core: map async events to arbitrary slave eqs

Slave async events were mapped to single eq. This patch fixes this issue, so
the slaves can map the async events to any eq.

Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlx4_core: Fix mtt profile issue
Marcel Apfelbaum [Thu, 19 Jan 2012 09:45:31 +0000 (09:45 +0000)]
mlx4_core: Fix mtt profile issue

Num mtts from profile is really the number of mtt segments.
Thus, in make profile, to get the proper number of MTT entries,
must multiply num_mtts by mtts per segment.

Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlx4_core: removed function index from vf.
Marcel Apfelbaum [Thu, 19 Jan 2012 09:45:19 +0000 (09:45 +0000)]
mlx4_core: removed function index from vf.

The Virtual Functions should not be aware their function number.

Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlx4_en: eth statistics modification
Eugenia Emantayev [Thu, 19 Jan 2012 09:45:05 +0000 (09:45 +0000)]
mlx4_en: eth statistics modification

In native mode display all available staticstics.
In SRIOV mode on VF display only SW counters statistics,
in SRIOV mode on hypervisor display SW counters and errors (got from FW)
statistics.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlx4: VF is not allowed to perform dump stats
Eugenia Emantayev [Thu, 19 Jan 2012 09:44:37 +0000 (09:44 +0000)]
mlx4: VF is not allowed to perform dump stats

In multifunction mode - DUMP_STATS command is not executed
for VFs.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agomlx4_en: clear all eth statistics when port goes up
Eugenia Emantayev [Thu, 19 Jan 2012 09:42:37 +0000 (09:42 +0000)]
mlx4_en: clear all eth statistics when port goes up

Bug fix: Not all stats fields were cleared.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Reviewed-by: Yevgeny Petriln <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: fix undo after RTO for CUBIC
Neal Cardwell [Wed, 18 Jan 2012 17:47:59 +0000 (17:47 +0000)]
tcp: fix undo after RTO for CUBIC

This patch fixes CUBIC so that cwnd reductions made during RTOs can be
undone (just as they already can be undone when using the default/Reno
behavior).

When undoing cwnd reductions, BIC-derived congestion control modules
were restoring the cwnd from last_max_cwnd. There were two problems
with using last_max_cwnd to restore a cwnd during undo:

(a) last_max_cwnd was set to 0 on state transitions into TCP_CA_Loss
(by calling the module's reset() functions), so cwnd reductions from
RTOs could not be undone.

(b) when fast_covergence is enabled (which it is by default)
last_max_cwnd does not actually hold the value of snd_cwnd before the
loss; instead, it holds a scaled-down version of snd_cwnd.

This patch makes the following changes:

(1) upon undo, revert snd_cwnd to ca->loss_cwnd, which is already, as
the existing comment notes, the "congestion window at last loss"

(2) stop forgetting ca->loss_cwnd on TCP_CA_Loss events

(3) use ca->last_max_cwnd to check if we're in slow start

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Sangtae Ha <sangtae.ha@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agotcp: fix undo after RTO for BIC
Neal Cardwell [Wed, 18 Jan 2012 17:47:58 +0000 (17:47 +0000)]
tcp: fix undo after RTO for BIC

This patch fixes BIC so that cwnd reductions made during RTOs can be
undone (just as they already can be undone when using the default/Reno
behavior).

When undoing cwnd reductions, BIC-derived congestion control modules
were restoring the cwnd from last_max_cwnd. There were two problems
with using last_max_cwnd to restore a cwnd during undo:

(a) last_max_cwnd was set to 0 on state transitions into TCP_CA_Loss
(by calling the module's reset() functions), so cwnd reductions from
RTOs could not be undone.

(b) when fast_covergence is enabled (which it is by default)
last_max_cwnd does not actually hold the value of snd_cwnd before the
loss; instead, it holds a scaled-down version of snd_cwnd.

This patch makes the following changes:

(1) upon undo, revert snd_cwnd to ca->loss_cwnd, which is already, as
the existing comment notes, the "congestion window at last loss"

(2) stop forgetting ca->loss_cwnd on TCP_CA_Loss events

(3) use ca->last_max_cwnd to check if we're in slow start

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9 years agoenic: fix compile when CONFIG_PCI_IOV is not enabled
Roopa Prabhu [Thu, 19 Jan 2012 22:25:36 +0000 (22:25 +0000)]
enic: fix compile when CONFIG_PCI_IOV is not enabled

reverting back change that access enic->num_vfs outside
CONFIG_PCI_IOV

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>