linux.git - Linux Kernel

Age	Commit message (Collapse)	Author
2015-12-04	sched/cputime: Fix invalid gtime in proc	Hiroshi Shimamoto
	/proc/stats shows invalid gtime when the thread is running in guest. When vtime accounting is not enabled, we cannot get a valid delta. The delta is calculated with now - tsk->vtime_snap, but tsk->vtime_snap is only updated when vtime accounting is runtime enabled. This patch makes task_gtime() just return gtime without computing the buggy non-existing tickless delta when vtime accounting is not enabled. Use context_tracking_is_enabled() to check if vtime is accounting on some cpu, in which case only we need to check the tickless delta. This way we fix the gtime value regression on machines not running nohz full. The kernel config contains CONFIG_VIRT_CPU_ACCOUNTING_GEN=y and CONFIG_NO_HZ_FULL_ALL=n and boot without nohz_full. I ran and stop a busy loop in VM and see the gtime in host. Dump the 43rd field which shows the gtime in every second: # while :; do awk '{print $3" "$43}' /proc/3955/task/4014/stat; sleep 1; done S 4348 R 7064566 R 7064766 R 7064967 R 7065168 S 4759 S 4759 During running busy loop, it returns large value. After applying this patch, we can see right gtime. # while :; do awk '{print $3" "$43}' /proc/10913/task/10956/stat; sleep 1; done S 5338 R 5365 R 5465 R 5566 R 5666 S 5726 S 5726 Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Christoph Lameter <cl@linux.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1447948054-28668-2-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-04	sched/core: Clear the root_domain cpumasks in init_rootdomain()	Xunlei Pang
	root_domain::rto_mask allocated through alloc_cpumask_var() contains garbage data, this may cause problems. For instance, When doing pull_rt_task(), it may do useless iterations if rto_mask retains some extra garbage bits. Worse still, this violates the isolated domain rule for clustered scheduling using cpuset, because the tasks(with all the cpus allowed) belongs to one root domain can be pulled away into another root domain. The patch cleans the garbage by using zalloc_cpumask_var() instead of alloc_cpumask_var() for root_domain::rto_mask allocation, thereby addressing the issues. Do the same thing for root_domain's other cpumask memembers: dlo_mask, span, and online. Signed-off-by: Xunlei Pang <xlpang@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1449057179-29321-1-git-send-email-xlpang@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-04	sched/core: Remove false-positive warning from wake_up_process()	Sasha Levin
	Because wakeups can (fundamentally) be late, a task might not be in the expected state. Therefore testing against a task's state is racy, and can yield false positives. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: oleg@redhat.com Fixes: 9067ac85d533 ("wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task") Link: http://lkml.kernel.org/r/1448933660-23082-1-git-send-email-sasha.levin@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-04	sched/wait: Fix signal handling in bit wait helpers	Peter Zijlstra
	Vladimir reported getting RCU stall warnings and bisected it back to commit: 743162013d40 ("sched: Remove proliferation of wait_on_bit() action functions") That commit inadvertently reversed the calls to schedule() and signal_pending(), thereby not handling the case where the signal receives while we sleep. Reported-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: mark.rutland@arm.com Cc: neilb@suse.de Cc: oleg@redhat.com Fixes: 743162013d40 ("sched: Remove proliferation of wait_on_bit() action functions") Fixes: cbbce8220949 ("SCHED: add some "wait..on_bit...timeout()" interfaces.") Link: http://lkml.kernel.org/r/20151201130404.GL3816@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-04	x86/mm: Fix regression with huge pages on PAE	Kirill A. Shutemov
	Recent PAT patchset has caused issue on 32-bit PAE machines: page:eea45000 count:0 mapcount:-128 mapping: (null) index:0x0 flags: 0x40000000() page dumped because: VM_BUG_ON_PAGE(page_mapcount(page) < 0) ------------[ cut here ]------------ kernel BUG at /home/build/linux-boris/mm/huge_memory.c:1485! invalid opcode: 0000 [#1] SMP [...] Call Trace: unmap_single_vma ? __wake_up unmap_vmas unmap_region do_munmap vm_munmap SyS_munmap do_fast_syscall_32 ? __do_page_fault sysenter_past_esp Code: ... EIP: [<c11bde80>] zap_huge_pmd+0x240/0x260 SS:ESP 0068:f6459d98 The problem is in pmd_pfn_mask() and pmd_flags_mask(). These helpers use PMD_PAGE_MASK to calculate resulting mask. PMD_PAGE_MASK is 'unsigned long', not 'unsigned long long' as phys_addr_t is on 32-bit PAE (ARCH_PHYS_ADDR_T_64BIT). As a result, the upper bits of resulting mask get truncated. pud_pfn_mask() and pud_flags_mask() aren't problematic since we don't have PUD page table level on 32-bit systems, but it's reasonable to keep them consistent with PMD counterpart. Introduce PHYSICAL_PMD_PAGE_MASK and PHYSICAL_PUD_PAGE_MASK in addition to existing PHYSICAL_PAGE_MASK and reworks helpers to use them. Reported-and-Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> [ Fix -Woverflow warnings from the realmode code. ] Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jürgen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: elliott@hpe.com Cc: konrad.wilk@oracle.com Cc: linux-mm <linux-mm@kvack.org> Fixes: f70abb0fc3da ("x86/asm: Fix pud/pmd interfaces to handle large PAT bit") Link: http://lkml.kernel.org/r/1448878233-11390-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-04	drm/nouveau: Fix pre-nv50 pageflip events (v4)	Daniel Vetter
	Apparently pre-nv50 pageflip events happen before the actual vblank period. Therefore that functionality got semi-disabled in commit af4870e406126b7ac0ae7c7ce5751f25ebe60f28 Author: Mario Kleiner <mario.kleiner.de@gmail.com> Date: Tue May 13 00:42:08 2014 +0200 drm/nouveau/kms/nv04-nv40: fix pageflip events via special case. Unfortunately that hack got uprooted in commit cc1ef118fc099295ae6aabbacc8af94d8d8885eb Author: Thierry Reding <treding@nvidia.com> Date: Wed Aug 12 17:00:31 2015 +0200 drm/irq: Make pipe unsigned and name consistent Triggering a warning when trying to sample the vblank timestamp for a non-existing pipe. There's a few ways to fix this: - Open-code the old behaviour, which just enshrines this slight breakage of the userspace ABI. - Revert Mario's commit and again inflict broken timestamps, again not pretty. - Fix this for real by delaying the pageflip TS until the next vblank interrupt, thereby making it accurate. This patch implements the third option. Since having a page flip interrupt that happens when the pageflip gets armed and not when it completes in the next vblank seems to be fairly common (older i915 hw works very similarly) create a new helper to arm vblank events for such drivers. v2 (Mario Kleiner): - Fix function prototypes in drmP.h - Add missing vblank_put() for pageflip completion without pageflip event. - Initialize sequence number for queued pageflip event to avoid trouble in drm_handle_vblank_events(). - Remove dead code and spelling fix. v3 (Mario Kleiner): - Add a signed-off-by and cc stable tag per Ilja's advice. v4 (Thierry Reding): - Fix kerneldoc typo, discovered by Michel Dänzer - Rearrange tags and changelog Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=106431 Cc: Thierry Reding <treding@nvidia.com> Cc: Mario Kleiner <mario.kleiner.de@gmail.com> Acked-by: Ben Skeggs <bskeggs@redhat.com> Cc: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: stable@vger.kernel.org # v4.3 Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-12-04	drm: Fix an unwanted master inheritance v2	Thomas Hellstrom
	A client calling drmSetMaster() using a file descriptor that was opened when another client was master would inherit the latter client's master object and all its authenticated clients. This is unwanted behaviour, and when this happens, instead allocate a brand new master object for the client calling drmSetMaster(). Fixes a BUG() throw in vmw_master_set(). Cc: <stable@vger.kernel.org> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-12-04	Merge tag 'imx-drm-fixes-2015-12-01' of ↵	Dave Airlie
	git://git.pengutronix.de/git/pza/linux into drm-fixes imx-drm crtc, plane, parallel panel, and TV encoder fixes - Use drm_crtc_send_vblank_event to fix per crtc vblank handling - Move the crtc device of_node assignment out of the ipuv3-crtc driver into ipu-common code, where the devices are created. - Fix parallel display support with simple-panels - Remove some unused fields and superfluous checks - Switch to universal planes and add error handling for primary plane creation - Fix module autoload for TV encoder driver * tag 'imx-drm-fixes-2015-12-01' of git://git.pengutronix.de/git/pza/linux: drm: imx: imx-tve: Fix module autoload for OF platform driver drm: imx: convert to drm_crtc_send_vblank_event() GPU-DRM-IMX: Delete an unnecessary check before drm_fbdev_cma_restore_mode() drm/imx: Remove of_node assignment from ipuv3-crtc driver probe gpu: ipu-v3: Assign of_node of child platform devices to corresponding ports gpu: ipu-v3: Remove reg_offset field gpu: ipu-v3: drop unused dmfc field from client platform data drm/imx: parallel-display: allow to determine bus format from the connected panel drm/imx: ipuv3-crtc: Return error if ipu_plane_init() fails for primary plane drm/imx: switch to universal planes
2015-12-04	Merge tag 'drm-intel-fixes-2015-12-03' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm-intel into drm-fixes Another batch of drm/i915 fixes for v4.4, on top of the ones from earlier this week. One timeout handling regression fix from Chris, and backport of five patches from our -next to fix a power management related HDMI hotplug regression. * tag 'drm-intel-fixes-2015-12-03' of git://anongit.freedesktop.org/drm-intel: drm/i915: take a power domain reference while checking the HDMI live status drm/i915: add MISSING_CASE to a few port/aux power domain helpers drm/i915/ddi: fix intel_display_port_aux_power_domain() after HDMI detect drm/i915: Introduce a gmbus power domain drm/i915: Clean up AUX power domain handling drm/i915: Check the timeout passed to i915_wait_request
2015-12-04	PM / Domains: Fix bad of_node_put() in failure paths of genpd_dev_pm_attach()	Eric Anholt
	It looks like these meant to be unreffing the of_parse_phandle_with_args() node, since the error paths above it don't do of_node_put. That function returns a new ref in pd_args.np, though, not a new ref on dev->of_node. Also, it would have leaked the ref in the success case. Fixes "ERROR: Bad of_node_put()" on bcm2835 in the -EPROBE_DEFER case. Fixes: aa42240ab254 (PM / Domains: Add generic OF-based PM domain look-up) Signed-off-by: Eric Anholt <eric@anholt.net> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Kevin Hilman <khilman@linaro.org> Cc: 3.18+ <stable@vger.kernel.org> # 3.18+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-03	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds
	Pull networking fixes from David Miller: "A lot of Thanksgiving turkey leftovers accumulated, here goes: 1) Fix bluetooth l2cap_chan object leak, from Johan Hedberg. 2) IDs for some new iwlwifi chips, from Oren Givon. 3) Fix rtlwifi lockups on boot, from Larry Finger. 4) Fix memory leak in fm10k, from Stephen Hemminger. 5) We have a route leak in the ipv6 tunnel infrastructure, fix from Paolo Abeni. 6) Fix buffer pointer handling in arm64 bpf JIT,f rom Zi Shen Lim. 7) Wrong lockdep annotations in tcp md5 support, fix from Eric Dumazet. 8) Work around some middle boxes which prevent proper handling of TCP Fast Open, from Yuchung Cheng. 9) TCP repair can do huge kmalloc() requests, build paged SKBs instead. From Eric Dumazet. 10) Fix msg_controllen overflow in scm_detach_fds, from Daniel Borkmann. 11) Fix device leaks on ipmr table destruction in ipv4 and ipv6, from Nikolay Aleksandrov. 12) Fix use after free in epoll with AF_UNIX sockets, from Rainer Weikusat. 13) Fix double free in VRF code, from Nikolay Aleksandrov. 14) Fix skb leaks on socket receive queue in tipc, from Ying Xue. 15) Fix ifup/ifdown crach in xgene driver, from Iyappan Subramanian. 16) Fix clearing of persistent array maps in bpf, from Daniel Borkmann. 17) In TCP, for the cross-SYN case, we don't initialize tp->copied_seq early enough. From Eric Dumazet. 18) Fix out of bounds accesses in bpf array implementation when updating elements, from Daniel Borkmann. 19) Fill gaps in RCU protection of np->opt in ipv6 stack, from Eric Dumazet. 20) When dumping proxy neigh entries, we have to accomodate NULL device pointers properly, from Konstantin Khlebnikov. 21) SCTP doesn't release all ipv6 socket resources properly, fix from Eric Dumazet. 22) Prevent underflows of sch->q.qlen for multiqueue packet schedulers, also from Eric Dumazet. 23) Fix MAC and unicast list handling in bnxt_en driver, from Jeffrey Huang and Michael Chan. 24) Don't actively scan radar channels, from Antonio Quartulli" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (110 commits) net: phy: reset only targeted phy bnxt_en: Setup uc_list mac filters after resetting the chip. bnxt_en: enforce proper storing of MAC address bnxt_en: Fixed incorrect implementation of ndo_set_mac_address net: lpc_eth: remove irq > NR_IRQS check from probe() net_sched: fix qdisc_tree_decrease_qlen() races openvswitch: fix hangup on vxlan/gre/geneve device deletion ipv4: igmp: Allow removing groups from a removed interface ipv6: sctp: implement sctp_v6_destroy_sock() arm64: bpf: add 'store immediate' instruction ipv6: kill sk_dst_lock ipv6: sctp: add rcu protection around np->opt net/neighbour: fix crash at dumping device-agnostic proxy entries sctp: use GFP_USER for user-controlled kmalloc sctp: convert sack_needed and sack_generation to bits ipv6: add complete rcu protection around np->opt bpf: fix allocation warnings in bpf maps and integer overflow mvebu: dts: enable IP checksum with jumbo frames for Armada 38x on Port0 net: mvneta: enable setting custom TX IP checksum limit net: mvneta: fix error path for building skb ...
2015-12-03	Merge branch 'for-linus' of git://git.kernel.dk/linux-block	Linus Torvalds
	Pull block fixes from Jens Axboe: "A collection of fixes from this series. The most important here is a regression fix for an issue that some folks would hit in blk-merge.c, and the NVMe queue depth limit for the screwed up Apple "nvme" controller. In more detail, this pull request contains: - a set of fixes for null_blk, including a fix for a few corner cases where we could hang the device. From Arianna and Paolo. - lightnvm: - A build improvement from Keith. - Update the qemu pci id detection from Matias. - Error handling fixes for leaks and other little fixes from Sudip and Wenwei. - fix from Eric where BLKRRPART would not return EBUSY for whole device mounts, only when partitions were mounted. - fix from Jan Kara, where EOF O_DIRECT reads would return negatively. - remove check for rq_mergeable() when checking limits for cloned requests. The check doesn't make any sense. It's assuming that since NOMERGE is set on the request that we don't have to recalculate limits since the request didn't change, but that's not true if the request has been redirected. From Hannes. - correctly get the bio front segment value set for single segment bio's, fixing a BUG() in blk-merge. From Ming" * 'for-linus' of git://git.kernel.dk/linux-block: nvme: temporary fix for Apple controller reset null_blk: change type of completion_nsec to unsigned long null_blk: guarantee device restart in all irq modes null_blk: set a separate timer for each command blk-merge: fix computing bio->bi_seg_front_size in case of single segment direct-io: Fix negative return from dio read beyond eof block: Always check queue limits for cloned requests lightnvm: missing nvm_lock acquire lightnvm: unconverted ppa returned in get_bb_tbl lightnvm: refactor and change vendor id for qemu lightnvm: do device max sectors boundary check first lightnvm: fix ioctl memory leaks lightnvm: free memory when gennvm register fails lightnvm: Simplify config when disabled Return EBUSY from BLKRRPART for mounted whole-dev fs
2015-12-03	Merge tag 'trace-v4.4-rc3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "During the merge window I added a new file that is used to filter trace events on pids. It filters all events where only tasks with their pid in that file exists. It also handles the sched_switch and sched_wakeup trace events where the current task does not have its pid in the file, but the task either being switched to or awaken does. Unfortunately, I forgot about sched_wakeup_new and sched_waking. Both of these tracepoints use the same class as the sched_wakeup tracepoint, and they too should be included in what gets filtered by the set_event_pid file" * tag 'trace-v4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Add sched_wakeup_new and sched_waking tracepoints for pid filter
2015-12-03	Merge tag 'mac80211-for-davem-2015-12-02' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A small set of fixes for 4.4: * fix scanning in mac80211 to not actively scan radar channels (from Antonio) * fix uninitialized variable in remain-on-channel that could lead to treating frame TX as remain-on-channel and not sending the frame at all * remove NL80211_FEATURE_FULL_AP_CLIENT_STATE again, it was broken and needs more work, we'll enable it later * fix call_rcu() induced use-after-reset/free in mesh (that was suddenly causing issues in certain tests) * always request block-ack window size 64 as we found some APs will otherwise crash (really ...) * fix P2P-Device teardown sequence to avoid restarting with uninitialized data ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	net: phy: reset only targeted phy	Jérôme Pouiller
	It is possible to address another chip on same MDIO bus. The case is correctly handled for media advertising. It is taken into account only if mii_data->phy_id == phydev->addr. However, this condition was missing for reset case. Signed-off-by: Jérôme Pouiller <jezz@sysmic.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	Merge branch 'bnxt_en-fixes'	David S. Miller
	Michael Chan says: ==================== bnxt_en: set mac address and uc_list bug fixes. Fix ndo_set_mac_address() for PF and VF. Re-apply uc_list after chip reset. v2: Fix compile error if CONFIG_BNXT_SRIOV is not set. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	bnxt_en: Setup uc_list mac filters after resetting the chip.	Michael Chan
	Call bnxt_cfg_rx_mode() in bnxt_init_chip() to setup uc_list and mc_list mac address filters. Before the patch, uc_list is not setup again after chip reset (such as ethtool ring size change) and macvlans don't work any more after that. Modify bnxt_cfg_rx_mode() to return error codes appropriately so that the init chip sequence can detect any failures. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	bnxt_en: enforce proper storing of MAC address	Jeffrey Huang
	For PF, the bp->pf.mac_addr always holds the permanent MAC addr assigned by the HW. For VF, the bp->vf.mac_addr always holds the administrator assigned VF MAC addr. The random generated VF MAC addr should never get stored to bp->vf.mac_addr. This way, when the VF wants to change the MAC address, we can tell if the adminstrator has already set it and disallow the VF from changing it. v2: Fix compile error if CONFIG_BNXT_SRIOV is not set. Signed-off-by: Jeffrey Huang <huangjw@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	bnxt_en: Fixed incorrect implementation of ndo_set_mac_address	Jeffrey Huang
	The existing ndo_set_mac_address only copies the new MAC addr and didn't set the new MAC addr to the HW. The correct way is to delete the existing default MAC filter from HW and add the new one. Because of RFS filters are also dependent on the default mac filter l2 context, the driver must go thru close_nic() to delete the default MAC and RFS filters, then open_nic() to set the default MAC address to HW. Signed-off-by: Jeffrey Huang <huangjw@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	net: lpc_eth: remove irq > NR_IRQS check from probe()	Vladimir Zapolskiy
	If the driver is used on an ARM platform with SPARSE_IRQ defined, semantics of NR_IRQS is different (minimal value of virtual irqs) and by default it is set to 16, see arch/arm/include/asm/irq.h. This value may be less than the actual number of virtual irqs, which may break the driver initialization. The check removal allows to use the driver on such a platform, and, if irq controller driver works correctly, the check is not needed on legacy platforms. Fixes a runtime problem: lpc-eth 31060000.ethernet: error getting resources. lpc_eth: lpc-eth: not found (-6). Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	net_sched: fix qdisc_tree_decrease_qlen() races	Eric Dumazet
	qdisc_tree_decrease_qlen() suffers from two problems on multiqueue devices. One problem is that it updates sch->q.qlen and sch->qstats.drops on the mq/mqprio root qdisc, while it should not : Daniele reported underflows errors : [ 681.774821] PAX: sch->q.qlen: 0 n: 1 [ 681.774825] PAX: size overflow detected in function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 cicus.693_49 min, count: 72, decl: qlen; num: 0; context: sk_buff_head; [ 681.774954] CPU: 2 PID: 19 Comm: ksoftirqd/2 Tainted: G O 4.2.6.201511282239-1-grsec #1 [ 681.774955] Hardware name: ASUSTeK COMPUTER INC. X302LJ/X302LJ, BIOS X302LJ.202 03/05/2015 [ 681.774956] ffffffffa9a04863 0000000000000000 0000000000000000 ffffffffa990ff7c [ 681.774959] ffffc90000d3bc38 ffffffffa95d2810 0000000000000007 ffffffffa991002b [ 681.774960] ffffc90000d3bc68 ffffffffa91a44f4 0000000000000001 0000000000000001 [ 681.774962] Call Trace: [ 681.774967] [<ffffffffa95d2810>] dump_stack+0x4c/0x7f [ 681.774970] [<ffffffffa91a44f4>] report_size_overflow+0x34/0x50 [ 681.774972] [<ffffffffa94d17e2>] qdisc_tree_decrease_qlen+0x152/0x160 [ 681.774976] [<ffffffffc02694b1>] fq_codel_dequeue+0x7b1/0x820 [sch_fq_codel] [ 681.774978] [<ffffffffc02680a0>] ? qdisc_peek_dequeued+0xa0/0xa0 [sch_fq_codel] [ 681.774980] [<ffffffffa94cd92d>] __qdisc_run+0x4d/0x1d0 [ 681.774983] [<ffffffffa949b2b2>] net_tx_action+0xc2/0x160 [ 681.774985] [<ffffffffa90664c1>] __do_softirq+0xf1/0x200 [ 681.774987] [<ffffffffa90665ee>] run_ksoftirqd+0x1e/0x30 [ 681.774989] [<ffffffffa90896b0>] smpboot_thread_fn+0x150/0x260 [ 681.774991] [<ffffffffa9089560>] ? sort_range+0x40/0x40 [ 681.774992] [<ffffffffa9085fe4>] kthread+0xe4/0x100 [ 681.774994] [<ffffffffa9085f00>] ? kthread_worker_fn+0x170/0x170 [ 681.774995] [<ffffffffa95d8d1e>] ret_from_fork+0x3e/0x70 mq/mqprio have their own ways to report qlen/drops by folding stats on all their queues, with appropriate locking. A second problem is that qdisc_tree_decrease_qlen() calls qdisc_lookup() without proper locking : concurrent qdisc updates could corrupt the list that qdisc_match_from_root() parses to find a qdisc given its handle. Fix first problem adding a TCQ_F_NOPARENT qdisc flag that qdisc_tree_decrease_qlen() can use to abort its tree traversal, as soon as it meets a mq/mqprio qdisc children. Second problem can be fixed by RCU protection. Qdisc are already freed after RCU grace period, so qdisc_list_add() and qdisc_list_del() simply have to use appropriate rcu list variants. A future patch will add a per struct netdev_queue list anchor, so that qdisc_tree_decrease_qlen() can have more efficient lookups. Reported-by: Daniele Fucini <dfucini@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Cong Wang <cwang@twopensource.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	openvswitch: fix hangup on vxlan/gre/geneve device deletion	Paolo Abeni
	Each openvswitch tunnel vport (vxlan,gre,geneve) holds a reference to the underlying tunnel device, but never released it when such device is deleted. Deleting the underlying device via the ip tool cause the kernel to hangup in the netdev_wait_allrefs() loop. This commit ensure that on device unregistration dp_detach_port_notify() is called for all vports that hold the device reference, properly releasing it. Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device") Fixes: b2acd1dc3949 ("openvswitch: Use regular GRE net_device instead of vport") Fixes: 6b001e682e90 ("openvswitch: Use Geneve device.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	Merge branch 'mkp-fixes' into fixes	James Bottomley

2015-12-03	mpt3sas: fix Kconfig dependency problem for mpt2sas back compatibility	James Bottomley
	The non-PCI builds of the O day test project are failing: On Thu, 2015-12-03 at 05:02 +0800, kbuild test robot wrote: > warning: (SCSI_MPT2SAS) selects SCSI_MPT3SAS which has unmet direct > dependencies (SCSI_LOWLEVEL && PCI && SCSI) The problem is that select and depend don't interact because Kconfig doesn't have a SAT solver, so depend picks up dependencies and select does onward selects, but select doesn't pick up dependencies. To fix this, we need to add the correct dependencies to the MPT2SAS option like this. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: b840c3627b6f4f856b333a14a72f8ed86da2f86c Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2015-12-03	ipv4: igmp: Allow removing groups from a removed interface	Andrew Lunn
	When a multicast group is joined on a socket, a struct ip_mc_socklist is appended to the sockets mc_list containing information about the joined group. If the interface is hot unplugged, this entry becomes stale. Prior to commit 52ad353a5344f ("igmp: fix the problem when mc leave group") it was possible to remove the stale entry by performing a IP_DROP_MEMBERSHIP, passing either the old ifindex or ip address on the interface. However, this fix enforces that the interface must still exist. Thus with time, the number of stale entries grows, until sysctl_igmp_max_memberships is reached and then it is not possible to join and more groups. The previous patch fixes an issue where a IP_DROP_MEMBERSHIP is performed without specifying the interface, either by ifindex or ip address. However here we do supply one of these. So loosen the restriction on device existence to only apply when the interface has not been specified. This then restores the ability to clean up the stale entries. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Fixes: 52ad353a5344f "(igmp: fix the problem when mc leave group") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	ipv6: sctp: implement sctp_v6_destroy_sock()	Eric Dumazet
	Dmitry Vyukov reported a memory leak using IPV6 SCTP sockets. We need to call inet6_destroy_sock() to properly release inet6 specific fields. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	Merge branch 'for-upstream' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2015-12-01 Here's a Bluetooth fix for the 4.4-rc series that fixes a memory leak of the Security Manager L2CAP channel that'll happen for every LE connection. Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	arm64: bpf: add 'store immediate' instruction	Yang Shi
	aarch64 doesn't have native store immediate instruction, such operation has to be implemented by the below instruction sequence: Load immediate to register Store register Signed-off-by: Yang Shi <yang.shi@linaro.org> CC: Zi Shen Lim <zlim.lnx@gmail.com> CC: Xi Wang <xi.wang@gmail.com> Reviewed-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	ipv6: kill sk_dst_lock	Eric Dumazet
	While testing the np->opt RCU conversion, I found that UDP/IPv6 was using a mixture of xchg() and sk_dst_lock to protect concurrent changes to sk->sk_dst_cache, leading to possible corruptions and crashes. ip6_sk_dst_lookup_flow() uses sk_dst_check() anyway, so the simplest way to fix the mess is to remove sk_dst_lock completely, as we did for IPv4. __ip6_dst_store() and ip6_dst_store() share same implementation. sk_setup_caps() being called with socket lock being held or not, we have to use sk_dst_set() instead of __sk_dst_set() Note that I had to move the "np->dst_cookie = rt6_get_cookie(rt);" in ip6_dst_store() before the sk_setup_caps(sk, dst) call. This is because ip6_dst_store() can be called from process context, without any lock held. As soon as the dst is installed in sk->sk_dst_cache, dst can be freed from another cpu doing a concurrent ip6_dst_store() Doing the dst dereference before doing the install is needed to make sure no use after free would trigger. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	ipv6: sctp: add rcu protection around np->opt	Eric Dumazet
	This patch completes the work I did in commit 45f6fad84cc3 ("ipv6: add complete rcu protection around np->opt"), as I missed sctp part. This simply makes sure np->opt is used with proper RCU locking and accessors. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03	net/neighbour: fix crash at dumping device-agnostic proxy entries	Konstantin Khlebnikov
	Proxy entries could have null pointer to net-device. Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Fixes: 84920c1420e2 ("net: Allow ipv6 proxies and arp proxies be shown with iproute2") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	sctp: use GFP_USER for user-controlled kmalloc	Marcelo Ricardo Leitner
	Dmitry Vyukov reported that the user could trigger a kernel warning by using a large len value for getsockopt SCTP_GET_LOCAL_ADDRS, as that value directly affects the value used as a kmalloc() parameter. This patch thus switches the allocation flags from all user-controllable kmalloc size to GFP_USER to put some more restrictions on it and also disables the warn, as they are not necessary. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	sctp: convert sack_needed and sack_generation to bits	Marcelo Ricardo Leitner
	They don't need to be any bigger than that and with this we start a new bitfield for tracking association runtime stuff, like zero window situation. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	ipv6: add complete rcu protection around np->opt	Eric Dumazet
	This patch addresses multiple problems : UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions while socket is not locked : Other threads can change np->opt concurrently. Dmitry posted a syzkaller (http://github.com/google/syzkaller) program desmonstrating use-after-free. Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock() and dccp_v6_request_recv_sock() also need to use RCU protection to dereference np->opt once (before calling ipv6_dup_options()) This patch adds full RCU protection to np->opt Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	bpf: fix allocation warnings in bpf maps and integer overflow	Alexei Starovoitov
	For large map->value_size the user space can trigger memory allocation warnings like: WARNING: CPU: 2 PID: 11122 at mm/page_alloc.c:2989 __alloc_pages_nodemask+0x695/0x14e0() Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [<ffffffff82743b56>] dump_stack+0x68/0x92 lib/dump_stack.c:50 [<ffffffff81244ec9>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:460 [<ffffffff812450f9>] warn_slowpath_null+0x29/0x30 kernel/panic.c:493 [< inline >] __alloc_pages_slowpath mm/page_alloc.c:2989 [<ffffffff81554e95>] __alloc_pages_nodemask+0x695/0x14e0 mm/page_alloc.c:3235 [<ffffffff816188fe>] alloc_pages_current+0xee/0x340 mm/mempolicy.c:2055 [< inline >] alloc_pages include/linux/gfp.h:451 [<ffffffff81550706>] alloc_kmem_pages+0x16/0xf0 mm/page_alloc.c:3414 [<ffffffff815a1c89>] kmalloc_order+0x19/0x60 mm/slab_common.c:1007 [<ffffffff815a1cef>] kmalloc_order_trace+0x1f/0xa0 mm/slab_common.c:1018 [< inline >] kmalloc_large include/linux/slab.h:390 [<ffffffff81627784>] __kmalloc+0x234/0x250 mm/slub.c:3525 [< inline >] kmalloc include/linux/slab.h:463 [< inline >] map_update_elem kernel/bpf/syscall.c:288 [< inline >] SYSC_bpf kernel/bpf/syscall.c:744 To avoid never succeeding kmalloc with order >= MAX_ORDER check that elem->value_size and computed elem_size are within limits for both hash and array type maps. Also add __GFP_NOWARN to kmalloc(value_size \| elem_size) to avoid OOM warnings. Note kmalloc(key_size) is highly unlikely to trigger OOM, since key_size <= 512, so keep those kmalloc-s as-is. Large value_size can cause integer overflows in elem_size and map.pages formulas, so check for that as well. Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	Merge branch 'mvneta-fixes'	David S. Miller
	Marcin Wojtas says: ==================== Marvell Armada XP/370/38X Neta fixes I'm sending v4 with corrected commit log of the last patch, in order to avoid possible conflicts between the branches as suggested by Gregory Clement. Best regards, Marcin Wojtas Changes from v4: * Correct commit log of patch 6/6 Changes from v2: * Style fixes in patch updating mbus protection * Remove redundant stable notifications except for patch 4/6 Changes from v1: * update MBUS windows access protection register once, after whole loop * add fixing value of MVNETA_RXQ_INTR_ENABLE_ALL_MASK * add fixing error path for skb_build() * add possibility of setting custom TX IP checksum limit in DT property ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	mvebu: dts: enable IP checksum with jumbo frames for Armada 38x on Port0	Marcin Wojtas
	The Ethernet controller found in the Armada 38x SoC's family support TCP/IP checksumming with frame sizes larger than 1600 bytes, however only on port 0. This commit enables it by setting 'tx-csum-limit' to 9800B in 'ethernet@70000' node. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	net: mvneta: enable setting custom TX IP checksum limit	Marcin Wojtas
	Since Armada 38x SoC can support IP checksum for jumbo frames only on a single port, it means that this feature should be enabled per-port, rather than for the whole SoC. This patch enables setting custom TX IP checksum limit by adding new optional property to the mvneta device tree node. If not used, by default 1600B is set for "marvell,armada-370-neta" and 9800B for other strings, which ensures backward compatibility. Binding documentation is updated accordingly. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	net: mvneta: fix error path for building skb	Marcin Wojtas
	In the actual RX processing, there is same error path for both descriptor ring refilling and building skb fails. This is not correct, because after successful refill, the ring is already updated with newly allocated buffer. Then, in case of build_skb() fail, hitherto code left the original buffer unmapped. This patch fixes above situation by swapping error check of skb build with DMA-unmap of original buffer. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Acked-by: Simon Guinot <simon.guinot@sequanux.org> Cc: <stable@vger.kernel.org> # v4.2+ Fixes a84e32894191 ("net: mvneta: fix refilling for Rx DMA buffers") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	net: mvneta: fix bit assignment for RX packet irq enable	Marcin Wojtas
	A value originally defined in the driver was inappropriate. Even though the ingress was somehow working, writing MVNETA_RXQ_INTR_ENABLE_ALL_MASK to MVNETA_INTR_ENABLE didn't make any effect, because the bits [31:16] are reserved and read-only. This commit updates MVNETA_RXQ_INTR_ENABLE_ALL_MASK to be compliant with the controller's documentation. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	net: mvneta: fix bit assignment in MVNETA_RXQ_CONFIG_REG	Marcin Wojtas
	MVNETA_RXQ_HW_BUF_ALLOC bit which controls enabling hardware buffer allocation was mistakenly set as BIT(1). This commit fixes the assignment. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	net: mvneta: add configuration for MBUS windows access protection	Marcin Wojtas
	This commit adds missing configuration of MBUS windows access protection in mvneta_conf_mbus_windows function - a dedicated variable for that purpose remained there unused since v3.8 initial mvneta support. Because of that the register contents were inherited from the bootloader. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	Merge tag 'spi-fix-v4.4-rc3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "There's one fix for the core here, we weren't reinitialising the actual transferred length in messages when they get reused which meant that we'd just keep adding to the length if a message is reused. This has limited impact since it's only used in error handling cases but will really mess anything that tries to use it up when it triggers. As ever there's a small collection of driver specific fixes too" * tag 'spi-fix-v4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: bugfix: spi_message.transfer_length does not get reset spi: pl022: handle EPROBE_DEFER for dma spi: bcm63xx: use correct format string for printing a resource spi: mediatek: single device does not require cs_gpios spi: Add missing kerneldoc description for parameter
2015-12-02	cpufreq: use last policy after online for drivers with ->setpolicy	Srinivas Pandruvada
	For cpufreq drivers which use setpolicy interface, after offline->online the policy is set to default. This can be reproduced by setting the default policy of intel_pstate or longrun to ondemand and then change to "performance". After offline and online, the setpolicy will be called with the policy=ondemand. For drivers using governors this condition is handled by storing last_governor, during offline and restoring during online. The same should be done for drivers using setpolicy interface. Storing last_policy during offline and restoring during online. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-02	mac80211: fix off-channel mgmt-tx uninitialized variable usage	Johannes Berg
	In the last change here, I neglected to update the cookie in one code path: when a mgmt-tx has no real cookie sent to userspace as it doesn't wait for a response, but is off-channel. The original code used the SKB pointer as the cookie and always assigned the cookie to the TX SKB in ieee80211_start_roc_work(), but my change turned this around and made the code rely on a valid cookie being passed in. Unfortunately, the off-channel no-wait TX path wasn't assigning one at all, resulting in an uninitialized stack value being used. This wasn't handed back to userspace as a cookie (since in the no-wait case there isn't a cookie), but it was tested for non-zero to distinguish between mgmt-tx and off-channel. Fix this by assigning a dummy non-zero cookie unconditionally, and get rid of a misleading comment and some dead code while at it. I'll clean up the ACK SKB handling separately later. Fixes: 3b79af973cf4 ("mac80211: stop using pointers as userspace cookies") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-02	mac80211: do not actively scan DFS channels	Antonio Quartulli
	DFS channels should not be actively scanned as we can't be sure if we are allowed or not. If the current channel is in the DFS band, active scan might be performed after CSA, but we have no guarantee about other channels, therefore it is safer to prevent active scanning at all. Cc: stable@vger.kernel.org Signed-off-by: Antonio Quartulli <antonio@open-mesh.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-02	mac80211: don't teardown sdata on sdata stop	Eliad Peller
	Interfaces are being initialized (setup) on addition, and torn down on removal. However, p2p device is being torn down when stopped, resulting in the next p2p start operation being done on uninitialized interface. Solve it by calling ieee80211_teardown_sdata() only on interface removal (for the non-netdev case). Signed-off-by: Eliad Peller <eliadx.peller@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> [squashed in fix to call teardown after unregister] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-12-02	Merge branch 'thunderx-fixes'	David S. Miller
	Sunil Goutham says: ==================== thunderx: miscellaneous fixes This patch series contains fixes for various issues observed with BGX and NIC drivers. Changes from v1: - Fixed comment syle in the first patch of the series - Removed 'Increase transmit queue length' patch from the series, will recheck if it's a driver or system issue and resubmit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	net: thunderx: Enable BGX LMAC's RX/TX only after VF is up	Sunil Goutham
	Enable or disable BGX LMAC's RX/TX based on corresponding VF's status. If otherwise, when multiple LMAC's physical link is up then packets from all LMAC's whose corresponding VF is not yet initialized will get forwarded to VF0. This is due to VNIC's default configuration where CPI, RSSI e.t.c point to VF0/QSET0/RQ0. This patch will prevent multiple copies of packets on VF0. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-02	net: thunderx: Switchon carrier only upon interface link up	Sunil Goutham
	Call netif_carrier_on() only if interface's link is up. Switching this on upon IFF_UP by default, is causing issues with ethernet channel bonding in LACP mode. Initial NETDEV_CHANGE notification was being skipped. Also fixed some issues with link/speed/duplex reporting via ethtool. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>