Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync up with Linus #92

Merged
merged 110 commits into from
Aug 14, 2015
Merged

Sync up with Linus #92

merged 110 commits into from
Aug 14, 2015

Conversation

dabrace
Copy link
Owner

@dabrace dabrace commented Aug 14, 2015

No description provided.

Bob Liu and others added 30 commits July 24, 2015 09:07
There is a bug when migrate from !feature-persistent host to feature-persistent
host, because domU still thinks new host/backend doesn't support persistent.
Dmesg like:
backed has not unmapped grant: 839
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 839

The fix is to recheck feature-persistent of new backend in blkif_recover().
See: https://lkml.org/lkml/2015/5/25/469

As Roger suggested, we can split the part of blkfront_connect that checks for
optional features, like persistent grants, indirect descriptors and
flush/barrier features to a separate function and call it from both
blkfront_connect and blkif_recover

Acked-by: Roger Pau Monné <[email protected]>
Signed-off-by: Bob Liu <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
We should consider info->feature_persistent when adding indirect page to list
info->indirect_pages, else the BUG_ON() in blkif_free() would be triggered.

When we are using persistent grants the indirect_pages list
should always be empty because blkfront has pre-allocated enough
persistent pages to fill all requests on the ring.

CC: [email protected]
Acked-by: Roger Pau Monné <[email protected]>
Signed-off-by: Bob Liu <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
…gnt()

The BUG_ON() in purge_persistent_gnt() will be triggered when previous purge
work haven't finished.

There is a work_pending() before this BUG_ON, but it doesn't account if the work
is still currently running.

CC: [email protected]
Acked-by: Roger Pau Monné <[email protected]>
Signed-off-by: Bob Liu <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
…nux/kernel/git/konrad/xen into for-linus

Konrad writes:

"There are three bugs that have been found in the xen-blkfront (and
backend). Two of them have the stable tree CC-ed. They have been found
where an guest is migrating to a host that is missing
'feature-persistent' support (from one that has it enabled). We end up
hitting an BUG() in the driver code."
When the card is not owned by the PCIe bus, we need to
acquire ownership first. This flow is implemented in
iwl_pcie_prepare_card_hw. Because of a hardware bug, we
need to disable link power management before we can
request ownership otherwise the other user of the device
won't get notified that we are requesting the device which
will prevent us from acquire ownership.

Same holds for the down flow where we need to make sure
that any other potential user is notified that the driver
is going down.

CC: <[email protected]> [4.1]
Signed-off-by: Emmanuel Grumbach <[email protected]>
The code checks the total number of iterations to differentiate
between regular scan and scheduled scan. However, regular scan has
a total of one iteration, not zero. As a result, regular scan will
have lower priority than it should have, and in case scheduled
scan is already running when regular scan is requested, regular scan
will be delayed until scheduled scan is aborted.
Fix that by checking for total iterations number of one as an
identifier for regular scan.

Fixes: 133c825 ("iwlwifi: mvm: rename generic_scan_cmd functions to dwell")
Signed-off-by: Avraham Stern <[email protected]>
Signed-off-by: Emmanuel Grumbach <[email protected]>
When inserting a new register into a block, the present bit map size is
increased using krealloc. krealloc does not clear the additionally
allocated memory, leaving it filled with random values. Result is that
some registers are considered cached even though this is not the case.

Fix the problem by clearing the additionally allocated memory. Also, if
the bitmap size does not increase, do not reallocate the bitmap at all
to reduce overhead.

Fixes: 3f4ff56 ("regmap: rbtree: Make cache_present bitmap per node")
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Cc: [email protected]
Since 88eab47 ("netfilter: conntrack: adjust nf_conntrack_buckets default
value"), the hashtable can easily hit this warning. We got reports from users
that are getting this message in a quite spamming fashion, so better silence
this.

Signed-off-by: Pablo Neira Ayuso <[email protected]>
Acked-by: Florian Westphal <[email protected]>
We recently changed this from nf_conntrack_alloc() to nf_ct_tmpl_alloc()
so the error handling needs to changed to check for NULL instead of
IS_ERR().

Fixes: 0838aa7 ('netfilter: fix netns dependencies with conntrack templates')
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
The stuck queue detection mechanism allows to detect queues
that are stuck. For sleeping clients, a queue may rightfully
be stuck: if a poor client implementation stays asleep for
more than 10s, then we don't want to trigger recovery flows
because of that client.
In order to cope with this, I added a mechanism that
monitors the state of the client: when a client goes to
sleep, the timer of his queues is frozen. When he wakes up,
the timer is reset to the right value so that if a client
was awake for more than 10s and the queues are stuck, only
then, the recovery flow will kick in.
This is valid only on non-shared queues: A-MPDU queues.

There was a bug in case we Tx to a sleeping client that has
an empty A-MPDU queue: the timer was armed to now + 10s.
This is bad, but pretty harmless.
The problem is that when the client wakes up, the timer is
modified to be now + remainder. But remainder is 0 since the
queue was empty when that client went to sleep...

Fix this by checking the state of the client before playing
with the timer when we add a packet to an empty queue.

Signed-off-by: Emmanuel Grumbach <[email protected]>
…b/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes

* a fix for the stuck TFD queue mechanism - it was producing
  noisy false alarms.
* a fix for the NIC prepare flow that prevented the driver
  from being able to access the device on certain systems.
* a fix for the scan prority handling which allows the
  regular scan to run even if a scheduled scan is already
  running.
Fixes commit eae79b4 ("rsi: fix memory leak in rsi_load_ta_instructions()")
which stopped the driver from functioning.

Firmware data has been allocated using vmalloc(), resulting in memory
that cannot be used for DMA. Hence the firmware was first copied to a
buffer allocated with kmalloc() in the original code. This patch reverts
the commit and only calls "kfree()" to release the buffer after sending
the data. This fixes the memory leak without breaking the driver.

Add a comment to the kmemdup() calls to explain why this is done, and abort
if memory allocation fails.

Tested on a Topic Miami-Florida board which contains the rsi SDIO chip.

Also added the same kfree() call to the USB glue driver. This was not
tested on actual hardware though, as I only have the SDIO version.

Fixes: eae79b4 ("rsi: fix memory leak in rsi_load_ta_instructions()")
Signed-off-by: Mike Looijmans <[email protected]>
Cc: [email protected]
Signed-off-by: Kalle Valo <[email protected]>
On the 2GHz and and on the 5GHZ band only the extpa_gain setting from
the 5GHz band was checked. this patch makes it check the property from
the correct band.

Signed-off-by: Hauke Mehrtens <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
In commit 33511b1 ("rtlwifi: add support to
send beacon frame"), the mechanism for sending beacons was established. That
patch works correctly for rtl8192cu, but there is a possibility of getting
the following warnings in the PCI drivers:

WARNING: CPU: 1 PID: 2439 at net/mac80211/driver-ops.h:12
ieee80211_bss_info_change_notify+0x179/0x1d0 [mac80211]()
wlp5s0:  Failed check-sdata-in-driver check, flags: 0x0

The warning is followed by a NULL pointer dereference as follows:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000006
IP: [<ffffffffc073998e>] rtl_get_tcb_desc+0x5e/0x760 [rtlwifi]

This problem was reported at http://thread.gmane.org/gmane.linux.kernel.wireless.general/138645,
but no solution was found at that time.

The problem was also reported at https://bugzilla.kernel.org/show_bug.cgi?id=9744
and this solution was developed and tested there.

The USB driver works with a NULL final argument in the adapter_tx() callback;
however, the PCI drivers need a struct rtl_tcb_desc in that position.

Fixes: 33511b1 ("rtlwifi: add support to send beacon frame.")
Signed-off-by: Luis Felipe Dominguez Vega <[email protected]>
Signed-off-by: Larry Finger <[email protected]>
Cc: Stable <[email protected]> [3.19+]
Signed-off-by: Kalle Valo <[email protected]>
…utable

The Sourcery CodeBench Lite 2014.05 toolchain (gcc 4.8.3, binutils
2.24.51) has a GCC which implements -fuse-ld, and it doesn't include
the gold linker, but it lacks an ld.bfd executable in its
installation.  This means that passing -fuse-ld=bfd fails with:

      VDSO    arch/arm/vdso/vdso.so.raw
    collect2: fatal error: cannot find 'ld'

Arguably this is a deficiency in the toolchain, but I suspect it's
commonly used enough that it's worth accommodating: just use

cc-ldoption (to cause a link attempt) instead of cc-option to test
whether we can use -fuse-ld.  So -fuse-ld=bfd won't be used with this
toolchain, but the build will rightly succeed, just as it does for
toolchains which don't implement -fuse-ld (and don't use gold as the
default linker).

Note: this will change the failure mode for a corner case I was trying
to handle in d2b30cd, where the toolchain defaults to the gold
linker and the BFD linker is not found in PATH, from:

      VDSO    arch/arm/vdso/vdso.so.raw
    collect2: fatal error: cannot find 'ld'

i.e. the BFD linker is not found, to:

      OBJCOPY arch/arm/vdso/vdso.so
    BFD: arch/arm/vdso/vdso.so: Not enough room for program headers, try
    linking with -N

that is, we fail to prevent gold from being used as the linker, and it
produces an object that objcopy can't digest.

Reported-by: Baruch Siach <[email protected]>
Tested-by: Baruch Siach <[email protected]>
Tested-by: Raphaël Poggi <[email protected]>
Fixes: d2b30cd ("ARM: 8384/1: VDSO: force use of BFD linker")
Cc: [email protected]
Signed-off-by: Nathan Lynch <[email protected]>
Signed-off-by: Russell King <[email protected]>
When removing a port's netdevice in 'rocker_remove_ports', we should
also free the allocated 'net_device' structure. Do that by calling
'free_netdev' after unregistering it.

Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Fixes: 4b8ac96 ("rocker: introduce rocker switch driver")
Acked-by: Scott Feldman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Alex reported the following crash when using fq_codel
with htb:

  crash> bt
  PID: 630839  TASK: ffff8823c990d280  CPU: 14  COMMAND: "tc"
   [... snip ...]
   #8 [ffff8820ceec17a0] page_fault at ffffffff8160a8c2
      [exception RIP: htb_qlen_notify+24]
      RIP: ffffffffa0841718  RSP: ffff8820ceec1858  RFLAGS: 00010282
      RAX: 0000000000000000  RBX: 0000000000000000  RCX: ffff88241747b400
      RDX: ffff88241747b408  RSI: 0000000000000000  RDI: ffff8811fb27d000
      RBP: ffff8820ceec1868   R8: ffff88120cdeff24   R9: ffff88120cdeff30
      R10: 0000000000000bd4  R11: ffffffffa0840919  R12: ffffffffa0843340
      R13: 0000000000000000  R14: 0000000000000001  R15: ffff8808dae5c2e8
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #9 [...] qdisc_tree_decrease_qlen at ffffffff81565375
  #10 [...] fq_codel_dequeue at ffffffffa084e0a0 [sch_fq_codel]
  #11 [...] fq_codel_reset at ffffffffa084e2f8 [sch_fq_codel]
  #12 [...] qdisc_destroy at ffffffff81560d2d
  #13 [...] htb_destroy_class at ffffffffa08408f8 [sch_htb]
  #14 [...] htb_put at ffffffffa084095c [sch_htb]
  #15 [...] tc_ctl_tclass at ffffffff815645a3
  #16 [...] rtnetlink_rcv_msg at ffffffff81552cb0
  [... snip ...]

As Jamal pointed out, there is actually no need to call dequeue
to purge the queued skb's in reset, data structures can be just
reset explicitly. Therefore, we reset everything except config's
and stats, so that we would have a fresh start after device flipping.

Fixes: 4b549a2 ("fq_codel: Fair Queue Codel AQM")
Reported-by: Alex Gartrell <[email protected]>
Cc: Alex Gartrell <[email protected]>
Cc: Jamal Hadi Salim <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
[[email protected]: added codel_vars_init() and qdisc_qstats_backlog_dec()]
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
The driver code allows for the disabling of MSI interrupts; however the
module_parm line was missed and the option fails to show with modinfo.

Signed-off-by: Larry Finger <[email protected]>
Cc: Stable <[email protected]> [3.15+]
Signed-off-by: Kalle Valo <[email protected]>
openvswitch modifies the L4 checksum of a packet when modifying
the ip address. When an IP packet is fragmented only the first
fragment contains an L4 header and checksum. Prior to this change
openvswitch would modify all fragments, modifying application data
in non-first fragments, causing checksum failures in the
reassembled packet.

Signed-off-by: Glenn Griffin <[email protected]>
Acked-by: Pravin B Shelar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
When we share an action within a filter, the bind refcnt
should increase, therefore we should not call tcf_hash_release().

Cc: Jamal Hadi Salim <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
"len" is a signed integer.  We check that len is not negative, so it
goes from zero to INT_MAX.  PAGE_SIZE is unsigned long so the comparison
is type promoted to unsigned long.  ULONG_MAX - 4095 is a higher than
INT_MAX so the condition can never be true.

I don't know if this is harmful but it seems safe to limit "len" to
INT_MAX - 4095.

Fixes: a8c879a ('RDS: Info and stats')
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
When vortex_up is failed, the skb buffers allocated by __netdev_alloc_skb
in vortex_open are not released, which may cause resource leaks.
This bug has been submitted before.
This patch modifies the error handling code to fix it.

Signed-off-by: Jia-Ju Bai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Multicast dst are not cached. They carry DST_NOCACHE.

As mentioned in commit f886497 ("ipv4: fix dst race in
sk_dst_get()"), these dst need special care before caching them
into a socket.

Caching them is allowed only if their refcnt was not 0, ie we
must use atomic_inc_not_zero()

Also, we must use READ_ONCE() to fetch sk->sk_rx_dst, as mentioned
in commit d0c294c ("tcp: prevent fetching dst twice in early demux
code")

Fixes: 421b388 ("udp: ipv4: Add udp early demux")
Tested-by: Gregory Hoggarth <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Gregory Hoggarth <[email protected]>
Reported-by: Alex Gartrell <[email protected]>
Cc: Michal Kubeček <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Determine if a fraglist is needed in the tx path, and allocate it if
necessary before setting up the copy and map operations.
Otherwise, undoing the copy and map operations is tricky.

This fixes a use-after-free: if allocating the fraglist failed, the copy
and map operations that had been set up were still executed, writing
over the data area of a freed skb.

Signed-off-by: Ross Lagerwall <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
When a node running DAT receives an ARP request from the LAN for the
first time, it is likely that this node will request the ARP entry
through the distributed ARP table (DAT) in the mesh.

Once a DAT reply is received the asking node must check if the MAC
address for which the IP address has been asked is local. If it is, the
node must drop the ARP reply bceause the client should have replied on
its own locally.

Forwarding this reply means fooling any L2 bridge (e.g. Ethernet
switches) lying between the batman-adv node and the LAN. This happens
because the L2 bridge will think that the client sending the ARP reply
lies somewhere in the mesh, while this node is sitting in the same LAN.

Reported-by: Simon Wunderlich <[email protected]>
Signed-off-by: Marek Lindner <[email protected]>
Signed-off-by: Antonio Quartulli <[email protected]>
batadv_softif_vlan_get() may return NULL which has to be verified
by the caller.

Fixes: 35df3b2 ("batman-adv: fix TT VLAN inconsistency on VLAN re-add")
Reported-by: Ryan Thompson <[email protected]>
Signed-off-by: Marek Lindner <[email protected]>
Signed-off-by: Antonio Quartulli <[email protected]>
The tt_local_entry deletion performed in batadv_tt_local_remove() was neither
protecting against simultaneous deletes nor checking whether the element was
still part of the list before calling hlist_del_rcu().

Replacing the hlist_del_rcu() call with batadv_hash_remove() provides adequate
protection via hash spinlocks as well as an is-element-still-in-hash check to
avoid 'blind' hash removal.

Fixes: 068ee6e ("batman-adv: roaming handling mechanism redesign")
Reported-by: [email protected]
Signed-off-by: Marek Lindner <[email protected]>
Signed-off-by: Antonio Quartulli <[email protected]>
Without this initialization, gateways which actually announce up/down
bandwidth of 0/0 could be added. If these nodes get purged via
_batadv_purge_orig() later, the gw_node structure does not get removed
since batadv_gw_node_delete() updates the gw_node with up/down
bandwidth of 0/0, and the updating function then discards the change
and does not free gw_node.

This results in leaking the gw_node structures, which references other
structures: gw_node -> orig_node -> orig_node_ifinfo -> hardif. When
removing the interface later, the open reference on the hardif may cause
hangs with the infamous "unregister_netdevice: waiting for mesh1 to
become free. Usage count = 1" message.

Signed-off-by: Simon Wunderlich <[email protected]>
Signed-off-by: Marek Lindner <[email protected]>
Signed-off-by: Antonio Quartulli <[email protected]>
The flags were ignored for this function when it was introduced. Also
fix the style problem in kzalloc.

Fixes: 0838aa7 (netfilter: fix netns dependencies with conntrack
templates)
Signed-off-by: Joe Stringer <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
This patch fixes how MGMT_EV_NEW_LONG_TERM_KEY event is build. Right now
val vield is filled with only 1 byte, instead of whole value. This bug
was introduced in
commit 1fc62c5 ("Bluetooth: Fix exposing full value of shortened LTKs")

Before that patch, if you paired with device using bluetoothd using simple
pairing, and then restarted bluetoothd, you would be able to re-connect,
but device would fail to establish encryption and would terminate
connection. After this patch connecting after bluetoothd restart works
fine.

Signed-off-by: Jakub Pawlowski <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
richardweinberger and others added 27 commits August 11, 2015 17:34
In kbuild it is allowed to define objects in files named "Makefile"
and "Kbuild".
Currently localmodconfig reads objects only from "Makefile"s and misses
modules like nouveau.

Link: http://lkml.kernel.org/r/[email protected]

Cc: [email protected]
Reported-and-tested-by: Leonidas Spyropoulos <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
…inux/kernel/git/rostedt/linux-kconfig

Pull localmodconfig fix from Steven Rostedt:
 "Leonidas Spyropoulos found that modules like nouveau were being
  unselected by make localmodconfig even though their configs were set
  and the module was loaded and visible by lsmod.

  The reason for this was because streamline-config.pl only looks at
  Makefiles, and not Kbuild files.  As these modules use Kbuild for
  their names, they too need to be checked by localmodconfig.  This was
  fixed by Richard Weinberger"

* tag 'localmodconfig-v4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-kconfig:
  localmodconfig: Use Kbuild files too
The device details and mapping trees were just being decremented
before.  Now btree_del() is called to do a deep delete.

Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Cc: [email protected]
When using nested btrees, the top leaves of the top levels contain
block addresses for the root of the next tree down.  If we shadow a
shared leaf node the leaf values (sub tree roots) should be incremented
accordingly.

This is only an issue if there is metadata sharing in the top levels.
Which only occurs if metadata snapshots are being used (as is possible
with dm-thinp).  And could result in a block from the thinp metadata
snap being reused early, thus corrupting the thinp metadata snap.

Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Cc: [email protected]
When creating dm-cache with the default policy, it will call
request_module("dm-cache-default") to register the default policy.
But the "dm-cache-default" alias was left referring to the MQ policy.
Fix this by moving the module alias to SMQ.

Fixes: bccab6a (dm cache: switch the "default" cache replacement policy from mq to smq)
Signed-off-by: Yi Zhang <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
…/kernel/git/broonie/regmap

Pull regmap fix from Mark Brown:
 "regmap: Fix handling of present bits on rbtree cache block resize

  When expanding a cache block we use krealloc() to resize the register
  present bitmap without initialising the newly allocated data (the
  original code was written for kzalloc()).  Add an appropraite memset()
  to fix that"

* tag 'regmap-fix-v4.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: regcache-rbtree: Clean new present bits on present bitmap resize
Signed-off-by: Boyuan Zhang <[email protected]>
Reviewed-by: Christian König <[email protected]>
This reverts commit 78ad5cd.
This commit breaks dpm and suspend/resume on CZ.
…inux

Pull amd drm fixes from Alex Deucher:
 "Dave is on vacation at the moment, so please pull these radeon and
  amdgpu fixes directly.

  Just a few minor things for 4.2:

   - add a new radeon pci id
   - fix a power management regression in amdgpu
   - fix HEVC command buffer validation in amdgpu"

* 'drm-fixes-4.2' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: add new OLAND pci id
  Revert "drm/amdgpu: Configure doorbell to maximum slots"
  drm/amdgpu: add context buffer size check for HEVC
…git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Fix coarse clock monotonicity (VDSO timestamp off by one jiffy
  compared to the syscall one)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: VDSO: fix coarse clock monotonicity regression
In case we need to divert reads/writes using the slave MII bus, we may have
already fetched a valid PHY interface property from Device Tree, and that
mode is used by the PHY driver to make configuration decisions.

If we could not fetch the "phy-mode" property, we will assign p->phy_interface
to PHY_INTERFACE_MODE_NA, such that we can actually check for that condition as
to whether or not we should override the interface value.

Fixes: 1933492 ("net: dsa: Set valid phy interface type")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
When the first slave is added (such as during bootup) the first
gratuitous ARP gets dropped. We don't see this drop during a failover.
The packet gets dropped in qdisc (noop_enqueue).

The fix is to delay the sending of gratuitous ARPs till the bond dev's
carrier is present.

It can also be worked around by setting num_grat_arp to more than 1.

Signed-off-by: Venkat Venkatsubra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
MAX_FILER_IDX is the last usable index.  Using less-than
will already guarantee that one entry for catch-all rule
will be left, no need to subtract 1 here.

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
At a cost of one line let's make sure .count is correct
when calling gfar_process_filer_changes().

Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Current filer rule optimization is broken in several ways:
 (1) Can perform reads/writes beyond end of allocated tables.
     (gianfar_ethtool.c:1326).

(2) It breaks badly for rules with more than 2 specifiers
     (e.g. matching ip, port, tos).

Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1 tos 1 action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2 tos 2 action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3 tos 3 action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND         Q:00           ctrl:00000080 prop:00000210
01: FPR  == 00000210 AND CLE     Q:00           ctrl:00000281 prop:00000210
02: MASK == ffffffff AND         Q:00           ctrl:00000080 prop:ffffffff
03: DPT  == 00000003 AND         Q:00           ctrl:0000008e prop:00000003
04: TOS  == 00000003 AND         Q:00           ctrl:0000008a prop:00000003
05: DIA  == 0a000003 AND         Q:11           ctrl:0000448c prop:0a000003
06: DPT  == 00000002 AND         Q:00           ctrl:0000008e prop:00000002
07: TOS  == 00000002 AND         Q:00           ctrl:0000008a prop:00000002
08: DIA  == 0a000002 AND         Q:09           ctrl:0000248c prop:0a000002
09: DIA  == 0a000001 AND         Q:00           ctrl:0000008c prop:0a000001
0a: DPT  == 00000001 AND         Q:00           ctrl:0000008e prop:00000001
0b: TOS  == 00000001     CLE     Q:01           ctrl:0000060a prop:00000001
ff: MASK >= 00000000             Q:00           ctrl:00000020 prop:00000000

(Entire cluster gets AND-ed together).

 (3) We observed that the masking rules it generates do not
     play well with clustering on P2020.  Only first rule
     of the cluster would ever fire.  Given that optimizer
     relies heavily on masking this is very hard to fix.

Example:
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.1 dst-port 1  action 1
Added rule with ID 254
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.2 dst-port 2  action 9
Added rule with ID 253
# ethtool -N eth2 flow-type udp4 dst-ip 10.0.0.3 dst-port 3  action 17
Added rule with ID 252
# ./filer_decode /sys/kernel/debug/gfar1/filer_raw
00: MASK == 00000210 AND         Q:00           ctrl:00000080 prop:00000210
01: FPR  == 00000210 AND CLE     Q:00           ctrl:00000281 prop:00000210
02: MASK == ffffffff AND         Q:00           ctrl:00000080 prop:ffffffff
03: DPT  == 00000003 AND         Q:00           ctrl:0000008e prop:00000003
04: DIA  == 0a000003             Q:11           ctrl:0000440c prop:0a000003
05: DPT  == 00000002 AND         Q:00           ctrl:0000008e prop:00000002
06: DIA  == 0a000002             Q:09           ctrl:0000240c prop:0a000002
07: DIA  == 0a000001 AND         Q:00           ctrl:0000008c prop:0a000001
08: DPT  == 00000001     CLE     Q:01           ctrl:0000060e prop:00000001
ff: MASK >= 00000000             Q:00           ctrl:00000020 prop:00000000

Which looks correct according to the spec but only the first
(eth id 252)/last added rule for 10.0.0.3 will ever trigger.
As if filer did not treat the AND CLE as cluster start but
also kept AND-ing the rules.  We found no errata covering this.

The fact that nobody noticed (2) or (3) makes me think
that this feature is not very widely used and we should just
remove it.

Reported-by: Aleksander Dutkowski <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Acked-by: Claudiu Manoil <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Jakub Kicinski says:

====================
gianfar: filer changes

respinning with examples as requested.
====================

Signed-off-by: David S. Miller <[email protected]>
If register_hdlc_device() fails, the current code returns 0 but we
should return an error code instead.

Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
The commit

  de3910e ("edac: change the mem allocation scheme to
		 make Documentation/kobject.txt happy")

changed the memory allocation for the csrows member. But ppc4xx_edac was
forgotten in the patch. Fix it.

Signed-off-by: Michael Walle <[email protected]>
Cc: <[email protected]>
Cc: linux-edac <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
…rnel/git/bp/bp

Pull EDAC fix from Borislav Petkov:
 "A ppc4xx_edac fix for accessing ->csrows properly.  This driver was
  missed during the conversion a couple of years ago"

* tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, ppc4xx: Access mci->csrows array elements properly
Pull networking fixes from David Miller:

 1) Workaround hw bug when acquiring PCI bos ownership of iwlwifi
    devices, from Emmanuel Grumbach.

 2) Falling back to vmalloc in conntrack should not emit a warning, from
    Pablo Neira Ayuso.

 3) Fix NULL deref when rtlwifi driver is used as an AP, from Luis
    Felipe Dominguez Vega.

 4) Rocker doesn't free netdev on device removal, from Ido Schimmel.

 5) UDP multicast early sock demux has route handling races, from Eric
    Dumazet.

 6) Fix L4 checksum handling in openvswitch, from Glenn Griffin.

 7) Fix use-after-free in skb_set_peeked, from Herbert Xu.

 8) Don't advertize NETIF_F_FRAGLIST in virtio_net driver, this can lead
    to fraglists longer than the driver can support.  From Jason Wang.

 9) Fix mlx5 on non-4k-pagesize systems, from Carol L Soto.

10) Fix interrupt storm in bna driver, from Ivan Vecera.

11) Don't propagate -EBUSY from netlink_insert(), from Daniel Borkmann.

12) Fix inet request sock leak, from Eric Dumazet.

13) Fix TX interrupt masking and marking in TX descriptors of fs_enet
    driver, from LEROY Christophe.

14) Get rid of rule optimizer in gianfar driver, it's buggy and unlikely
    to get fixed any time soon.  From Jakub Kicinski

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
  cosa: missing error code on failure in probe()
  gianfar: remove faulty filer optimizer
  gianfar: correct list membership accounting
  gianfar: correct filer table writing
  bonding: Gratuitous ARP gets dropped when first slave added
  net: dsa: Do not override PHY interface if already configured
  net: fs_enet: mask interrupts for TX partial frames.
  net: fs_enet: explicitly remove I flag on TX partial frames
  inet: fix possible request socket leak
  inet: fix races with reqsk timers
  mkiss: Fix error handling in mkiss_open()
  bnx2x: Free NVRAM lock at end of each page
  bnx2x: Prevent null pointer dereference on SKB release
  cxgb4: missing curly braces in t4_setup_debugfs()
  net-timestamp: Update skb_complete_tx_timestamp comment
  ipv6: don't reject link-local nexthop on other interface
  netlink: make sure -EBUSY won't escape from netlink_insert
  bna: fix interrupts storm caused by erroneous packets
  net: mvpp2: replace TX coalescing interrupts with hrtimer
  net: mvpp2: enable proper per-CPU TX buffers unmapping
  ...
This reverts commits 9a036b9 ("x86/signal/64: Remove 'fs' and 'gs'
from sigcontext") and c6f2062 ("x86/signal/64: Fix SS handling for
signals delivered to 64-bit programs").

They were cleanups, but they break dosemu by changing the signal return
behavior (and removing 'fs' and 'gs' from the sigcontext struct - while
not actually changing any behavior - causes build problems).

Reported-and-tested-by: Stas Sergeev <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
…ux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:

 - revert a fix from 4.2-rc5 that was causing lots of WARNING spam.

 - fix a memory leak affecting backends in HVM guests.

 - fix PV domU hang with certain configurations.

* tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
  Revert "xen/events/fifo: Handle linked events when closing a port"
  x86/xen: build "Xen PV" APIC driver for domU as well
Pull xen block driver fixes from Jens Axboe:
 "A few small bug fixes for xen-blk{front,back} that have been sitting
  over my vacation"

* 'for-linus' of git://git.kernel.dk/linux-block:
  xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
  xen-blkfront: don't add indirect pages to list when !feature_persistent
  xen-blkfront: introduce blkfront_gather_backend_features()
…el/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - two stable fixes for corruption seen in a snapshot of thinp metadata;
   metadata snapshots aren't widely used but help provide a consistent
   view of the metadata associated with an active thin-pool.

 - a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"

* tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm cache policy smq: move 'dm-cache-default' module alias to SMQ
  dm btree: add ref counting ops for the leaves of top level btrees
  dm thin metadata: delete btrees when releasing metadata snapshot
…mbers

Commit 3f5159a ("x86/asm/entry/32: Update -ENOSYS handling to match
the 64-bit logic") broke the ENOSYS handling for the 32-bit compat case.
The proper error return value was never loaded into %rax, except if
things just happened to go through the audit paths, which ended up
reloading the return value.

This moves the loading or %rax into the normal system call path, just to
make sure the error case triggers it.  It's kind of sad, since it adds a
useless instruction to reload the register to the fast path, but it's
not like that single load from the stack is going to be noticeable.

Reported-by: David Drysdale <[email protected]>
Tested-by: Kees Cook <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Pull ARM fixes from Russell King:
 "Another few small ARM fixes, mostly addressing some VDSO issues"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8410/1: VDSO: fix coarse clock monotonicity regression
  ARM: 8409/1: Mark ret_fast_syscall as a function
  ARM: 8408/1: Fix the secondary_startup function in Big Endian case
  ARM: 8405/1: VDSO: fix regression with toolchains lacking ld.bfd executable
@dabrace dabrace merged this pull request into dabrace:pmcpqi-next Aug 14, 2015
dabrace pushed a commit that referenced this pull request Dec 16, 2016
There is at least one Chelsio 10Gb card which uses VPD area to store some
non-standard blocks (example below).  However pci_vpd_size() returns the
length of the first block only assuming that there can be only one VPD "End
Tag".

Since 4e1a635 ("vfio/pci: Use kernel VPD access functions"), VFIO
blocks access beyond that offset, which prevents the guest "cxgb3" driver
from probing the device.  The host system does not have this problem as its
driver accesses the config space directly without pci_read_vpd().

Add a quirk to override the VPD size to a bigger value.  The maximum size
is taken from EEPROMSIZE in drivers/net/ethernet/chelsio/cxgb3/common.h.
We do not read the tag as the cxgb3 driver does as the driver supports
writing to EEPROM/VPD and when it writes, it only checks for 8192 bytes
boundary.  The quirk is registered for all devices supported by the cxgb3
driver.

This adds a quirk to the PCI layer (not to the cxgb3 driver) as the cxgb3
driver itself accesses VPD directly and the problem only exists with the
vfio-pci driver (when cxgb3 is not running on the host and may not be even
loaded) which blocks accesses beyond the first block of VPD data.  However
vfio-pci itself does not have quirks mechanism so we add it to PCI.

This is the controller:
Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]

This is what I parsed from its VPD:
===
b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 FN\x0746K'
 0000 Large item 42 bytes; name 0x2 Identifier String
	b'10 Gigabit Ethernet-SR PCI Express Adapter'
 002d Large item 74 bytes; name 0x10
	#00 [EC] len=7: b'D76809 '
	#0a [FN] len=7: b'46K7897'
	#14 [PN] len=7: b'46K7897'
	#1e [MN] len=4: b'1037'
	#25 [FC] len=4: b'5769'
	#2c [SN] len=12: b'YL102035603V'
	#3b [NA] len=12: b'00145E992ED1'
 007a Small item 1 bytes; name 0xf End Tag

 0c00 Large item 16 bytes; name 0x2 Identifier String
	b'S310E-SR-X      '
 0c13 Large item 234 bytes; name 0x10
	#00 [PN] len=16: b'TBD             '
	#13 [EC] len=16: b'110107730D2     '
	#26 [SN] len=16: b'97YL102035603V  '
	#39 [NA] len=12: b'00145E992ED1'
	#48 [V0] len=6: b'175000'
	#51 [V1] len=6: b'266666'
	#5a [V2] len=6: b'266666'
	#63 [V3] len=6: b'2000  '
	#6c [V4] len=2: b'1 '
	#71 [V5] len=6: b'c2    '
	#7a [V6] len=6: b'0     '
	#83 [V7] len=2: b'1 '
	#88 [V8] len=2: b'0 '
	#8d [V9] len=2: b'0 '
	#92 [VA] len=2: b'0 '
	#97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
 0d00 Large item 252 bytes; name 0x11
	#00 [VC] len=16: b'122310_1222 dp  '
	#13 [VD] len=16: b'610-0001-00 H1\x00\x00'
	#26 [VE] len=16: b'122310_1353 fp  '
	#39 [VF] len=16: b'610-0001-00 H1\x00\x00'
	#4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
 0dff Small item 0 bytes; name 0xf End Tag

10f3 Large item 13315 bytes; name 0x62
!!! unknown item name 98: b'\xd0\x03\x00@`\x0c\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00'
===

Signed-off-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.