linux.git - Linux Kernel

Age	Commit message (Collapse)	Author
2021-08-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h 9e26680733d5 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp") 9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins") 099fdeda659d ("bnxt_en: Event handler for PPS events") kernel/bpf/helpers.c include/linux/bpf-cgroup.h a2baf4e8bb0f ("bpf: Fix potentially incorrect results with bpf_get_local_storage()") c7603cfa04e7 ("bpf: Add ambient BPF runtime context stored in current") drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c 5957cc557dc5 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray") 2d0b41a37679 ("net/mlx5: Refcount mlx5_irq with integer") MAINTAINERS 7b637cd52f02 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo") 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-10	net: switchdev: zero-initialize struct switchdev_notifier_fdb_info emitted ↵	Vladimir Oltean
	by drivers towards the bridge The blamed commit added a new field to struct switchdev_notifier_fdb_info, but did not make sure that all call paths set it to something valid. For example, a switchdev driver may emit a SWITCHDEV_FDB_ADD_TO_BRIDGE notifier, and since the 'is_local' flag is not set, it contains junk from the stack, so the bridge might interpret those notifications as being for local FDB entries when that was not intended. To avoid that now and in the future, zero-initialize all switchdev_notifier_fdb_info structures created by drivers such that all newly added fields to not need to touch drivers again. Fixes: 2c4eca3ef716 ("net: bridge: switchdev: include local flag in FDB notifications") Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/20210810115024.1629983-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-09	page_pool: keep pp info as long as page pool owns the page	Yunsheng Lin
	Currently, page->pp is cleared and set everytime the page is recycled, which is unnecessary. So only set the page->pp when the page is added to the page pool and only clear it when the page is released from the page pool. This is also a preparation to support allocating frag page in page pool. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-09	devlink: Set device as early as possible	Leon Romanovsky
	All kernel devlink implementations call to devlink_alloc() during initialization routine for specific device which is used later as a parent device for devlink_register(). Such late device assignment causes to the situation which requires us to call to device_register() before setting other parameters, but that call opens devlink to the world and makes accessible for the netlink users. Any attempt to move devlink_register() to be the last call generates the following error due to access to the devlink->dev pointer. [ 8.758862] devlink_nl_param_fill+0x2e8/0xe50 [ 8.760305] devlink_param_notify+0x6d/0x180 [ 8.760435] __devlink_params_register+0x2f1/0x670 [ 8.760558] devlink_params_register+0x1e/0x20 The simple change of API to set devlink device in the devlink_alloc() instead of devlink_register() fixes all this above and ensures that prior to call to devlink_register() everything already set. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-06	net: mvvp2: fix short frame size on s390	John Hubbard
	On s390, the following build warning occurs: drivers/net/ethernet/marvell/mvpp2/mvpp2.h:844:2: warning: overflow in conversion from 'long unsigned int' to 'int' changes value from '18446744073709551584' to '-32' [-Woverflow] 844 \| ((total_size) - MVPP2_SKB_HEADROOM - MVPP2_SKB_SHINFO_SIZE) This happens because MVPP2_SKB_SHINFO_SIZE, which is 320 bytes (which is already 64-byte aligned) on some architectures, actually gets ALIGN'd up to 512 bytes in the s390 case. So then, when this is invoked: MVPP2_RX_MAX_PKT_SIZE(MVPP2_BM_SHORT_FRAME_SIZE) ...that turns into: 704 - 224 - 512 == -32 ...which is not a good frame size to end up with! The warning above is a bit lucky: it notices a signed/unsigned bad behavior here, which leads to the real problem of a frame that is too short for its contents. Increase MVPP2_BM_SHORT_FRAME_SIZE by 32 (from 704 to 736), which is just exactly big enough. (The other values can't readily be changed without causing a lot of other problems.) Fixes: 07dd0a7aae7f ("mvpp2: add basic XDP support") Cc: Sven Auhagen <sven.auhagen@voleatech.de> Cc: Matteo Croce <mcroce@microsoft.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Build failure in drivers/net/wwan/mhi_wwan_mbim.c: add missing parameter (0, assuming we don't want buffer pre-alloc). Conflict in drivers/net/dsa/sja1105/sja1105_main.c between: 589918df9322 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") 0fac6aa098ed ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode") Follow the instructions from the commit message of the former commit - removed the if conditions. When looking at commit 589918df9322 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") note that the mask_iotag fields get removed by the following patch. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-04	net/prestera: Fix devlink groups leakage in error flow	Leon Romanovsky
	Devlink trap group is registered but not released in error flow, add the missing devlink_trap_groups_unregister() call. Fixes: 0a9003f45e91 ("net: marvell: prestera: devlink: add traps/groups implementation") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-03	octeontx2-af: Fix spelling mistake "Makesure" -> "Make sure"	Colin Ian King
	There is a spelling mistake in a NL_SET_ERR_MSG_MOD message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210803105617.338546-1-colin.king@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-02	net: marvell: make the array name static, makes object smaller	Colin Ian King
	Don't populate the const array name on the stack but instead it static. Makes the object code smaller by 28 bytes. Add a missing const to clean up a checkpatch warning. Before: text data bss dec hex filename 124565 31565 384 156514 26362 drivers/net/ethernet/marvell/sky2.o After: text data bss dec hex filename 124441 31661 384 156486 26346 drivers/net/ethernet/marvell/sky2.o (gcc version 10.2.0) Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210801150647.145728-1-colin.king@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-02	octeontx2-pf: cn10k: Config DWRR weight based on MTU	Sunil Goutham
	Program SQ, MDQ, TL4 to TL2 transmit scheduler queues' DWRR weight based on DWRR MTU programmed at NIX_AF_DWRR_RPM_MTU. The DWRR MTU from admin function is retrieved via mbox. On OcteaonTx2 silicon, admin function driver responds with DWRR MTU as '1'. This helps to avoid silicon specific transmit scheduler DWRR quantum/weight configuration logic. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-02	octeontx2-af: cn10k: DWRR MTU configuration	Sunil Goutham
	On OcteonTx2 DWRR quantum is directly configured into each of the transmit scheduler queues. And PF/VF drivers were free to config any value upto 2^24. On CN10K, HW is modified, the quantum configuration at scheduler queues is in terms of weight. And SW needs to setup a base DWRR MTU at NIX_AF_DWRR_RPM_MTU / NIX_AF_DWRR_SDP_MTU. HW will do 'DWRR MTU * weight' to get the quantum. For LBK traffic, value programmed into NIX_AF_DWRR_RPM_MTU register is considered as DWRR MTU. This patch programs a default DWRR MTU of 8192 into HW and also provides a way to change this via devlink params. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-31	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Conflicting commits, all resolutions pretty trivial: drivers/bus/mhi/pci_generic.c 5c2c85315948 ("bus: mhi: pci-generic: configurable network interface MRU") 56f6f4c4eb2a ("bus: mhi: pci_generic: Apply no-op for wake using sideband wake boolean") drivers/nfc/s3fwrn5/firmware.c a0302ff5906a ("nfc: s3fwrn5: remove unnecessary label") 46573e3ab08f ("nfc: s3fwrn5: fix undefined parameter values in dev_err()") 801e541c79bb ("nfc: s3fwrn5: fix undefined parameter values in dev_err()") MAINTAINERS 7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver") 8a7b46fa7902 ("MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-27	dev_ioctl: split out ndo_eth_ioctl	Arnd Bergmann
	Most users of ndo_do_ioctl are ethernet drivers that implement the MII commands SIOCGMIIPHY/SIOCGMIIREG/SIOCSMIIREG, or hardware timestamping with SIOCSHWTSTAMP/SIOCGHWTSTAMP. Separate these from the few drivers that use ndo_do_ioctl to implement SIOCBOND, SIOCBR and SIOCWANDEV commands. This is a purely cosmetic change intended to help readers find their way through the implementation. Cc: Doug Ledford <dledford@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Vladimir Oltean <olteanv@gmail.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: linux-rdma@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-27	octeontx2-af: Do NIX_RX_SW_SYNC twice	Sunil Goutham
	NIX_RX_SW_SYNC ensures all existing transactions are finished and pkts are written to LLC/DRAM, queues should be teared down after successful SW_SYNC. Due to a HW errata, in some rare scenarios an existing transaction might end after SW_SYNC operation. To ensure operation is fully done, do the SW_SYNC twice. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25	octeontx2-pf: Dont enable backpressure on LBK links	Hariprasad Kelam
	Avoid configure backpressure for LBK links as they don't support it and enable lmacs before configuration pause frames. Fixes: 75f36270990c ("octeontx2-pf: Support to enable/disable pause frames via ethtool") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25	octeontx2-pf: Fix interface down flag on error	Geetha sowjanya
	In the existing code while changing the number of TX/RX queues using ethtool the PF/VF interface resources are freed and reallocated (otx2_stop and otx2_open is called) if the device is in running state. If any resource allocation fails in otx2_open, driver free already allocated resources and return. But again, when the number of queues changes as the device state still running oxt2_stop is called. In which we try to free already freed resources leading to driver crash. This patch fixes the issue by setting the INTF_DOWN flag on error and free the resources in otx2_stop only if the flag is not set. Fixes: 50fe6c02e5ad ("octeontx2-pf: Register and handle link notifications") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <Sunil.Goutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25	octeontx2-af: Fix PKIND overlap between LBK and LMAC interfaces	Geetha sowjanya
	Currently PKINDs are not assigned to LBK channels. The default value of LBK_CHX_PKIND (channel to PKIND mapping) register is zero, which is resulting in a overlap of pkind between LBK and CGX LMACs. When KPU1 parser config is modified when PTP timestamping is enabled on the CGX LMAC interface it is impacting traffic on LBK interfaces as well. This patch fixes the issue by reserving the PKIND#0 for LBK devices. CGX mapped PF pkind starts from 1 and also fixes the max pkind available. Fixes: 421572175ba5 ("octeontx2-af: Support to enable/disable HW timestamping") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23	octeontx2-af: Fix uninitialized variables in rvu_switch	Subbaraya Sundeep
	Get the number of VFs of a PF correctly by calling rvu_get_pf_numvfs in rvu_switch_disable function. Also hwvf is not required hence remove it. Fixes: 23109f8dd06d ("octeontx2-af: Introduce internal packet switching") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23	octeontx2-af: Enhance mailbox trace entry	Jerin Jacob
	Added mailbox id to name translation on trace entry for better tracing output. Before the change: otx2_msg_process: [0002:01:00.0] msg:(0x03) error:0 After the change: otx2_msg_process: [0002:01:00.0] msg:(DETACH_RESOURCES) error:0 Signed-off-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23	net: bridge: switchdev: allow the TX data plane forwarding to be offloaded	Tobias Waldekranz
	Allow switchdevs to forward frames from the CPU in accordance with the bridge configuration in the same way as is done between bridge ports. This means that the bridge will only send a single skb towards one of the ports under the switchdev's control, and expects the driver to deliver the packet to all eligible ports in its domain. Primarily this improves the performance of multicast flows with multiple subscribers, as it allows the hardware to perform the frame replication. The basic flow between the driver and the bridge is as follows: - When joining a bridge port, the switchdev driver calls switchdev_bridge_port_offload() with tx_fwd_offload = true. - The bridge sends offloadable skbs to one of the ports under the switchdev's control using skb->offload_fwd_mark = true. - The switchdev driver checks the skb->offload_fwd_mark field and lets its FDB lookup select the destination port mask for this packet. v1->v2: - convert br_input_skb_cb::fwd_hwdoms to a plain unsigned long - introduce a static key "br_switchdev_fwd_offload_used" to minimize the impact of the newly introduced feature on all the setups which don't have hardware that can make use of it - introduce a check for nbp->flags & BR_FWD_OFFLOAD to optimize cache line access - reorder nbp_switchdev_frame_mark_accel() and br_handle_vlan() in __br_forward() - do not strip VLAN on egress if forwarding offload on VLAN-aware bridge is being used - propagate errors from .ndo_dfwd_add_station() if not EOPNOTSUPP v2->v3: - replace the solution based on .ndo_dfwd_add_station with a solution based on switchdev_bridge_port_offload - rename BR_FWD_OFFLOAD to BR_TX_FWD_OFFLOAD v3->v4: rebase v4->v5: - make sure the static key is decremented on bridge port unoffload - more function and variable renaming and comments for them: br_switchdev_fwd_offload_used to br_switchdev_tx_fwd_offload br_switchdev_accels_skb to br_switchdev_frame_uses_tx_fwd_offload nbp_switchdev_frame_mark_tx_fwd to nbp_switchdev_frame_mark_tx_fwd_to_hwdom nbp_switchdev_frame_mark_accel to nbp_switchdev_frame_mark_tx_fwd_offload fwd_accel to tx_fwd_offload Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	David S. Miller
	Conflicts are simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-23	octeontx2-af: Remove unnecessary devm_kfree	Sunil Goutham
	Remove devm_kfree of memory where VLAN entry to RVU PF mapping info is saved. This will be freed anyway at driver exit. Having this could result in warning from devm_kfree() if the memory is not allocated due to errors in rvu_nix_block_init() before nix_setup_txvlan(). Fixes: 9a946def264d ("octeontx2-af: Modify nix_vtag_cfg mailbox to support TX VTAG entries") Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-22	net: bridge: move the switchdev object replay helpers to "push" mode	Vladimir Oltean
	Starting with commit 4f2673b3a2b6 ("net: bridge: add helper to replay port and host-joined mdb entries"), DSA has introduced some bridge helpers that replay switchdev events (FDB/MDB/VLAN additions and deletions) that can be lost by the switchdev drivers in a variety of circumstances: - an IP multicast group was host-joined on the bridge itself before any switchdev port joined the bridge, leading to the host MDB entries missing in the hardware database. - during the bridge creation process, the MAC address of the bridge was added to the FDB as an entry pointing towards the bridge device itself, but with no switchdev ports being part of the bridge yet, this local FDB entry would remain unknown to the switchdev hardware database. - a VLAN/FDB/MDB was added to a bridge port that is a LAG interface, before any switchdev port joined that LAG, leading to the hardware database missing those entries. - a switchdev port left a LAG that is a bridge port, while the LAG remained part of the bridge, and all FDB/MDB/VLAN entries remained installed in the hardware database of the switchdev port. Also, since commit 0d2cfbd41c4a ("net: bridge: ignore switchdev events for LAG ports which didn't request replay"), DSA introduced a method, based on a const void ctx, to ensure that two switchdev ports under the same LAG that is a bridge port do not see the same MDB/VLAN entry being replayed twice by the bridge, once for every bridge port that joins the LAG. With so many ordering corner cases being possible, it seems unreasonable to expect a switchdev driver writer to get it right from the first try. Therefore, now that DSA has experimented with the bridge replay helpers for a little bit, we can move the code to the bridge driver where it is more readily available to all switchdev drivers. To convert the switchdev object replay helpers from "pull mode" (where the driver asks for them) to a "push mode" (where the bridge offers them automatically), the biggest problem is that the bridge needs to be aware when a switchdev port joins and leaves, even when the switchdev is only indirectly a bridge port (for example when the bridge port is a LAG upper of the switchdev). Luckily, we already have a hook for that, in the form of the newly introduced switchdev_bridge_port_offload() and switchdev_bridge_port_unoffload() calls. These offer a natural place for hooking the object addition and deletion replays. Extend the above 2 functions with: - pointers to the switchdev atomic notifier (for FDB replays) and the blocking notifier (for MDB and VLAN replays). - the "const void ctx" argument required for drivers to be able to disambiguate between which port is targeted, when multiple ports are lowers of the same LAG that is a bridge port. Most of the drivers pass NULL to this argument, except the ones that support LAG offload and have the proper context check already in place in the switchdev blocking notifier handler. Also unexport the replay helpers, since nobody except the bridge calls them directly now. Note that: (a) we abuse the terminology slightly, because FDB entries are not "switchdev objects", but we count them as objects nonetheless. With no direct way to prove it, I think they are not modeled as switchdev objects because those can only be installed by the bridge to the hardware (as opposed to FDB entries which can be propagated in the other direction too). This is merely an abuse of terms, FDB entries are replayed too, despite not being objects. (b) the bridge does not attempt to sync port attributes to newly joined ports, just the countable stuff (the objects). The reason for this is simple: no universal and symmetric way to sync and unsync them is known. For example, VLAN filtering: what to do on unsync, disable or leave it enabled? Similarly, STP state, ageing timer, etc etc. What a switchdev port does when it becomes standalone again is not really up to the bridge's competence, and the driver should deal with it. On the other hand, replaying deletions of switchdev objects can be seen a matter of cleanup and therefore be treated by the bridge, hence this patch. We make the replay helpers opt-in for drivers, because they might not bring immediate benefits for them: - nbp_vlan_init() is called _after_ netdev_master_upper_dev_link(), so br_vlan_replay() should not do anything for the new drivers on which we call it. The existing drivers where there was even a slight possibility for there to exist a VLAN on a bridge port before they join it are already guarded against this: mlxsw and prestera deny joining LAG interfaces that are members of a bridge. - br_fdb_replay() should now notify of local FDB entries, but I patched all drivers except DSA to ignore these new entries in commit 2c4eca3ef716 ("net: bridge: switchdev: include local flag in FDB notifications"). Driver authors can lift this restriction as they wish, and when they do, they can also opt into the FDB replay functionality. - br_mdb_replay() should fix a real issue which is described in commit 4f2673b3a2b6 ("net: bridge: add helper to replay port and host-joined mdb entries"). However most drivers do not offload the SWITCHDEV_OBJ_ID_HOST_MDB to see this issue: only cpsw and am65_cpsw offload this switchdev object, and I don't completely understand the way in which they offload this switchdev object anyway. So I'll leave it up to these drivers' respective maintainers to opt into br_mdb_replay(). So most of the drivers pass NULL notifier blocks for the replay helpers, except: - dpaa2-switch which was already acked/regression-tested with the helpers enabled (and there isn't much of a downside in having them) - ocelot which already had replay logic in "pull" mode - DSA which already had replay logic in "pull" mode An important observation is that the drivers which don't currently request bridge event replays don't even have the switchdev_bridge_port_{offload,unoffload} calls placed in proper places right now. This was done to avoid unnecessary rework for drivers which might never even add support for this. For driver writers who wish to add replay support, this can be used as a tentative placement guide: https://patchwork.kernel.org/project/netdevbpf/patch/20210720134655.892334-11-vladimir.oltean@nxp.com/ Cc: Vadym Kochan <vkochan@marvell.com> Cc: Taras Chornyi <tchornyi@marvell.com> Cc: Ioana Ciornei <ioana.ciornei@nxp.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Cc: Steen Hegelund <Steen.Hegelund@microchip.com> Cc: UNGLinuxDriver@microchip.com Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-22	net: bridge: switchdev: let drivers inform which bridge ports are offloaded	Vladimir Oltean
	On reception of an skb, the bridge checks if it was marked as 'already forwarded in hardware' (checks if skb->offload_fwd_mark == 1), and if it is, it assigns the source hardware domain of that skb based on the hardware domain of the ingress port. Then during forwarding, it enforces that the egress port must have a different hardware domain than the ingress one (this is done in nbp_switchdev_allowed_egress). Non-switchdev drivers don't report any physical switch id (neither through devlink nor .ndo_get_port_parent_id), therefore the bridge assigns them a hardware domain of 0, and packets coming from them will always have skb->offload_fwd_mark = 0. So there aren't any restrictions. Problems appear due to the fact that DSA would like to perform software fallback for bonding and team interfaces that the physical switch cannot offload. +-- br0 ---+ / / \| \ / / \| \ / \| \| bond0 / \| \| / \ swp0 swp1 swp2 swp3 swp4 There, it is desirable that the presence of swp3 and swp4 under a non-offloaded LAG does not preclude us from doing hardware bridging beteen swp0, swp1 and swp2. The bandwidth of the CPU is often times high enough that software bridging between {swp0,swp1,swp2} and bond0 is not impractical. But this creates an impossible paradox given the current way in which port hardware domains are assigned. When the driver receives a packet from swp0 (say, due to flooding), it must set skb->offload_fwd_mark to something. - If we set it to 0, then the bridge will forward it towards swp1, swp2 and bond0. But the switch has already forwarded it towards swp1 and swp2 (not to bond0, remember, that isn't offloaded, so as far as the switch is concerned, ports swp3 and swp4 are not looking up the FDB, and the entire bond0 is a destination that is strictly behind the CPU). But we don't want duplicated traffic towards swp1 and swp2, so it's not ok to set skb->offload_fwd_mark = 0. - If we set it to 1, then the bridge will not forward the skb towards the ports with the same switchdev mark, i.e. not to swp1, swp2 and bond0. Towards swp1 and swp2 that's ok, but towards bond0? It should have forwarded the skb there. So the real issue is that bond0 will be assigned the same hardware domain as {swp0,swp1,swp2}, because the function that assigns hardware domains to bridge ports, nbp_switchdev_add(), recurses through bond0's lower interfaces until it finds something that implements devlink (calls dev_get_port_parent_id with bool recurse = true). This is a problem because the fact that bond0 can be offloaded by swp3 and swp4 in our example is merely an assumption. A solution is to give the bridge explicit hints as to what hardware domain it should use for each port. Currently, the bridging offload is very 'silent': a driver registers a netdevice notifier, which is put on the netns's notifier chain, and which sniffs around for NETDEV_CHANGEUPPER events where the upper is a bridge, and the lower is an interface it knows about (one registered by this driver, normally). Then, from within that notifier, it does a bunch of stuff behind the bridge's back, without the bridge necessarily knowing that there's somebody offloading that port. It looks like this: ip link set swp0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v call_netdevice_notifiers \| v dsa_slave_netdevice_event \| v oh, hey! it's for me! \| v .port_bridge_join What we do to solve the conundrum is to be less silent, and change the switchdev drivers to present themselves to the bridge. Something like this: ip link set swp0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v bridge: Aye! I'll use this call_netdevice_notifiers ^ ppid as the \| \| hardware domain for v \| this port, and zero dsa_slave_netdevice_event \| if I got nothing. \| \| v \| oh, hey! it's for me! \| \| \| v \| .port_bridge_join \| \| \| +------------------------+ switchdev_bridge_port_offload(swp0, swp0) Then stacked interfaces (like bond0 on top of swp3/swp4) would be treated differently in DSA, depending on whether we can or cannot offload them. The offload case: ip link set bond0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v bridge: Aye! I'll use this call_netdevice_notifiers ^ ppid as the \| \| switchdev mark for v \| bond0. dsa_slave_netdevice_event \| Coincidentally (or not), \| \| bond0 and swp0, swp1, swp2 v \| all have the same switchdev hmm, it's not quite for me, \| mark now, since the ASIC but my driver has already \| is able to forward towards called .port_lag_join \| all these ports in hw. for it, because I have \| a port with dp->lag_dev == bond0. \| \| \| v \| .port_bridge_join \| for swp3 and swp4 \| \| \| +------------------------+ switchdev_bridge_port_offload(bond0, swp3) switchdev_bridge_port_offload(bond0, swp4) And the non-offload case: ip link set bond0 master br0 \| v br_add_if() calls netdev_master_upper_dev_link() \| v bridge waiting: call_netdevice_notifiers ^ huh, switchdev_bridge_port_offload \| \| wasn't called, okay, I'll use a v \| hwdom of zero for this one. dsa_slave_netdevice_event : Then packets received on swp0 will \| : not be software-forwarded towards v : swp1, but they will towards bond0. it's not for me, but bond0 is an upper of swp3 and swp4, but their dp->lag_dev is NULL because they couldn't offload it. Basically we can draw the conclusion that the lowers of a bridge port can come and go, so depending on the configuration of lowers for a bridge port, it can dynamically toggle between offloaded and unoffloaded. Therefore, we need an equivalent switchdev_bridge_port_unoffload too. This patch changes the way any switchdev driver interacts with the bridge. From now on, everybody needs to call switchdev_bridge_port_offload and switchdev_bridge_port_unoffload, otherwise the bridge will treat the port as non-offloaded and allow software flooding to other ports from the same ASIC. Note that these functions lay the ground for a more complex handshake between switchdev drivers and the bridge in the future. For drivers that will request a replay of the switchdev objects when they offload and unoffload a bridge port (DSA, dpaa2-switch, ocelot), we place the call to switchdev_bridge_port_unoffload() strategically inside the NETDEV_PRECHANGEUPPER notifier's code path, and not inside NETDEV_CHANGEUPPER. This is because the switchdev object replay helpers need the netdev adjacency lists to be valid, and that is only true in NETDEV_PRECHANGEUPPER. Cc: Vadym Kochan <vkochan@marvell.com> Cc: Taras Chornyi <tchornyi@marvell.com> Cc: Ioana Ciornei <ioana.ciornei@nxp.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Cc: Steen Hegelund <Steen.Hegelund@microchip.com> Cc: UNGLinuxDriver@microchip.com Cc: Claudiu Manoil <claudiu.manoil@nxp.com> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch: regression Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com> # dpaa2-switch Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com> # ocelot-switch Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20	net: marvell: clean up trigraph warning on ??! string	Colin Ian King
	The character sequence ??! is a trigraph and causes the following clang warning: drivers/net/ethernet/marvell/mvneta.c:2604:39: warning: trigraph ignored [-Wtrigraphs] Clean this by replacing it with single ?. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20	net: mvpp2: deny disabling autoneg for 802.3z modes	Russell King (Oracle)
	The documentation for Armada 8040 says: Bit 2 Field InBandAnEn In-band Auto-Negotiation enable. ... When <PortType> = 1 (1000BASE-X) this field must be set to 1. We presently ignore whether userspace requests autonegotiation or not through the ethtool ksettings interface. However, we have some network interfaces that wish to do this. To offer a consistent API across network interfaces, deny the ability to disable autonegotiation on mvpp2 hardware when in 1000BASE-X and 2500BASE-X. This means the only way to switch between 2500BASE-X and 1000BASE-X on SFPs that support this will be: # ethtool -s ethX advertise 0x20000006000 # 1000BASE-X Pause AsymPause # ethtool -s ethX advertise 0xe000 # 2500BASE-X Pause AsymPause Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Marek Behún <kabel@kernel.org> Acked-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20	net: mvneta: deny disabling autoneg for 802.3z modes	Russell King (Oracle)
	The documentation for Armada 38x says: Bit 2 Field InBandAnEn In-band Auto-Negotiation enable. ... When <PortType> = 1 (1000BASE-X) this field must be set to 1. We presently ignore whether userspace requests autonegotiation or not through the ethtool ksettings interface. However, we have some network interfaces that wish to do this. To offer a consistent API across network interfaces, deny the ability to disable autonegotiation on mvneta hardware when in 1000BASE-X and 2500BASE-X. This means the only way to switch between 2500BASE-X and 1000BASE-X on SFPs that support this will be: # ethtool -s ethX advertise 0x20000002000 # 1000BASE-X Pause # ethtool -s ethX advertise 0xa000 # 2500BASE-X Pause Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-19	octeontx2-af: Introduce internal packet switching	Subbaraya Sundeep
	As of now any communication between CGXs PFs and their VFs within the system is possible only by external switches sending packets back to the system. This patch adds internal switching support. Broadcast packet replication is not covered here. RVU admin function (AF) maintains MAC addresses of all interfaces in the system. When switching is enabled, MCAM entries are allocated to install rules such that packets with DMAC matching any of the internal interface MAC addresses is punted back into the system via the loopback channel. On the receive side the default unicast rules are modified to not check for ingress channel. So any packet with matching DMAC irrespective of which interface it is coming from will be forwarded to the respective PF/VF interface. The transmit side rules and default unicast rules are updated if user changes MAC address of an interface. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-19	octeontx2-af: Prepare for allocating MCAM rules for AF	Subbaraya Sundeep
	AF till now only manages the allocation and freeing of MCAM rules for other PF/VFs in system. To implement L2 switching between all CGX mapped PF and VFs, AF requires MCAM entries for DMAC rules for each PF and VF. This patch modifies AF driver such that AF can also allocate MCAM rules and install rules for other PFs and VFs. All the checks like channel verification for RX rules and PF_FUNC verification for TX rules are relaxed in case AF is allocating or installing rules. Also all the entry and counter to owner mappings are set to NPC_MCAM_INVALID_MAP when they are free indicating those are not allocated to AF nor PF/VFs. This patch also ensures that AF allocated and installed entries are displayed in debugfs. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-19	octeontx2-af: Enable transmit side LBK link	Subbaraya Sundeep
	For enabling VF-VF switching the packets egressing out of CGX mapped VFs needed to be sent to LBK so that same packets are received back to the system. But the LBK link also needs to be enabled in addition to a VF's mapped CGX_LMAC link otherwise hardware raises send error interrupt indicating selected LBK link is not enabled in NIX_AF_TL3_TL2X_LINKX_CFG register. Hence this patch enables all LBK links in TL3_TL2_LINKX_CFG registers. Also to enable packet flow between PFs/VFs of NIX0 to PFs/VFs of NIX1(in 98xx silicon) the NPC TX DMAC rules has to be installed such that rules must be hit for any TX interface i.e., NIX0-TX or NIX1-TX provided DMAC match creteria is met. Hence this patch changes the behavior such that MCAM is programmed to match with any NIX0/1-TX interface for TX rules. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-14	Merge tag 'net-5.14-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski. "Including fixes from bpf and netfilter. Current release - regressions: - sock: fix parameter order in sock_setsockopt() Current release - new code bugs: - netfilter: nft_last: - fix incorrect arithmetic when restoring last used - honor NFTA_LAST_SET on restoration Previous releases - regressions: - udp: properly flush normal packet at GRO time - sfc: ensure correct number of XDP queues; don't allow enabling the feature if there isn't sufficient resources to Tx from any CPU - dsa: sja1105: fix address learning getting disabled on the CPU port - mptcp: addresses a rmem accounting issue that could keep packets in subflow receive buffers longer than necessary, delaying MPTCP-level ACKs - ip_tunnel: fix mtu calculation for ETHER tunnel devices - do not reuse skbs allocated from skbuff_fclone_cache in the napi skb cache, we'd try to return them to the wrong slab cache - tcp: consistently disable header prediction for mptcp Previous releases - always broken: - bpf: fix subprog poke descriptor tracking use-after-free - ipv6: - allocate enough headroom in ip6_finish_output2() in case iptables TEE is used - tcp: drop silly ICMPv6 packet too big messages to avoid expensive and pointless lookups (which may serve as a DDOS vector) - make sure fwmark is copied in SYNACK packets - fix 'disable_policy' for forwarded packets (align with IPv4) - netfilter: conntrack: - do not renew entry stuck in tcp SYN_SENT state - do not mark RST in the reply direction coming after SYN packet for an out-of-sync entry - mptcp: cleanly handle error conditions with MP_JOIN and syncookies - mptcp: fix double free when rejecting a join due to port mismatch - validate lwtstate->data before returning from skb_tunnel_info() - tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path - mt76: mt7921: continue to probe driver when fw already downloaded - bonding: fix multiple issues with offloading IPsec to (thru?) bond - stmmac: ptp: fix issues around Qbv support and setting time back - bcmgenet: always clear wake-up based on energy detection Misc: - sctp: move 198 addresses from unusable to private scope - ptp: support virtual clocks and timestamping - openvswitch: optimize operation for key comparison" * tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (158 commits) net: dsa: properly check for the bridge_leave methods in dsa_switch_bridge_leave() sfc: add logs explaining XDP_TX/REDIRECT is not available sfc: ensure correct number of XDP queues sfc: fix lack of XDP TX queues - error XDP TX failed (-22) net: fddi: fix UAF in fza_probe net: dsa: sja1105: fix address learning getting disabled on the CPU port net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload net: Use nlmsg_unicast() instead of netlink_unicast() octeontx2-pf: Fix uninitialized boolean variable pps ipv6: allocate enough headroom in ip6_finish_output2() net: hdlc: rename 'mod_init' & 'mod_exit' functions to be module-specific net: bridge: multicast: fix MRD advertisement router port marking race net: bridge: multicast: fix PIM hello router port marking race net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340 dsa: fix for_each_child.cocci warnings virtio_net: check virtqueue_add_sgs() return value mptcp: properly account bulk freed memory selftests: mptcp: fix case multiple subflows limited by server mptcp: avoid processing packet if a subflow reset mptcp: fix syncookie process if mptcp can not_accept new subflow ...
2021-07-12	octeontx2-pf: Fix uninitialized boolean variable pps	Colin Ian King
	In the case where act->id is FLOW_ACTION_POLICE and also act->police.rate_bytes_ps > 0 or act->police.rate_pkt_ps is not > 0 the boolean variable pps contains an uninitialized value when function otx2_tc_act_set_police is called. Fix this by initializing pps to false. Addresses-Coverity: ("Uninitialized scalar variable)" Fixes: 68fbff68dbea ("octeontx2-pf: Add police action for TC flower") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-06	octeontx2-pf: Fix assigned error return value that is never used	Colin Ian King
	Currently when the call to otx2_mbox_alloc_msg_cgx_mac_addr_update fails the error return variable rc is being assigned -ENOMEM and does not return early. rc is then re-assigned and the error case is not handled correctly. Fix this by returning -ENOMEM rather than assigning rc. Addresses-Coverity: ("Unused value") Fixes: 79d2be385e9e ("octeontx2-pf: offload DMAC filters to CGX/RPM block") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-03	net: marvell: always set skb_shared_info in mvneta_swbm_add_rx_fragment	Lorenzo Bianconi
	Always set skb_shared_info data structure in mvneta_swbm_add_rx_fragment routine even if the fragment contains only the ethernet FCS. Fixes: 039fbc47f9f1 ("net: mvneta: alloc skb_shared_info on the mvneta_rx_swbm stack") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-03	Merge tag 'trace-v5.14' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Added option for per CPU threads to the hwlat tracer - Have hwlat tracer handle hotplug CPUs - New tracer: osnoise, that detects latency caused by interrupts, softirqs and scheduling of other tasks. - Added timerlat tracer that creates a thread and measures in detail what sources of latency it has for wake ups. - Removed the "success" field of the sched_wakeup trace event. This has been hardcoded as "1" since 2015, no tooling should be looking at it now. If one exists, we can revert this commit, fix that tool and try to remove it again in the future. - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids. - New boot command line option "tp_printk_stop", as tp_printk causes trace events to write to console. When user space starts, this can easily live lock the system. Having a boot option to stop just after boot up is useful to prevent that from happening. - Have ftrace_dump_on_oops boot command line option take numbers that match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops. - Bootconfig clean ups, fixes and enhancements. - New ktest script that tests bootconfig options. - Add tracepoint_probe_register_may_exist() to register a tracepoint without triggering a WARN() if it already exists. BPF has a path from user space that can do this. All other paths are considered a bug. - Small clean ups and fixes tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits) tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT tracing: Simplify & fix saved_tgids logic treewide: Add missing semicolons to __assign_str uses tracing: Change variable type as bool for clean-up trace/timerlat: Fix indentation on timerlat_main() trace/osnoise: Make 'noise' variable s64 in run_osnoise() tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing tracing: Fix spelling in osnoise tracer "interferences" -> "interference" Documentation: Fix a typo on trace/osnoise-tracer trace/osnoise: Fix return value on osnoise_init_hotplug_support trace/osnoise: Make interval u64 on osnoise_main trace/osnoise: Fix 'no previous prototype' warnings tracing: Have osnoise_main() add a quiescent state for task rcu seq_buf: Make trace_seq_putmem_hex() support data longer than 8 seq_buf: Fix overflow in seq_buf_putmem_hex() trace/osnoise: Support hotplug operations trace/hwlat: Support hotplug operations trace/hwlat: Protect kdata->kthread with get/put_online_cpus trace: Add timerlat tracer trace: Add osnoise tracer ...
2021-07-01	octeontx2-pf: offload DMAC filters to CGX/RPM block	Hariprasad Kelam
	DMAC filtering can be achieved by either NPC MCAM rules or CGX/RPM MAC filters. Currently we are achieving this by NPC MCAM rules. This patch offloads DMAC filters to CGX/RPM MAC filters instead of NPC MCAM rules. Offloading DMAC filter to CGX/RPM block helps in reducing traffic to NPC block and save MCAM rules Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01	octeontx2-af: Debugfs support for DMAC filters	Hariprasad Kelam
	Add debugfs support to display CGX/RPM DMAC filter table associated with pf. cat /sys/kernel/debug/octeontx2/cgx/cgx0/lmac0/mac_filter PCI dev RVUPF BROADCAST MULTICAST FILTER-MODE 0002:02:00.0 PF2 ACCEPT ACCEPT UNICAST DMAC-INDEX ADDRESS 0 00:0f:b7:06:17:06 1 1a:1b:1c:1d:1e:01 2 1a:1b:1c:1d:1e:02 Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01	octeontx2-af: DMAC filter support in MAC block	Sunil Kumar Kori
	MAC block supports 32 dmac filters which are logically divided among all attached LMACS. For example MAC block0 having one LMAC then maximum supported filters are 32 where as MAC block1 having 4 enabled LMACS them maximum supported filteres are 8 for each LMAC. This patch adds mbox handlers to add/delete/update mac entry in DMAC filter table. Signed-off-by: Sunil Kumar Kori <skori@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01	octeontx2-pf: cn10k: Use runtime allocated LMTLINE region	Geetha sowjanya
	The current driver uses static LMTST region allocated by firmware. This memory gets populated as PF/VF BAR2. RVU PF/VF driver ioremap the memory as device memory for NIX/NPA operation. Since the memory is mapped as device memory we see performance degration. To address this issue this patch implements runtime memory allocation. RVU PF/VF allocates memory during device probe and share the base address with RVU AF. RVU AF then configure the LMT MAP table accordingly. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01	octeontx2-af: cn10k: Support configurable LMTST regions	Geetha sowjanya
	This patch extends the lmtst_tbl_setup_req mbox to support run time LMTST configuration. RVU PF/VF and DPDK/ODP allocates a LMT region, creates a translation entry for a device via VFIO IOCTLs. This IOVA is shared with AF through above mbox. AF then uses RVU_SMMU transulation Widget and gets PA for the IOVA and updates the LMTtable entry for that device. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01	octeontx2-af: cn10k: Setting up lmtst map table	Harman Kalra
	Introducing a new mailbox to support updating lmt entries and common lmt base address scheme i.e. multiple pcifuncs can share lmt region to reduce L1 cache pressure for application. Parameters passed to mailbox includes the primary pcifunc value whose lmt regions will be shared by other secondary pcifuncs. Here secondary pcifunc will be the one who is calling the mailbox. For example: By default each pcifunc has its own LMT base address: PCIFUNC1 LMT_BASE_ADDR A PCIFUNC2 LMT_BASE_ADDR B PCIFUNC3 LMT_BASE_ADDR C PCIFUNC4 LMT_BASE_ADDR D Application will choose PCIFUNC1 as base/primary pcifunc and as and when other pcifunc(secondary pcifuncs) gets probed, this mailbox will be called and LMTST table will be updated as: PCIFUNC1 LMT_BASE_ADDR A PCIFUNC2 LMT_BASE_ADDR A PCIFUNC3 LMT_BASE_ADDR A PCIFUNC4 LMT_BASE_ADDR A On FLR lmtst map table gets resetted to the default lmt base addresses for all secondary pcifuncs. Signed-off-by: Harman Kalra <hkalra@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-30	treewide: Add missing semicolons to __assign_str uses	Joe Perches
	The __assign_str macro has an unusual ending semicolon but the vast majority of uses of the macro already have semicolon termination. $ git grep -P '\b__assign_str\b' \| wc -l 551 $ git grep -P '\b__assign_str\b.*;' \| wc -l 480 Add semicolons to the __assign_str() uses without semicolon termination and all the other uses without semicolon termination via additional defines that are equivalent to __assign_str() with the eventual goal of removing the semicolon from the __assign_str() macro definition. Link: https://lore.kernel.org/lkml/1e068d21106bb6db05b735b4916bb420e6c9842a.camel@perches.com/ Link: https://lkml.kernel.org/r/48a056adabd8f70444475352f617914cef504a45.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-28	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-28 The following pull-request contains BPF updates for your net-next tree. We've added 37 non-merge commits during the last 12 day(s) which contain a total of 56 files changed, 394 insertions(+), 380 deletions(-). The main changes are: 1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney. 2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski. 3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev. 4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria. 5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards. 6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa. 7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi. 8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim. 9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young. 10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer. 11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski. 12) Fix up bpfilter startup log-level to info level, from Gary Lin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28	net: switchdev: add a context void pointer to struct switchdev_notifier_info	Vladimir Oltean
	In the case where the driver asks for a replay of a certain type of event (port object or attribute) for a bridge port that is a LAG, it may do so because this port has just joined the LAG. But there might already be other switchdev ports in that LAG, and it is preferable that those preexisting switchdev ports do not act upon the replayed event. The solution is to add a context to switchdev events, which is NULL most of the time (when the bridge layer initiates the call) but which can be set to a value controlled by the switchdev driver when a replay is requested. The driver can then check the context to figure out if all ports within the LAG should act upon the switchdev event, or just the ones that match the context. We have to modify all switchdev_handle_* helper functions as well as the prototypes in the drivers that use these helpers too, because these helpers hide the underlying struct switchdev_notifier_info from us and there is no way to retrieve the context otherwise. The context structure will be populated and used in later patches. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-25	net: mdiobus: withdraw fwnode_mdbiobus_register	Marcin Wojtas
	The newly implemented fwnode_mdbiobus_register turned out to be problematic - in case the fwnode_/of_/acpi_mdio are built as modules, a dependency cycle can be observed during the depmod phase of modules_install, eg.: depmod: ERROR: Cycle detected: fwnode_mdio -> of_mdio -> fwnode_mdio depmod: ERROR: Found 2 modules in dependency cycles! OR: depmod: ERROR: Cycle detected: acpi_mdio -> fwnode_mdio -> acpi_mdio depmod: ERROR: Found 2 modules in dependency cycles! A possible solution could be to rework fwnode_mdiobus_register, so that to merge the contents of acpi_mdiobus_register and of_mdiobus_register. However feasible, such change would be very intrusive and affect huge amount of the of_mdiobus_register users. Since there are currently 2 users of ACPI and MDIO (xgmac_mdio and mvmdio), withdraw the fwnode_mdbiobus_register and roll back to a simple 'if' condition in affected drivers. Fixes: 62a6ef6a996f ("net: mdiobus: Introduce fwnode_mdbiobus_register()") Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24	marvell: Remove rcu_read_lock() around XDP program invocation	Toke Høiland-Jørgensen
	The mvneta and mvpp2 drivers have rcu_read_lock()/rcu_read_unlock() pairs around XDP program invocations. However, the actual lifetime of the objects referred by the XDP program invocation is longer, all the way through to the call to xdp_do_flush(), making the scope of the rcu_read_lock() too small. This turns out to be harmless because it all happens in a single NAPI poll cycle (and thus under local_bh_disable()), but it makes the rcu_read_lock() misleading. Rather than extend the scope of the rcu_read_lock(), just get rid of it entirely. With the addition of RCU annotations to the XDP_REDIRECT map types that take bh execution into account, lockdep even understands this to be safe, so there's really no reason to keep it around. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/bpf/20210624160609.292325-13-toke@redhat.com
2021-06-22	net: marvell: return csum computation result from mvneta_rx_csum/mvpp2_rx_csum	Lorenzo Bianconi
	This is a preliminary patch to add hw csum hint support to mvneta/mvpp2 xdp implementation Tested-by: Matteo Croce <mcroce@linux.microsoft.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22	octeontx2-af: Avoid field-overflowing memcpy()	Kees Cook
	In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. To avoid having memcpy() think a u64 "prof" is being written beyond, adjust the prof member type by adding struct nix_bandprof_s to the union to match the other structs. This silences the following future warning: In file included from ./include/linux/string.h:253, from ./include/linux/bitmap.h:10, from ./include/linux/cpumask.h:12, from ./arch/x86/include/asm/cpumask.h:5, from ./arch/x86/include/asm/msr.h:11, from ./arch/x86/include/asm/processor.h:22, from ./arch/x86/include/asm/timex.h:5, from ./include/linux/timex.h:65, from ./include/linux/time32.h:13, from ./include/linux/time.h:60, from ./include/linux/stat.h:19, from ./include/linux/module.h:13, from drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c:11: In function '__fortify_memcpy_chk', inlined from '__fortify_memcpy' at ./include/linux/fortify-string.h:310:2, inlined from 'rvu_nix_blk_aq_enq_inst' at drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c:910:5: ./include/linux/fortify-string.h:268:4: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); please use struct_group() [-Wattribute-warning] 268 \| __write_overflow_field(); \| ^~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/marvell/octeontx2/af/rvu_nix.c: ... else if (req->ctype == NIX_AQ_CTYPE_BANDPROF) memcpy(&rsp->prof, ctx, sizeof(struct nix_bandprof_s)); ... Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Subbaraya Sundeep<sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22	net: mvpp2: remove unused 'has_phy' field	Marcin Wojtas
	The 'has_phy' field from struct mvpp2_port is no longer used. Remove it. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-22	net: mvpp2: enable using phylink with ACPI	Marcin Wojtas
	Now that the MDIO and phylink are supported in the ACPI world, enable to use them in the mvpp2 driver. Ensure a backward compatibility with the firmware whose ACPI description does not contain the necessary elements for the proper phy handling and fall back to relying on the link interrupts instead. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>