summaryrefslogtreecommitdiff
path: root/Documentation/networking
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/00-INDEX2
-rw-r--r--Documentation/networking/LICENSE.qlcnic51
-rw-r--r--Documentation/networking/batman-adv.txt15
-rw-r--r--Documentation/networking/bonding.txt17
-rw-r--r--Documentation/networking/ieee802154.txt27
-rw-r--r--Documentation/networking/ifenslave.c2
-rw-r--r--Documentation/networking/ip-sysctl.txt42
-rw-r--r--Documentation/networking/ipvs-sysctl.txt62
-rw-r--r--Documentation/networking/mac80211-injection.txt4
-rw-r--r--Documentation/networking/netdevices.txt4
-rw-r--r--Documentation/networking/openvswitch.txt195
-rw-r--r--Documentation/networking/packet_mmap.txt2
-rw-r--r--Documentation/networking/scaling.txt10
-rw-r--r--Documentation/networking/stmmac.txt60
-rw-r--r--Documentation/networking/team.txt2
15 files changed, 404 insertions, 91 deletions
diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX
index bbce1215434..9ad9ddeb384 100644
--- a/Documentation/networking/00-INDEX
+++ b/Documentation/networking/00-INDEX
@@ -144,6 +144,8 @@ nfc.txt
- The Linux Near Field Communication (NFS) subsystem.
olympic.txt
- IBM PCI Pit/Pit-Phy/Olympic Token Ring driver info.
+openvswitch.txt
+ - Open vSwitch developer documentation.
operstates.txt
- Overview of network interface operational states.
packet_mmap.txt
diff --git a/Documentation/networking/LICENSE.qlcnic b/Documentation/networking/LICENSE.qlcnic
index 29ad4b10642..e7fb2c6023b 100644
--- a/Documentation/networking/LICENSE.qlcnic
+++ b/Documentation/networking/LICENSE.qlcnic
@@ -1,61 +1,22 @@
-Copyright (c) 2009-2010 QLogic Corporation
+Copyright (c) 2009-2011 QLogic Corporation
QLogic Linux qlcnic NIC Driver
-This program includes a device driver for Linux 2.6 that may be
-distributed with QLogic hardware specific firmware binary file.
You may modify and redistribute the device driver code under the
GNU General Public License (a copy of which is attached hereto as
Exhibit A) published by the Free Software Foundation (version 2).
-You may redistribute the hardware specific firmware binary file
-under the following terms:
-
- 1. Redistribution of source code (only if applicable),
- must retain the above copyright notice, this list of
- conditions and the following disclaimer.
-
- 2. Redistribution in binary form must reproduce the above
- copyright notice, this list of conditions and the
- following disclaimer in the documentation and/or other
- materials provided with the distribution.
-
- 3. The name of QLogic Corporation may not be used to
- endorse or promote products derived from this software
- without specific prior written permission
-
-REGARDLESS OF WHAT LICENSING MECHANISM IS USED OR APPLICABLE,
-THIS PROGRAM IS PROVIDED BY QLOGIC CORPORATION "AS IS'' AND ANY
-EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
-PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR
-BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
-EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
-TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
-ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
-OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
-OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
-POSSIBILITY OF SUCH DAMAGE.
-
-USER ACKNOWLEDGES AND AGREES THAT USE OF THIS PROGRAM WILL NOT
-CREATE OR GIVE GROUNDS FOR A LICENSE BY IMPLICATION, ESTOPPEL, OR
-OTHERWISE IN ANY INTELLECTUAL PROPERTY RIGHTS (PATENT, COPYRIGHT,
-TRADE SECRET, MASK WORK, OR OTHER PROPRIETARY RIGHT) EMBODIED IN
-ANY OTHER QLOGIC HARDWARE OR SOFTWARE EITHER SOLELY OR IN
-COMBINATION WITH THIS PROGRAM.
-
EXHIBIT A
- GNU GENERAL PUBLIC LICENSE
- Version 2, June 1991
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
- Preamble
+ Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
@@ -105,7 +66,7 @@ patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
- GNU GENERAL PUBLIC LICENSE
+ GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
@@ -304,7 +265,7 @@ make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
- NO WARRANTY
+ NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
diff --git a/Documentation/networking/batman-adv.txt b/Documentation/networking/batman-adv.txt
index 88d4afbdef9..221ad0cdf11 100644
--- a/Documentation/networking/batman-adv.txt
+++ b/Documentation/networking/batman-adv.txt
@@ -1,4 +1,4 @@
-[state: 17-04-2011]
+[state: 21-08-2011]
BATMAN-ADV
----------
@@ -68,9 +68,9 @@ All mesh wide settings can be found in batman's own interface
folder:
# ls /sys/class/net/bat0/mesh/
-# aggregated_ogms gw_bandwidth hop_penalty
-# bonding gw_mode orig_interval
-# fragmentation gw_sel_class vis_mode
+# aggregated_ogms fragmentation gw_sel_class vis_mode
+# ap_isolation gw_bandwidth hop_penalty
+# bonding gw_mode orig_interval
There is a special folder for debugging information:
@@ -200,15 +200,16 @@ abled during run time. Following log_levels are defined:
0 - All debug output disabled
1 - Enable messages related to routing / flooding / broadcasting
-2 - Enable route or tt entry added / changed / deleted
-3 - Enable all messages
+2 - Enable messages related to route added / changed / deleted
+4 - Enable messages related to translation table operations
+7 - Enable all messages
The debug output can be changed at runtime using the file
/sys/class/net/bat0/mesh/log_level. e.g.
# echo 2 > /sys/class/net/bat0/mesh/log_level
-will enable debug messages for when routes or TTs change.
+will enable debug messages for when routes change.
BATCTL
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 91df678fb7f..080ad26690a 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -196,6 +196,23 @@ or, for backwards compatibility, the option value. E.g.,
The parameters are as follows:
+active_slave
+
+ Specifies the new active slave for modes that support it
+ (active-backup, balance-alb and balance-tlb). Possible values
+ are the name of any currently enslaved interface, or an empty
+ string. If a name is given, the slave and its link must be up in order
+ to be selected as the new active slave. If an empty string is
+ specified, the current active slave is cleared, and a new active
+ slave is selected automatically.
+
+ Note that this is only available through the sysfs interface. No module
+ parameter by this name exists.
+
+ The normal value of this option is the name of the currently
+ active slave, or the empty string if there is no active slave or
+ the current mode does not use an active slave.
+
ad_select
Specifies the 802.3ad aggregation selection logic to use. The
diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt
index f41ea240522..1dc1c24a754 100644
--- a/Documentation/networking/ieee802154.txt
+++ b/Documentation/networking/ieee802154.txt
@@ -78,3 +78,30 @@ in software. This is currently WIP.
See header include/net/mac802154.h and several drivers in drivers/ieee802154/.
+6LoWPAN Linux implementation
+============================
+
+The IEEE 802.15.4 standard specifies an MTU of 128 bytes, yielding about 80
+octets of actual MAC payload once security is turned on, on a wireless link
+with a link throughput of 250 kbps or less. The 6LoWPAN adaptation format
+[RFC4944] was specified to carry IPv6 datagrams over such constrained links,
+taking into account limited bandwidth, memory, or energy resources that are
+expected in applications such as wireless Sensor Networks. [RFC4944] defines
+a Mesh Addressing header to support sub-IP forwarding, a Fragmentation header
+to support the IPv6 minimum MTU requirement [RFC2460], and stateless header
+compression for IPv6 datagrams (LOWPAN_HC1 and LOWPAN_HC2) to reduce the
+relatively large IPv6 and UDP headers down to (in the best case) several bytes.
+
+In Semptember 2011 the standard update was published - [RFC6282].
+It deprecates HC1 and HC2 compression and defines IPHC encoding format which is
+used in this Linux implementation.
+
+All the code related to 6lowpan you may find in files: net/ieee802154/6lowpan.*
+
+To setup 6lowpan interface you need (busybox release > 1.17.0):
+1. Add IEEE802.15.4 interface and initialize PANid;
+2. Add 6lowpan interface by command like:
+ # ip link add link wpan0 name lowpan0 type lowpan
+3. Set MAC (if needs):
+ # ip link set lowpan0 address de:ad:be:ef:ca:fe:ba:be
+4. Bring up 'lowpan0' interface
diff --git a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c
index 65968fbf1e4..ac5debb2f16 100644
--- a/Documentation/networking/ifenslave.c
+++ b/Documentation/networking/ifenslave.c
@@ -539,12 +539,14 @@ static int if_getconfig(char *ifname)
metric = 0;
} else
metric = ifr.ifr_metric;
+ printf("The result of SIOCGIFMETRIC is %d\n", metric);
strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFMTU, &ifr) < 0)
mtu = 0;
else
mtu = ifr.ifr_mtu;
+ printf("The result of SIOCGIFMTU is %d\n", mtu);
strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFDSTADDR, &ifr) < 0) {
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index ca5cdcd0f0e..ad3e80e17b4 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -20,7 +20,7 @@ ip_no_pmtu_disc - BOOLEAN
default FALSE
min_pmtu - INTEGER
- default 562 - minimum discovered Path MTU
+ default 552 - minimum discovered Path MTU
route/max_size - INTEGER
Maximum number of routes allowed in the kernel. Increase
@@ -31,6 +31,16 @@ neigh/default/gc_thresh3 - INTEGER
when using large numbers of interfaces and when communicating
with large numbers of directly-connected peers.
+neigh/default/unres_qlen_bytes - INTEGER
+ The maximum number of bytes which may be used by packets
+ queued for each unresolved address by other network layers.
+ (added in linux 3.3)
+
+neigh/default/unres_qlen - INTEGER
+ The maximum number of packets which may be queued for each
+ unresolved address by other network layers.
+ (deprecated in linux 3.3) : use unres_qlen_bytes instead.
+
mtu_expires - INTEGER
Time, in seconds, that cached PMTU information is kept.
@@ -165,6 +175,9 @@ tcp_congestion_control - STRING
connections. The algorithm "reno" is always available, but
additional choices may be available based on kernel configuration.
Default is set as part of kernel configuration.
+ For passive connections, the listener congestion control choice
+ is inherited.
+ [see setsockopt(listenfd, SOL_TCP, TCP_CONGESTION, "name" ...) ]
tcp_cookie_size - INTEGER
Default size of TCP Cookie Transactions (TCPCT) option, that may be
@@ -282,11 +295,11 @@ tcp_max_ssthresh - INTEGER
Default: 0 (off)
tcp_max_syn_backlog - INTEGER
- Maximal number of remembered connection requests, which are
- still did not receive an acknowledgment from connecting client.
- Default value is 1024 for systems with more than 128Mb of memory,
- and 128 for low memory machines. If server suffers of overload,
- try to increase this number.
+ Maximal number of remembered connection requests, which have not
+ received an acknowledgment from connecting client.
+ The minimal value is 128 for low memory machines, and it will
+ increase in proportion to the memory of machine.
+ If server suffers from overload, try increasing this number.
tcp_max_tw_buckets - INTEGER
Maximal number of timewait sockets held by system simultaneously.
@@ -1045,6 +1058,11 @@ conf/interface/*:
accept_ra - INTEGER
Accept Router Advertisements; autoconfigure using them.
+ It also determines whether or not to transmit Router
+ Solicitations. If and only if the functional setting is to
+ accept Router Advertisements, Router Solicitations will be
+ transmitted.
+
Possible values are:
0 Do not accept Router Advertisements.
1 Accept Router Advertisements if forwarding is disabled.
@@ -1115,14 +1133,14 @@ forwarding - INTEGER
Possible values are:
0 Forwarding disabled
1 Forwarding enabled
- 2 Forwarding enabled (Hybrid Mode)
FALSE (0):
By default, Host behaviour is assumed. This means:
1. IsRouter flag is not set in Neighbour Advertisements.
- 2. Router Solicitations are being sent when necessary.
+ 2. If accept_ra is TRUE (default), transmit Router
+ Solicitations.
3. If accept_ra is TRUE (default), accept Router
Advertisements (and do autoconfiguration).
4. If accept_redirects is TRUE (default), accept Redirects.
@@ -1133,16 +1151,10 @@ forwarding - INTEGER
This means exactly the reverse from the above:
1. IsRouter flag is set in Neighbour Advertisements.
- 2. Router Solicitations are not sent.
+ 2. Router Solicitations are not sent unless accept_ra is 2.
3. Router Advertisements are ignored unless accept_ra is 2.
4. Redirects are ignored.
- TRUE (2):
-
- Hybrid mode. Same behaviour as TRUE, except for:
-
- 2. Router Solicitations are being sent when necessary.
-
Default: 0 (disabled) if global forwarding is disabled (default),
otherwise 1 (enabled).
diff --git a/Documentation/networking/ipvs-sysctl.txt b/Documentation/networking/ipvs-sysctl.txt
index 4ccdbca0381..f2a2488f1bf 100644
--- a/Documentation/networking/ipvs-sysctl.txt
+++ b/Documentation/networking/ipvs-sysctl.txt
@@ -15,6 +15,23 @@ amemthresh - INTEGER
enabled and the variable is automatically set to 2, otherwise
the strategy is disabled and the variable is set to 1.
+conntrack - BOOLEAN
+ 0 - disabled (default)
+ not 0 - enabled
+
+ If set, maintain connection tracking entries for
+ connections handled by IPVS.
+
+ This should be enabled if connections handled by IPVS are to be
+ also handled by stateful firewall rules. That is, iptables rules
+ that make use of connection tracking. It is a performance
+ optimisation to disable this setting otherwise.
+
+ Connections handled by the IPVS FTP application module
+ will have connection tracking entries regardless of this setting.
+
+ Only available when IPVS is compiled with CONFIG_IP_VS_NFCT enabled.
+
cache_bypass - BOOLEAN
0 - disabled (default)
not 0 - enabled
@@ -39,7 +56,7 @@ debug_level - INTEGER
11 - IPVS packet handling (ip_vs_in/ip_vs_out)
12 or more - packet traversal
- Only available when IPVS is compiled with the CONFIG_IPVS_DEBUG
+ Only available when IPVS is compiled with CONFIG_IP_VS_DEBUG enabled.
Higher debugging levels include the messages for lower debugging
levels, so setting debug level 2, includes level 0, 1 and 2
@@ -123,13 +140,11 @@ nat_icmp_send - BOOLEAN
secure_tcp - INTEGER
0 - disabled (default)
- The secure_tcp defense is to use a more complicated state
- transition table and some possible short timeouts of each
- state. In the VS/NAT, it delays the entering the ESTABLISHED
- until the real server starts to send data and ACK packet
- (after 3-way handshake).
+ The secure_tcp defense is to use a more complicated TCP state
+ transition table. For VS/NAT, it also delays entering the
+ TCP ESTABLISHED state until the three way handshake is completed.
- The value definition is the same as that of drop_entry or
+ The value definition is the same as that of drop_entry and
drop_packet.
sync_threshold - INTEGER
@@ -141,3 +156,36 @@ sync_threshold - INTEGER
synchronized, every time the number of its incoming packets
modulus 50 equals the threshold. The range of the threshold is
from 0 to 49.
+
+snat_reroute - BOOLEAN
+ 0 - disabled
+ not 0 - enabled (default)
+
+ If enabled, recalculate the route of SNATed packets from
+ realservers so that they are routed as if they originate from the
+ director. Otherwise they are routed as if they are forwarded by the
+ director.
+
+ If policy routing is in effect then it is possible that the route
+ of a packet originating from a director is routed differently to a
+ packet being forwarded by the director.
+
+ If policy routing is not in effect then the recalculated route will
+ always be the same as the original route so it is an optimisation
+ to disable snat_reroute and avoid the recalculation.
+
+sync_version - INTEGER
+ default 1
+
+ The version of the synchronisation protocol used when sending
+ synchronisation messages.
+
+ 0 selects the original synchronisation protocol (version 0). This
+ should be used when sending synchronisation messages to a legacy
+ system that only understands the original synchronisation protocol.
+
+ 1 selects the current synchronisation protocol (version 1). This
+ should be used where possible.
+
+ Kernels with this sync_version entry are able to receive messages
+ of both version 1 and version 2 of the synchronisation protocol.
diff --git a/Documentation/networking/mac80211-injection.txt b/Documentation/networking/mac80211-injection.txt
index b30e81ad530..3a930072b16 100644
--- a/Documentation/networking/mac80211-injection.txt
+++ b/Documentation/networking/mac80211-injection.txt
@@ -23,6 +23,10 @@ radiotap headers and used to control injection:
IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
current fragmentation threshold.
+ * IEEE80211_RADIOTAP_TX_FLAGS
+
+ IEEE80211_RADIOTAP_F_TX_NOACK: frame should be sent without waiting for
+ an ACK even if it is a unicast frame
The injection code can also skip all other currently defined radiotap fields
facilitating replay of captured radiotap headers directly.
diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt
index 87b3d15f523..89358341682 100644
--- a/Documentation/networking/netdevices.txt
+++ b/Documentation/networking/netdevices.txt
@@ -73,7 +73,7 @@ dev->hard_start_xmit:
has to lock by itself when needed. It is recommended to use a try lock
for this and return NETDEV_TX_LOCKED when the spin lock fails.
The locking there should also properly protect against
- set_multicast_list. Note that the use of NETIF_F_LLTX is deprecated.
+ set_rx_mode. Note that the use of NETIF_F_LLTX is deprecated.
Don't use it for new drivers.
Context: Process with BHs disabled or BH (timer),
@@ -92,7 +92,7 @@ dev->tx_timeout:
Context: BHs disabled
Notes: netif_queue_stopped() is guaranteed true
-dev->set_multicast_list:
+dev->set_rx_mode:
Synchronization: netif_tx_lock spinlock.
Context: BHs disabled
diff --git a/Documentation/networking/openvswitch.txt b/Documentation/networking/openvswitch.txt
new file mode 100644
index 00000000000..b8a048b8df3
--- /dev/null
+++ b/Documentation/networking/openvswitch.txt
@@ -0,0 +1,195 @@
+Open vSwitch datapath developer documentation
+=============================================
+
+The Open vSwitch kernel module allows flexible userspace control over
+flow-level packet processing on selected network devices. It can be
+used to implement a plain Ethernet switch, network device bonding,
+VLAN processing, network access control, flow-based network control,
+and so on.
+
+The kernel module implements multiple "datapaths" (analogous to
+bridges), each of which can have multiple "vports" (analogous to ports
+within a bridge). Each datapath also has associated with it a "flow
+table" that userspace populates with "flows" that map from keys based
+on packet headers and metadata to sets of actions. The most common
+action forwards the packet to another vport; other actions are also
+implemented.
+
+When a packet arrives on a vport, the kernel module processes it by
+extracting its flow key and looking it up in the flow table. If there
+is a matching flow, it executes the associated actions. If there is
+no match, it queues the packet to userspace for processing (as part of
+its processing, userspace will likely set up a flow to handle further
+packets of the same type entirely in-kernel).
+
+
+Flow key compatibility
+----------------------
+
+Network protocols evolve over time. New protocols become important
+and existing protocols lose their prominence. For the Open vSwitch
+kernel module to remain relevant, it must be possible for newer
+versions to parse additional protocols as part of the flow key. It
+might even be desirable, someday, to drop support for parsing
+protocols that have become obsolete. Therefore, the Netlink interface
+to Open vSwitch is designed to allow carefully written userspace
+applications to work with any version of the flow key, past or future.
+
+To support this forward and backward compatibility, whenever the
+kernel module passes a packet to userspace, it also passes along the
+flow key that it parsed from the packet. Userspace then extracts its
+own notion of a flow key from the packet and compares it against the
+kernel-provided version:
+
+ - If userspace's notion of the flow key for the packet matches the
+ kernel's, then nothing special is necessary.
+
+ - If the kernel's flow key includes more fields than the userspace
+ version of the flow key, for example if the kernel decoded IPv6
+ headers but userspace stopped at the Ethernet type (because it
+ does not understand IPv6), then again nothing special is
+ necessary. Userspace can still set up a flow in the usual way,
+ as long as it uses the kernel-provided flow key to do it.
+
+ - If the userspace flow key includes more fields than the
+ kernel's, for example if userspace decoded an IPv6 header but
+ the kernel stopped at the Ethernet type, then userspace can
+ forward the packet manually, without setting up a flow in the
+ kernel. This case is bad for performance because every packet
+ that the kernel considers part of the flow must go to userspace,
+ but the forwarding behavior is correct. (If userspace can
+ determine that the values of the extra fields would not affect
+ forwarding behavior, then it could set up a flow anyway.)
+
+How flow keys evolve over time is important to making this work, so
+the following sections go into detail.
+
+
+Flow key format
+---------------
+
+A flow key is passed over a Netlink socket as a sequence of Netlink
+attributes. Some attributes represent packet metadata, defined as any
+information about a packet that cannot be extracted from the packet
+itself, e.g. the vport on which the packet was received. Most
+attributes, however, are extracted from headers within the packet,
+e.g. source and destination addresses from Ethernet, IP, or TCP
+headers.
+
+The <linux/openvswitch.h> header file defines the exact format of the
+flow key attributes. For informal explanatory purposes here, we write
+them as comma-separated strings, with parentheses indicating arguments
+and nesting. For example, the following could represent a flow key
+corresponding to a TCP packet that arrived on vport 1:
+
+ in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
+ eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0,
+ frag=no), tcp(src=49163, dst=80)
+
+Often we ellipsize arguments not important to the discussion, e.g.:
+
+ in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
+
+
+Basic rule for evolving flow keys
+---------------------------------
+
+Some care is needed to really maintain forward and backward
+compatibility for applications that follow the rules listed under
+"Flow key compatibility" above.
+
+The basic rule is obvious:
+
+ ------------------------------------------------------------------
+ New network protocol support must only supplement existing flow
+ key attributes. It must not change the meaning of already defined
+ flow key attributes.
+ ------------------------------------------------------------------
+
+This rule does have less-obvious consequences so it is worth working
+through a few examples. Suppose, for example, that the kernel module
+did not already implement VLAN parsing. Instead, it just interpreted
+the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the
+packet. The flow key for any packet with an 802.1Q header would look
+essentially like this, ignoring metadata:
+
+ eth(...), eth_type(0x8100)
+
+Naively, to add VLAN support, it makes sense to add a new "vlan" flow
+key attribute to contain the VLAN tag, then continue to decode the
+encapsulated headers beyond the VLAN tag using the existing field
+definitions. With this change, an TCP packet in VLAN 10 would have a
+flow key much like this:
+
+ eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
+
+But this change would negatively affect a userspace application that
+has not been updated to understand the new "vlan" flow key attribute.
+The application could, following the flow compatibility rules above,
+ignore the "vlan" attribute that it does not understand and therefore
+assume that the flow contained IP packets. This is a bad assumption
+(the flow only contains IP packets if one parses and skips over the
+802.1Q header) and it could cause the application's behavior to change
+across kernel versions even though it follows the compatibility rules.
+
+The solution is to use a set of nested attributes. This is, for
+example, why 802.1Q support uses nested attributes. A TCP packet in
+VLAN 10 is actually expressed as:
+
+ eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
+ ip(proto=6, ...), tcp(...)))
+
+Notice how the "eth_type", "ip", and "tcp" flow key attributes are
+nested inside the "encap" attribute. Thus, an application that does
+not understand the "vlan" key will not see either of those attributes
+and therefore will not misinterpret them. (Also, the outer eth_type
+is still 0x8100, not changed to 0x0800.)
+
+Handling malformed packets
+--------------------------
+
+Don't drop packets in the kernel for malformed protocol headers, bad
+checksums, etc. This would prevent userspace from implementing a
+simple Ethernet switch that forwards every packet.
+
+Instead, in such a case, include an attribute with "empty" content.
+It doesn't matter if the empty content could be valid protocol values,
+as long as those values are rarely seen in practice, because userspace
+can always forward all packets with those values to userspace and
+handle them individually.
+
+For example, consider a packet that contains an IP header that
+indicates protocol 6 for TCP, but which is truncated just after the IP
+header, so that the TCP header is missing. The flow key for this
+packet would include a tcp attribute with all-zero src and dst, like
+this:
+
+ eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
+
+As another example, consider a packet with an Ethernet type of 0x8100,
+indicating that a VLAN TCI should follow, but which is truncated just
+after the Ethernet type. The flow key for this packet would include
+an all-zero-bits vlan and an empty encap attribute, like this:
+
+ eth(...), eth_type(0x8100), vlan(0), encap()
+
+Unlike a TCP packet with source and destination ports 0, an
+all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka
+VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan
+attribute expressly to allow this situation to be distinguished.
+Thus, the flow key in this second example unambiguously indicates a
+missing or malformed VLAN TCI.
+
+Other rules
+-----------
+
+The other rules for flow keys are much less subtle:
+
+ - Duplicate attributes are not allowed at a given nesting level.
+
+ - Ordering of attributes is not significant.
+
+ - When the kernel sends a given flow key to userspace, it always
+ composes it the same way. This allows userspace to hash and
+ compare entire flow keys that it may not be able to fully
+ interpret.
diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt
index 4acea660372..1c08a4b0981 100644
--- a/Documentation/networking/packet_mmap.txt
+++ b/Documentation/networking/packet_mmap.txt
@@ -155,7 +155,7 @@ As capture, each frame contains two parts:
/* fill sockaddr_ll struct to prepare binding */
my_addr.sll_family = AF_PACKET;
- my_addr.sll_protocol = ETH_P_ALL;
+ my_addr.sll_protocol = htons(ETH_P_ALL);
my_addr.sll_ifindex = s_ifr.ifr_ifindex;
/* bind socket to eth0 */
diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt
index fe67b5c79f0..579994afbe0 100644
--- a/Documentation/networking/scaling.txt
+++ b/Documentation/networking/scaling.txt
@@ -73,7 +73,7 @@ of queues to IRQs can be determined from /proc/interrupts. By default,
an IRQ may be handled on any CPU. Because a non-negligible part of packet
processing takes place in receive interrupt handling, it is advantageous
to spread receive interrupts between CPUs. To manually adjust the IRQ
-affinity of each interrupt see Documentation/IRQ-affinity. Some systems
+affinity of each interrupt see Documentation/IRQ-affinity.txt. Some systems
will be running irqbalance, a daemon that dynamically optimizes IRQ
assignments and as a result may override any manual settings.
@@ -208,7 +208,7 @@ The counter in rps_dev_flow_table values records the length of the current
CPU's backlog when a packet in this flow was last enqueued. Each backlog
queue has a head counter that is incremented on dequeue. A tail counter
is computed as head counter + queue length. In other words, the counter
-in rps_dev_flow_table[i] records the last element in flow i that has
+in rps_dev_flow[i] records the last element in flow i that has
been enqueued onto the currently designated CPU for flow i (of course,
entry i is actually selected by hash and multiple flows may hash to the
same entry i).
@@ -224,7 +224,7 @@ following is true:
- The current CPU's queue head counter >= the recorded tail counter
value in rps_dev_flow[i]
-- The current CPU is unset (equal to NR_CPUS)
+- The current CPU is unset (equal to RPS_NO_CPU)
- The current CPU is offline
After this check, the packet is sent to the (possibly updated) current
@@ -235,7 +235,7 @@ CPU.
==== RFS Configuration
-RFS is only available if the kconfig symbol CONFIG_RFS is enabled (on
+RFS is only available if the kconfig symbol CONFIG_RPS is enabled (on
by default for SMP). The functionality remains disabled until explicitly
configured. The number of entries in the global flow table is set through:
@@ -258,7 +258,7 @@ For a single queue device, the rps_flow_cnt value for the single queue
would normally be configured to the same value as rps_sock_flow_entries.
For a multi-queue device, the rps_flow_cnt for each queue might be
configured as rps_sock_flow_entries / N, where N is the number of
-queues. So for instance, if rps_flow_entries is set to 32768 and there
+queues. So for instance, if rps_sock_flow_entries is set to 32768 and there
are 16 configured receive queues, rps_flow_cnt for each queue might be
configured as 2048.
diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
index 57a24108b84..d0aeeadd264 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -4,14 +4,16 @@ Copyright (C) 2007-2010 STMicroelectronics Ltd
Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers
-(Synopsys IP blocks); it has been fully tested on STLinux platforms.
+(Synopsys IP blocks).
Currently this network device driver is for all STM embedded MAC/GMAC
-(i.e. 7xxx/5xxx SoCs) and it's known working on other platforms i.e. ARM SPEAr.
+(i.e. 7xxx/5xxx SoCs), SPEAr (arm), Loongson1B (mips) and XLINX XC2V3000
+FF1152AMT0221 D1215994A VIRTEX FPGA board.
-DWC Ether MAC 10/100/1000 Universal version 3.41a and DWC Ether MAC 10/100
-Universal version 4.0 have been used for developing the first code
-implementation.
+DWC Ether MAC 10/100/1000 Universal version 3.60a (and older) and DWC Ether MAC 10/100
+Universal version 4.0 have been used for developing this driver.
+
+This driver supports both the platform bus and PCI.
Please, for more information also visit: www.stlinux.com
@@ -76,7 +78,16 @@ core.
4.5) DMA descriptors
Driver handles both normal and enhanced descriptors. The latter has been only
-tested on DWC Ether MAC 10/100/1000 Universal version 3.41a.
+tested on DWC Ether MAC 10/100/1000 Universal version 3.41a and later.
+
+STMMAC supports DMA descriptor to operate both in dual buffer (RING)
+and linked-list(CHAINED) mode. In RING each descriptor points to two
+data buffer pointers whereas in CHAINED mode they point to only one data
+buffer pointer. RING mode is the default.
+
+In CHAINED mode each descriptor will have pointer to next descriptor in
+the list, hence creating the explicit chaining in the descriptor itself,
+whereas such explicit chaining is not possible in RING mode.
4.6) Ethtool support
Ethtool is supported. Driver statistics and internal errors can be taken using:
@@ -235,7 +246,38 @@ reset procedure etc).
o enh_desc.c: functions for handling enhanced descriptors
o norm_desc.c: functions for handling normal descriptors
-5) TODO:
+5) Debug Information
+
+The driver exports many information i.e. internal statistics,
+debug information, MAC and DMA registers etc.
+
+These can be read in several ways depending on the
+type of the information actually needed.
+
+For example a user can be use the ethtool support
+to get statistics: e.g. using: ethtool -S ethX
+(that shows the Management counters (MMC) if supported)
+or sees the MAC/DMA registers: e.g. using: ethtool -d ethX
+
+Compiling the Kernel with CONFIG_DEBUG_FS and enabling the
+STMMAC_DEBUG_FS option the driver will export the following
+debugfs entries:
+
+/sys/kernel/debug/stmmaceth/descriptors_status
+ To show the DMA TX/RX descriptor rings
+
+Developer can also use the "debug" module parameter to get
+further debug information.
+
+In the end, there are other macros (that cannot be enabled
+via menuconfig) to turn-on the RX/TX DMA debugging,
+specific MAC core debug printk etc. Others to enable the
+debug in the TX and RX processes.
+All these are only useful during the developing stage
+and should never enabled inside the code for general usage.
+In fact, these can generate an huge amount of debug messages.
+
+6) TODO:
o XGMAC is not supported.
- o Review the timer optimisation code to use an embedded device that will be
- available in new chip generations.
+ o Add the EEE - Energy Efficient Ethernet
+ o Add the PTP - precision time protocol
diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt
new file mode 100644
index 00000000000..5a013686b9e
--- /dev/null
+++ b/Documentation/networking/team.txt
@@ -0,0 +1,2 @@
+Team devices are driven from userspace via libteam library which is here:
+ https://github.com/jpirko/libteam