xref: /OK3568_Linux_fs/kernel/Documentation/networking/netdev-features.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0
2*4882a593Smuzhiyun
3*4882a593Smuzhiyun=====================================================
4*4882a593SmuzhiyunNetdev features mess and how to get out from it alive
5*4882a593Smuzhiyun=====================================================
6*4882a593Smuzhiyun
7*4882a593SmuzhiyunAuthor:
8*4882a593Smuzhiyun	Michał Mirosław <mirq-linux@rere.qmqm.pl>
9*4882a593Smuzhiyun
10*4882a593Smuzhiyun
11*4882a593Smuzhiyun
12*4882a593SmuzhiyunPart I: Feature sets
13*4882a593Smuzhiyun====================
14*4882a593Smuzhiyun
15*4882a593SmuzhiyunLong gone are the days when a network card would just take and give packets
16*4882a593Smuzhiyunverbatim.  Today's devices add multiple features and bugs (read: offloads)
17*4882a593Smuzhiyunthat relieve an OS of various tasks like generating and checking checksums,
18*4882a593Smuzhiyunsplitting packets, classifying them.  Those capabilities and their state
19*4882a593Smuzhiyunare commonly referred to as netdev features in Linux kernel world.
20*4882a593Smuzhiyun
21*4882a593SmuzhiyunThere are currently three sets of features relevant to the driver, and
22*4882a593Smuzhiyunone used internally by network core:
23*4882a593Smuzhiyun
24*4882a593Smuzhiyun 1. netdev->hw_features set contains features whose state may possibly
25*4882a593Smuzhiyun    be changed (enabled or disabled) for a particular device by user's
26*4882a593Smuzhiyun    request.  This set should be initialized in ndo_init callback and not
27*4882a593Smuzhiyun    changed later.
28*4882a593Smuzhiyun
29*4882a593Smuzhiyun 2. netdev->features set contains features which are currently enabled
30*4882a593Smuzhiyun    for a device.  This should be changed only by network core or in
31*4882a593Smuzhiyun    error paths of ndo_set_features callback.
32*4882a593Smuzhiyun
33*4882a593Smuzhiyun 3. netdev->vlan_features set contains features whose state is inherited
34*4882a593Smuzhiyun    by child VLAN devices (limits netdev->features set).  This is currently
35*4882a593Smuzhiyun    used for all VLAN devices whether tags are stripped or inserted in
36*4882a593Smuzhiyun    hardware or software.
37*4882a593Smuzhiyun
38*4882a593Smuzhiyun 4. netdev->wanted_features set contains feature set requested by user.
39*4882a593Smuzhiyun    This set is filtered by ndo_fix_features callback whenever it or
40*4882a593Smuzhiyun    some device-specific conditions change. This set is internal to
41*4882a593Smuzhiyun    networking core and should not be referenced in drivers.
42*4882a593Smuzhiyun
43*4882a593Smuzhiyun
44*4882a593Smuzhiyun
45*4882a593SmuzhiyunPart II: Controlling enabled features
46*4882a593Smuzhiyun=====================================
47*4882a593Smuzhiyun
48*4882a593SmuzhiyunWhen current feature set (netdev->features) is to be changed, new set
49*4882a593Smuzhiyunis calculated and filtered by calling ndo_fix_features callback
50*4882a593Smuzhiyunand netdev_fix_features(). If the resulting set differs from current
51*4882a593Smuzhiyunset, it is passed to ndo_set_features callback and (if the callback
52*4882a593Smuzhiyunreturns success) replaces value stored in netdev->features.
53*4882a593SmuzhiyunNETDEV_FEAT_CHANGE notification is issued after that whenever current
54*4882a593Smuzhiyunset might have changed.
55*4882a593Smuzhiyun
56*4882a593SmuzhiyunThe following events trigger recalculation:
57*4882a593Smuzhiyun 1. device's registration, after ndo_init returned success
58*4882a593Smuzhiyun 2. user requested changes in features state
59*4882a593Smuzhiyun 3. netdev_update_features() is called
60*4882a593Smuzhiyun
61*4882a593Smuzhiyunndo_*_features callbacks are called with rtnl_lock held. Missing callbacks
62*4882a593Smuzhiyunare treated as always returning success.
63*4882a593Smuzhiyun
64*4882a593SmuzhiyunA driver that wants to trigger recalculation must do so by calling
65*4882a593Smuzhiyunnetdev_update_features() while holding rtnl_lock. This should not be done
66*4882a593Smuzhiyunfrom ndo_*_features callbacks. netdev->features should not be modified by
67*4882a593Smuzhiyundriver except by means of ndo_fix_features callback.
68*4882a593Smuzhiyun
69*4882a593Smuzhiyun
70*4882a593Smuzhiyun
71*4882a593SmuzhiyunPart III: Implementation hints
72*4882a593Smuzhiyun==============================
73*4882a593Smuzhiyun
74*4882a593Smuzhiyun * ndo_fix_features:
75*4882a593Smuzhiyun
76*4882a593SmuzhiyunAll dependencies between features should be resolved here. The resulting
77*4882a593Smuzhiyunset can be reduced further by networking core imposed limitations (as coded
78*4882a593Smuzhiyunin netdev_fix_features()). For this reason it is safer to disable a feature
79*4882a593Smuzhiyunwhen its dependencies are not met instead of forcing the dependency on.
80*4882a593Smuzhiyun
81*4882a593SmuzhiyunThis callback should not modify hardware nor driver state (should be
82*4882a593Smuzhiyunstateless).  It can be called multiple times between successive
83*4882a593Smuzhiyunndo_set_features calls.
84*4882a593Smuzhiyun
85*4882a593SmuzhiyunCallback must not alter features contained in NETIF_F_SOFT_FEATURES or
86*4882a593SmuzhiyunNETIF_F_NEVER_CHANGE sets. The exception is NETIF_F_VLAN_CHALLENGED but
87*4882a593Smuzhiyuncare must be taken as the change won't affect already configured VLANs.
88*4882a593Smuzhiyun
89*4882a593Smuzhiyun * ndo_set_features:
90*4882a593Smuzhiyun
91*4882a593SmuzhiyunHardware should be reconfigured to match passed feature set. The set
92*4882a593Smuzhiyunshould not be altered unless some error condition happens that can't
93*4882a593Smuzhiyunbe reliably detected in ndo_fix_features. In this case, the callback
94*4882a593Smuzhiyunshould update netdev->features to match resulting hardware state.
95*4882a593SmuzhiyunErrors returned are not (and cannot be) propagated anywhere except dmesg.
96*4882a593Smuzhiyun(Note: successful return is zero, >0 means silent error.)
97*4882a593Smuzhiyun
98*4882a593Smuzhiyun
99*4882a593Smuzhiyun
100*4882a593SmuzhiyunPart IV: Features
101*4882a593Smuzhiyun=================
102*4882a593Smuzhiyun
103*4882a593SmuzhiyunFor current list of features, see include/linux/netdev_features.h.
104*4882a593SmuzhiyunThis section describes semantics of some of them.
105*4882a593Smuzhiyun
106*4882a593Smuzhiyun * Transmit checksumming
107*4882a593Smuzhiyun
108*4882a593SmuzhiyunFor complete description, see comments near the top of include/linux/skbuff.h.
109*4882a593Smuzhiyun
110*4882a593SmuzhiyunNote: NETIF_F_HW_CSUM is a superset of NETIF_F_IP_CSUM + NETIF_F_IPV6_CSUM.
111*4882a593SmuzhiyunIt means that device can fill TCP/UDP-like checksum anywhere in the packets
112*4882a593Smuzhiyunwhatever headers there might be.
113*4882a593Smuzhiyun
114*4882a593Smuzhiyun * Transmit TCP segmentation offload
115*4882a593Smuzhiyun
116*4882a593SmuzhiyunNETIF_F_TSO_ECN means that hardware can properly split packets with CWR bit
117*4882a593Smuzhiyunset, be it TCPv4 (when NETIF_F_TSO is enabled) or TCPv6 (NETIF_F_TSO6).
118*4882a593Smuzhiyun
119*4882a593Smuzhiyun * Transmit UDP segmentation offload
120*4882a593Smuzhiyun
121*4882a593SmuzhiyunNETIF_F_GSO_UDP_L4 accepts a single UDP header with a payload that exceeds
122*4882a593Smuzhiyungso_size. On segmentation, it segments the payload on gso_size boundaries and
123*4882a593Smuzhiyunreplicates the network and UDP headers (fixing up the last one if less than
124*4882a593Smuzhiyungso_size).
125*4882a593Smuzhiyun
126*4882a593Smuzhiyun * Transmit DMA from high memory
127*4882a593Smuzhiyun
128*4882a593SmuzhiyunOn platforms where this is relevant, NETIF_F_HIGHDMA signals that
129*4882a593Smuzhiyunndo_start_xmit can handle skbs with frags in high memory.
130*4882a593Smuzhiyun
131*4882a593Smuzhiyun * Transmit scatter-gather
132*4882a593Smuzhiyun
133*4882a593SmuzhiyunThose features say that ndo_start_xmit can handle fragmented skbs:
134*4882a593SmuzhiyunNETIF_F_SG --- paged skbs (skb_shinfo()->frags), NETIF_F_FRAGLIST ---
135*4882a593Smuzhiyunchained skbs (skb->next/prev list).
136*4882a593Smuzhiyun
137*4882a593Smuzhiyun * Software features
138*4882a593Smuzhiyun
139*4882a593SmuzhiyunFeatures contained in NETIF_F_SOFT_FEATURES are features of networking
140*4882a593Smuzhiyunstack. Driver should not change behaviour based on them.
141*4882a593Smuzhiyun
142*4882a593Smuzhiyun * LLTX driver (deprecated for hardware drivers)
143*4882a593Smuzhiyun
144*4882a593SmuzhiyunNETIF_F_LLTX is meant to be used by drivers that don't need locking at all,
145*4882a593Smuzhiyune.g. software tunnels.
146*4882a593Smuzhiyun
147*4882a593SmuzhiyunThis is also used in a few legacy drivers that implement their
148*4882a593Smuzhiyunown locking, don't use it for new (hardware) drivers.
149*4882a593Smuzhiyun
150*4882a593Smuzhiyun * netns-local device
151*4882a593Smuzhiyun
152*4882a593SmuzhiyunNETIF_F_NETNS_LOCAL is set for devices that are not allowed to move between
153*4882a593Smuzhiyunnetwork namespaces (e.g. loopback).
154*4882a593Smuzhiyun
155*4882a593SmuzhiyunDon't use it in drivers.
156*4882a593Smuzhiyun
157*4882a593Smuzhiyun * VLAN challenged
158*4882a593Smuzhiyun
159*4882a593SmuzhiyunNETIF_F_VLAN_CHALLENGED should be set for devices which can't cope with VLAN
160*4882a593Smuzhiyunheaders. Some drivers set this because the cards can't handle the bigger MTU.
161*4882a593Smuzhiyun[FIXME: Those cases could be fixed in VLAN code by allowing only reduced-MTU
162*4882a593SmuzhiyunVLANs. This may be not useful, though.]
163*4882a593Smuzhiyun
164*4882a593Smuzhiyun*  rx-fcs
165*4882a593Smuzhiyun
166*4882a593SmuzhiyunThis requests that the NIC append the Ethernet Frame Checksum (FCS)
167*4882a593Smuzhiyunto the end of the skb data.  This allows sniffers and other tools to
168*4882a593Smuzhiyunread the CRC recorded by the NIC on receipt of the packet.
169*4882a593Smuzhiyun
170*4882a593Smuzhiyun*  rx-all
171*4882a593Smuzhiyun
172*4882a593SmuzhiyunThis requests that the NIC receive all possible frames, including errored
173*4882a593Smuzhiyunframes (such as bad FCS, etc).  This can be helpful when sniffing a link with
174*4882a593Smuzhiyunbad packets on it.  Some NICs may receive more packets if also put into normal
175*4882a593SmuzhiyunPROMISC mode.
176*4882a593Smuzhiyun
177*4882a593Smuzhiyun*  rx-gro-hw
178*4882a593Smuzhiyun
179*4882a593SmuzhiyunThis requests that the NIC enables Hardware GRO (generic receive offload).
180*4882a593SmuzhiyunHardware GRO is basically the exact reverse of TSO, and is generally
181*4882a593Smuzhiyunstricter than Hardware LRO.  A packet stream merged by Hardware GRO must
182*4882a593Smuzhiyunbe re-segmentable by GSO or TSO back to the exact original packet stream.
183*4882a593SmuzhiyunHardware GRO is dependent on RXCSUM since every packet successfully merged
184*4882a593Smuzhiyunby hardware must also have the checksum verified by hardware.
185