xref: /OK3568_Linux_fs/kernel/Documentation/networking/dsa/dsa.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun============
2*4882a593SmuzhiyunArchitecture
3*4882a593Smuzhiyun============
4*4882a593Smuzhiyun
5*4882a593SmuzhiyunThis document describes the **Distributed Switch Architecture (DSA)** subsystem
6*4882a593Smuzhiyundesign principles, limitations, interactions with other subsystems, and how to
7*4882a593Smuzhiyundevelop drivers for this subsystem as well as a TODO for developers interested
8*4882a593Smuzhiyunin joining the effort.
9*4882a593Smuzhiyun
10*4882a593SmuzhiyunDesign principles
11*4882a593Smuzhiyun=================
12*4882a593Smuzhiyun
13*4882a593SmuzhiyunThe Distributed Switch Architecture is a subsystem which was primarily designed
14*4882a593Smuzhiyunto support Marvell Ethernet switches (MV88E6xxx, a.k.a Linkstreet product line)
15*4882a593Smuzhiyunusing Linux, but has since evolved to support other vendors as well.
16*4882a593Smuzhiyun
17*4882a593SmuzhiyunThe original philosophy behind this design was to be able to use unmodified
18*4882a593SmuzhiyunLinux tools such as bridge, iproute2, ifconfig to work transparently whether
19*4882a593Smuzhiyunthey configured/queried a switch port network device or a regular network
20*4882a593Smuzhiyundevice.
21*4882a593Smuzhiyun
22*4882a593SmuzhiyunAn Ethernet switch is typically comprised of multiple front-panel ports, and one
23*4882a593Smuzhiyunor more CPU or management port. The DSA subsystem currently relies on the
24*4882a593Smuzhiyunpresence of a management port connected to an Ethernet controller capable of
25*4882a593Smuzhiyunreceiving Ethernet frames from the switch. This is a very common setup for all
26*4882a593Smuzhiyunkinds of Ethernet switches found in Small Home and Office products: routers,
27*4882a593Smuzhiyungateways, or even top-of-the rack switches. This host Ethernet controller will
28*4882a593Smuzhiyunbe later referred to as "master" and "cpu" in DSA terminology and code.
29*4882a593Smuzhiyun
30*4882a593SmuzhiyunThe D in DSA stands for Distributed, because the subsystem has been designed
31*4882a593Smuzhiyunwith the ability to configure and manage cascaded switches on top of each other
32*4882a593Smuzhiyunusing upstream and downstream Ethernet links between switches. These specific
33*4882a593Smuzhiyunports are referred to as "dsa" ports in DSA terminology and code. A collection
34*4882a593Smuzhiyunof multiple switches connected to each other is called a "switch tree".
35*4882a593Smuzhiyun
36*4882a593SmuzhiyunFor each front-panel port, DSA will create specialized network devices which are
37*4882a593Smuzhiyunused as controlling and data-flowing endpoints for use by the Linux networking
38*4882a593Smuzhiyunstack. These specialized network interfaces are referred to as "slave" network
39*4882a593Smuzhiyuninterfaces in DSA terminology and code.
40*4882a593Smuzhiyun
41*4882a593SmuzhiyunThe ideal case for using DSA is when an Ethernet switch supports a "switch tag"
42*4882a593Smuzhiyunwhich is a hardware feature making the switch insert a specific tag for each
43*4882a593SmuzhiyunEthernet frames it received to/from specific ports to help the management
44*4882a593Smuzhiyuninterface figure out:
45*4882a593Smuzhiyun
46*4882a593Smuzhiyun- what port is this frame coming from
47*4882a593Smuzhiyun- what was the reason why this frame got forwarded
48*4882a593Smuzhiyun- how to send CPU originated traffic to specific ports
49*4882a593Smuzhiyun
50*4882a593SmuzhiyunThe subsystem does support switches not capable of inserting/stripping tags, but
51*4882a593Smuzhiyunthe features might be slightly limited in that case (traffic separation relies
52*4882a593Smuzhiyunon Port-based VLAN IDs).
53*4882a593Smuzhiyun
54*4882a593SmuzhiyunNote that DSA does not currently create network interfaces for the "cpu" and
55*4882a593Smuzhiyun"dsa" ports because:
56*4882a593Smuzhiyun
57*4882a593Smuzhiyun- the "cpu" port is the Ethernet switch facing side of the management
58*4882a593Smuzhiyun  controller, and as such, would create a duplication of feature, since you
59*4882a593Smuzhiyun  would get two interfaces for the same conduit: master netdev, and "cpu" netdev
60*4882a593Smuzhiyun
61*4882a593Smuzhiyun- the "dsa" port(s) are just conduits between two or more switches, and as such
62*4882a593Smuzhiyun  cannot really be used as proper network interfaces either, only the
63*4882a593Smuzhiyun  downstream, or the top-most upstream interface makes sense with that model
64*4882a593Smuzhiyun
65*4882a593SmuzhiyunSwitch tagging protocols
66*4882a593Smuzhiyun------------------------
67*4882a593Smuzhiyun
68*4882a593SmuzhiyunDSA currently supports 5 different tagging protocols, and a tag-less mode as
69*4882a593Smuzhiyunwell. The different protocols are implemented in:
70*4882a593Smuzhiyun
71*4882a593Smuzhiyun- ``net/dsa/tag_trailer.c``: Marvell's 4 trailer tag mode (legacy)
72*4882a593Smuzhiyun- ``net/dsa/tag_dsa.c``: Marvell's original DSA tag
73*4882a593Smuzhiyun- ``net/dsa/tag_edsa.c``: Marvell's enhanced DSA tag
74*4882a593Smuzhiyun- ``net/dsa/tag_brcm.c``: Broadcom's 4 bytes tag
75*4882a593Smuzhiyun- ``net/dsa/tag_qca.c``: Qualcomm's 2 bytes tag
76*4882a593Smuzhiyun
77*4882a593SmuzhiyunThe exact format of the tag protocol is vendor specific, but in general, they
78*4882a593Smuzhiyunall contain something which:
79*4882a593Smuzhiyun
80*4882a593Smuzhiyun- identifies which port the Ethernet frame came from/should be sent to
81*4882a593Smuzhiyun- provides a reason why this frame was forwarded to the management interface
82*4882a593Smuzhiyun
83*4882a593SmuzhiyunMaster network devices
84*4882a593Smuzhiyun----------------------
85*4882a593Smuzhiyun
86*4882a593SmuzhiyunMaster network devices are regular, unmodified Linux network device drivers for
87*4882a593Smuzhiyunthe CPU/management Ethernet interface. Such a driver might occasionally need to
88*4882a593Smuzhiyunknow whether DSA is enabled (e.g.: to enable/disable specific offload features),
89*4882a593Smuzhiyunbut the DSA subsystem has been proven to work with industry standard drivers:
90*4882a593Smuzhiyun``e1000e,`` ``mv643xx_eth`` etc. without having to introduce modifications to these
91*4882a593Smuzhiyundrivers. Such network devices are also often referred to as conduit network
92*4882a593Smuzhiyundevices since they act as a pipe between the host processor and the hardware
93*4882a593SmuzhiyunEthernet switch.
94*4882a593Smuzhiyun
95*4882a593SmuzhiyunNetworking stack hooks
96*4882a593Smuzhiyun----------------------
97*4882a593Smuzhiyun
98*4882a593SmuzhiyunWhen a master netdev is used with DSA, a small hook is placed in the
99*4882a593Smuzhiyunnetworking stack is in order to have the DSA subsystem process the Ethernet
100*4882a593Smuzhiyunswitch specific tagging protocol. DSA accomplishes this by registering a
101*4882a593Smuzhiyunspecific (and fake) Ethernet type (later becoming ``skb->protocol``) with the
102*4882a593Smuzhiyunnetworking stack, this is also known as a ``ptype`` or ``packet_type``. A typical
103*4882a593SmuzhiyunEthernet Frame receive sequence looks like this:
104*4882a593Smuzhiyun
105*4882a593SmuzhiyunMaster network device (e.g.: e1000e):
106*4882a593Smuzhiyun
107*4882a593Smuzhiyun1. Receive interrupt fires:
108*4882a593Smuzhiyun
109*4882a593Smuzhiyun        - receive function is invoked
110*4882a593Smuzhiyun        - basic packet processing is done: getting length, status etc.
111*4882a593Smuzhiyun        - packet is prepared to be processed by the Ethernet layer by calling
112*4882a593Smuzhiyun          ``eth_type_trans``
113*4882a593Smuzhiyun
114*4882a593Smuzhiyun2. net/ethernet/eth.c::
115*4882a593Smuzhiyun
116*4882a593Smuzhiyun          eth_type_trans(skb, dev)
117*4882a593Smuzhiyun                  if (dev->dsa_ptr != NULL)
118*4882a593Smuzhiyun                          -> skb->protocol = ETH_P_XDSA
119*4882a593Smuzhiyun
120*4882a593Smuzhiyun3. drivers/net/ethernet/\*::
121*4882a593Smuzhiyun
122*4882a593Smuzhiyun          netif_receive_skb(skb)
123*4882a593Smuzhiyun                  -> iterate over registered packet_type
124*4882a593Smuzhiyun                          -> invoke handler for ETH_P_XDSA, calls dsa_switch_rcv()
125*4882a593Smuzhiyun
126*4882a593Smuzhiyun4. net/dsa/dsa.c::
127*4882a593Smuzhiyun
128*4882a593Smuzhiyun          -> dsa_switch_rcv()
129*4882a593Smuzhiyun                  -> invoke switch tag specific protocol handler in 'net/dsa/tag_*.c'
130*4882a593Smuzhiyun
131*4882a593Smuzhiyun5. net/dsa/tag_*.c:
132*4882a593Smuzhiyun
133*4882a593Smuzhiyun        - inspect and strip switch tag protocol to determine originating port
134*4882a593Smuzhiyun        - locate per-port network device
135*4882a593Smuzhiyun        - invoke ``eth_type_trans()`` with the DSA slave network device
136*4882a593Smuzhiyun        - invoked ``netif_receive_skb()``
137*4882a593Smuzhiyun
138*4882a593SmuzhiyunPast this point, the DSA slave network devices get delivered regular Ethernet
139*4882a593Smuzhiyunframes that can be processed by the networking stack.
140*4882a593Smuzhiyun
141*4882a593SmuzhiyunSlave network devices
142*4882a593Smuzhiyun---------------------
143*4882a593Smuzhiyun
144*4882a593SmuzhiyunSlave network devices created by DSA are stacked on top of their master network
145*4882a593Smuzhiyundevice, each of these network interfaces will be responsible for being a
146*4882a593Smuzhiyuncontrolling and data-flowing end-point for each front-panel port of the switch.
147*4882a593SmuzhiyunThese interfaces are specialized in order to:
148*4882a593Smuzhiyun
149*4882a593Smuzhiyun- insert/remove the switch tag protocol (if it exists) when sending traffic
150*4882a593Smuzhiyun  to/from specific switch ports
151*4882a593Smuzhiyun- query the switch for ethtool operations: statistics, link state,
152*4882a593Smuzhiyun  Wake-on-LAN, register dumps...
153*4882a593Smuzhiyun- external/internal PHY management: link, auto-negotiation etc.
154*4882a593Smuzhiyun
155*4882a593SmuzhiyunThese slave network devices have custom net_device_ops and ethtool_ops function
156*4882a593Smuzhiyunpointers which allow DSA to introduce a level of layering between the networking
157*4882a593Smuzhiyunstack/ethtool, and the switch driver implementation.
158*4882a593Smuzhiyun
159*4882a593SmuzhiyunUpon frame transmission from these slave network devices, DSA will look up which
160*4882a593Smuzhiyunswitch tagging protocol is currently registered with these network devices, and
161*4882a593Smuzhiyuninvoke a specific transmit routine which takes care of adding the relevant
162*4882a593Smuzhiyunswitch tag in the Ethernet frames.
163*4882a593Smuzhiyun
164*4882a593SmuzhiyunThese frames are then queued for transmission using the master network device
165*4882a593Smuzhiyun``ndo_start_xmit()`` function, since they contain the appropriate switch tag, the
166*4882a593SmuzhiyunEthernet switch will be able to process these incoming frames from the
167*4882a593Smuzhiyunmanagement interface and delivers these frames to the physical switch port.
168*4882a593Smuzhiyun
169*4882a593SmuzhiyunGraphical representation
170*4882a593Smuzhiyun------------------------
171*4882a593Smuzhiyun
172*4882a593SmuzhiyunSummarized, this is basically how DSA looks like from a network device
173*4882a593Smuzhiyunperspective::
174*4882a593Smuzhiyun
175*4882a593Smuzhiyun
176*4882a593Smuzhiyun                |---------------------------
177*4882a593Smuzhiyun                | CPU network device (eth0)|
178*4882a593Smuzhiyun                ----------------------------
179*4882a593Smuzhiyun                | <tag added by switch     |
180*4882a593Smuzhiyun                |                          |
181*4882a593Smuzhiyun                |                          |
182*4882a593Smuzhiyun                |        tag added by CPU> |
183*4882a593Smuzhiyun        |--------------------------------------------|
184*4882a593Smuzhiyun        |            Switch driver                   |
185*4882a593Smuzhiyun        |--------------------------------------------|
186*4882a593Smuzhiyun                  ||        ||         ||
187*4882a593Smuzhiyun              |-------|  |-------|  |-------|
188*4882a593Smuzhiyun              | sw0p0 |  | sw0p1 |  | sw0p2 |
189*4882a593Smuzhiyun              |-------|  |-------|  |-------|
190*4882a593Smuzhiyun
191*4882a593Smuzhiyun
192*4882a593Smuzhiyun
193*4882a593SmuzhiyunSlave MDIO bus
194*4882a593Smuzhiyun--------------
195*4882a593Smuzhiyun
196*4882a593SmuzhiyunIn order to be able to read to/from a switch PHY built into it, DSA creates a
197*4882a593Smuzhiyunslave MDIO bus which allows a specific switch driver to divert and intercept
198*4882a593SmuzhiyunMDIO reads/writes towards specific PHY addresses. In most MDIO-connected
199*4882a593Smuzhiyunswitches, these functions would utilize direct or indirect PHY addressing mode
200*4882a593Smuzhiyunto return standard MII registers from the switch builtin PHYs, allowing the PHY
201*4882a593Smuzhiyunlibrary and/or to return link status, link partner pages, auto-negotiation
202*4882a593Smuzhiyunresults etc..
203*4882a593Smuzhiyun
204*4882a593SmuzhiyunFor Ethernet switches which have both external and internal MDIO busses, the
205*4882a593Smuzhiyunslave MII bus can be utilized to mux/demux MDIO reads and writes towards either
206*4882a593Smuzhiyuninternal or external MDIO devices this switch might be connected to: internal
207*4882a593SmuzhiyunPHYs, external PHYs, or even external switches.
208*4882a593Smuzhiyun
209*4882a593SmuzhiyunData structures
210*4882a593Smuzhiyun---------------
211*4882a593Smuzhiyun
212*4882a593SmuzhiyunDSA data structures are defined in ``include/net/dsa.h`` as well as
213*4882a593Smuzhiyun``net/dsa/dsa_priv.h``:
214*4882a593Smuzhiyun
215*4882a593Smuzhiyun- ``dsa_chip_data``: platform data configuration for a given switch device,
216*4882a593Smuzhiyun  this structure describes a switch device's parent device, its address, as
217*4882a593Smuzhiyun  well as various properties of its ports: names/labels, and finally a routing
218*4882a593Smuzhiyun  table indication (when cascading switches)
219*4882a593Smuzhiyun
220*4882a593Smuzhiyun- ``dsa_platform_data``: platform device configuration data which can reference
221*4882a593Smuzhiyun  a collection of dsa_chip_data structure if multiples switches are cascaded,
222*4882a593Smuzhiyun  the master network device this switch tree is attached to needs to be
223*4882a593Smuzhiyun  referenced
224*4882a593Smuzhiyun
225*4882a593Smuzhiyun- ``dsa_switch_tree``: structure assigned to the master network device under
226*4882a593Smuzhiyun  ``dsa_ptr``, this structure references a dsa_platform_data structure as well as
227*4882a593Smuzhiyun  the tagging protocol supported by the switch tree, and which receive/transmit
228*4882a593Smuzhiyun  function hooks should be invoked, information about the directly attached
229*4882a593Smuzhiyun  switch is also provided: CPU port. Finally, a collection of dsa_switch are
230*4882a593Smuzhiyun  referenced to address individual switches in the tree.
231*4882a593Smuzhiyun
232*4882a593Smuzhiyun- ``dsa_switch``: structure describing a switch device in the tree, referencing
233*4882a593Smuzhiyun  a ``dsa_switch_tree`` as a backpointer, slave network devices, master network
234*4882a593Smuzhiyun  device, and a reference to the backing``dsa_switch_ops``
235*4882a593Smuzhiyun
236*4882a593Smuzhiyun- ``dsa_switch_ops``: structure referencing function pointers, see below for a
237*4882a593Smuzhiyun  full description.
238*4882a593Smuzhiyun
239*4882a593SmuzhiyunDesign limitations
240*4882a593Smuzhiyun==================
241*4882a593Smuzhiyun
242*4882a593SmuzhiyunLimits on the number of devices and ports
243*4882a593Smuzhiyun-----------------------------------------
244*4882a593Smuzhiyun
245*4882a593SmuzhiyunDSA currently limits the number of maximum switches within a tree to 4
246*4882a593Smuzhiyun(``DSA_MAX_SWITCHES``), and the number of ports per switch to 12 (``DSA_MAX_PORTS``).
247*4882a593SmuzhiyunThese limits could be extended to support larger configurations would this need
248*4882a593Smuzhiyunarise.
249*4882a593Smuzhiyun
250*4882a593SmuzhiyunLack of CPU/DSA network devices
251*4882a593Smuzhiyun-------------------------------
252*4882a593Smuzhiyun
253*4882a593SmuzhiyunDSA does not currently create slave network devices for the CPU or DSA ports, as
254*4882a593Smuzhiyundescribed before. This might be an issue in the following cases:
255*4882a593Smuzhiyun
256*4882a593Smuzhiyun- inability to fetch switch CPU port statistics counters using ethtool, which
257*4882a593Smuzhiyun  can make it harder to debug MDIO switch connected using xMII interfaces
258*4882a593Smuzhiyun
259*4882a593Smuzhiyun- inability to configure the CPU port link parameters based on the Ethernet
260*4882a593Smuzhiyun  controller capabilities attached to it: http://patchwork.ozlabs.org/patch/509806/
261*4882a593Smuzhiyun
262*4882a593Smuzhiyun- inability to configure specific VLAN IDs / trunking VLANs between switches
263*4882a593Smuzhiyun  when using a cascaded setup
264*4882a593Smuzhiyun
265*4882a593SmuzhiyunCommon pitfalls using DSA setups
266*4882a593Smuzhiyun--------------------------------
267*4882a593Smuzhiyun
268*4882a593SmuzhiyunOnce a master network device is configured to use DSA (dev->dsa_ptr becomes
269*4882a593Smuzhiyunnon-NULL), and the switch behind it expects a tagging protocol, this network
270*4882a593Smuzhiyuninterface can only exclusively be used as a conduit interface. Sending packets
271*4882a593Smuzhiyundirectly through this interface (e.g.: opening a socket using this interface)
272*4882a593Smuzhiyunwill not make us go through the switch tagging protocol transmit function, so
273*4882a593Smuzhiyunthe Ethernet switch on the other end, expecting a tag will typically drop this
274*4882a593Smuzhiyunframe.
275*4882a593Smuzhiyun
276*4882a593SmuzhiyunSlave network devices check that the master network device is UP before allowing
277*4882a593Smuzhiyunyou to administratively bring UP these slave network devices. A common
278*4882a593Smuzhiyunconfiguration mistake is forgetting to bring UP the master network device first.
279*4882a593Smuzhiyun
280*4882a593SmuzhiyunInteractions with other subsystems
281*4882a593Smuzhiyun==================================
282*4882a593Smuzhiyun
283*4882a593SmuzhiyunDSA currently leverages the following subsystems:
284*4882a593Smuzhiyun
285*4882a593Smuzhiyun- MDIO/PHY library: ``drivers/net/phy/phy.c``, ``mdio_bus.c``
286*4882a593Smuzhiyun- Switchdev:``net/switchdev/*``
287*4882a593Smuzhiyun- Device Tree for various of_* functions
288*4882a593Smuzhiyun
289*4882a593SmuzhiyunMDIO/PHY library
290*4882a593Smuzhiyun----------------
291*4882a593Smuzhiyun
292*4882a593SmuzhiyunSlave network devices exposed by DSA may or may not be interfacing with PHY
293*4882a593Smuzhiyundevices (``struct phy_device`` as defined in ``include/linux/phy.h)``, but the DSA
294*4882a593Smuzhiyunsubsystem deals with all possible combinations:
295*4882a593Smuzhiyun
296*4882a593Smuzhiyun- internal PHY devices, built into the Ethernet switch hardware
297*4882a593Smuzhiyun- external PHY devices, connected via an internal or external MDIO bus
298*4882a593Smuzhiyun- internal PHY devices, connected via an internal MDIO bus
299*4882a593Smuzhiyun- special, non-autonegotiated or non MDIO-managed PHY devices: SFPs, MoCA; a.k.a
300*4882a593Smuzhiyun  fixed PHYs
301*4882a593Smuzhiyun
302*4882a593SmuzhiyunThe PHY configuration is done by the ``dsa_slave_phy_setup()`` function and the
303*4882a593Smuzhiyunlogic basically looks like this:
304*4882a593Smuzhiyun
305*4882a593Smuzhiyun- if Device Tree is used, the PHY device is looked up using the standard
306*4882a593Smuzhiyun  "phy-handle" property, if found, this PHY device is created and registered
307*4882a593Smuzhiyun  using ``of_phy_connect()``
308*4882a593Smuzhiyun
309*4882a593Smuzhiyun- if Device Tree is used, and the PHY device is "fixed", that is, conforms to
310*4882a593Smuzhiyun  the definition of a non-MDIO managed PHY as defined in
311*4882a593Smuzhiyun  ``Documentation/devicetree/bindings/net/fixed-link.txt``, the PHY is registered
312*4882a593Smuzhiyun  and connected transparently using the special fixed MDIO bus driver
313*4882a593Smuzhiyun
314*4882a593Smuzhiyun- finally, if the PHY is built into the switch, as is very common with
315*4882a593Smuzhiyun  standalone switch packages, the PHY is probed using the slave MII bus created
316*4882a593Smuzhiyun  by DSA
317*4882a593Smuzhiyun
318*4882a593Smuzhiyun
319*4882a593SmuzhiyunSWITCHDEV
320*4882a593Smuzhiyun---------
321*4882a593Smuzhiyun
322*4882a593SmuzhiyunDSA directly utilizes SWITCHDEV when interfacing with the bridge layer, and
323*4882a593Smuzhiyunmore specifically with its VLAN filtering portion when configuring VLANs on top
324*4882a593Smuzhiyunof per-port slave network devices. Since DSA primarily deals with
325*4882a593SmuzhiyunMDIO-connected switches, although not exclusively, SWITCHDEV's
326*4882a593Smuzhiyunprepare/abort/commit phases are often simplified into a prepare phase which
327*4882a593Smuzhiyunchecks whether the operation is supported by the DSA switch driver, and a commit
328*4882a593Smuzhiyunphase which applies the changes.
329*4882a593Smuzhiyun
330*4882a593SmuzhiyunAs of today, the only SWITCHDEV objects supported by DSA are the FDB and VLAN
331*4882a593Smuzhiyunobjects.
332*4882a593Smuzhiyun
333*4882a593SmuzhiyunDevice Tree
334*4882a593Smuzhiyun-----------
335*4882a593Smuzhiyun
336*4882a593SmuzhiyunDSA features a standardized binding which is documented in
337*4882a593Smuzhiyun``Documentation/devicetree/bindings/net/dsa/dsa.txt``. PHY/MDIO library helper
338*4882a593Smuzhiyunfunctions such as ``of_get_phy_mode()``, ``of_phy_connect()`` are also used to query
339*4882a593Smuzhiyunper-port PHY specific details: interface connection, MDIO bus location etc..
340*4882a593Smuzhiyun
341*4882a593SmuzhiyunDriver development
342*4882a593Smuzhiyun==================
343*4882a593Smuzhiyun
344*4882a593SmuzhiyunDSA switch drivers need to implement a dsa_switch_ops structure which will
345*4882a593Smuzhiyuncontain the various members described below.
346*4882a593Smuzhiyun
347*4882a593Smuzhiyun``register_switch_driver()`` registers this dsa_switch_ops in its internal list
348*4882a593Smuzhiyunof drivers to probe for. ``unregister_switch_driver()`` does the exact opposite.
349*4882a593Smuzhiyun
350*4882a593SmuzhiyunUnless requested differently by setting the priv_size member accordingly, DSA
351*4882a593Smuzhiyundoes not allocate any driver private context space.
352*4882a593Smuzhiyun
353*4882a593SmuzhiyunSwitch configuration
354*4882a593Smuzhiyun--------------------
355*4882a593Smuzhiyun
356*4882a593Smuzhiyun- ``tag_protocol``: this is to indicate what kind of tagging protocol is supported,
357*4882a593Smuzhiyun  should be a valid value from the ``dsa_tag_protocol`` enum
358*4882a593Smuzhiyun
359*4882a593Smuzhiyun- ``probe``: probe routine which will be invoked by the DSA platform device upon
360*4882a593Smuzhiyun  registration to test for the presence/absence of a switch device. For MDIO
361*4882a593Smuzhiyun  devices, it is recommended to issue a read towards internal registers using
362*4882a593Smuzhiyun  the switch pseudo-PHY and return whether this is a supported device. For other
363*4882a593Smuzhiyun  buses, return a non-NULL string
364*4882a593Smuzhiyun
365*4882a593Smuzhiyun- ``setup``: setup function for the switch, this function is responsible for setting
366*4882a593Smuzhiyun  up the ``dsa_switch_ops`` private structure with all it needs: register maps,
367*4882a593Smuzhiyun  interrupts, mutexes, locks etc.. This function is also expected to properly
368*4882a593Smuzhiyun  configure the switch to separate all network interfaces from each other, that
369*4882a593Smuzhiyun  is, they should be isolated by the switch hardware itself, typically by creating
370*4882a593Smuzhiyun  a Port-based VLAN ID for each port and allowing only the CPU port and the
371*4882a593Smuzhiyun  specific port to be in the forwarding vector. Ports that are unused by the
372*4882a593Smuzhiyun  platform should be disabled. Past this function, the switch is expected to be
373*4882a593Smuzhiyun  fully configured and ready to serve any kind of request. It is recommended
374*4882a593Smuzhiyun  to issue a software reset of the switch during this setup function in order to
375*4882a593Smuzhiyun  avoid relying on what a previous software agent such as a bootloader/firmware
376*4882a593Smuzhiyun  may have previously configured.
377*4882a593Smuzhiyun
378*4882a593SmuzhiyunPHY devices and link management
379*4882a593Smuzhiyun-------------------------------
380*4882a593Smuzhiyun
381*4882a593Smuzhiyun- ``get_phy_flags``: Some switches are interfaced to various kinds of Ethernet PHYs,
382*4882a593Smuzhiyun  if the PHY library PHY driver needs to know about information it cannot obtain
383*4882a593Smuzhiyun  on its own (e.g.: coming from switch memory mapped registers), this function
384*4882a593Smuzhiyun  should return a 32-bits bitmask of "flags", that is private between the switch
385*4882a593Smuzhiyun  driver and the Ethernet PHY driver in ``drivers/net/phy/\*``.
386*4882a593Smuzhiyun
387*4882a593Smuzhiyun- ``phy_read``: Function invoked by the DSA slave MDIO bus when attempting to read
388*4882a593Smuzhiyun  the switch port MDIO registers. If unavailable, return 0xffff for each read.
389*4882a593Smuzhiyun  For builtin switch Ethernet PHYs, this function should allow reading the link
390*4882a593Smuzhiyun  status, auto-negotiation results, link partner pages etc..
391*4882a593Smuzhiyun
392*4882a593Smuzhiyun- ``phy_write``: Function invoked by the DSA slave MDIO bus when attempting to write
393*4882a593Smuzhiyun  to the switch port MDIO registers. If unavailable return a negative error
394*4882a593Smuzhiyun  code.
395*4882a593Smuzhiyun
396*4882a593Smuzhiyun- ``adjust_link``: Function invoked by the PHY library when a slave network device
397*4882a593Smuzhiyun  is attached to a PHY device. This function is responsible for appropriately
398*4882a593Smuzhiyun  configuring the switch port link parameters: speed, duplex, pause based on
399*4882a593Smuzhiyun  what the ``phy_device`` is providing.
400*4882a593Smuzhiyun
401*4882a593Smuzhiyun- ``fixed_link_update``: Function invoked by the PHY library, and specifically by
402*4882a593Smuzhiyun  the fixed PHY driver asking the switch driver for link parameters that could
403*4882a593Smuzhiyun  not be auto-negotiated, or obtained by reading the PHY registers through MDIO.
404*4882a593Smuzhiyun  This is particularly useful for specific kinds of hardware such as QSGMII,
405*4882a593Smuzhiyun  MoCA or other kinds of non-MDIO managed PHYs where out of band link
406*4882a593Smuzhiyun  information is obtained
407*4882a593Smuzhiyun
408*4882a593SmuzhiyunEthtool operations
409*4882a593Smuzhiyun------------------
410*4882a593Smuzhiyun
411*4882a593Smuzhiyun- ``get_strings``: ethtool function used to query the driver's strings, will
412*4882a593Smuzhiyun  typically return statistics strings, private flags strings etc.
413*4882a593Smuzhiyun
414*4882a593Smuzhiyun- ``get_ethtool_stats``: ethtool function used to query per-port statistics and
415*4882a593Smuzhiyun  return their values. DSA overlays slave network devices general statistics:
416*4882a593Smuzhiyun  RX/TX counters from the network device, with switch driver specific statistics
417*4882a593Smuzhiyun  per port
418*4882a593Smuzhiyun
419*4882a593Smuzhiyun- ``get_sset_count``: ethtool function used to query the number of statistics items
420*4882a593Smuzhiyun
421*4882a593Smuzhiyun- ``get_wol``: ethtool function used to obtain Wake-on-LAN settings per-port, this
422*4882a593Smuzhiyun  function may, for certain implementations also query the master network device
423*4882a593Smuzhiyun  Wake-on-LAN settings if this interface needs to participate in Wake-on-LAN
424*4882a593Smuzhiyun
425*4882a593Smuzhiyun- ``set_wol``: ethtool function used to configure Wake-on-LAN settings per-port,
426*4882a593Smuzhiyun  direct counterpart to set_wol with similar restrictions
427*4882a593Smuzhiyun
428*4882a593Smuzhiyun- ``set_eee``: ethtool function which is used to configure a switch port EEE (Green
429*4882a593Smuzhiyun  Ethernet) settings, can optionally invoke the PHY library to enable EEE at the
430*4882a593Smuzhiyun  PHY level if relevant. This function should enable EEE at the switch port MAC
431*4882a593Smuzhiyun  controller and data-processing logic
432*4882a593Smuzhiyun
433*4882a593Smuzhiyun- ``get_eee``: ethtool function which is used to query a switch port EEE settings,
434*4882a593Smuzhiyun  this function should return the EEE state of the switch port MAC controller
435*4882a593Smuzhiyun  and data-processing logic as well as query the PHY for its currently configured
436*4882a593Smuzhiyun  EEE settings
437*4882a593Smuzhiyun
438*4882a593Smuzhiyun- ``get_eeprom_len``: ethtool function returning for a given switch the EEPROM
439*4882a593Smuzhiyun  length/size in bytes
440*4882a593Smuzhiyun
441*4882a593Smuzhiyun- ``get_eeprom``: ethtool function returning for a given switch the EEPROM contents
442*4882a593Smuzhiyun
443*4882a593Smuzhiyun- ``set_eeprom``: ethtool function writing specified data to a given switch EEPROM
444*4882a593Smuzhiyun
445*4882a593Smuzhiyun- ``get_regs_len``: ethtool function returning the register length for a given
446*4882a593Smuzhiyun  switch
447*4882a593Smuzhiyun
448*4882a593Smuzhiyun- ``get_regs``: ethtool function returning the Ethernet switch internal register
449*4882a593Smuzhiyun  contents. This function might require user-land code in ethtool to
450*4882a593Smuzhiyun  pretty-print register values and registers
451*4882a593Smuzhiyun
452*4882a593SmuzhiyunPower management
453*4882a593Smuzhiyun----------------
454*4882a593Smuzhiyun
455*4882a593Smuzhiyun- ``suspend``: function invoked by the DSA platform device when the system goes to
456*4882a593Smuzhiyun  suspend, should quiesce all Ethernet switch activities, but keep ports
457*4882a593Smuzhiyun  participating in Wake-on-LAN active as well as additional wake-up logic if
458*4882a593Smuzhiyun  supported
459*4882a593Smuzhiyun
460*4882a593Smuzhiyun- ``resume``: function invoked by the DSA platform device when the system resumes,
461*4882a593Smuzhiyun  should resume all Ethernet switch activities and re-configure the switch to be
462*4882a593Smuzhiyun  in a fully active state
463*4882a593Smuzhiyun
464*4882a593Smuzhiyun- ``port_enable``: function invoked by the DSA slave network device ndo_open
465*4882a593Smuzhiyun  function when a port is administratively brought up, this function should be
466*4882a593Smuzhiyun  fully enabling a given switch port. DSA takes care of marking the port with
467*4882a593Smuzhiyun  ``BR_STATE_BLOCKING`` if the port is a bridge member, or ``BR_STATE_FORWARDING`` if it
468*4882a593Smuzhiyun  was not, and propagating these changes down to the hardware
469*4882a593Smuzhiyun
470*4882a593Smuzhiyun- ``port_disable``: function invoked by the DSA slave network device ndo_close
471*4882a593Smuzhiyun  function when a port is administratively brought down, this function should be
472*4882a593Smuzhiyun  fully disabling a given switch port. DSA takes care of marking the port with
473*4882a593Smuzhiyun  ``BR_STATE_DISABLED`` and propagating changes to the hardware if this port is
474*4882a593Smuzhiyun  disabled while being a bridge member
475*4882a593Smuzhiyun
476*4882a593SmuzhiyunBridge layer
477*4882a593Smuzhiyun------------
478*4882a593Smuzhiyun
479*4882a593Smuzhiyun- ``port_bridge_join``: bridge layer function invoked when a given switch port is
480*4882a593Smuzhiyun  added to a bridge, this function should be doing the necessary at the switch
481*4882a593Smuzhiyun  level to permit the joining port from being added to the relevant logical
482*4882a593Smuzhiyun  domain for it to ingress/egress traffic with other members of the bridge.
483*4882a593Smuzhiyun
484*4882a593Smuzhiyun- ``port_bridge_leave``: bridge layer function invoked when a given switch port is
485*4882a593Smuzhiyun  removed from a bridge, this function should be doing the necessary at the
486*4882a593Smuzhiyun  switch level to deny the leaving port from ingress/egress traffic from the
487*4882a593Smuzhiyun  remaining bridge members. When the port leaves the bridge, it should be aged
488*4882a593Smuzhiyun  out at the switch hardware for the switch to (re) learn MAC addresses behind
489*4882a593Smuzhiyun  this port.
490*4882a593Smuzhiyun
491*4882a593Smuzhiyun- ``port_stp_state_set``: bridge layer function invoked when a given switch port STP
492*4882a593Smuzhiyun  state is computed by the bridge layer and should be propagated to switch
493*4882a593Smuzhiyun  hardware to forward/block/learn traffic. The switch driver is responsible for
494*4882a593Smuzhiyun  computing a STP state change based on current and asked parameters and perform
495*4882a593Smuzhiyun  the relevant ageing based on the intersection results
496*4882a593Smuzhiyun
497*4882a593SmuzhiyunBridge VLAN filtering
498*4882a593Smuzhiyun---------------------
499*4882a593Smuzhiyun
500*4882a593Smuzhiyun- ``port_vlan_filtering``: bridge layer function invoked when the bridge gets
501*4882a593Smuzhiyun  configured for turning on or off VLAN filtering. If nothing specific needs to
502*4882a593Smuzhiyun  be done at the hardware level, this callback does not need to be implemented.
503*4882a593Smuzhiyun  When VLAN filtering is turned on, the hardware must be programmed with
504*4882a593Smuzhiyun  rejecting 802.1Q frames which have VLAN IDs outside of the programmed allowed
505*4882a593Smuzhiyun  VLAN ID map/rules.  If there is no PVID programmed into the switch port,
506*4882a593Smuzhiyun  untagged frames must be rejected as well. When turned off the switch must
507*4882a593Smuzhiyun  accept any 802.1Q frames irrespective of their VLAN ID, and untagged frames are
508*4882a593Smuzhiyun  allowed.
509*4882a593Smuzhiyun
510*4882a593Smuzhiyun- ``port_vlan_prepare``: bridge layer function invoked when the bridge prepares the
511*4882a593Smuzhiyun  configuration of a VLAN on the given port. If the operation is not supported
512*4882a593Smuzhiyun  by the hardware, this function should return ``-EOPNOTSUPP`` to inform the bridge
513*4882a593Smuzhiyun  code to fallback to a software implementation. No hardware setup must be done
514*4882a593Smuzhiyun  in this function. See port_vlan_add for this and details.
515*4882a593Smuzhiyun
516*4882a593Smuzhiyun- ``port_vlan_add``: bridge layer function invoked when a VLAN is configured
517*4882a593Smuzhiyun  (tagged or untagged) for the given switch port
518*4882a593Smuzhiyun
519*4882a593Smuzhiyun- ``port_vlan_del``: bridge layer function invoked when a VLAN is removed from the
520*4882a593Smuzhiyun  given switch port
521*4882a593Smuzhiyun
522*4882a593Smuzhiyun- ``port_vlan_dump``: bridge layer function invoked with a switchdev callback
523*4882a593Smuzhiyun  function that the driver has to call for each VLAN the given port is a member
524*4882a593Smuzhiyun  of. A switchdev object is used to carry the VID and bridge flags.
525*4882a593Smuzhiyun
526*4882a593Smuzhiyun- ``port_fdb_add``: bridge layer function invoked when the bridge wants to install a
527*4882a593Smuzhiyun  Forwarding Database entry, the switch hardware should be programmed with the
528*4882a593Smuzhiyun  specified address in the specified VLAN Id in the forwarding database
529*4882a593Smuzhiyun  associated with this VLAN ID. If the operation is not supported, this
530*4882a593Smuzhiyun  function should return ``-EOPNOTSUPP`` to inform the bridge code to fallback to
531*4882a593Smuzhiyun  a software implementation.
532*4882a593Smuzhiyun
533*4882a593Smuzhiyun.. note:: VLAN ID 0 corresponds to the port private database, which, in the context
534*4882a593Smuzhiyun        of DSA, would be its port-based VLAN, used by the associated bridge device.
535*4882a593Smuzhiyun
536*4882a593Smuzhiyun- ``port_fdb_del``: bridge layer function invoked when the bridge wants to remove a
537*4882a593Smuzhiyun  Forwarding Database entry, the switch hardware should be programmed to delete
538*4882a593Smuzhiyun  the specified MAC address from the specified VLAN ID if it was mapped into
539*4882a593Smuzhiyun  this port forwarding database
540*4882a593Smuzhiyun
541*4882a593Smuzhiyun- ``port_fdb_dump``: bridge layer function invoked with a switchdev callback
542*4882a593Smuzhiyun  function that the driver has to call for each MAC address known to be behind
543*4882a593Smuzhiyun  the given port. A switchdev object is used to carry the VID and FDB info.
544*4882a593Smuzhiyun
545*4882a593Smuzhiyun- ``port_mdb_prepare``: bridge layer function invoked when the bridge prepares the
546*4882a593Smuzhiyun  installation of a multicast database entry. If the operation is not supported,
547*4882a593Smuzhiyun  this function should return ``-EOPNOTSUPP`` to inform the bridge code to fallback
548*4882a593Smuzhiyun  to a software implementation. No hardware setup must be done in this function.
549*4882a593Smuzhiyun  See ``port_fdb_add`` for this and details.
550*4882a593Smuzhiyun
551*4882a593Smuzhiyun- ``port_mdb_add``: bridge layer function invoked when the bridge wants to install
552*4882a593Smuzhiyun  a multicast database entry, the switch hardware should be programmed with the
553*4882a593Smuzhiyun  specified address in the specified VLAN ID in the forwarding database
554*4882a593Smuzhiyun  associated with this VLAN ID.
555*4882a593Smuzhiyun
556*4882a593Smuzhiyun.. note:: VLAN ID 0 corresponds to the port private database, which, in the context
557*4882a593Smuzhiyun        of DSA, would be its port-based VLAN, used by the associated bridge device.
558*4882a593Smuzhiyun
559*4882a593Smuzhiyun- ``port_mdb_del``: bridge layer function invoked when the bridge wants to remove a
560*4882a593Smuzhiyun  multicast database entry, the switch hardware should be programmed to delete
561*4882a593Smuzhiyun  the specified MAC address from the specified VLAN ID if it was mapped into
562*4882a593Smuzhiyun  this port forwarding database.
563*4882a593Smuzhiyun
564*4882a593Smuzhiyun- ``port_mdb_dump``: bridge layer function invoked with a switchdev callback
565*4882a593Smuzhiyun  function that the driver has to call for each MAC address known to be behind
566*4882a593Smuzhiyun  the given port. A switchdev object is used to carry the VID and MDB info.
567*4882a593Smuzhiyun
568*4882a593SmuzhiyunTODO
569*4882a593Smuzhiyun====
570*4882a593Smuzhiyun
571*4882a593SmuzhiyunMaking SWITCHDEV and DSA converge towards an unified codebase
572*4882a593Smuzhiyun-------------------------------------------------------------
573*4882a593Smuzhiyun
574*4882a593SmuzhiyunSWITCHDEV properly takes care of abstracting the networking stack with offload
575*4882a593Smuzhiyuncapable hardware, but does not enforce a strict switch device driver model. On
576*4882a593Smuzhiyunthe other DSA enforces a fairly strict device driver model, and deals with most
577*4882a593Smuzhiyunof the switch specific. At some point we should envision a merger between these
578*4882a593Smuzhiyuntwo subsystems and get the best of both worlds.
579*4882a593Smuzhiyun
580*4882a593SmuzhiyunOther hanging fruits
581*4882a593Smuzhiyun--------------------
582*4882a593Smuzhiyun
583*4882a593Smuzhiyun- making the number of ports fully dynamic and not dependent on ``DSA_MAX_PORTS``
584*4882a593Smuzhiyun- allowing more than one CPU/management interface:
585*4882a593Smuzhiyun  http://comments.gmane.org/gmane.linux.network/365657
586*4882a593Smuzhiyun- porting more drivers from other vendors:
587*4882a593Smuzhiyun  http://comments.gmane.org/gmane.linux.network/365510
588