1*4882a593Smuzhiyun============ 2*4882a593SmuzhiyunArchitecture 3*4882a593Smuzhiyun============ 4*4882a593Smuzhiyun 5*4882a593SmuzhiyunThis document describes the **Distributed Switch Architecture (DSA)** subsystem 6*4882a593Smuzhiyundesign principles, limitations, interactions with other subsystems, and how to 7*4882a593Smuzhiyundevelop drivers for this subsystem as well as a TODO for developers interested 8*4882a593Smuzhiyunin joining the effort. 9*4882a593Smuzhiyun 10*4882a593SmuzhiyunDesign principles 11*4882a593Smuzhiyun================= 12*4882a593Smuzhiyun 13*4882a593SmuzhiyunThe Distributed Switch Architecture is a subsystem which was primarily designed 14*4882a593Smuzhiyunto support Marvell Ethernet switches (MV88E6xxx, a.k.a Linkstreet product line) 15*4882a593Smuzhiyunusing Linux, but has since evolved to support other vendors as well. 16*4882a593Smuzhiyun 17*4882a593SmuzhiyunThe original philosophy behind this design was to be able to use unmodified 18*4882a593SmuzhiyunLinux tools such as bridge, iproute2, ifconfig to work transparently whether 19*4882a593Smuzhiyunthey configured/queried a switch port network device or a regular network 20*4882a593Smuzhiyundevice. 21*4882a593Smuzhiyun 22*4882a593SmuzhiyunAn Ethernet switch is typically comprised of multiple front-panel ports, and one 23*4882a593Smuzhiyunor more CPU or management port. The DSA subsystem currently relies on the 24*4882a593Smuzhiyunpresence of a management port connected to an Ethernet controller capable of 25*4882a593Smuzhiyunreceiving Ethernet frames from the switch. This is a very common setup for all 26*4882a593Smuzhiyunkinds of Ethernet switches found in Small Home and Office products: routers, 27*4882a593Smuzhiyungateways, or even top-of-the rack switches. This host Ethernet controller will 28*4882a593Smuzhiyunbe later referred to as "master" and "cpu" in DSA terminology and code. 29*4882a593Smuzhiyun 30*4882a593SmuzhiyunThe D in DSA stands for Distributed, because the subsystem has been designed 31*4882a593Smuzhiyunwith the ability to configure and manage cascaded switches on top of each other 32*4882a593Smuzhiyunusing upstream and downstream Ethernet links between switches. These specific 33*4882a593Smuzhiyunports are referred to as "dsa" ports in DSA terminology and code. A collection 34*4882a593Smuzhiyunof multiple switches connected to each other is called a "switch tree". 35*4882a593Smuzhiyun 36*4882a593SmuzhiyunFor each front-panel port, DSA will create specialized network devices which are 37*4882a593Smuzhiyunused as controlling and data-flowing endpoints for use by the Linux networking 38*4882a593Smuzhiyunstack. These specialized network interfaces are referred to as "slave" network 39*4882a593Smuzhiyuninterfaces in DSA terminology and code. 40*4882a593Smuzhiyun 41*4882a593SmuzhiyunThe ideal case for using DSA is when an Ethernet switch supports a "switch tag" 42*4882a593Smuzhiyunwhich is a hardware feature making the switch insert a specific tag for each 43*4882a593SmuzhiyunEthernet frames it received to/from specific ports to help the management 44*4882a593Smuzhiyuninterface figure out: 45*4882a593Smuzhiyun 46*4882a593Smuzhiyun- what port is this frame coming from 47*4882a593Smuzhiyun- what was the reason why this frame got forwarded 48*4882a593Smuzhiyun- how to send CPU originated traffic to specific ports 49*4882a593Smuzhiyun 50*4882a593SmuzhiyunThe subsystem does support switches not capable of inserting/stripping tags, but 51*4882a593Smuzhiyunthe features might be slightly limited in that case (traffic separation relies 52*4882a593Smuzhiyunon Port-based VLAN IDs). 53*4882a593Smuzhiyun 54*4882a593SmuzhiyunNote that DSA does not currently create network interfaces for the "cpu" and 55*4882a593Smuzhiyun"dsa" ports because: 56*4882a593Smuzhiyun 57*4882a593Smuzhiyun- the "cpu" port is the Ethernet switch facing side of the management 58*4882a593Smuzhiyun controller, and as such, would create a duplication of feature, since you 59*4882a593Smuzhiyun would get two interfaces for the same conduit: master netdev, and "cpu" netdev 60*4882a593Smuzhiyun 61*4882a593Smuzhiyun- the "dsa" port(s) are just conduits between two or more switches, and as such 62*4882a593Smuzhiyun cannot really be used as proper network interfaces either, only the 63*4882a593Smuzhiyun downstream, or the top-most upstream interface makes sense with that model 64*4882a593Smuzhiyun 65*4882a593SmuzhiyunSwitch tagging protocols 66*4882a593Smuzhiyun------------------------ 67*4882a593Smuzhiyun 68*4882a593SmuzhiyunDSA currently supports 5 different tagging protocols, and a tag-less mode as 69*4882a593Smuzhiyunwell. The different protocols are implemented in: 70*4882a593Smuzhiyun 71*4882a593Smuzhiyun- ``net/dsa/tag_trailer.c``: Marvell's 4 trailer tag mode (legacy) 72*4882a593Smuzhiyun- ``net/dsa/tag_dsa.c``: Marvell's original DSA tag 73*4882a593Smuzhiyun- ``net/dsa/tag_edsa.c``: Marvell's enhanced DSA tag 74*4882a593Smuzhiyun- ``net/dsa/tag_brcm.c``: Broadcom's 4 bytes tag 75*4882a593Smuzhiyun- ``net/dsa/tag_qca.c``: Qualcomm's 2 bytes tag 76*4882a593Smuzhiyun 77*4882a593SmuzhiyunThe exact format of the tag protocol is vendor specific, but in general, they 78*4882a593Smuzhiyunall contain something which: 79*4882a593Smuzhiyun 80*4882a593Smuzhiyun- identifies which port the Ethernet frame came from/should be sent to 81*4882a593Smuzhiyun- provides a reason why this frame was forwarded to the management interface 82*4882a593Smuzhiyun 83*4882a593SmuzhiyunMaster network devices 84*4882a593Smuzhiyun---------------------- 85*4882a593Smuzhiyun 86*4882a593SmuzhiyunMaster network devices are regular, unmodified Linux network device drivers for 87*4882a593Smuzhiyunthe CPU/management Ethernet interface. Such a driver might occasionally need to 88*4882a593Smuzhiyunknow whether DSA is enabled (e.g.: to enable/disable specific offload features), 89*4882a593Smuzhiyunbut the DSA subsystem has been proven to work with industry standard drivers: 90*4882a593Smuzhiyun``e1000e,`` ``mv643xx_eth`` etc. without having to introduce modifications to these 91*4882a593Smuzhiyundrivers. Such network devices are also often referred to as conduit network 92*4882a593Smuzhiyundevices since they act as a pipe between the host processor and the hardware 93*4882a593SmuzhiyunEthernet switch. 94*4882a593Smuzhiyun 95*4882a593SmuzhiyunNetworking stack hooks 96*4882a593Smuzhiyun---------------------- 97*4882a593Smuzhiyun 98*4882a593SmuzhiyunWhen a master netdev is used with DSA, a small hook is placed in the 99*4882a593Smuzhiyunnetworking stack is in order to have the DSA subsystem process the Ethernet 100*4882a593Smuzhiyunswitch specific tagging protocol. DSA accomplishes this by registering a 101*4882a593Smuzhiyunspecific (and fake) Ethernet type (later becoming ``skb->protocol``) with the 102*4882a593Smuzhiyunnetworking stack, this is also known as a ``ptype`` or ``packet_type``. A typical 103*4882a593SmuzhiyunEthernet Frame receive sequence looks like this: 104*4882a593Smuzhiyun 105*4882a593SmuzhiyunMaster network device (e.g.: e1000e): 106*4882a593Smuzhiyun 107*4882a593Smuzhiyun1. Receive interrupt fires: 108*4882a593Smuzhiyun 109*4882a593Smuzhiyun - receive function is invoked 110*4882a593Smuzhiyun - basic packet processing is done: getting length, status etc. 111*4882a593Smuzhiyun - packet is prepared to be processed by the Ethernet layer by calling 112*4882a593Smuzhiyun ``eth_type_trans`` 113*4882a593Smuzhiyun 114*4882a593Smuzhiyun2. net/ethernet/eth.c:: 115*4882a593Smuzhiyun 116*4882a593Smuzhiyun eth_type_trans(skb, dev) 117*4882a593Smuzhiyun if (dev->dsa_ptr != NULL) 118*4882a593Smuzhiyun -> skb->protocol = ETH_P_XDSA 119*4882a593Smuzhiyun 120*4882a593Smuzhiyun3. drivers/net/ethernet/\*:: 121*4882a593Smuzhiyun 122*4882a593Smuzhiyun netif_receive_skb(skb) 123*4882a593Smuzhiyun -> iterate over registered packet_type 124*4882a593Smuzhiyun -> invoke handler for ETH_P_XDSA, calls dsa_switch_rcv() 125*4882a593Smuzhiyun 126*4882a593Smuzhiyun4. net/dsa/dsa.c:: 127*4882a593Smuzhiyun 128*4882a593Smuzhiyun -> dsa_switch_rcv() 129*4882a593Smuzhiyun -> invoke switch tag specific protocol handler in 'net/dsa/tag_*.c' 130*4882a593Smuzhiyun 131*4882a593Smuzhiyun5. net/dsa/tag_*.c: 132*4882a593Smuzhiyun 133*4882a593Smuzhiyun - inspect and strip switch tag protocol to determine originating port 134*4882a593Smuzhiyun - locate per-port network device 135*4882a593Smuzhiyun - invoke ``eth_type_trans()`` with the DSA slave network device 136*4882a593Smuzhiyun - invoked ``netif_receive_skb()`` 137*4882a593Smuzhiyun 138*4882a593SmuzhiyunPast this point, the DSA slave network devices get delivered regular Ethernet 139*4882a593Smuzhiyunframes that can be processed by the networking stack. 140*4882a593Smuzhiyun 141*4882a593SmuzhiyunSlave network devices 142*4882a593Smuzhiyun--------------------- 143*4882a593Smuzhiyun 144*4882a593SmuzhiyunSlave network devices created by DSA are stacked on top of their master network 145*4882a593Smuzhiyundevice, each of these network interfaces will be responsible for being a 146*4882a593Smuzhiyuncontrolling and data-flowing end-point for each front-panel port of the switch. 147*4882a593SmuzhiyunThese interfaces are specialized in order to: 148*4882a593Smuzhiyun 149*4882a593Smuzhiyun- insert/remove the switch tag protocol (if it exists) when sending traffic 150*4882a593Smuzhiyun to/from specific switch ports 151*4882a593Smuzhiyun- query the switch for ethtool operations: statistics, link state, 152*4882a593Smuzhiyun Wake-on-LAN, register dumps... 153*4882a593Smuzhiyun- external/internal PHY management: link, auto-negotiation etc. 154*4882a593Smuzhiyun 155*4882a593SmuzhiyunThese slave network devices have custom net_device_ops and ethtool_ops function 156*4882a593Smuzhiyunpointers which allow DSA to introduce a level of layering between the networking 157*4882a593Smuzhiyunstack/ethtool, and the switch driver implementation. 158*4882a593Smuzhiyun 159*4882a593SmuzhiyunUpon frame transmission from these slave network devices, DSA will look up which 160*4882a593Smuzhiyunswitch tagging protocol is currently registered with these network devices, and 161*4882a593Smuzhiyuninvoke a specific transmit routine which takes care of adding the relevant 162*4882a593Smuzhiyunswitch tag in the Ethernet frames. 163*4882a593Smuzhiyun 164*4882a593SmuzhiyunThese frames are then queued for transmission using the master network device 165*4882a593Smuzhiyun``ndo_start_xmit()`` function, since they contain the appropriate switch tag, the 166*4882a593SmuzhiyunEthernet switch will be able to process these incoming frames from the 167*4882a593Smuzhiyunmanagement interface and delivers these frames to the physical switch port. 168*4882a593Smuzhiyun 169*4882a593SmuzhiyunGraphical representation 170*4882a593Smuzhiyun------------------------ 171*4882a593Smuzhiyun 172*4882a593SmuzhiyunSummarized, this is basically how DSA looks like from a network device 173*4882a593Smuzhiyunperspective:: 174*4882a593Smuzhiyun 175*4882a593Smuzhiyun 176*4882a593Smuzhiyun |--------------------------- 177*4882a593Smuzhiyun | CPU network device (eth0)| 178*4882a593Smuzhiyun ---------------------------- 179*4882a593Smuzhiyun | <tag added by switch | 180*4882a593Smuzhiyun | | 181*4882a593Smuzhiyun | | 182*4882a593Smuzhiyun | tag added by CPU> | 183*4882a593Smuzhiyun |--------------------------------------------| 184*4882a593Smuzhiyun | Switch driver | 185*4882a593Smuzhiyun |--------------------------------------------| 186*4882a593Smuzhiyun || || || 187*4882a593Smuzhiyun |-------| |-------| |-------| 188*4882a593Smuzhiyun | sw0p0 | | sw0p1 | | sw0p2 | 189*4882a593Smuzhiyun |-------| |-------| |-------| 190*4882a593Smuzhiyun 191*4882a593Smuzhiyun 192*4882a593Smuzhiyun 193*4882a593SmuzhiyunSlave MDIO bus 194*4882a593Smuzhiyun-------------- 195*4882a593Smuzhiyun 196*4882a593SmuzhiyunIn order to be able to read to/from a switch PHY built into it, DSA creates a 197*4882a593Smuzhiyunslave MDIO bus which allows a specific switch driver to divert and intercept 198*4882a593SmuzhiyunMDIO reads/writes towards specific PHY addresses. In most MDIO-connected 199*4882a593Smuzhiyunswitches, these functions would utilize direct or indirect PHY addressing mode 200*4882a593Smuzhiyunto return standard MII registers from the switch builtin PHYs, allowing the PHY 201*4882a593Smuzhiyunlibrary and/or to return link status, link partner pages, auto-negotiation 202*4882a593Smuzhiyunresults etc.. 203*4882a593Smuzhiyun 204*4882a593SmuzhiyunFor Ethernet switches which have both external and internal MDIO busses, the 205*4882a593Smuzhiyunslave MII bus can be utilized to mux/demux MDIO reads and writes towards either 206*4882a593Smuzhiyuninternal or external MDIO devices this switch might be connected to: internal 207*4882a593SmuzhiyunPHYs, external PHYs, or even external switches. 208*4882a593Smuzhiyun 209*4882a593SmuzhiyunData structures 210*4882a593Smuzhiyun--------------- 211*4882a593Smuzhiyun 212*4882a593SmuzhiyunDSA data structures are defined in ``include/net/dsa.h`` as well as 213*4882a593Smuzhiyun``net/dsa/dsa_priv.h``: 214*4882a593Smuzhiyun 215*4882a593Smuzhiyun- ``dsa_chip_data``: platform data configuration for a given switch device, 216*4882a593Smuzhiyun this structure describes a switch device's parent device, its address, as 217*4882a593Smuzhiyun well as various properties of its ports: names/labels, and finally a routing 218*4882a593Smuzhiyun table indication (when cascading switches) 219*4882a593Smuzhiyun 220*4882a593Smuzhiyun- ``dsa_platform_data``: platform device configuration data which can reference 221*4882a593Smuzhiyun a collection of dsa_chip_data structure if multiples switches are cascaded, 222*4882a593Smuzhiyun the master network device this switch tree is attached to needs to be 223*4882a593Smuzhiyun referenced 224*4882a593Smuzhiyun 225*4882a593Smuzhiyun- ``dsa_switch_tree``: structure assigned to the master network device under 226*4882a593Smuzhiyun ``dsa_ptr``, this structure references a dsa_platform_data structure as well as 227*4882a593Smuzhiyun the tagging protocol supported by the switch tree, and which receive/transmit 228*4882a593Smuzhiyun function hooks should be invoked, information about the directly attached 229*4882a593Smuzhiyun switch is also provided: CPU port. Finally, a collection of dsa_switch are 230*4882a593Smuzhiyun referenced to address individual switches in the tree. 231*4882a593Smuzhiyun 232*4882a593Smuzhiyun- ``dsa_switch``: structure describing a switch device in the tree, referencing 233*4882a593Smuzhiyun a ``dsa_switch_tree`` as a backpointer, slave network devices, master network 234*4882a593Smuzhiyun device, and a reference to the backing``dsa_switch_ops`` 235*4882a593Smuzhiyun 236*4882a593Smuzhiyun- ``dsa_switch_ops``: structure referencing function pointers, see below for a 237*4882a593Smuzhiyun full description. 238*4882a593Smuzhiyun 239*4882a593SmuzhiyunDesign limitations 240*4882a593Smuzhiyun================== 241*4882a593Smuzhiyun 242*4882a593SmuzhiyunLimits on the number of devices and ports 243*4882a593Smuzhiyun----------------------------------------- 244*4882a593Smuzhiyun 245*4882a593SmuzhiyunDSA currently limits the number of maximum switches within a tree to 4 246*4882a593Smuzhiyun(``DSA_MAX_SWITCHES``), and the number of ports per switch to 12 (``DSA_MAX_PORTS``). 247*4882a593SmuzhiyunThese limits could be extended to support larger configurations would this need 248*4882a593Smuzhiyunarise. 249*4882a593Smuzhiyun 250*4882a593SmuzhiyunLack of CPU/DSA network devices 251*4882a593Smuzhiyun------------------------------- 252*4882a593Smuzhiyun 253*4882a593SmuzhiyunDSA does not currently create slave network devices for the CPU or DSA ports, as 254*4882a593Smuzhiyundescribed before. This might be an issue in the following cases: 255*4882a593Smuzhiyun 256*4882a593Smuzhiyun- inability to fetch switch CPU port statistics counters using ethtool, which 257*4882a593Smuzhiyun can make it harder to debug MDIO switch connected using xMII interfaces 258*4882a593Smuzhiyun 259*4882a593Smuzhiyun- inability to configure the CPU port link parameters based on the Ethernet 260*4882a593Smuzhiyun controller capabilities attached to it: http://patchwork.ozlabs.org/patch/509806/ 261*4882a593Smuzhiyun 262*4882a593Smuzhiyun- inability to configure specific VLAN IDs / trunking VLANs between switches 263*4882a593Smuzhiyun when using a cascaded setup 264*4882a593Smuzhiyun 265*4882a593SmuzhiyunCommon pitfalls using DSA setups 266*4882a593Smuzhiyun-------------------------------- 267*4882a593Smuzhiyun 268*4882a593SmuzhiyunOnce a master network device is configured to use DSA (dev->dsa_ptr becomes 269*4882a593Smuzhiyunnon-NULL), and the switch behind it expects a tagging protocol, this network 270*4882a593Smuzhiyuninterface can only exclusively be used as a conduit interface. Sending packets 271*4882a593Smuzhiyundirectly through this interface (e.g.: opening a socket using this interface) 272*4882a593Smuzhiyunwill not make us go through the switch tagging protocol transmit function, so 273*4882a593Smuzhiyunthe Ethernet switch on the other end, expecting a tag will typically drop this 274*4882a593Smuzhiyunframe. 275*4882a593Smuzhiyun 276*4882a593SmuzhiyunSlave network devices check that the master network device is UP before allowing 277*4882a593Smuzhiyunyou to administratively bring UP these slave network devices. A common 278*4882a593Smuzhiyunconfiguration mistake is forgetting to bring UP the master network device first. 279*4882a593Smuzhiyun 280*4882a593SmuzhiyunInteractions with other subsystems 281*4882a593Smuzhiyun================================== 282*4882a593Smuzhiyun 283*4882a593SmuzhiyunDSA currently leverages the following subsystems: 284*4882a593Smuzhiyun 285*4882a593Smuzhiyun- MDIO/PHY library: ``drivers/net/phy/phy.c``, ``mdio_bus.c`` 286*4882a593Smuzhiyun- Switchdev:``net/switchdev/*`` 287*4882a593Smuzhiyun- Device Tree for various of_* functions 288*4882a593Smuzhiyun 289*4882a593SmuzhiyunMDIO/PHY library 290*4882a593Smuzhiyun---------------- 291*4882a593Smuzhiyun 292*4882a593SmuzhiyunSlave network devices exposed by DSA may or may not be interfacing with PHY 293*4882a593Smuzhiyundevices (``struct phy_device`` as defined in ``include/linux/phy.h)``, but the DSA 294*4882a593Smuzhiyunsubsystem deals with all possible combinations: 295*4882a593Smuzhiyun 296*4882a593Smuzhiyun- internal PHY devices, built into the Ethernet switch hardware 297*4882a593Smuzhiyun- external PHY devices, connected via an internal or external MDIO bus 298*4882a593Smuzhiyun- internal PHY devices, connected via an internal MDIO bus 299*4882a593Smuzhiyun- special, non-autonegotiated or non MDIO-managed PHY devices: SFPs, MoCA; a.k.a 300*4882a593Smuzhiyun fixed PHYs 301*4882a593Smuzhiyun 302*4882a593SmuzhiyunThe PHY configuration is done by the ``dsa_slave_phy_setup()`` function and the 303*4882a593Smuzhiyunlogic basically looks like this: 304*4882a593Smuzhiyun 305*4882a593Smuzhiyun- if Device Tree is used, the PHY device is looked up using the standard 306*4882a593Smuzhiyun "phy-handle" property, if found, this PHY device is created and registered 307*4882a593Smuzhiyun using ``of_phy_connect()`` 308*4882a593Smuzhiyun 309*4882a593Smuzhiyun- if Device Tree is used, and the PHY device is "fixed", that is, conforms to 310*4882a593Smuzhiyun the definition of a non-MDIO managed PHY as defined in 311*4882a593Smuzhiyun ``Documentation/devicetree/bindings/net/fixed-link.txt``, the PHY is registered 312*4882a593Smuzhiyun and connected transparently using the special fixed MDIO bus driver 313*4882a593Smuzhiyun 314*4882a593Smuzhiyun- finally, if the PHY is built into the switch, as is very common with 315*4882a593Smuzhiyun standalone switch packages, the PHY is probed using the slave MII bus created 316*4882a593Smuzhiyun by DSA 317*4882a593Smuzhiyun 318*4882a593Smuzhiyun 319*4882a593SmuzhiyunSWITCHDEV 320*4882a593Smuzhiyun--------- 321*4882a593Smuzhiyun 322*4882a593SmuzhiyunDSA directly utilizes SWITCHDEV when interfacing with the bridge layer, and 323*4882a593Smuzhiyunmore specifically with its VLAN filtering portion when configuring VLANs on top 324*4882a593Smuzhiyunof per-port slave network devices. Since DSA primarily deals with 325*4882a593SmuzhiyunMDIO-connected switches, although not exclusively, SWITCHDEV's 326*4882a593Smuzhiyunprepare/abort/commit phases are often simplified into a prepare phase which 327*4882a593Smuzhiyunchecks whether the operation is supported by the DSA switch driver, and a commit 328*4882a593Smuzhiyunphase which applies the changes. 329*4882a593Smuzhiyun 330*4882a593SmuzhiyunAs of today, the only SWITCHDEV objects supported by DSA are the FDB and VLAN 331*4882a593Smuzhiyunobjects. 332*4882a593Smuzhiyun 333*4882a593SmuzhiyunDevice Tree 334*4882a593Smuzhiyun----------- 335*4882a593Smuzhiyun 336*4882a593SmuzhiyunDSA features a standardized binding which is documented in 337*4882a593Smuzhiyun``Documentation/devicetree/bindings/net/dsa/dsa.txt``. PHY/MDIO library helper 338*4882a593Smuzhiyunfunctions such as ``of_get_phy_mode()``, ``of_phy_connect()`` are also used to query 339*4882a593Smuzhiyunper-port PHY specific details: interface connection, MDIO bus location etc.. 340*4882a593Smuzhiyun 341*4882a593SmuzhiyunDriver development 342*4882a593Smuzhiyun================== 343*4882a593Smuzhiyun 344*4882a593SmuzhiyunDSA switch drivers need to implement a dsa_switch_ops structure which will 345*4882a593Smuzhiyuncontain the various members described below. 346*4882a593Smuzhiyun 347*4882a593Smuzhiyun``register_switch_driver()`` registers this dsa_switch_ops in its internal list 348*4882a593Smuzhiyunof drivers to probe for. ``unregister_switch_driver()`` does the exact opposite. 349*4882a593Smuzhiyun 350*4882a593SmuzhiyunUnless requested differently by setting the priv_size member accordingly, DSA 351*4882a593Smuzhiyundoes not allocate any driver private context space. 352*4882a593Smuzhiyun 353*4882a593SmuzhiyunSwitch configuration 354*4882a593Smuzhiyun-------------------- 355*4882a593Smuzhiyun 356*4882a593Smuzhiyun- ``tag_protocol``: this is to indicate what kind of tagging protocol is supported, 357*4882a593Smuzhiyun should be a valid value from the ``dsa_tag_protocol`` enum 358*4882a593Smuzhiyun 359*4882a593Smuzhiyun- ``probe``: probe routine which will be invoked by the DSA platform device upon 360*4882a593Smuzhiyun registration to test for the presence/absence of a switch device. For MDIO 361*4882a593Smuzhiyun devices, it is recommended to issue a read towards internal registers using 362*4882a593Smuzhiyun the switch pseudo-PHY and return whether this is a supported device. For other 363*4882a593Smuzhiyun buses, return a non-NULL string 364*4882a593Smuzhiyun 365*4882a593Smuzhiyun- ``setup``: setup function for the switch, this function is responsible for setting 366*4882a593Smuzhiyun up the ``dsa_switch_ops`` private structure with all it needs: register maps, 367*4882a593Smuzhiyun interrupts, mutexes, locks etc.. This function is also expected to properly 368*4882a593Smuzhiyun configure the switch to separate all network interfaces from each other, that 369*4882a593Smuzhiyun is, they should be isolated by the switch hardware itself, typically by creating 370*4882a593Smuzhiyun a Port-based VLAN ID for each port and allowing only the CPU port and the 371*4882a593Smuzhiyun specific port to be in the forwarding vector. Ports that are unused by the 372*4882a593Smuzhiyun platform should be disabled. Past this function, the switch is expected to be 373*4882a593Smuzhiyun fully configured and ready to serve any kind of request. It is recommended 374*4882a593Smuzhiyun to issue a software reset of the switch during this setup function in order to 375*4882a593Smuzhiyun avoid relying on what a previous software agent such as a bootloader/firmware 376*4882a593Smuzhiyun may have previously configured. 377*4882a593Smuzhiyun 378*4882a593SmuzhiyunPHY devices and link management 379*4882a593Smuzhiyun------------------------------- 380*4882a593Smuzhiyun 381*4882a593Smuzhiyun- ``get_phy_flags``: Some switches are interfaced to various kinds of Ethernet PHYs, 382*4882a593Smuzhiyun if the PHY library PHY driver needs to know about information it cannot obtain 383*4882a593Smuzhiyun on its own (e.g.: coming from switch memory mapped registers), this function 384*4882a593Smuzhiyun should return a 32-bits bitmask of "flags", that is private between the switch 385*4882a593Smuzhiyun driver and the Ethernet PHY driver in ``drivers/net/phy/\*``. 386*4882a593Smuzhiyun 387*4882a593Smuzhiyun- ``phy_read``: Function invoked by the DSA slave MDIO bus when attempting to read 388*4882a593Smuzhiyun the switch port MDIO registers. If unavailable, return 0xffff for each read. 389*4882a593Smuzhiyun For builtin switch Ethernet PHYs, this function should allow reading the link 390*4882a593Smuzhiyun status, auto-negotiation results, link partner pages etc.. 391*4882a593Smuzhiyun 392*4882a593Smuzhiyun- ``phy_write``: Function invoked by the DSA slave MDIO bus when attempting to write 393*4882a593Smuzhiyun to the switch port MDIO registers. If unavailable return a negative error 394*4882a593Smuzhiyun code. 395*4882a593Smuzhiyun 396*4882a593Smuzhiyun- ``adjust_link``: Function invoked by the PHY library when a slave network device 397*4882a593Smuzhiyun is attached to a PHY device. This function is responsible for appropriately 398*4882a593Smuzhiyun configuring the switch port link parameters: speed, duplex, pause based on 399*4882a593Smuzhiyun what the ``phy_device`` is providing. 400*4882a593Smuzhiyun 401*4882a593Smuzhiyun- ``fixed_link_update``: Function invoked by the PHY library, and specifically by 402*4882a593Smuzhiyun the fixed PHY driver asking the switch driver for link parameters that could 403*4882a593Smuzhiyun not be auto-negotiated, or obtained by reading the PHY registers through MDIO. 404*4882a593Smuzhiyun This is particularly useful for specific kinds of hardware such as QSGMII, 405*4882a593Smuzhiyun MoCA or other kinds of non-MDIO managed PHYs where out of band link 406*4882a593Smuzhiyun information is obtained 407*4882a593Smuzhiyun 408*4882a593SmuzhiyunEthtool operations 409*4882a593Smuzhiyun------------------ 410*4882a593Smuzhiyun 411*4882a593Smuzhiyun- ``get_strings``: ethtool function used to query the driver's strings, will 412*4882a593Smuzhiyun typically return statistics strings, private flags strings etc. 413*4882a593Smuzhiyun 414*4882a593Smuzhiyun- ``get_ethtool_stats``: ethtool function used to query per-port statistics and 415*4882a593Smuzhiyun return their values. DSA overlays slave network devices general statistics: 416*4882a593Smuzhiyun RX/TX counters from the network device, with switch driver specific statistics 417*4882a593Smuzhiyun per port 418*4882a593Smuzhiyun 419*4882a593Smuzhiyun- ``get_sset_count``: ethtool function used to query the number of statistics items 420*4882a593Smuzhiyun 421*4882a593Smuzhiyun- ``get_wol``: ethtool function used to obtain Wake-on-LAN settings per-port, this 422*4882a593Smuzhiyun function may, for certain implementations also query the master network device 423*4882a593Smuzhiyun Wake-on-LAN settings if this interface needs to participate in Wake-on-LAN 424*4882a593Smuzhiyun 425*4882a593Smuzhiyun- ``set_wol``: ethtool function used to configure Wake-on-LAN settings per-port, 426*4882a593Smuzhiyun direct counterpart to set_wol with similar restrictions 427*4882a593Smuzhiyun 428*4882a593Smuzhiyun- ``set_eee``: ethtool function which is used to configure a switch port EEE (Green 429*4882a593Smuzhiyun Ethernet) settings, can optionally invoke the PHY library to enable EEE at the 430*4882a593Smuzhiyun PHY level if relevant. This function should enable EEE at the switch port MAC 431*4882a593Smuzhiyun controller and data-processing logic 432*4882a593Smuzhiyun 433*4882a593Smuzhiyun- ``get_eee``: ethtool function which is used to query a switch port EEE settings, 434*4882a593Smuzhiyun this function should return the EEE state of the switch port MAC controller 435*4882a593Smuzhiyun and data-processing logic as well as query the PHY for its currently configured 436*4882a593Smuzhiyun EEE settings 437*4882a593Smuzhiyun 438*4882a593Smuzhiyun- ``get_eeprom_len``: ethtool function returning for a given switch the EEPROM 439*4882a593Smuzhiyun length/size in bytes 440*4882a593Smuzhiyun 441*4882a593Smuzhiyun- ``get_eeprom``: ethtool function returning for a given switch the EEPROM contents 442*4882a593Smuzhiyun 443*4882a593Smuzhiyun- ``set_eeprom``: ethtool function writing specified data to a given switch EEPROM 444*4882a593Smuzhiyun 445*4882a593Smuzhiyun- ``get_regs_len``: ethtool function returning the register length for a given 446*4882a593Smuzhiyun switch 447*4882a593Smuzhiyun 448*4882a593Smuzhiyun- ``get_regs``: ethtool function returning the Ethernet switch internal register 449*4882a593Smuzhiyun contents. This function might require user-land code in ethtool to 450*4882a593Smuzhiyun pretty-print register values and registers 451*4882a593Smuzhiyun 452*4882a593SmuzhiyunPower management 453*4882a593Smuzhiyun---------------- 454*4882a593Smuzhiyun 455*4882a593Smuzhiyun- ``suspend``: function invoked by the DSA platform device when the system goes to 456*4882a593Smuzhiyun suspend, should quiesce all Ethernet switch activities, but keep ports 457*4882a593Smuzhiyun participating in Wake-on-LAN active as well as additional wake-up logic if 458*4882a593Smuzhiyun supported 459*4882a593Smuzhiyun 460*4882a593Smuzhiyun- ``resume``: function invoked by the DSA platform device when the system resumes, 461*4882a593Smuzhiyun should resume all Ethernet switch activities and re-configure the switch to be 462*4882a593Smuzhiyun in a fully active state 463*4882a593Smuzhiyun 464*4882a593Smuzhiyun- ``port_enable``: function invoked by the DSA slave network device ndo_open 465*4882a593Smuzhiyun function when a port is administratively brought up, this function should be 466*4882a593Smuzhiyun fully enabling a given switch port. DSA takes care of marking the port with 467*4882a593Smuzhiyun ``BR_STATE_BLOCKING`` if the port is a bridge member, or ``BR_STATE_FORWARDING`` if it 468*4882a593Smuzhiyun was not, and propagating these changes down to the hardware 469*4882a593Smuzhiyun 470*4882a593Smuzhiyun- ``port_disable``: function invoked by the DSA slave network device ndo_close 471*4882a593Smuzhiyun function when a port is administratively brought down, this function should be 472*4882a593Smuzhiyun fully disabling a given switch port. DSA takes care of marking the port with 473*4882a593Smuzhiyun ``BR_STATE_DISABLED`` and propagating changes to the hardware if this port is 474*4882a593Smuzhiyun disabled while being a bridge member 475*4882a593Smuzhiyun 476*4882a593SmuzhiyunBridge layer 477*4882a593Smuzhiyun------------ 478*4882a593Smuzhiyun 479*4882a593Smuzhiyun- ``port_bridge_join``: bridge layer function invoked when a given switch port is 480*4882a593Smuzhiyun added to a bridge, this function should be doing the necessary at the switch 481*4882a593Smuzhiyun level to permit the joining port from being added to the relevant logical 482*4882a593Smuzhiyun domain for it to ingress/egress traffic with other members of the bridge. 483*4882a593Smuzhiyun 484*4882a593Smuzhiyun- ``port_bridge_leave``: bridge layer function invoked when a given switch port is 485*4882a593Smuzhiyun removed from a bridge, this function should be doing the necessary at the 486*4882a593Smuzhiyun switch level to deny the leaving port from ingress/egress traffic from the 487*4882a593Smuzhiyun remaining bridge members. When the port leaves the bridge, it should be aged 488*4882a593Smuzhiyun out at the switch hardware for the switch to (re) learn MAC addresses behind 489*4882a593Smuzhiyun this port. 490*4882a593Smuzhiyun 491*4882a593Smuzhiyun- ``port_stp_state_set``: bridge layer function invoked when a given switch port STP 492*4882a593Smuzhiyun state is computed by the bridge layer and should be propagated to switch 493*4882a593Smuzhiyun hardware to forward/block/learn traffic. The switch driver is responsible for 494*4882a593Smuzhiyun computing a STP state change based on current and asked parameters and perform 495*4882a593Smuzhiyun the relevant ageing based on the intersection results 496*4882a593Smuzhiyun 497*4882a593SmuzhiyunBridge VLAN filtering 498*4882a593Smuzhiyun--------------------- 499*4882a593Smuzhiyun 500*4882a593Smuzhiyun- ``port_vlan_filtering``: bridge layer function invoked when the bridge gets 501*4882a593Smuzhiyun configured for turning on or off VLAN filtering. If nothing specific needs to 502*4882a593Smuzhiyun be done at the hardware level, this callback does not need to be implemented. 503*4882a593Smuzhiyun When VLAN filtering is turned on, the hardware must be programmed with 504*4882a593Smuzhiyun rejecting 802.1Q frames which have VLAN IDs outside of the programmed allowed 505*4882a593Smuzhiyun VLAN ID map/rules. If there is no PVID programmed into the switch port, 506*4882a593Smuzhiyun untagged frames must be rejected as well. When turned off the switch must 507*4882a593Smuzhiyun accept any 802.1Q frames irrespective of their VLAN ID, and untagged frames are 508*4882a593Smuzhiyun allowed. 509*4882a593Smuzhiyun 510*4882a593Smuzhiyun- ``port_vlan_prepare``: bridge layer function invoked when the bridge prepares the 511*4882a593Smuzhiyun configuration of a VLAN on the given port. If the operation is not supported 512*4882a593Smuzhiyun by the hardware, this function should return ``-EOPNOTSUPP`` to inform the bridge 513*4882a593Smuzhiyun code to fallback to a software implementation. No hardware setup must be done 514*4882a593Smuzhiyun in this function. See port_vlan_add for this and details. 515*4882a593Smuzhiyun 516*4882a593Smuzhiyun- ``port_vlan_add``: bridge layer function invoked when a VLAN is configured 517*4882a593Smuzhiyun (tagged or untagged) for the given switch port 518*4882a593Smuzhiyun 519*4882a593Smuzhiyun- ``port_vlan_del``: bridge layer function invoked when a VLAN is removed from the 520*4882a593Smuzhiyun given switch port 521*4882a593Smuzhiyun 522*4882a593Smuzhiyun- ``port_vlan_dump``: bridge layer function invoked with a switchdev callback 523*4882a593Smuzhiyun function that the driver has to call for each VLAN the given port is a member 524*4882a593Smuzhiyun of. A switchdev object is used to carry the VID and bridge flags. 525*4882a593Smuzhiyun 526*4882a593Smuzhiyun- ``port_fdb_add``: bridge layer function invoked when the bridge wants to install a 527*4882a593Smuzhiyun Forwarding Database entry, the switch hardware should be programmed with the 528*4882a593Smuzhiyun specified address in the specified VLAN Id in the forwarding database 529*4882a593Smuzhiyun associated with this VLAN ID. If the operation is not supported, this 530*4882a593Smuzhiyun function should return ``-EOPNOTSUPP`` to inform the bridge code to fallback to 531*4882a593Smuzhiyun a software implementation. 532*4882a593Smuzhiyun 533*4882a593Smuzhiyun.. note:: VLAN ID 0 corresponds to the port private database, which, in the context 534*4882a593Smuzhiyun of DSA, would be its port-based VLAN, used by the associated bridge device. 535*4882a593Smuzhiyun 536*4882a593Smuzhiyun- ``port_fdb_del``: bridge layer function invoked when the bridge wants to remove a 537*4882a593Smuzhiyun Forwarding Database entry, the switch hardware should be programmed to delete 538*4882a593Smuzhiyun the specified MAC address from the specified VLAN ID if it was mapped into 539*4882a593Smuzhiyun this port forwarding database 540*4882a593Smuzhiyun 541*4882a593Smuzhiyun- ``port_fdb_dump``: bridge layer function invoked with a switchdev callback 542*4882a593Smuzhiyun function that the driver has to call for each MAC address known to be behind 543*4882a593Smuzhiyun the given port. A switchdev object is used to carry the VID and FDB info. 544*4882a593Smuzhiyun 545*4882a593Smuzhiyun- ``port_mdb_prepare``: bridge layer function invoked when the bridge prepares the 546*4882a593Smuzhiyun installation of a multicast database entry. If the operation is not supported, 547*4882a593Smuzhiyun this function should return ``-EOPNOTSUPP`` to inform the bridge code to fallback 548*4882a593Smuzhiyun to a software implementation. No hardware setup must be done in this function. 549*4882a593Smuzhiyun See ``port_fdb_add`` for this and details. 550*4882a593Smuzhiyun 551*4882a593Smuzhiyun- ``port_mdb_add``: bridge layer function invoked when the bridge wants to install 552*4882a593Smuzhiyun a multicast database entry, the switch hardware should be programmed with the 553*4882a593Smuzhiyun specified address in the specified VLAN ID in the forwarding database 554*4882a593Smuzhiyun associated with this VLAN ID. 555*4882a593Smuzhiyun 556*4882a593Smuzhiyun.. note:: VLAN ID 0 corresponds to the port private database, which, in the context 557*4882a593Smuzhiyun of DSA, would be its port-based VLAN, used by the associated bridge device. 558*4882a593Smuzhiyun 559*4882a593Smuzhiyun- ``port_mdb_del``: bridge layer function invoked when the bridge wants to remove a 560*4882a593Smuzhiyun multicast database entry, the switch hardware should be programmed to delete 561*4882a593Smuzhiyun the specified MAC address from the specified VLAN ID if it was mapped into 562*4882a593Smuzhiyun this port forwarding database. 563*4882a593Smuzhiyun 564*4882a593Smuzhiyun- ``port_mdb_dump``: bridge layer function invoked with a switchdev callback 565*4882a593Smuzhiyun function that the driver has to call for each MAC address known to be behind 566*4882a593Smuzhiyun the given port. A switchdev object is used to carry the VID and MDB info. 567*4882a593Smuzhiyun 568*4882a593SmuzhiyunTODO 569*4882a593Smuzhiyun==== 570*4882a593Smuzhiyun 571*4882a593SmuzhiyunMaking SWITCHDEV and DSA converge towards an unified codebase 572*4882a593Smuzhiyun------------------------------------------------------------- 573*4882a593Smuzhiyun 574*4882a593SmuzhiyunSWITCHDEV properly takes care of abstracting the networking stack with offload 575*4882a593Smuzhiyuncapable hardware, but does not enforce a strict switch device driver model. On 576*4882a593Smuzhiyunthe other DSA enforces a fairly strict device driver model, and deals with most 577*4882a593Smuzhiyunof the switch specific. At some point we should envision a merger between these 578*4882a593Smuzhiyuntwo subsystems and get the best of both worlds. 579*4882a593Smuzhiyun 580*4882a593SmuzhiyunOther hanging fruits 581*4882a593Smuzhiyun-------------------- 582*4882a593Smuzhiyun 583*4882a593Smuzhiyun- making the number of ports fully dynamic and not dependent on ``DSA_MAX_PORTS`` 584*4882a593Smuzhiyun- allowing more than one CPU/management interface: 585*4882a593Smuzhiyun http://comments.gmane.org/gmane.linux.network/365657 586*4882a593Smuzhiyun- porting more drivers from other vendors: 587*4882a593Smuzhiyun http://comments.gmane.org/gmane.linux.network/365510 588