xref: /OK3568_Linux_fs/kernel/Documentation/networking/net_dim.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun======================================================
2*4882a593SmuzhiyunNet DIM - Generic Network Dynamic Interrupt Moderation
3*4882a593Smuzhiyun======================================================
4*4882a593Smuzhiyun
5*4882a593Smuzhiyun:Author: Tal Gilboa <talgi@mellanox.com>
6*4882a593Smuzhiyun
7*4882a593Smuzhiyun.. contents:: :depth: 2
8*4882a593Smuzhiyun
9*4882a593SmuzhiyunAssumptions
10*4882a593Smuzhiyun===========
11*4882a593Smuzhiyun
12*4882a593SmuzhiyunThis document assumes the reader has basic knowledge in network drivers
13*4882a593Smuzhiyunand in general interrupt moderation.
14*4882a593Smuzhiyun
15*4882a593Smuzhiyun
16*4882a593SmuzhiyunIntroduction
17*4882a593Smuzhiyun============
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunDynamic Interrupt Moderation (DIM) (in networking) refers to changing the
20*4882a593Smuzhiyuninterrupt moderation configuration of a channel in order to optimize packet
21*4882a593Smuzhiyunprocessing. The mechanism includes an algorithm which decides if and how to
22*4882a593Smuzhiyunchange moderation parameters for a channel, usually by performing an analysis on
23*4882a593Smuzhiyunruntime data sampled from the system. Net DIM is such a mechanism. In each
24*4882a593Smuzhiyuniteration of the algorithm, it analyses a given sample of the data, compares it
25*4882a593Smuzhiyunto the previous sample and if required, it can decide to change some of the
26*4882a593Smuzhiyuninterrupt moderation configuration fields. The data sample is composed of data
27*4882a593Smuzhiyunbandwidth, the number of packets and the number of events. The time between
28*4882a593Smuzhiyunsamples is also measured. Net DIM compares the current and the previous data and
29*4882a593Smuzhiyunreturns an adjusted interrupt moderation configuration object. In some cases,
30*4882a593Smuzhiyunthe algorithm might decide not to change anything. The configuration fields are
31*4882a593Smuzhiyunthe minimum duration (microseconds) allowed between events and the maximum
32*4882a593Smuzhiyunnumber of wanted packets per event. The Net DIM algorithm ascribes importance to
33*4882a593Smuzhiyunincrease bandwidth over reducing interrupt rate.
34*4882a593Smuzhiyun
35*4882a593Smuzhiyun
36*4882a593SmuzhiyunNet DIM Algorithm
37*4882a593Smuzhiyun=================
38*4882a593Smuzhiyun
39*4882a593SmuzhiyunEach iteration of the Net DIM algorithm follows these steps:
40*4882a593Smuzhiyun
41*4882a593Smuzhiyun#. Calculates new data sample.
42*4882a593Smuzhiyun#. Compares it to previous sample.
43*4882a593Smuzhiyun#. Makes a decision - suggests interrupt moderation configuration fields.
44*4882a593Smuzhiyun#. Applies a schedule work function, which applies suggested configuration.
45*4882a593Smuzhiyun
46*4882a593SmuzhiyunThe first two steps are straightforward, both the new and the previous data are
47*4882a593Smuzhiyunsupplied by the driver registered to Net DIM. The previous data is the new data
48*4882a593Smuzhiyunsupplied to the previous iteration. The comparison step checks the difference
49*4882a593Smuzhiyunbetween the new and previous data and decides on the result of the last step.
50*4882a593SmuzhiyunA step would result as "better" if bandwidth increases and as "worse" if
51*4882a593Smuzhiyunbandwidth reduces. If there is no change in bandwidth, the packet rate is
52*4882a593Smuzhiyuncompared in a similar fashion - increase == "better" and decrease == "worse".
53*4882a593SmuzhiyunIn case there is no change in the packet rate as well, the interrupt rate is
54*4882a593Smuzhiyuncompared. Here the algorithm tries to optimize for lower interrupt rate so an
55*4882a593Smuzhiyunincrease in the interrupt rate is considered "worse" and a decrease is
56*4882a593Smuzhiyunconsidered "better". Step #2 has an optimization for avoiding false results: it
57*4882a593Smuzhiyunonly considers a difference between samples as valid if it is greater than a
58*4882a593Smuzhiyuncertain percentage. Also, since Net DIM does not measure anything by itself, it
59*4882a593Smuzhiyunassumes the data provided by the driver is valid.
60*4882a593Smuzhiyun
61*4882a593SmuzhiyunStep #3 decides on the suggested configuration based on the result from step #2
62*4882a593Smuzhiyunand the internal state of the algorithm. The states reflect the "direction" of
63*4882a593Smuzhiyunthe algorithm: is it going left (reducing moderation), right (increasing
64*4882a593Smuzhiyunmoderation) or standing still. Another optimization is that if a decision
65*4882a593Smuzhiyunto stay still is made multiple times, the interval between iterations of the
66*4882a593Smuzhiyunalgorithm would increase in order to reduce calculation overhead. Also, after
67*4882a593Smuzhiyun"parking" on one of the most left or most right decisions, the algorithm may
68*4882a593Smuzhiyundecide to verify this decision by taking a step in the other direction. This is
69*4882a593Smuzhiyundone in order to avoid getting stuck in a "deep sleep" scenario. Once a
70*4882a593Smuzhiyundecision is made, an interrupt moderation configuration is selected from
71*4882a593Smuzhiyunthe predefined profiles.
72*4882a593Smuzhiyun
73*4882a593SmuzhiyunThe last step is to notify the registered driver that it should apply the
74*4882a593Smuzhiyunsuggested configuration. This is done by scheduling a work function, defined by
75*4882a593Smuzhiyunthe Net DIM API and provided by the registered driver.
76*4882a593Smuzhiyun
77*4882a593SmuzhiyunAs you can see, Net DIM itself does not actively interact with the system. It
78*4882a593Smuzhiyunwould have trouble making the correct decisions if the wrong data is supplied to
79*4882a593Smuzhiyunit and it would be useless if the work function would not apply the suggested
80*4882a593Smuzhiyunconfiguration. This does, however, allow the registered driver some room for
81*4882a593Smuzhiyunmanoeuvre as it may provide partial data or ignore the algorithm suggestion
82*4882a593Smuzhiyununder some conditions.
83*4882a593Smuzhiyun
84*4882a593Smuzhiyun
85*4882a593SmuzhiyunRegistering a Network Device to DIM
86*4882a593Smuzhiyun===================================
87*4882a593Smuzhiyun
88*4882a593SmuzhiyunNet DIM API exposes the main function net_dim().
89*4882a593SmuzhiyunThis function is the entry point to the Net
90*4882a593SmuzhiyunDIM algorithm and has to be called every time the driver would like to check if
91*4882a593Smuzhiyunit should change interrupt moderation parameters. The driver should provide two
92*4882a593Smuzhiyundata structures: :c:type:`struct dim <dim>` and
93*4882a593Smuzhiyun:c:type:`struct dim_sample <dim_sample>`. :c:type:`struct dim <dim>`
94*4882a593Smuzhiyundescribes the state of DIM for a specific object (RX queue, TX queue,
95*4882a593Smuzhiyunother queues, etc.). This includes the current selected profile, previous data
96*4882a593Smuzhiyunsamples, the callback function provided by the driver and more.
97*4882a593Smuzhiyun:c:type:`struct dim_sample <dim_sample>` describes a data sample,
98*4882a593Smuzhiyunwhich will be compared to the data sample stored in :c:type:`struct dim <dim>`
99*4882a593Smuzhiyunin order to decide on the algorithm's next
100*4882a593Smuzhiyunstep. The sample should include bytes, packets and interrupts, measured by
101*4882a593Smuzhiyunthe driver.
102*4882a593Smuzhiyun
103*4882a593SmuzhiyunIn order to use Net DIM from a networking driver, the driver needs to call the
104*4882a593Smuzhiyunmain net_dim() function. The recommended method is to call net_dim() on each
105*4882a593Smuzhiyuninterrupt. Since Net DIM has a built-in moderation and it might decide to skip
106*4882a593Smuzhiyuniterations under certain conditions, there is no need to moderate the net_dim()
107*4882a593Smuzhiyuncalls as well. As mentioned above, the driver needs to provide an object of type
108*4882a593Smuzhiyun:c:type:`struct dim <dim>` to the net_dim() function call. It is advised for
109*4882a593Smuzhiyuneach entity using Net DIM to hold a :c:type:`struct dim <dim>` as part of its
110*4882a593Smuzhiyundata structure and use it as the main Net DIM API object.
111*4882a593SmuzhiyunThe :c:type:`struct dim_sample <dim_sample>` should hold the latest
112*4882a593Smuzhiyunbytes, packets and interrupts count. No need to perform any calculations, just
113*4882a593Smuzhiyuninclude the raw data.
114*4882a593Smuzhiyun
115*4882a593SmuzhiyunThe net_dim() call itself does not return anything. Instead Net DIM relies on
116*4882a593Smuzhiyunthe driver to provide a callback function, which is called when the algorithm
117*4882a593Smuzhiyundecides to make a change in the interrupt moderation parameters. This callback
118*4882a593Smuzhiyunwill be scheduled and run in a separate thread in order not to add overhead to
119*4882a593Smuzhiyunthe data flow. After the work is done, Net DIM algorithm needs to be set to
120*4882a593Smuzhiyunthe proper state in order to move to the next iteration.
121*4882a593Smuzhiyun
122*4882a593Smuzhiyun
123*4882a593SmuzhiyunExample
124*4882a593Smuzhiyun=======
125*4882a593Smuzhiyun
126*4882a593SmuzhiyunThe following code demonstrates how to register a driver to Net DIM. The actual
127*4882a593Smuzhiyunusage is not complete but it should make the outline of the usage clear.
128*4882a593Smuzhiyun
129*4882a593Smuzhiyun.. code-block:: c
130*4882a593Smuzhiyun
131*4882a593Smuzhiyun  #include <linux/dim.h>
132*4882a593Smuzhiyun
133*4882a593Smuzhiyun  /* Callback for net DIM to schedule on a decision to change moderation */
134*4882a593Smuzhiyun  void my_driver_do_dim_work(struct work_struct *work)
135*4882a593Smuzhiyun  {
136*4882a593Smuzhiyun	/* Get struct dim from struct work_struct */
137*4882a593Smuzhiyun	struct dim *dim = container_of(work, struct dim,
138*4882a593Smuzhiyun				       work);
139*4882a593Smuzhiyun	/* Do interrupt moderation related stuff */
140*4882a593Smuzhiyun	...
141*4882a593Smuzhiyun
142*4882a593Smuzhiyun	/* Signal net DIM work is done and it should move to next iteration */
143*4882a593Smuzhiyun	dim->state = DIM_START_MEASURE;
144*4882a593Smuzhiyun  }
145*4882a593Smuzhiyun
146*4882a593Smuzhiyun  /* My driver's interrupt handler */
147*4882a593Smuzhiyun  int my_driver_handle_interrupt(struct my_driver_entity *my_entity, ...)
148*4882a593Smuzhiyun  {
149*4882a593Smuzhiyun	...
150*4882a593Smuzhiyun	/* A struct to hold current measured data */
151*4882a593Smuzhiyun	struct dim_sample dim_sample;
152*4882a593Smuzhiyun	...
153*4882a593Smuzhiyun	/* Initiate data sample struct with current data */
154*4882a593Smuzhiyun	dim_update_sample(my_entity->events,
155*4882a593Smuzhiyun		          my_entity->packets,
156*4882a593Smuzhiyun		          my_entity->bytes,
157*4882a593Smuzhiyun		          &dim_sample);
158*4882a593Smuzhiyun	/* Call net DIM */
159*4882a593Smuzhiyun	net_dim(&my_entity->dim, dim_sample);
160*4882a593Smuzhiyun	...
161*4882a593Smuzhiyun  }
162*4882a593Smuzhiyun
163*4882a593Smuzhiyun  /* My entity's initialization function (my_entity was already allocated) */
164*4882a593Smuzhiyun  int my_driver_init_my_entity(struct my_driver_entity *my_entity, ...)
165*4882a593Smuzhiyun  {
166*4882a593Smuzhiyun	...
167*4882a593Smuzhiyun	/* Initiate struct work_struct with my driver's callback function */
168*4882a593Smuzhiyun	INIT_WORK(&my_entity->dim.work, my_driver_do_dim_work);
169*4882a593Smuzhiyun	...
170*4882a593Smuzhiyun  }
171*4882a593Smuzhiyun
172*4882a593SmuzhiyunDynamic Interrupt Moderation (DIM) library API
173*4882a593Smuzhiyun==============================================
174*4882a593Smuzhiyun
175*4882a593Smuzhiyun.. kernel-doc:: include/linux/dim.h
176*4882a593Smuzhiyun    :internal:
177