xref: /OK3568_Linux_fs/kernel/Documentation/core-api/padata.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0
2*4882a593Smuzhiyun
3*4882a593Smuzhiyun=======================================
4*4882a593SmuzhiyunThe padata parallel execution mechanism
5*4882a593Smuzhiyun=======================================
6*4882a593Smuzhiyun
7*4882a593Smuzhiyun:Date: May 2020
8*4882a593Smuzhiyun
9*4882a593SmuzhiyunPadata is a mechanism by which the kernel can farm jobs out to be done in
10*4882a593Smuzhiyunparallel on multiple CPUs while optionally retaining their ordering.
11*4882a593Smuzhiyun
12*4882a593SmuzhiyunIt was originally developed for IPsec, which needs to perform encryption and
13*4882a593Smuzhiyundecryption on large numbers of packets without reordering those packets.  This
14*4882a593Smuzhiyunis currently the sole consumer of padata's serialized job support.
15*4882a593Smuzhiyun
16*4882a593SmuzhiyunPadata also supports multithreaded jobs, splitting up the job evenly while load
17*4882a593Smuzhiyunbalancing and coordinating between threads.
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunRunning Serialized Jobs
20*4882a593Smuzhiyun=======================
21*4882a593Smuzhiyun
22*4882a593SmuzhiyunInitializing
23*4882a593Smuzhiyun------------
24*4882a593Smuzhiyun
25*4882a593SmuzhiyunThe first step in using padata to run serialized jobs is to set up a
26*4882a593Smuzhiyunpadata_instance structure for overall control of how jobs are to be run::
27*4882a593Smuzhiyun
28*4882a593Smuzhiyun    #include <linux/padata.h>
29*4882a593Smuzhiyun
30*4882a593Smuzhiyun    struct padata_instance *padata_alloc(const char *name);
31*4882a593Smuzhiyun
32*4882a593Smuzhiyun'name' simply identifies the instance.
33*4882a593Smuzhiyun
34*4882a593SmuzhiyunThen, complete padata initialization by allocating a padata_shell::
35*4882a593Smuzhiyun
36*4882a593Smuzhiyun   struct padata_shell *padata_alloc_shell(struct padata_instance *pinst);
37*4882a593Smuzhiyun
38*4882a593SmuzhiyunA padata_shell is used to submit a job to padata and allows a series of such
39*4882a593Smuzhiyunjobs to be serialized independently.  A padata_instance may have one or more
40*4882a593Smuzhiyunpadata_shells associated with it, each allowing a separate series of jobs.
41*4882a593Smuzhiyun
42*4882a593SmuzhiyunModifying cpumasks
43*4882a593Smuzhiyun------------------
44*4882a593Smuzhiyun
45*4882a593SmuzhiyunThe CPUs used to run jobs can be changed in two ways, programatically with
46*4882a593Smuzhiyunpadata_set_cpumask() or via sysfs.  The former is defined::
47*4882a593Smuzhiyun
48*4882a593Smuzhiyun    int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
49*4882a593Smuzhiyun			   cpumask_var_t cpumask);
50*4882a593Smuzhiyun
51*4882a593SmuzhiyunHere cpumask_type is one of PADATA_CPU_PARALLEL or PADATA_CPU_SERIAL, where a
52*4882a593Smuzhiyunparallel cpumask describes which processors will be used to execute jobs
53*4882a593Smuzhiyunsubmitted to this instance in parallel and a serial cpumask defines which
54*4882a593Smuzhiyunprocessors are allowed to be used as the serialization callback processor.
55*4882a593Smuzhiyuncpumask specifies the new cpumask to use.
56*4882a593Smuzhiyun
57*4882a593SmuzhiyunThere may be sysfs files for an instance's cpumasks.  For example, pcrypt's
58*4882a593Smuzhiyunlive in /sys/kernel/pcrypt/<instance-name>.  Within an instance's directory
59*4882a593Smuzhiyunthere are two files, parallel_cpumask and serial_cpumask, and either cpumask
60*4882a593Smuzhiyunmay be changed by echoing a bitmask into the file, for example::
61*4882a593Smuzhiyun
62*4882a593Smuzhiyun    echo f > /sys/kernel/pcrypt/pencrypt/parallel_cpumask
63*4882a593Smuzhiyun
64*4882a593SmuzhiyunReading one of these files shows the user-supplied cpumask, which may be
65*4882a593Smuzhiyundifferent from the 'usable' cpumask.
66*4882a593Smuzhiyun
67*4882a593SmuzhiyunPadata maintains two pairs of cpumasks internally, the user-supplied cpumasks
68*4882a593Smuzhiyunand the 'usable' cpumasks.  (Each pair consists of a parallel and a serial
69*4882a593Smuzhiyuncpumask.)  The user-supplied cpumasks default to all possible CPUs on instance
70*4882a593Smuzhiyunallocation and may be changed as above.  The usable cpumasks are always a
71*4882a593Smuzhiyunsubset of the user-supplied cpumasks and contain only the online CPUs in the
72*4882a593Smuzhiyunuser-supplied masks; these are the cpumasks padata actually uses.  So it is
73*4882a593Smuzhiyunlegal to supply a cpumask to padata that contains offline CPUs.  Once an
74*4882a593Smuzhiyunoffline CPU in the user-supplied cpumask comes online, padata is going to use
75*4882a593Smuzhiyunit.
76*4882a593Smuzhiyun
77*4882a593SmuzhiyunChanging the CPU masks are expensive operations, so it should not be done with
78*4882a593Smuzhiyungreat frequency.
79*4882a593Smuzhiyun
80*4882a593SmuzhiyunRunning A Job
81*4882a593Smuzhiyun-------------
82*4882a593Smuzhiyun
83*4882a593SmuzhiyunActually submitting work to the padata instance requires the creation of a
84*4882a593Smuzhiyunpadata_priv structure, which represents one job::
85*4882a593Smuzhiyun
86*4882a593Smuzhiyun    struct padata_priv {
87*4882a593Smuzhiyun        /* Other stuff here... */
88*4882a593Smuzhiyun	void                    (*parallel)(struct padata_priv *padata);
89*4882a593Smuzhiyun	void                    (*serial)(struct padata_priv *padata);
90*4882a593Smuzhiyun    };
91*4882a593Smuzhiyun
92*4882a593SmuzhiyunThis structure will almost certainly be embedded within some larger
93*4882a593Smuzhiyunstructure specific to the work to be done.  Most of its fields are private to
94*4882a593Smuzhiyunpadata, but the structure should be zeroed at initialisation time, and the
95*4882a593Smuzhiyunparallel() and serial() functions should be provided.  Those functions will
96*4882a593Smuzhiyunbe called in the process of getting the work done as we will see
97*4882a593Smuzhiyunmomentarily.
98*4882a593Smuzhiyun
99*4882a593SmuzhiyunThe submission of the job is done with::
100*4882a593Smuzhiyun
101*4882a593Smuzhiyun    int padata_do_parallel(struct padata_shell *ps,
102*4882a593Smuzhiyun		           struct padata_priv *padata, int *cb_cpu);
103*4882a593Smuzhiyun
104*4882a593SmuzhiyunThe ps and padata structures must be set up as described above; cb_cpu
105*4882a593Smuzhiyunpoints to the preferred CPU to be used for the final callback when the job is
106*4882a593Smuzhiyundone; it must be in the current instance's CPU mask (if not the cb_cpu pointer
107*4882a593Smuzhiyunis updated to point to the CPU actually chosen).  The return value from
108*4882a593Smuzhiyunpadata_do_parallel() is zero on success, indicating that the job is in
109*4882a593Smuzhiyunprogress. -EBUSY means that somebody, somewhere else is messing with the
110*4882a593Smuzhiyuninstance's CPU mask, while -EINVAL is a complaint about cb_cpu not being in the
111*4882a593Smuzhiyunserial cpumask, no online CPUs in the parallel or serial cpumasks, or a stopped
112*4882a593Smuzhiyuninstance.
113*4882a593Smuzhiyun
114*4882a593SmuzhiyunEach job submitted to padata_do_parallel() will, in turn, be passed to
115*4882a593Smuzhiyunexactly one call to the above-mentioned parallel() function, on one CPU, so
116*4882a593Smuzhiyuntrue parallelism is achieved by submitting multiple jobs.  parallel() runs with
117*4882a593Smuzhiyunsoftware interrupts disabled and thus cannot sleep.  The parallel()
118*4882a593Smuzhiyunfunction gets the padata_priv structure pointer as its lone parameter;
119*4882a593Smuzhiyuninformation about the actual work to be done is probably obtained by using
120*4882a593Smuzhiyuncontainer_of() to find the enclosing structure.
121*4882a593Smuzhiyun
122*4882a593SmuzhiyunNote that parallel() has no return value; the padata subsystem assumes that
123*4882a593Smuzhiyunparallel() will take responsibility for the job from this point.  The job
124*4882a593Smuzhiyunneed not be completed during this call, but, if parallel() leaves work
125*4882a593Smuzhiyunoutstanding, it should be prepared to be called again with a new job before
126*4882a593Smuzhiyunthe previous one completes.
127*4882a593Smuzhiyun
128*4882a593SmuzhiyunSerializing Jobs
129*4882a593Smuzhiyun----------------
130*4882a593Smuzhiyun
131*4882a593SmuzhiyunWhen a job does complete, parallel() (or whatever function actually finishes
132*4882a593Smuzhiyunthe work) should inform padata of the fact with a call to::
133*4882a593Smuzhiyun
134*4882a593Smuzhiyun    void padata_do_serial(struct padata_priv *padata);
135*4882a593Smuzhiyun
136*4882a593SmuzhiyunAt some point in the future, padata_do_serial() will trigger a call to the
137*4882a593Smuzhiyunserial() function in the padata_priv structure.  That call will happen on
138*4882a593Smuzhiyunthe CPU requested in the initial call to padata_do_parallel(); it, too, is
139*4882a593Smuzhiyunrun with local software interrupts disabled.
140*4882a593SmuzhiyunNote that this call may be deferred for a while since the padata code takes
141*4882a593Smuzhiyunpains to ensure that jobs are completed in the order in which they were
142*4882a593Smuzhiyunsubmitted.
143*4882a593Smuzhiyun
144*4882a593SmuzhiyunDestroying
145*4882a593Smuzhiyun----------
146*4882a593Smuzhiyun
147*4882a593SmuzhiyunCleaning up a padata instance predictably involves calling the two free
148*4882a593Smuzhiyunfunctions that correspond to the allocation in reverse::
149*4882a593Smuzhiyun
150*4882a593Smuzhiyun    void padata_free_shell(struct padata_shell *ps);
151*4882a593Smuzhiyun    void padata_free(struct padata_instance *pinst);
152*4882a593Smuzhiyun
153*4882a593SmuzhiyunIt is the user's responsibility to ensure all outstanding jobs are complete
154*4882a593Smuzhiyunbefore any of the above are called.
155*4882a593Smuzhiyun
156*4882a593SmuzhiyunRunning Multithreaded Jobs
157*4882a593Smuzhiyun==========================
158*4882a593Smuzhiyun
159*4882a593SmuzhiyunA multithreaded job has a main thread and zero or more helper threads, with the
160*4882a593Smuzhiyunmain thread participating in the job and then waiting until all helpers have
161*4882a593Smuzhiyunfinished.  padata splits the job into units called chunks, where a chunk is a
162*4882a593Smuzhiyunpiece of the job that one thread completes in one call to the thread function.
163*4882a593Smuzhiyun
164*4882a593SmuzhiyunA user has to do three things to run a multithreaded job.  First, describe the
165*4882a593Smuzhiyunjob by defining a padata_mt_job structure, which is explained in the Interface
166*4882a593Smuzhiyunsection.  This includes a pointer to the thread function, which padata will
167*4882a593Smuzhiyuncall each time it assigns a job chunk to a thread.  Then, define the thread
168*4882a593Smuzhiyunfunction, which accepts three arguments, ``start``, ``end``, and ``arg``, where
169*4882a593Smuzhiyunthe first two delimit the range that the thread operates on and the last is a
170*4882a593Smuzhiyunpointer to the job's shared state, if any.  Prepare the shared state, which is
171*4882a593Smuzhiyuntypically allocated on the main thread's stack.  Last, call
172*4882a593Smuzhiyunpadata_do_multithreaded(), which will return once the job is finished.
173*4882a593Smuzhiyun
174*4882a593SmuzhiyunInterface
175*4882a593Smuzhiyun=========
176*4882a593Smuzhiyun
177*4882a593Smuzhiyun.. kernel-doc:: include/linux/padata.h
178*4882a593Smuzhiyun.. kernel-doc:: kernel/padata.c
179