xref: /OK3568_Linux_fs/kernel/Documentation/core-api/packing.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun================================================
2*4882a593SmuzhiyunGeneric bitfield packing and unpacking functions
3*4882a593Smuzhiyun================================================
4*4882a593Smuzhiyun
5*4882a593SmuzhiyunProblem statement
6*4882a593Smuzhiyun-----------------
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunWhen working with hardware, one has to choose between several approaches of
9*4882a593Smuzhiyuninterfacing with it.
10*4882a593SmuzhiyunOne can memory-map a pointer to a carefully crafted struct over the hardware
11*4882a593Smuzhiyundevice's memory region, and access its fields as struct members (potentially
12*4882a593Smuzhiyundeclared as bitfields). But writing code this way would make it less portable,
13*4882a593Smuzhiyundue to potential endianness mismatches between the CPU and the hardware device.
14*4882a593SmuzhiyunAdditionally, one has to pay close attention when translating register
15*4882a593Smuzhiyundefinitions from the hardware documentation into bit field indices for the
16*4882a593Smuzhiyunstructs. Also, some hardware (typically networking equipment) tends to group
17*4882a593Smuzhiyunits register fields in ways that violate any reasonable word boundaries
18*4882a593Smuzhiyun(sometimes even 64 bit ones). This creates the inconvenience of having to
19*4882a593Smuzhiyundefine "high" and "low" portions of register fields within the struct.
20*4882a593SmuzhiyunA more robust alternative to struct field definitions would be to extract the
21*4882a593Smuzhiyunrequired fields by shifting the appropriate number of bits. But this would
22*4882a593Smuzhiyunstill not protect from endianness mismatches, except if all memory accesses
23*4882a593Smuzhiyunwere performed byte-by-byte. Also the code can easily get cluttered, and the
24*4882a593Smuzhiyunhigh-level idea might get lost among the many bit shifts required.
25*4882a593SmuzhiyunMany drivers take the bit-shifting approach and then attempt to reduce the
26*4882a593Smuzhiyunclutter with tailored macros, but more often than not these macros take
27*4882a593Smuzhiyunshortcuts that still prevent the code from being truly portable.
28*4882a593Smuzhiyun
29*4882a593SmuzhiyunThe solution
30*4882a593Smuzhiyun------------
31*4882a593Smuzhiyun
32*4882a593SmuzhiyunThis API deals with 2 basic operations:
33*4882a593Smuzhiyun
34*4882a593Smuzhiyun  - Packing a CPU-usable number into a memory buffer (with hardware
35*4882a593Smuzhiyun    constraints/quirks)
36*4882a593Smuzhiyun  - Unpacking a memory buffer (which has hardware constraints/quirks)
37*4882a593Smuzhiyun    into a CPU-usable number.
38*4882a593Smuzhiyun
39*4882a593SmuzhiyunThe API offers an abstraction over said hardware constraints and quirks,
40*4882a593Smuzhiyunover CPU endianness and therefore between possible mismatches between
41*4882a593Smuzhiyunthe two.
42*4882a593Smuzhiyun
43*4882a593SmuzhiyunThe basic unit of these API functions is the u64. From the CPU's
44*4882a593Smuzhiyunperspective, bit 63 always means bit offset 7 of byte 7, albeit only
45*4882a593Smuzhiyunlogically. The question is: where do we lay this bit out in memory?
46*4882a593Smuzhiyun
47*4882a593SmuzhiyunThe following examples cover the memory layout of a packed u64 field.
48*4882a593SmuzhiyunThe byte offsets in the packed buffer are always implicitly 0, 1, ... 7.
49*4882a593SmuzhiyunWhat the examples show is where the logical bytes and bits sit.
50*4882a593Smuzhiyun
51*4882a593Smuzhiyun1. Normally (no quirks), we would do it like this:
52*4882a593Smuzhiyun
53*4882a593Smuzhiyun::
54*4882a593Smuzhiyun
55*4882a593Smuzhiyun  63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
56*4882a593Smuzhiyun  7                       6                       5                        4
57*4882a593Smuzhiyun  31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
58*4882a593Smuzhiyun  3                       2                       1                        0
59*4882a593Smuzhiyun
60*4882a593SmuzhiyunThat is, the MSByte (7) of the CPU-usable u64 sits at memory offset 0, and the
61*4882a593SmuzhiyunLSByte (0) of the u64 sits at memory offset 7.
62*4882a593SmuzhiyunThis corresponds to what most folks would regard to as "big endian", where
63*4882a593Smuzhiyunbit i corresponds to the number 2^i. This is also referred to in the code
64*4882a593Smuzhiyuncomments as "logical" notation.
65*4882a593Smuzhiyun
66*4882a593Smuzhiyun
67*4882a593Smuzhiyun2. If QUIRK_MSB_ON_THE_RIGHT is set, we do it like this:
68*4882a593Smuzhiyun
69*4882a593Smuzhiyun::
70*4882a593Smuzhiyun
71*4882a593Smuzhiyun  56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
72*4882a593Smuzhiyun  7                       6                        5                       4
73*4882a593Smuzhiyun  24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23  8  9 10 11 12 13 14 15  0  1  2  3  4  5  6  7
74*4882a593Smuzhiyun  3                       2                        1                       0
75*4882a593Smuzhiyun
76*4882a593SmuzhiyunThat is, QUIRK_MSB_ON_THE_RIGHT does not affect byte positioning, but
77*4882a593Smuzhiyuninverts bit offsets inside a byte.
78*4882a593Smuzhiyun
79*4882a593Smuzhiyun
80*4882a593Smuzhiyun3. If QUIRK_LITTLE_ENDIAN is set, we do it like this:
81*4882a593Smuzhiyun
82*4882a593Smuzhiyun::
83*4882a593Smuzhiyun
84*4882a593Smuzhiyun  39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56
85*4882a593Smuzhiyun  4                       5                       6                       7
86*4882a593Smuzhiyun  7  6  5  4  3  2  1  0  15 14 13 12 11 10  9  8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24
87*4882a593Smuzhiyun  0                       1                       2                       3
88*4882a593Smuzhiyun
89*4882a593SmuzhiyunTherefore, QUIRK_LITTLE_ENDIAN means that inside the memory region, every
90*4882a593Smuzhiyunbyte from each 4-byte word is placed at its mirrored position compared to
91*4882a593Smuzhiyunthe boundary of that word.
92*4882a593Smuzhiyun
93*4882a593Smuzhiyun4. If QUIRK_MSB_ON_THE_RIGHT and QUIRK_LITTLE_ENDIAN are both set, we do it
94*4882a593Smuzhiyun   like this:
95*4882a593Smuzhiyun
96*4882a593Smuzhiyun::
97*4882a593Smuzhiyun
98*4882a593Smuzhiyun  32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
99*4882a593Smuzhiyun  4                       5                       6                       7
100*4882a593Smuzhiyun  0  1  2  3  4  5  6  7  8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
101*4882a593Smuzhiyun  0                       1                       2                       3
102*4882a593Smuzhiyun
103*4882a593Smuzhiyun
104*4882a593Smuzhiyun5. If just QUIRK_LSW32_IS_FIRST is set, we do it like this:
105*4882a593Smuzhiyun
106*4882a593Smuzhiyun::
107*4882a593Smuzhiyun
108*4882a593Smuzhiyun  31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
109*4882a593Smuzhiyun  3                       2                       1                        0
110*4882a593Smuzhiyun  63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
111*4882a593Smuzhiyun  7                       6                       5                        4
112*4882a593Smuzhiyun
113*4882a593SmuzhiyunIn this case the 8 byte memory region is interpreted as follows: first
114*4882a593Smuzhiyun4 bytes correspond to the least significant 4-byte word, next 4 bytes to
115*4882a593Smuzhiyunthe more significant 4-byte word.
116*4882a593Smuzhiyun
117*4882a593Smuzhiyun
118*4882a593Smuzhiyun6. If QUIRK_LSW32_IS_FIRST and QUIRK_MSB_ON_THE_RIGHT are set, we do it like
119*4882a593Smuzhiyun   this:
120*4882a593Smuzhiyun
121*4882a593Smuzhiyun::
122*4882a593Smuzhiyun
123*4882a593Smuzhiyun  24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23  8  9 10 11 12 13 14 15  0  1  2  3  4  5  6  7
124*4882a593Smuzhiyun  3                       2                        1                       0
125*4882a593Smuzhiyun  56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
126*4882a593Smuzhiyun  7                       6                        5                       4
127*4882a593Smuzhiyun
128*4882a593Smuzhiyun
129*4882a593Smuzhiyun7. If QUIRK_LSW32_IS_FIRST and QUIRK_LITTLE_ENDIAN are set, it looks like
130*4882a593Smuzhiyun   this:
131*4882a593Smuzhiyun
132*4882a593Smuzhiyun::
133*4882a593Smuzhiyun
134*4882a593Smuzhiyun  7  6  5  4  3  2  1  0  15 14 13 12 11 10  9  8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24
135*4882a593Smuzhiyun  0                       1                       2                       3
136*4882a593Smuzhiyun  39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56
137*4882a593Smuzhiyun  4                       5                       6                       7
138*4882a593Smuzhiyun
139*4882a593Smuzhiyun
140*4882a593Smuzhiyun8. If QUIRK_LSW32_IS_FIRST, QUIRK_LITTLE_ENDIAN and QUIRK_MSB_ON_THE_RIGHT
141*4882a593Smuzhiyun   are set, it looks like this:
142*4882a593Smuzhiyun
143*4882a593Smuzhiyun::
144*4882a593Smuzhiyun
145*4882a593Smuzhiyun  0  1  2  3  4  5  6  7  8   9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
146*4882a593Smuzhiyun  0                       1                       2                       3
147*4882a593Smuzhiyun  32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
148*4882a593Smuzhiyun  4                       5                       6                       7
149*4882a593Smuzhiyun
150*4882a593Smuzhiyun
151*4882a593SmuzhiyunWe always think of our offsets as if there were no quirk, and we translate
152*4882a593Smuzhiyunthem afterwards, before accessing the memory region.
153*4882a593Smuzhiyun
154*4882a593SmuzhiyunIntended use
155*4882a593Smuzhiyun------------
156*4882a593Smuzhiyun
157*4882a593SmuzhiyunDrivers that opt to use this API first need to identify which of the above 3
158*4882a593Smuzhiyunquirk combinations (for a total of 8) match what the hardware documentation
159*4882a593Smuzhiyundescribes. Then they should wrap the packing() function, creating a new
160*4882a593Smuzhiyunxxx_packing() that calls it using the proper QUIRK_* one-hot bits set.
161*4882a593Smuzhiyun
162*4882a593SmuzhiyunThe packing() function returns an int-encoded error code, which protects the
163*4882a593Smuzhiyunprogrammer against incorrect API use.  The errors are not expected to occur
164*4882a593Smuzhiyundurring runtime, therefore it is reasonable for xxx_packing() to return void
165*4882a593Smuzhiyunand simply swallow those errors. Optionally it can dump stack or print the
166*4882a593Smuzhiyunerror description.
167