1*4882a593Smuzhiyun================================================ 2*4882a593SmuzhiyunGeneric bitfield packing and unpacking functions 3*4882a593Smuzhiyun================================================ 4*4882a593Smuzhiyun 5*4882a593SmuzhiyunProblem statement 6*4882a593Smuzhiyun----------------- 7*4882a593Smuzhiyun 8*4882a593SmuzhiyunWhen working with hardware, one has to choose between several approaches of 9*4882a593Smuzhiyuninterfacing with it. 10*4882a593SmuzhiyunOne can memory-map a pointer to a carefully crafted struct over the hardware 11*4882a593Smuzhiyundevice's memory region, and access its fields as struct members (potentially 12*4882a593Smuzhiyundeclared as bitfields). But writing code this way would make it less portable, 13*4882a593Smuzhiyundue to potential endianness mismatches between the CPU and the hardware device. 14*4882a593SmuzhiyunAdditionally, one has to pay close attention when translating register 15*4882a593Smuzhiyundefinitions from the hardware documentation into bit field indices for the 16*4882a593Smuzhiyunstructs. Also, some hardware (typically networking equipment) tends to group 17*4882a593Smuzhiyunits register fields in ways that violate any reasonable word boundaries 18*4882a593Smuzhiyun(sometimes even 64 bit ones). This creates the inconvenience of having to 19*4882a593Smuzhiyundefine "high" and "low" portions of register fields within the struct. 20*4882a593SmuzhiyunA more robust alternative to struct field definitions would be to extract the 21*4882a593Smuzhiyunrequired fields by shifting the appropriate number of bits. But this would 22*4882a593Smuzhiyunstill not protect from endianness mismatches, except if all memory accesses 23*4882a593Smuzhiyunwere performed byte-by-byte. Also the code can easily get cluttered, and the 24*4882a593Smuzhiyunhigh-level idea might get lost among the many bit shifts required. 25*4882a593SmuzhiyunMany drivers take the bit-shifting approach and then attempt to reduce the 26*4882a593Smuzhiyunclutter with tailored macros, but more often than not these macros take 27*4882a593Smuzhiyunshortcuts that still prevent the code from being truly portable. 28*4882a593Smuzhiyun 29*4882a593SmuzhiyunThe solution 30*4882a593Smuzhiyun------------ 31*4882a593Smuzhiyun 32*4882a593SmuzhiyunThis API deals with 2 basic operations: 33*4882a593Smuzhiyun 34*4882a593Smuzhiyun - Packing a CPU-usable number into a memory buffer (with hardware 35*4882a593Smuzhiyun constraints/quirks) 36*4882a593Smuzhiyun - Unpacking a memory buffer (which has hardware constraints/quirks) 37*4882a593Smuzhiyun into a CPU-usable number. 38*4882a593Smuzhiyun 39*4882a593SmuzhiyunThe API offers an abstraction over said hardware constraints and quirks, 40*4882a593Smuzhiyunover CPU endianness and therefore between possible mismatches between 41*4882a593Smuzhiyunthe two. 42*4882a593Smuzhiyun 43*4882a593SmuzhiyunThe basic unit of these API functions is the u64. From the CPU's 44*4882a593Smuzhiyunperspective, bit 63 always means bit offset 7 of byte 7, albeit only 45*4882a593Smuzhiyunlogically. The question is: where do we lay this bit out in memory? 46*4882a593Smuzhiyun 47*4882a593SmuzhiyunThe following examples cover the memory layout of a packed u64 field. 48*4882a593SmuzhiyunThe byte offsets in the packed buffer are always implicitly 0, 1, ... 7. 49*4882a593SmuzhiyunWhat the examples show is where the logical bytes and bits sit. 50*4882a593Smuzhiyun 51*4882a593Smuzhiyun1. Normally (no quirks), we would do it like this: 52*4882a593Smuzhiyun 53*4882a593Smuzhiyun:: 54*4882a593Smuzhiyun 55*4882a593Smuzhiyun 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 56*4882a593Smuzhiyun 7 6 5 4 57*4882a593Smuzhiyun 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 58*4882a593Smuzhiyun 3 2 1 0 59*4882a593Smuzhiyun 60*4882a593SmuzhiyunThat is, the MSByte (7) of the CPU-usable u64 sits at memory offset 0, and the 61*4882a593SmuzhiyunLSByte (0) of the u64 sits at memory offset 7. 62*4882a593SmuzhiyunThis corresponds to what most folks would regard to as "big endian", where 63*4882a593Smuzhiyunbit i corresponds to the number 2^i. This is also referred to in the code 64*4882a593Smuzhiyuncomments as "logical" notation. 65*4882a593Smuzhiyun 66*4882a593Smuzhiyun 67*4882a593Smuzhiyun2. If QUIRK_MSB_ON_THE_RIGHT is set, we do it like this: 68*4882a593Smuzhiyun 69*4882a593Smuzhiyun:: 70*4882a593Smuzhiyun 71*4882a593Smuzhiyun 56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39 72*4882a593Smuzhiyun 7 6 5 4 73*4882a593Smuzhiyun 24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 74*4882a593Smuzhiyun 3 2 1 0 75*4882a593Smuzhiyun 76*4882a593SmuzhiyunThat is, QUIRK_MSB_ON_THE_RIGHT does not affect byte positioning, but 77*4882a593Smuzhiyuninverts bit offsets inside a byte. 78*4882a593Smuzhiyun 79*4882a593Smuzhiyun 80*4882a593Smuzhiyun3. If QUIRK_LITTLE_ENDIAN is set, we do it like this: 81*4882a593Smuzhiyun 82*4882a593Smuzhiyun:: 83*4882a593Smuzhiyun 84*4882a593Smuzhiyun 39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56 85*4882a593Smuzhiyun 4 5 6 7 86*4882a593Smuzhiyun 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24 87*4882a593Smuzhiyun 0 1 2 3 88*4882a593Smuzhiyun 89*4882a593SmuzhiyunTherefore, QUIRK_LITTLE_ENDIAN means that inside the memory region, every 90*4882a593Smuzhiyunbyte from each 4-byte word is placed at its mirrored position compared to 91*4882a593Smuzhiyunthe boundary of that word. 92*4882a593Smuzhiyun 93*4882a593Smuzhiyun4. If QUIRK_MSB_ON_THE_RIGHT and QUIRK_LITTLE_ENDIAN are both set, we do it 94*4882a593Smuzhiyun like this: 95*4882a593Smuzhiyun 96*4882a593Smuzhiyun:: 97*4882a593Smuzhiyun 98*4882a593Smuzhiyun 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 99*4882a593Smuzhiyun 4 5 6 7 100*4882a593Smuzhiyun 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 101*4882a593Smuzhiyun 0 1 2 3 102*4882a593Smuzhiyun 103*4882a593Smuzhiyun 104*4882a593Smuzhiyun5. If just QUIRK_LSW32_IS_FIRST is set, we do it like this: 105*4882a593Smuzhiyun 106*4882a593Smuzhiyun:: 107*4882a593Smuzhiyun 108*4882a593Smuzhiyun 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 109*4882a593Smuzhiyun 3 2 1 0 110*4882a593Smuzhiyun 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 111*4882a593Smuzhiyun 7 6 5 4 112*4882a593Smuzhiyun 113*4882a593SmuzhiyunIn this case the 8 byte memory region is interpreted as follows: first 114*4882a593Smuzhiyun4 bytes correspond to the least significant 4-byte word, next 4 bytes to 115*4882a593Smuzhiyunthe more significant 4-byte word. 116*4882a593Smuzhiyun 117*4882a593Smuzhiyun 118*4882a593Smuzhiyun6. If QUIRK_LSW32_IS_FIRST and QUIRK_MSB_ON_THE_RIGHT are set, we do it like 119*4882a593Smuzhiyun this: 120*4882a593Smuzhiyun 121*4882a593Smuzhiyun:: 122*4882a593Smuzhiyun 123*4882a593Smuzhiyun 24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 124*4882a593Smuzhiyun 3 2 1 0 125*4882a593Smuzhiyun 56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39 126*4882a593Smuzhiyun 7 6 5 4 127*4882a593Smuzhiyun 128*4882a593Smuzhiyun 129*4882a593Smuzhiyun7. If QUIRK_LSW32_IS_FIRST and QUIRK_LITTLE_ENDIAN are set, it looks like 130*4882a593Smuzhiyun this: 131*4882a593Smuzhiyun 132*4882a593Smuzhiyun:: 133*4882a593Smuzhiyun 134*4882a593Smuzhiyun 7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24 135*4882a593Smuzhiyun 0 1 2 3 136*4882a593Smuzhiyun 39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56 137*4882a593Smuzhiyun 4 5 6 7 138*4882a593Smuzhiyun 139*4882a593Smuzhiyun 140*4882a593Smuzhiyun8. If QUIRK_LSW32_IS_FIRST, QUIRK_LITTLE_ENDIAN and QUIRK_MSB_ON_THE_RIGHT 141*4882a593Smuzhiyun are set, it looks like this: 142*4882a593Smuzhiyun 143*4882a593Smuzhiyun:: 144*4882a593Smuzhiyun 145*4882a593Smuzhiyun 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 146*4882a593Smuzhiyun 0 1 2 3 147*4882a593Smuzhiyun 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 148*4882a593Smuzhiyun 4 5 6 7 149*4882a593Smuzhiyun 150*4882a593Smuzhiyun 151*4882a593SmuzhiyunWe always think of our offsets as if there were no quirk, and we translate 152*4882a593Smuzhiyunthem afterwards, before accessing the memory region. 153*4882a593Smuzhiyun 154*4882a593SmuzhiyunIntended use 155*4882a593Smuzhiyun------------ 156*4882a593Smuzhiyun 157*4882a593SmuzhiyunDrivers that opt to use this API first need to identify which of the above 3 158*4882a593Smuzhiyunquirk combinations (for a total of 8) match what the hardware documentation 159*4882a593Smuzhiyundescribes. Then they should wrap the packing() function, creating a new 160*4882a593Smuzhiyunxxx_packing() that calls it using the proper QUIRK_* one-hot bits set. 161*4882a593Smuzhiyun 162*4882a593SmuzhiyunThe packing() function returns an int-encoded error code, which protects the 163*4882a593Smuzhiyunprogrammer against incorrect API use. The errors are not expected to occur 164*4882a593Smuzhiyundurring runtime, therefore it is reasonable for xxx_packing() to return void 165*4882a593Smuzhiyunand simply swallow those errors. Optionally it can dump stack or print the 166*4882a593Smuzhiyunerror description. 167