1*4882a593Smuzhiyun.. _zsmalloc: 2*4882a593Smuzhiyun 3*4882a593Smuzhiyun======== 4*4882a593Smuzhiyunzsmalloc 5*4882a593Smuzhiyun======== 6*4882a593Smuzhiyun 7*4882a593SmuzhiyunThis allocator is designed for use with zram. Thus, the allocator is 8*4882a593Smuzhiyunsupposed to work well under low memory conditions. In particular, it 9*4882a593Smuzhiyunnever attempts higher order page allocation which is very likely to 10*4882a593Smuzhiyunfail under memory pressure. On the other hand, if we just use single 11*4882a593Smuzhiyun(0-order) pages, it would suffer from very high fragmentation -- 12*4882a593Smuzhiyunany object of size PAGE_SIZE/2 or larger would occupy an entire page. 13*4882a593SmuzhiyunThis was one of the major issues with its predecessor (xvmalloc). 14*4882a593Smuzhiyun 15*4882a593SmuzhiyunTo overcome these issues, zsmalloc allocates a bunch of 0-order pages 16*4882a593Smuzhiyunand links them together using various 'struct page' fields. These linked 17*4882a593Smuzhiyunpages act as a single higher-order page i.e. an object can span 0-order 18*4882a593Smuzhiyunpage boundaries. The code refers to these linked pages as a single entity 19*4882a593Smuzhiyuncalled zspage. 20*4882a593Smuzhiyun 21*4882a593SmuzhiyunFor simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE 22*4882a593Smuzhiyunsince this satisfies the requirements of all its current users (in the 23*4882a593Smuzhiyunworst case, page is incompressible and is thus stored "as-is" i.e. in 24*4882a593Smuzhiyununcompressed form). For allocation requests larger than this size, failure 25*4882a593Smuzhiyunis returned (see zs_malloc). 26*4882a593Smuzhiyun 27*4882a593SmuzhiyunAdditionally, zs_malloc() does not return a dereferenceable pointer. 28*4882a593SmuzhiyunInstead, it returns an opaque handle (unsigned long) which encodes actual 29*4882a593Smuzhiyunlocation of the allocated object. The reason for this indirection is that 30*4882a593Smuzhiyunzsmalloc does not keep zspages permanently mapped since that would cause 31*4882a593Smuzhiyunissues on 32-bit systems where the VA region for kernel space mappings 32*4882a593Smuzhiyunis very small. So, before using the allocating memory, the object has to 33*4882a593Smuzhiyunbe mapped using zs_map_object() to get a usable pointer and subsequently 34*4882a593Smuzhiyununmapped using zs_unmap_object(). 35*4882a593Smuzhiyun 36*4882a593Smuzhiyunstat 37*4882a593Smuzhiyun==== 38*4882a593Smuzhiyun 39*4882a593SmuzhiyunWith CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via 40*4882a593Smuzhiyun``/sys/kernel/debug/zsmalloc/<user name>``. Here is a sample of stat output:: 41*4882a593Smuzhiyun 42*4882a593Smuzhiyun # cat /sys/kernel/debug/zsmalloc/zram0/classes 43*4882a593Smuzhiyun 44*4882a593Smuzhiyun class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage 45*4882a593Smuzhiyun ... 46*4882a593Smuzhiyun ... 47*4882a593Smuzhiyun 9 176 0 1 186 129 8 4 48*4882a593Smuzhiyun 10 192 1 0 2880 2872 135 3 49*4882a593Smuzhiyun 11 208 0 1 819 795 42 2 50*4882a593Smuzhiyun 12 224 0 1 219 159 12 4 51*4882a593Smuzhiyun ... 52*4882a593Smuzhiyun ... 53*4882a593Smuzhiyun 54*4882a593Smuzhiyun 55*4882a593Smuzhiyunclass 56*4882a593Smuzhiyun index 57*4882a593Smuzhiyunsize 58*4882a593Smuzhiyun object size zspage stores 59*4882a593Smuzhiyunalmost_empty 60*4882a593Smuzhiyun the number of ZS_ALMOST_EMPTY zspages(see below) 61*4882a593Smuzhiyunalmost_full 62*4882a593Smuzhiyun the number of ZS_ALMOST_FULL zspages(see below) 63*4882a593Smuzhiyunobj_allocated 64*4882a593Smuzhiyun the number of objects allocated 65*4882a593Smuzhiyunobj_used 66*4882a593Smuzhiyun the number of objects allocated to the user 67*4882a593Smuzhiyunpages_used 68*4882a593Smuzhiyun the number of pages allocated for the class 69*4882a593Smuzhiyunpages_per_zspage 70*4882a593Smuzhiyun the number of 0-order pages to make a zspage 71*4882a593Smuzhiyun 72*4882a593SmuzhiyunWe assign a zspage to ZS_ALMOST_EMPTY fullness group when n <= N / f, where 73*4882a593Smuzhiyun 74*4882a593Smuzhiyun* n = number of allocated objects 75*4882a593Smuzhiyun* N = total number of objects zspage can store 76*4882a593Smuzhiyun* f = fullness_threshold_frac(ie, 4 at the moment) 77*4882a593Smuzhiyun 78*4882a593SmuzhiyunSimilarly, we assign zspage to: 79*4882a593Smuzhiyun 80*4882a593Smuzhiyun* ZS_ALMOST_FULL when n > N / f 81*4882a593Smuzhiyun* ZS_EMPTY when n == 0 82*4882a593Smuzhiyun* ZS_FULL when n == N 83