xref: /OK3568_Linux_fs/kernel/Documentation/core-api/xarray.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0+
2*4882a593Smuzhiyun
3*4882a593Smuzhiyun======
4*4882a593SmuzhiyunXArray
5*4882a593Smuzhiyun======
6*4882a593Smuzhiyun
7*4882a593Smuzhiyun:Author: Matthew Wilcox
8*4882a593Smuzhiyun
9*4882a593SmuzhiyunOverview
10*4882a593Smuzhiyun========
11*4882a593Smuzhiyun
12*4882a593SmuzhiyunThe XArray is an abstract data type which behaves like a very large array
13*4882a593Smuzhiyunof pointers.  It meets many of the same needs as a hash or a conventional
14*4882a593Smuzhiyunresizable array.  Unlike a hash, it allows you to sensibly go to the
15*4882a593Smuzhiyunnext or previous entry in a cache-efficient manner.  In contrast to a
16*4882a593Smuzhiyunresizable array, there is no need to copy data or change MMU mappings in
17*4882a593Smuzhiyunorder to grow the array.  It is more memory-efficient, parallelisable
18*4882a593Smuzhiyunand cache friendly than a doubly-linked list.  It takes advantage of
19*4882a593SmuzhiyunRCU to perform lookups without locking.
20*4882a593Smuzhiyun
21*4882a593SmuzhiyunThe XArray implementation is efficient when the indices used are densely
22*4882a593Smuzhiyunclustered; hashing the object and using the hash as the index will not
23*4882a593Smuzhiyunperform well.  The XArray is optimised for small indices, but still has
24*4882a593Smuzhiyungood performance with large indices.  If your index can be larger than
25*4882a593Smuzhiyun``ULONG_MAX`` then the XArray is not the data type for you.  The most
26*4882a593Smuzhiyunimportant user of the XArray is the page cache.
27*4882a593Smuzhiyun
28*4882a593SmuzhiyunNormal pointers may be stored in the XArray directly.  They must be 4-byte
29*4882a593Smuzhiyunaligned, which is true for any pointer returned from kmalloc() and
30*4882a593Smuzhiyunalloc_page().  It isn't true for arbitrary user-space pointers,
31*4882a593Smuzhiyunnor for function pointers.  You can store pointers to statically allocated
32*4882a593Smuzhiyunobjects, as long as those objects have an alignment of at least 4.
33*4882a593Smuzhiyun
34*4882a593SmuzhiyunYou can also store integers between 0 and ``LONG_MAX`` in the XArray.
35*4882a593SmuzhiyunYou must first convert it into an entry using xa_mk_value().
36*4882a593SmuzhiyunWhen you retrieve an entry from the XArray, you can check whether it is
37*4882a593Smuzhiyuna value entry by calling xa_is_value(), and convert it back to
38*4882a593Smuzhiyunan integer by calling xa_to_value().
39*4882a593Smuzhiyun
40*4882a593SmuzhiyunSome users want to tag the pointers they store in the XArray.  You can
41*4882a593Smuzhiyuncall xa_tag_pointer() to create an entry with a tag, xa_untag_pointer()
42*4882a593Smuzhiyunto turn a tagged entry back into an untagged pointer and xa_pointer_tag()
43*4882a593Smuzhiyunto retrieve the tag of an entry.  Tagged pointers use the same bits that
44*4882a593Smuzhiyunare used to distinguish value entries from normal pointers, so you must
45*4882a593Smuzhiyundecide whether they want to store value entries or tagged pointers in
46*4882a593Smuzhiyunany particular XArray.
47*4882a593Smuzhiyun
48*4882a593SmuzhiyunThe XArray does not support storing IS_ERR() pointers as some
49*4882a593Smuzhiyunconflict with value entries or internal entries.
50*4882a593Smuzhiyun
51*4882a593SmuzhiyunAn unusual feature of the XArray is the ability to create entries which
52*4882a593Smuzhiyunoccupy a range of indices.  Once stored to, looking up any index in
53*4882a593Smuzhiyunthe range will return the same entry as looking up any other index in
54*4882a593Smuzhiyunthe range.  Storing to any index will store to all of them.  Multi-index
55*4882a593Smuzhiyunentries can be explicitly split into smaller entries, or storing ``NULL``
56*4882a593Smuzhiyuninto any entry will cause the XArray to forget about the range.
57*4882a593Smuzhiyun
58*4882a593SmuzhiyunNormal API
59*4882a593Smuzhiyun==========
60*4882a593Smuzhiyun
61*4882a593SmuzhiyunStart by initialising an XArray, either with DEFINE_XARRAY()
62*4882a593Smuzhiyunfor statically allocated XArrays or xa_init() for dynamically
63*4882a593Smuzhiyunallocated ones.  A freshly-initialised XArray contains a ``NULL``
64*4882a593Smuzhiyunpointer at every index.
65*4882a593Smuzhiyun
66*4882a593SmuzhiyunYou can then set entries using xa_store() and get entries
67*4882a593Smuzhiyunusing xa_load().  xa_store will overwrite any entry with the
68*4882a593Smuzhiyunnew entry and return the previous entry stored at that index.  You can
69*4882a593Smuzhiyunuse xa_erase() instead of calling xa_store() with a
70*4882a593Smuzhiyun``NULL`` entry.  There is no difference between an entry that has never
71*4882a593Smuzhiyunbeen stored to, one that has been erased and one that has most recently
72*4882a593Smuzhiyunhad ``NULL`` stored to it.
73*4882a593Smuzhiyun
74*4882a593SmuzhiyunYou can conditionally replace an entry at an index by using
75*4882a593Smuzhiyunxa_cmpxchg().  Like cmpxchg(), it will only succeed if
76*4882a593Smuzhiyunthe entry at that index has the 'old' value.  It also returns the entry
77*4882a593Smuzhiyunwhich was at that index; if it returns the same entry which was passed as
78*4882a593Smuzhiyun'old', then xa_cmpxchg() succeeded.
79*4882a593Smuzhiyun
80*4882a593SmuzhiyunIf you want to only store a new entry to an index if the current entry
81*4882a593Smuzhiyunat that index is ``NULL``, you can use xa_insert() which
82*4882a593Smuzhiyunreturns ``-EBUSY`` if the entry is not empty.
83*4882a593Smuzhiyun
84*4882a593SmuzhiyunYou can copy entries out of the XArray into a plain array by calling
85*4882a593Smuzhiyunxa_extract().  Or you can iterate over the present entries in the XArray
86*4882a593Smuzhiyunby calling xa_for_each(), xa_for_each_start() or xa_for_each_range().
87*4882a593SmuzhiyunYou may prefer to use xa_find() or xa_find_after() to move to the next
88*4882a593Smuzhiyunpresent entry in the XArray.
89*4882a593Smuzhiyun
90*4882a593SmuzhiyunCalling xa_store_range() stores the same entry in a range
91*4882a593Smuzhiyunof indices.  If you do this, some of the other operations will behave
92*4882a593Smuzhiyunin a slightly odd way.  For example, marking the entry at one index
93*4882a593Smuzhiyunmay result in the entry being marked at some, but not all of the other
94*4882a593Smuzhiyunindices.  Storing into one index may result in the entry retrieved by
95*4882a593Smuzhiyunsome, but not all of the other indices changing.
96*4882a593Smuzhiyun
97*4882a593SmuzhiyunSometimes you need to ensure that a subsequent call to xa_store()
98*4882a593Smuzhiyunwill not need to allocate memory.  The xa_reserve() function
99*4882a593Smuzhiyunwill store a reserved entry at the indicated index.  Users of the
100*4882a593Smuzhiyunnormal API will see this entry as containing ``NULL``.  If you do
101*4882a593Smuzhiyunnot need to use the reserved entry, you can call xa_release()
102*4882a593Smuzhiyunto remove the unused entry.  If another user has stored to the entry
103*4882a593Smuzhiyunin the meantime, xa_release() will do nothing; if instead you
104*4882a593Smuzhiyunwant the entry to become ``NULL``, you should use xa_erase().
105*4882a593SmuzhiyunUsing xa_insert() on a reserved entry will fail.
106*4882a593Smuzhiyun
107*4882a593SmuzhiyunIf all entries in the array are ``NULL``, the xa_empty() function
108*4882a593Smuzhiyunwill return ``true``.
109*4882a593Smuzhiyun
110*4882a593SmuzhiyunFinally, you can remove all entries from an XArray by calling
111*4882a593Smuzhiyunxa_destroy().  If the XArray entries are pointers, you may wish
112*4882a593Smuzhiyunto free the entries first.  You can do this by iterating over all present
113*4882a593Smuzhiyunentries in the XArray using the xa_for_each() iterator.
114*4882a593Smuzhiyun
115*4882a593SmuzhiyunSearch Marks
116*4882a593Smuzhiyun------------
117*4882a593Smuzhiyun
118*4882a593SmuzhiyunEach entry in the array has three bits associated with it called marks.
119*4882a593SmuzhiyunEach mark may be set or cleared independently of the others.  You can
120*4882a593Smuzhiyuniterate over marked entries by using the xa_for_each_marked() iterator.
121*4882a593Smuzhiyun
122*4882a593SmuzhiyunYou can enquire whether a mark is set on an entry by using
123*4882a593Smuzhiyunxa_get_mark().  If the entry is not ``NULL``, you can set a mark on it
124*4882a593Smuzhiyunby using xa_set_mark() and remove the mark from an entry by calling
125*4882a593Smuzhiyunxa_clear_mark().  You can ask whether any entry in the XArray has a
126*4882a593Smuzhiyunparticular mark set by calling xa_marked().  Erasing an entry from the
127*4882a593SmuzhiyunXArray causes all marks associated with that entry to be cleared.
128*4882a593Smuzhiyun
129*4882a593SmuzhiyunSetting or clearing a mark on any index of a multi-index entry will
130*4882a593Smuzhiyunaffect all indices covered by that entry.  Querying the mark on any
131*4882a593Smuzhiyunindex will return the same result.
132*4882a593Smuzhiyun
133*4882a593SmuzhiyunThere is no way to iterate over entries which are not marked; the data
134*4882a593Smuzhiyunstructure does not allow this to be implemented efficiently.  There are
135*4882a593Smuzhiyunnot currently iterators to search for logical combinations of bits (eg
136*4882a593Smuzhiyuniterate over all entries which have both ``XA_MARK_1`` and ``XA_MARK_2``
137*4882a593Smuzhiyunset, or iterate over all entries which have ``XA_MARK_0`` or ``XA_MARK_2``
138*4882a593Smuzhiyunset).  It would be possible to add these if a user arises.
139*4882a593Smuzhiyun
140*4882a593SmuzhiyunAllocating XArrays
141*4882a593Smuzhiyun------------------
142*4882a593Smuzhiyun
143*4882a593SmuzhiyunIf you use DEFINE_XARRAY_ALLOC() to define the XArray, or
144*4882a593Smuzhiyuninitialise it by passing ``XA_FLAGS_ALLOC`` to xa_init_flags(),
145*4882a593Smuzhiyunthe XArray changes to track whether entries are in use or not.
146*4882a593Smuzhiyun
147*4882a593SmuzhiyunYou can call xa_alloc() to store the entry at an unused index
148*4882a593Smuzhiyunin the XArray.  If you need to modify the array from interrupt context,
149*4882a593Smuzhiyunyou can use xa_alloc_bh() or xa_alloc_irq() to disable
150*4882a593Smuzhiyuninterrupts while allocating the ID.
151*4882a593Smuzhiyun
152*4882a593SmuzhiyunUsing xa_store(), xa_cmpxchg() or xa_insert() will
153*4882a593Smuzhiyunalso mark the entry as being allocated.  Unlike a normal XArray, storing
154*4882a593Smuzhiyun``NULL`` will mark the entry as being in use, like xa_reserve().
155*4882a593SmuzhiyunTo free an entry, use xa_erase() (or xa_release() if
156*4882a593Smuzhiyunyou only want to free the entry if it's ``NULL``).
157*4882a593Smuzhiyun
158*4882a593SmuzhiyunBy default, the lowest free entry is allocated starting from 0.  If you
159*4882a593Smuzhiyunwant to allocate entries starting at 1, it is more efficient to use
160*4882a593SmuzhiyunDEFINE_XARRAY_ALLOC1() or ``XA_FLAGS_ALLOC1``.  If you want to
161*4882a593Smuzhiyunallocate IDs up to a maximum, then wrap back around to the lowest free
162*4882a593SmuzhiyunID, you can use xa_alloc_cyclic().
163*4882a593Smuzhiyun
164*4882a593SmuzhiyunYou cannot use ``XA_MARK_0`` with an allocating XArray as this mark
165*4882a593Smuzhiyunis used to track whether an entry is free or not.  The other marks are
166*4882a593Smuzhiyunavailable for your use.
167*4882a593Smuzhiyun
168*4882a593SmuzhiyunMemory allocation
169*4882a593Smuzhiyun-----------------
170*4882a593Smuzhiyun
171*4882a593SmuzhiyunThe xa_store(), xa_cmpxchg(), xa_alloc(),
172*4882a593Smuzhiyunxa_reserve() and xa_insert() functions take a gfp_t
173*4882a593Smuzhiyunparameter in case the XArray needs to allocate memory to store this entry.
174*4882a593SmuzhiyunIf the entry is being deleted, no memory allocation needs to be performed,
175*4882a593Smuzhiyunand the GFP flags specified will be ignored.
176*4882a593Smuzhiyun
177*4882a593SmuzhiyunIt is possible for no memory to be allocatable, particularly if you pass
178*4882a593Smuzhiyuna restrictive set of GFP flags.  In that case, the functions return a
179*4882a593Smuzhiyunspecial value which can be turned into an errno using xa_err().
180*4882a593SmuzhiyunIf you don't need to know exactly which error occurred, using
181*4882a593Smuzhiyunxa_is_err() is slightly more efficient.
182*4882a593Smuzhiyun
183*4882a593SmuzhiyunLocking
184*4882a593Smuzhiyun-------
185*4882a593Smuzhiyun
186*4882a593SmuzhiyunWhen using the Normal API, you do not have to worry about locking.
187*4882a593SmuzhiyunThe XArray uses RCU and an internal spinlock to synchronise access:
188*4882a593Smuzhiyun
189*4882a593SmuzhiyunNo lock needed:
190*4882a593Smuzhiyun * xa_empty()
191*4882a593Smuzhiyun * xa_marked()
192*4882a593Smuzhiyun
193*4882a593SmuzhiyunTakes RCU read lock:
194*4882a593Smuzhiyun * xa_load()
195*4882a593Smuzhiyun * xa_for_each()
196*4882a593Smuzhiyun * xa_for_each_start()
197*4882a593Smuzhiyun * xa_for_each_range()
198*4882a593Smuzhiyun * xa_find()
199*4882a593Smuzhiyun * xa_find_after()
200*4882a593Smuzhiyun * xa_extract()
201*4882a593Smuzhiyun * xa_get_mark()
202*4882a593Smuzhiyun
203*4882a593SmuzhiyunTakes xa_lock internally:
204*4882a593Smuzhiyun * xa_store()
205*4882a593Smuzhiyun * xa_store_bh()
206*4882a593Smuzhiyun * xa_store_irq()
207*4882a593Smuzhiyun * xa_insert()
208*4882a593Smuzhiyun * xa_insert_bh()
209*4882a593Smuzhiyun * xa_insert_irq()
210*4882a593Smuzhiyun * xa_erase()
211*4882a593Smuzhiyun * xa_erase_bh()
212*4882a593Smuzhiyun * xa_erase_irq()
213*4882a593Smuzhiyun * xa_cmpxchg()
214*4882a593Smuzhiyun * xa_cmpxchg_bh()
215*4882a593Smuzhiyun * xa_cmpxchg_irq()
216*4882a593Smuzhiyun * xa_store_range()
217*4882a593Smuzhiyun * xa_alloc()
218*4882a593Smuzhiyun * xa_alloc_bh()
219*4882a593Smuzhiyun * xa_alloc_irq()
220*4882a593Smuzhiyun * xa_reserve()
221*4882a593Smuzhiyun * xa_reserve_bh()
222*4882a593Smuzhiyun * xa_reserve_irq()
223*4882a593Smuzhiyun * xa_destroy()
224*4882a593Smuzhiyun * xa_set_mark()
225*4882a593Smuzhiyun * xa_clear_mark()
226*4882a593Smuzhiyun
227*4882a593SmuzhiyunAssumes xa_lock held on entry:
228*4882a593Smuzhiyun * __xa_store()
229*4882a593Smuzhiyun * __xa_insert()
230*4882a593Smuzhiyun * __xa_erase()
231*4882a593Smuzhiyun * __xa_cmpxchg()
232*4882a593Smuzhiyun * __xa_alloc()
233*4882a593Smuzhiyun * __xa_set_mark()
234*4882a593Smuzhiyun * __xa_clear_mark()
235*4882a593Smuzhiyun
236*4882a593SmuzhiyunIf you want to take advantage of the lock to protect the data structures
237*4882a593Smuzhiyunthat you are storing in the XArray, you can call xa_lock()
238*4882a593Smuzhiyunbefore calling xa_load(), then take a reference count on the
239*4882a593Smuzhiyunobject you have found before calling xa_unlock().  This will
240*4882a593Smuzhiyunprevent stores from removing the object from the array between looking
241*4882a593Smuzhiyunup the object and incrementing the refcount.  You can also use RCU to
242*4882a593Smuzhiyunavoid dereferencing freed memory, but an explanation of that is beyond
243*4882a593Smuzhiyunthe scope of this document.
244*4882a593Smuzhiyun
245*4882a593SmuzhiyunThe XArray does not disable interrupts or softirqs while modifying
246*4882a593Smuzhiyunthe array.  It is safe to read the XArray from interrupt or softirq
247*4882a593Smuzhiyuncontext as the RCU lock provides enough protection.
248*4882a593Smuzhiyun
249*4882a593SmuzhiyunIf, for example, you want to store entries in the XArray in process
250*4882a593Smuzhiyuncontext and then erase them in softirq context, you can do that this way::
251*4882a593Smuzhiyun
252*4882a593Smuzhiyun    void foo_init(struct foo *foo)
253*4882a593Smuzhiyun    {
254*4882a593Smuzhiyun        xa_init_flags(&foo->array, XA_FLAGS_LOCK_BH);
255*4882a593Smuzhiyun    }
256*4882a593Smuzhiyun
257*4882a593Smuzhiyun    int foo_store(struct foo *foo, unsigned long index, void *entry)
258*4882a593Smuzhiyun    {
259*4882a593Smuzhiyun        int err;
260*4882a593Smuzhiyun
261*4882a593Smuzhiyun        xa_lock_bh(&foo->array);
262*4882a593Smuzhiyun        err = xa_err(__xa_store(&foo->array, index, entry, GFP_KERNEL));
263*4882a593Smuzhiyun        if (!err)
264*4882a593Smuzhiyun            foo->count++;
265*4882a593Smuzhiyun        xa_unlock_bh(&foo->array);
266*4882a593Smuzhiyun        return err;
267*4882a593Smuzhiyun    }
268*4882a593Smuzhiyun
269*4882a593Smuzhiyun    /* foo_erase() is only called from softirq context */
270*4882a593Smuzhiyun    void foo_erase(struct foo *foo, unsigned long index)
271*4882a593Smuzhiyun    {
272*4882a593Smuzhiyun        xa_lock(&foo->array);
273*4882a593Smuzhiyun        __xa_erase(&foo->array, index);
274*4882a593Smuzhiyun        foo->count--;
275*4882a593Smuzhiyun        xa_unlock(&foo->array);
276*4882a593Smuzhiyun    }
277*4882a593Smuzhiyun
278*4882a593SmuzhiyunIf you are going to modify the XArray from interrupt or softirq context,
279*4882a593Smuzhiyunyou need to initialise the array using xa_init_flags(), passing
280*4882a593Smuzhiyun``XA_FLAGS_LOCK_IRQ`` or ``XA_FLAGS_LOCK_BH``.
281*4882a593Smuzhiyun
282*4882a593SmuzhiyunThe above example also shows a common pattern of wanting to extend the
283*4882a593Smuzhiyuncoverage of the xa_lock on the store side to protect some statistics
284*4882a593Smuzhiyunassociated with the array.
285*4882a593Smuzhiyun
286*4882a593SmuzhiyunSharing the XArray with interrupt context is also possible, either
287*4882a593Smuzhiyunusing xa_lock_irqsave() in both the interrupt handler and process
288*4882a593Smuzhiyuncontext, or xa_lock_irq() in process context and xa_lock()
289*4882a593Smuzhiyunin the interrupt handler.  Some of the more common patterns have helper
290*4882a593Smuzhiyunfunctions such as xa_store_bh(), xa_store_irq(),
291*4882a593Smuzhiyunxa_erase_bh(), xa_erase_irq(), xa_cmpxchg_bh()
292*4882a593Smuzhiyunand xa_cmpxchg_irq().
293*4882a593Smuzhiyun
294*4882a593SmuzhiyunSometimes you need to protect access to the XArray with a mutex because
295*4882a593Smuzhiyunthat lock sits above another mutex in the locking hierarchy.  That does
296*4882a593Smuzhiyunnot entitle you to use functions like __xa_erase() without taking
297*4882a593Smuzhiyunthe xa_lock; the xa_lock is used for lockdep validation and will be used
298*4882a593Smuzhiyunfor other purposes in the future.
299*4882a593Smuzhiyun
300*4882a593SmuzhiyunThe __xa_set_mark() and __xa_clear_mark() functions are also
301*4882a593Smuzhiyunavailable for situations where you look up an entry and want to atomically
302*4882a593Smuzhiyunset or clear a mark.  It may be more efficient to use the advanced API
303*4882a593Smuzhiyunin this case, as it will save you from walking the tree twice.
304*4882a593Smuzhiyun
305*4882a593SmuzhiyunAdvanced API
306*4882a593Smuzhiyun============
307*4882a593Smuzhiyun
308*4882a593SmuzhiyunThe advanced API offers more flexibility and better performance at the
309*4882a593Smuzhiyuncost of an interface which can be harder to use and has fewer safeguards.
310*4882a593SmuzhiyunNo locking is done for you by the advanced API, and you are required
311*4882a593Smuzhiyunto use the xa_lock while modifying the array.  You can choose whether
312*4882a593Smuzhiyunto use the xa_lock or the RCU lock while doing read-only operations on
313*4882a593Smuzhiyunthe array.  You can mix advanced and normal operations on the same array;
314*4882a593Smuzhiyunindeed the normal API is implemented in terms of the advanced API.  The
315*4882a593Smuzhiyunadvanced API is only available to modules with a GPL-compatible license.
316*4882a593Smuzhiyun
317*4882a593SmuzhiyunThe advanced API is based around the xa_state.  This is an opaque data
318*4882a593Smuzhiyunstructure which you declare on the stack using the XA_STATE()
319*4882a593Smuzhiyunmacro.  This macro initialises the xa_state ready to start walking
320*4882a593Smuzhiyunaround the XArray.  It is used as a cursor to maintain the position
321*4882a593Smuzhiyunin the XArray and let you compose various operations together without
322*4882a593Smuzhiyunhaving to restart from the top every time.
323*4882a593Smuzhiyun
324*4882a593SmuzhiyunThe xa_state is also used to store errors.  You can call
325*4882a593Smuzhiyunxas_error() to retrieve the error.  All operations check whether
326*4882a593Smuzhiyunthe xa_state is in an error state before proceeding, so there's no need
327*4882a593Smuzhiyunfor you to check for an error after each call; you can make multiple
328*4882a593Smuzhiyuncalls in succession and only check at a convenient point.  The only
329*4882a593Smuzhiyunerrors currently generated by the XArray code itself are ``ENOMEM`` and
330*4882a593Smuzhiyun``EINVAL``, but it supports arbitrary errors in case you want to call
331*4882a593Smuzhiyunxas_set_err() yourself.
332*4882a593Smuzhiyun
333*4882a593SmuzhiyunIf the xa_state is holding an ``ENOMEM`` error, calling xas_nomem()
334*4882a593Smuzhiyunwill attempt to allocate more memory using the specified gfp flags and
335*4882a593Smuzhiyuncache it in the xa_state for the next attempt.  The idea is that you take
336*4882a593Smuzhiyunthe xa_lock, attempt the operation and drop the lock.  The operation
337*4882a593Smuzhiyunattempts to allocate memory while holding the lock, but it is more
338*4882a593Smuzhiyunlikely to fail.  Once you have dropped the lock, xas_nomem()
339*4882a593Smuzhiyuncan try harder to allocate more memory.  It will return ``true`` if it
340*4882a593Smuzhiyunis worth retrying the operation (i.e. that there was a memory error *and*
341*4882a593Smuzhiyunmore memory was allocated).  If it has previously allocated memory, and
342*4882a593Smuzhiyunthat memory wasn't used, and there is no error (or some error that isn't
343*4882a593Smuzhiyun``ENOMEM``), then it will free the memory previously allocated.
344*4882a593Smuzhiyun
345*4882a593SmuzhiyunInternal Entries
346*4882a593Smuzhiyun----------------
347*4882a593Smuzhiyun
348*4882a593SmuzhiyunThe XArray reserves some entries for its own purposes.  These are never
349*4882a593Smuzhiyunexposed through the normal API, but when using the advanced API, it's
350*4882a593Smuzhiyunpossible to see them.  Usually the best way to handle them is to pass them
351*4882a593Smuzhiyunto xas_retry(), and retry the operation if it returns ``true``.
352*4882a593Smuzhiyun
353*4882a593Smuzhiyun.. flat-table::
354*4882a593Smuzhiyun   :widths: 1 1 6
355*4882a593Smuzhiyun
356*4882a593Smuzhiyun   * - Name
357*4882a593Smuzhiyun     - Test
358*4882a593Smuzhiyun     - Usage
359*4882a593Smuzhiyun
360*4882a593Smuzhiyun   * - Node
361*4882a593Smuzhiyun     - xa_is_node()
362*4882a593Smuzhiyun     - An XArray node.  May be visible when using a multi-index xa_state.
363*4882a593Smuzhiyun
364*4882a593Smuzhiyun   * - Sibling
365*4882a593Smuzhiyun     - xa_is_sibling()
366*4882a593Smuzhiyun     - A non-canonical entry for a multi-index entry.  The value indicates
367*4882a593Smuzhiyun       which slot in this node has the canonical entry.
368*4882a593Smuzhiyun
369*4882a593Smuzhiyun   * - Retry
370*4882a593Smuzhiyun     - xa_is_retry()
371*4882a593Smuzhiyun     - This entry is currently being modified by a thread which has the
372*4882a593Smuzhiyun       xa_lock.  The node containing this entry may be freed at the end
373*4882a593Smuzhiyun       of this RCU period.  You should restart the lookup from the head
374*4882a593Smuzhiyun       of the array.
375*4882a593Smuzhiyun
376*4882a593Smuzhiyun   * - Zero
377*4882a593Smuzhiyun     - xa_is_zero()
378*4882a593Smuzhiyun     - Zero entries appear as ``NULL`` through the Normal API, but occupy
379*4882a593Smuzhiyun       an entry in the XArray which can be used to reserve the index for
380*4882a593Smuzhiyun       future use.  This is used by allocating XArrays for allocated entries
381*4882a593Smuzhiyun       which are ``NULL``.
382*4882a593Smuzhiyun
383*4882a593SmuzhiyunOther internal entries may be added in the future.  As far as possible, they
384*4882a593Smuzhiyunwill be handled by xas_retry().
385*4882a593Smuzhiyun
386*4882a593SmuzhiyunAdditional functionality
387*4882a593Smuzhiyun------------------------
388*4882a593Smuzhiyun
389*4882a593SmuzhiyunThe xas_create_range() function allocates all the necessary memory
390*4882a593Smuzhiyunto store every entry in a range.  It will set ENOMEM in the xa_state if
391*4882a593Smuzhiyunit cannot allocate memory.
392*4882a593Smuzhiyun
393*4882a593SmuzhiyunYou can use xas_init_marks() to reset the marks on an entry
394*4882a593Smuzhiyunto their default state.  This is usually all marks clear, unless the
395*4882a593SmuzhiyunXArray is marked with ``XA_FLAGS_TRACK_FREE``, in which case mark 0 is set
396*4882a593Smuzhiyunand all other marks are clear.  Replacing one entry with another using
397*4882a593Smuzhiyunxas_store() will not reset the marks on that entry; if you want
398*4882a593Smuzhiyunthe marks reset, you should do that explicitly.
399*4882a593Smuzhiyun
400*4882a593SmuzhiyunThe xas_load() will walk the xa_state as close to the entry
401*4882a593Smuzhiyunas it can.  If you know the xa_state has already been walked to the
402*4882a593Smuzhiyunentry and need to check that the entry hasn't changed, you can use
403*4882a593Smuzhiyunxas_reload() to save a function call.
404*4882a593Smuzhiyun
405*4882a593SmuzhiyunIf you need to move to a different index in the XArray, call
406*4882a593Smuzhiyunxas_set().  This resets the cursor to the top of the tree, which
407*4882a593Smuzhiyunwill generally make the next operation walk the cursor to the desired
408*4882a593Smuzhiyunspot in the tree.  If you want to move to the next or previous index,
409*4882a593Smuzhiyuncall xas_next() or xas_prev().  Setting the index does
410*4882a593Smuzhiyunnot walk the cursor around the array so does not require a lock to be
411*4882a593Smuzhiyunheld, while moving to the next or previous index does.
412*4882a593Smuzhiyun
413*4882a593SmuzhiyunYou can search for the next present entry using xas_find().  This
414*4882a593Smuzhiyunis the equivalent of both xa_find() and xa_find_after();
415*4882a593Smuzhiyunif the cursor has been walked to an entry, then it will find the next
416*4882a593Smuzhiyunentry after the one currently referenced.  If not, it will return the
417*4882a593Smuzhiyunentry at the index of the xa_state.  Using xas_next_entry() to
418*4882a593Smuzhiyunmove to the next present entry instead of xas_find() will save
419*4882a593Smuzhiyuna function call in the majority of cases at the expense of emitting more
420*4882a593Smuzhiyuninline code.
421*4882a593Smuzhiyun
422*4882a593SmuzhiyunThe xas_find_marked() function is similar.  If the xa_state has
423*4882a593Smuzhiyunnot been walked, it will return the entry at the index of the xa_state,
424*4882a593Smuzhiyunif it is marked.  Otherwise, it will return the first marked entry after
425*4882a593Smuzhiyunthe entry referenced by the xa_state.  The xas_next_marked()
426*4882a593Smuzhiyunfunction is the equivalent of xas_next_entry().
427*4882a593Smuzhiyun
428*4882a593SmuzhiyunWhen iterating over a range of the XArray using xas_for_each()
429*4882a593Smuzhiyunor xas_for_each_marked(), it may be necessary to temporarily stop
430*4882a593Smuzhiyunthe iteration.  The xas_pause() function exists for this purpose.
431*4882a593SmuzhiyunAfter you have done the necessary work and wish to resume, the xa_state
432*4882a593Smuzhiyunis in an appropriate state to continue the iteration after the entry
433*4882a593Smuzhiyunyou last processed.  If you have interrupts disabled while iterating,
434*4882a593Smuzhiyunthen it is good manners to pause the iteration and reenable interrupts
435*4882a593Smuzhiyunevery ``XA_CHECK_SCHED`` entries.
436*4882a593Smuzhiyun
437*4882a593SmuzhiyunThe xas_get_mark(), xas_set_mark() and xas_clear_mark() functions require
438*4882a593Smuzhiyunthe xa_state cursor to have been moved to the appropriate location in the
439*4882a593SmuzhiyunXArray; they will do nothing if you have called xas_pause() or xas_set()
440*4882a593Smuzhiyunimmediately before.
441*4882a593Smuzhiyun
442*4882a593SmuzhiyunYou can call xas_set_update() to have a callback function
443*4882a593Smuzhiyuncalled each time the XArray updates a node.  This is used by the page
444*4882a593Smuzhiyuncache workingset code to maintain its list of nodes which contain only
445*4882a593Smuzhiyunshadow entries.
446*4882a593Smuzhiyun
447*4882a593SmuzhiyunMulti-Index Entries
448*4882a593Smuzhiyun-------------------
449*4882a593Smuzhiyun
450*4882a593SmuzhiyunThe XArray has the ability to tie multiple indices together so that
451*4882a593Smuzhiyunoperations on one index affect all indices.  For example, storing into
452*4882a593Smuzhiyunany index will change the value of the entry retrieved from any index.
453*4882a593SmuzhiyunSetting or clearing a mark on any index will set or clear the mark
454*4882a593Smuzhiyunon every index that is tied together.  The current implementation
455*4882a593Smuzhiyunonly allows tying ranges which are aligned powers of two together;
456*4882a593Smuzhiyuneg indices 64-127 may be tied together, but 2-6 may not be.  This may
457*4882a593Smuzhiyunsave substantial quantities of memory; for example tying 512 entries
458*4882a593Smuzhiyuntogether will save over 4kB.
459*4882a593Smuzhiyun
460*4882a593SmuzhiyunYou can create a multi-index entry by using XA_STATE_ORDER()
461*4882a593Smuzhiyunor xas_set_order() followed by a call to xas_store().
462*4882a593SmuzhiyunCalling xas_load() with a multi-index xa_state will walk the
463*4882a593Smuzhiyunxa_state to the right location in the tree, but the return value is not
464*4882a593Smuzhiyunmeaningful, potentially being an internal entry or ``NULL`` even when there
465*4882a593Smuzhiyunis an entry stored within the range.  Calling xas_find_conflict()
466*4882a593Smuzhiyunwill return the first entry within the range or ``NULL`` if there are no
467*4882a593Smuzhiyunentries in the range.  The xas_for_each_conflict() iterator will
468*4882a593Smuzhiyuniterate over every entry which overlaps the specified range.
469*4882a593Smuzhiyun
470*4882a593SmuzhiyunIf xas_load() encounters a multi-index entry, the xa_index
471*4882a593Smuzhiyunin the xa_state will not be changed.  When iterating over an XArray
472*4882a593Smuzhiyunor calling xas_find(), if the initial index is in the middle
473*4882a593Smuzhiyunof a multi-index entry, it will not be altered.  Subsequent calls
474*4882a593Smuzhiyunor iterations will move the index to the first index in the range.
475*4882a593SmuzhiyunEach entry will only be returned once, no matter how many indices it
476*4882a593Smuzhiyunoccupies.
477*4882a593Smuzhiyun
478*4882a593SmuzhiyunUsing xas_next() or xas_prev() with a multi-index xa_state is not
479*4882a593Smuzhiyunsupported.  Using either of these functions on a multi-index entry will
480*4882a593Smuzhiyunreveal sibling entries; these should be skipped over by the caller.
481*4882a593Smuzhiyun
482*4882a593SmuzhiyunStoring ``NULL`` into any index of a multi-index entry will set the
483*4882a593Smuzhiyunentry at every index to ``NULL`` and dissolve the tie.  A multi-index
484*4882a593Smuzhiyunentry can be split into entries occupying smaller ranges by calling
485*4882a593Smuzhiyunxas_split_alloc() without the xa_lock held, followed by taking the lock
486*4882a593Smuzhiyunand calling xas_split().
487*4882a593Smuzhiyun
488*4882a593SmuzhiyunFunctions and structures
489*4882a593Smuzhiyun========================
490*4882a593Smuzhiyun
491*4882a593Smuzhiyun.. kernel-doc:: include/linux/xarray.h
492*4882a593Smuzhiyun.. kernel-doc:: lib/xarray.c
493