1*4882a593Smuzhiyun.. _userfaultfd: 2*4882a593Smuzhiyun 3*4882a593Smuzhiyun=========== 4*4882a593SmuzhiyunUserfaultfd 5*4882a593Smuzhiyun=========== 6*4882a593Smuzhiyun 7*4882a593SmuzhiyunObjective 8*4882a593Smuzhiyun========= 9*4882a593Smuzhiyun 10*4882a593SmuzhiyunUserfaults allow the implementation of on-demand paging from userland 11*4882a593Smuzhiyunand more generally they allow userland to take control of various 12*4882a593Smuzhiyunmemory page faults, something otherwise only the kernel code could do. 13*4882a593Smuzhiyun 14*4882a593SmuzhiyunFor example userfaults allows a proper and more optimal implementation 15*4882a593Smuzhiyunof the ``PROT_NONE+SIGSEGV`` trick. 16*4882a593Smuzhiyun 17*4882a593SmuzhiyunDesign 18*4882a593Smuzhiyun====== 19*4882a593Smuzhiyun 20*4882a593SmuzhiyunUserfaults are delivered and resolved through the ``userfaultfd`` syscall. 21*4882a593Smuzhiyun 22*4882a593SmuzhiyunThe ``userfaultfd`` (aside from registering and unregistering virtual 23*4882a593Smuzhiyunmemory ranges) provides two primary functionalities: 24*4882a593Smuzhiyun 25*4882a593Smuzhiyun1) ``read/POLLIN`` protocol to notify a userland thread of the faults 26*4882a593Smuzhiyun happening 27*4882a593Smuzhiyun 28*4882a593Smuzhiyun2) various ``UFFDIO_*`` ioctls that can manage the virtual memory regions 29*4882a593Smuzhiyun registered in the ``userfaultfd`` that allows userland to efficiently 30*4882a593Smuzhiyun resolve the userfaults it receives via 1) or to manage the virtual 31*4882a593Smuzhiyun memory in the background 32*4882a593Smuzhiyun 33*4882a593SmuzhiyunThe real advantage of userfaults if compared to regular virtual memory 34*4882a593Smuzhiyunmanagement of mremap/mprotect is that the userfaults in all their 35*4882a593Smuzhiyunoperations never involve heavyweight structures like vmas (in fact the 36*4882a593Smuzhiyun``userfaultfd`` runtime load never takes the mmap_lock for writing). 37*4882a593Smuzhiyun 38*4882a593SmuzhiyunVmas are not suitable for page- (or hugepage) granular fault tracking 39*4882a593Smuzhiyunwhen dealing with virtual address spaces that could span 40*4882a593SmuzhiyunTerabytes. Too many vmas would be needed for that. 41*4882a593Smuzhiyun 42*4882a593SmuzhiyunThe ``userfaultfd`` once opened by invoking the syscall, can also be 43*4882a593Smuzhiyunpassed using unix domain sockets to a manager process, so the same 44*4882a593Smuzhiyunmanager process could handle the userfaults of a multitude of 45*4882a593Smuzhiyundifferent processes without them being aware about what is going on 46*4882a593Smuzhiyun(well of course unless they later try to use the ``userfaultfd`` 47*4882a593Smuzhiyunthemselves on the same region the manager is already tracking, which 48*4882a593Smuzhiyunis a corner case that would currently return ``-EBUSY``). 49*4882a593Smuzhiyun 50*4882a593SmuzhiyunAPI 51*4882a593Smuzhiyun=== 52*4882a593Smuzhiyun 53*4882a593SmuzhiyunWhen first opened the ``userfaultfd`` must be enabled invoking the 54*4882a593Smuzhiyun``UFFDIO_API`` ioctl specifying a ``uffdio_api.api`` value set to ``UFFD_API`` (or 55*4882a593Smuzhiyuna later API version) which will specify the ``read/POLLIN`` protocol 56*4882a593Smuzhiyunuserland intends to speak on the ``UFFD`` and the ``uffdio_api.features`` 57*4882a593Smuzhiyunuserland requires. The ``UFFDIO_API`` ioctl if successful (i.e. if the 58*4882a593Smuzhiyunrequested ``uffdio_api.api`` is spoken also by the running kernel and the 59*4882a593Smuzhiyunrequested features are going to be enabled) will return into 60*4882a593Smuzhiyun``uffdio_api.features`` and ``uffdio_api.ioctls`` two 64bit bitmasks of 61*4882a593Smuzhiyunrespectively all the available features of the read(2) protocol and 62*4882a593Smuzhiyunthe generic ioctl available. 63*4882a593Smuzhiyun 64*4882a593SmuzhiyunThe ``uffdio_api.features`` bitmask returned by the ``UFFDIO_API`` ioctl 65*4882a593Smuzhiyundefines what memory types are supported by the ``userfaultfd`` and what 66*4882a593Smuzhiyunevents, except page fault notifications, may be generated: 67*4882a593Smuzhiyun 68*4882a593Smuzhiyun- The ``UFFD_FEATURE_EVENT_*`` flags indicate that various other events 69*4882a593Smuzhiyun other than page faults are supported. These events are described in more 70*4882a593Smuzhiyun detail below in the `Non-cooperative userfaultfd`_ section. 71*4882a593Smuzhiyun 72*4882a593Smuzhiyun- ``UFFD_FEATURE_MISSING_HUGETLBFS`` and ``UFFD_FEATURE_MISSING_SHMEM`` 73*4882a593Smuzhiyun indicate that the kernel supports ``UFFDIO_REGISTER_MODE_MISSING`` 74*4882a593Smuzhiyun registrations for hugetlbfs and shared memory (covering all shmem APIs, 75*4882a593Smuzhiyun i.e. tmpfs, ``IPCSHM``, ``/dev/zero``, ``MAP_SHARED``, ``memfd_create``, 76*4882a593Smuzhiyun etc) virtual memory areas, respectively. 77*4882a593Smuzhiyun 78*4882a593Smuzhiyun- ``UFFD_FEATURE_MINOR_HUGETLBFS`` indicates that the kernel supports 79*4882a593Smuzhiyun ``UFFDIO_REGISTER_MODE_MINOR`` registration for hugetlbfs virtual memory 80*4882a593Smuzhiyun areas. ``UFFD_FEATURE_MINOR_SHMEM`` is the analogous feature indicating 81*4882a593Smuzhiyun support for shmem virtual memory areas. 82*4882a593Smuzhiyun 83*4882a593SmuzhiyunThe userland application should set the feature flags it intends to use 84*4882a593Smuzhiyunwhen invoking the ``UFFDIO_API`` ioctl, to request that those features be 85*4882a593Smuzhiyunenabled if supported. 86*4882a593Smuzhiyun 87*4882a593SmuzhiyunOnce the ``userfaultfd`` API has been enabled the ``UFFDIO_REGISTER`` 88*4882a593Smuzhiyunioctl should be invoked (if present in the returned ``uffdio_api.ioctls`` 89*4882a593Smuzhiyunbitmask) to register a memory range in the ``userfaultfd`` by setting the 90*4882a593Smuzhiyunuffdio_register structure accordingly. The ``uffdio_register.mode`` 91*4882a593Smuzhiyunbitmask will specify to the kernel which kind of faults to track for 92*4882a593Smuzhiyunthe range. The ``UFFDIO_REGISTER`` ioctl will return the 93*4882a593Smuzhiyun``uffdio_register.ioctls`` bitmask of ioctls that are suitable to resolve 94*4882a593Smuzhiyunuserfaults on the range registered. Not all ioctls will necessarily be 95*4882a593Smuzhiyunsupported for all memory types (e.g. anonymous memory vs. shmem vs. 96*4882a593Smuzhiyunhugetlbfs), or all types of intercepted faults. 97*4882a593Smuzhiyun 98*4882a593SmuzhiyunUserland can use the ``uffdio_register.ioctls`` to manage the virtual 99*4882a593Smuzhiyunaddress space in the background (to add or potentially also remove 100*4882a593Smuzhiyunmemory from the ``userfaultfd`` registered range). This means a userfault 101*4882a593Smuzhiyuncould be triggering just before userland maps in the background the 102*4882a593Smuzhiyunuser-faulted page. 103*4882a593Smuzhiyun 104*4882a593SmuzhiyunResolving Userfaults 105*4882a593Smuzhiyun-------------------- 106*4882a593Smuzhiyun 107*4882a593SmuzhiyunThere are three basic ways to resolve userfaults: 108*4882a593Smuzhiyun 109*4882a593Smuzhiyun- ``UFFDIO_COPY`` atomically copies some existing page contents from 110*4882a593Smuzhiyun userspace. 111*4882a593Smuzhiyun 112*4882a593Smuzhiyun- ``UFFDIO_ZEROPAGE`` atomically zeros the new page. 113*4882a593Smuzhiyun 114*4882a593Smuzhiyun- ``UFFDIO_CONTINUE`` maps an existing, previously-populated page. 115*4882a593Smuzhiyun 116*4882a593SmuzhiyunThese operations are atomic in the sense that they guarantee nothing can 117*4882a593Smuzhiyunsee a half-populated page, since readers will keep userfaulting until the 118*4882a593Smuzhiyunoperation has finished. 119*4882a593Smuzhiyun 120*4882a593SmuzhiyunBy default, these wake up userfaults blocked on the range in question. 121*4882a593SmuzhiyunThey support a ``UFFDIO_*_MODE_DONTWAKE`` ``mode`` flag, which indicates 122*4882a593Smuzhiyunthat waking will be done separately at some later time. 123*4882a593Smuzhiyun 124*4882a593SmuzhiyunWhich ioctl to choose depends on the kind of page fault, and what we'd 125*4882a593Smuzhiyunlike to do to resolve it: 126*4882a593Smuzhiyun 127*4882a593Smuzhiyun- For ``UFFDIO_REGISTER_MODE_MISSING`` faults, the fault needs to be 128*4882a593Smuzhiyun resolved by either providing a new page (``UFFDIO_COPY``), or mapping 129*4882a593Smuzhiyun the zero page (``UFFDIO_ZEROPAGE``). By default, the kernel would map 130*4882a593Smuzhiyun the zero page for a missing fault. With userfaultfd, userspace can 131*4882a593Smuzhiyun decide what content to provide before the faulting thread continues. 132*4882a593Smuzhiyun 133*4882a593Smuzhiyun- For ``UFFDIO_REGISTER_MODE_MINOR`` faults, there is an existing page (in 134*4882a593Smuzhiyun the page cache). Userspace has the option of modifying the page's 135*4882a593Smuzhiyun contents before resolving the fault. Once the contents are correct 136*4882a593Smuzhiyun (modified or not), userspace asks the kernel to map the page and let the 137*4882a593Smuzhiyun faulting thread continue with ``UFFDIO_CONTINUE``. 138*4882a593Smuzhiyun 139*4882a593SmuzhiyunNotes: 140*4882a593Smuzhiyun 141*4882a593Smuzhiyun- You can tell which kind of fault occurred by examining 142*4882a593Smuzhiyun ``pagefault.flags`` within the ``uffd_msg``, checking for the 143*4882a593Smuzhiyun ``UFFD_PAGEFAULT_FLAG_*`` flags. 144*4882a593Smuzhiyun 145*4882a593Smuzhiyun- None of the page-delivering ioctls default to the range that you 146*4882a593Smuzhiyun registered with. You must fill in all fields for the appropriate 147*4882a593Smuzhiyun ioctl struct including the range. 148*4882a593Smuzhiyun 149*4882a593Smuzhiyun- You get the address of the access that triggered the missing page 150*4882a593Smuzhiyun event out of a struct uffd_msg that you read in the thread from the 151*4882a593Smuzhiyun uffd. You can supply as many pages as you want with these IOCTLs. 152*4882a593Smuzhiyun Keep in mind that unless you used DONTWAKE then the first of any of 153*4882a593Smuzhiyun those IOCTLs wakes up the faulting thread. 154*4882a593Smuzhiyun 155*4882a593Smuzhiyun- Be sure to test for all errors including 156*4882a593Smuzhiyun (``pollfd[0].revents & POLLERR``). This can happen, e.g. when ranges 157*4882a593Smuzhiyun supplied were incorrect. 158*4882a593Smuzhiyun 159*4882a593SmuzhiyunWrite Protect Notifications 160*4882a593Smuzhiyun--------------------------- 161*4882a593Smuzhiyun 162*4882a593SmuzhiyunThis is equivalent to (but faster than) using mprotect and a SIGSEGV 163*4882a593Smuzhiyunsignal handler. 164*4882a593Smuzhiyun 165*4882a593SmuzhiyunFirstly you need to register a range with ``UFFDIO_REGISTER_MODE_WP``. 166*4882a593SmuzhiyunInstead of using mprotect(2) you use 167*4882a593Smuzhiyun``ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect)`` 168*4882a593Smuzhiyunwhile ``mode = UFFDIO_WRITEPROTECT_MODE_WP`` 169*4882a593Smuzhiyunin the struct passed in. The range does not default to and does not 170*4882a593Smuzhiyunhave to be identical to the range you registered with. You can write 171*4882a593Smuzhiyunprotect as many ranges as you like (inside the registered range). 172*4882a593SmuzhiyunThen, in the thread reading from uffd the struct will have 173*4882a593Smuzhiyun``msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP`` set. Now you send 174*4882a593Smuzhiyun``ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect)`` 175*4882a593Smuzhiyunagain while ``pagefault.mode`` does not have ``UFFDIO_WRITEPROTECT_MODE_WP`` 176*4882a593Smuzhiyunset. This wakes up the thread which will continue to run with writes. This 177*4882a593Smuzhiyunallows you to do the bookkeeping about the write in the uffd reading 178*4882a593Smuzhiyunthread before the ioctl. 179*4882a593Smuzhiyun 180*4882a593SmuzhiyunIf you registered with both ``UFFDIO_REGISTER_MODE_MISSING`` and 181*4882a593Smuzhiyun``UFFDIO_REGISTER_MODE_WP`` then you need to think about the sequence in 182*4882a593Smuzhiyunwhich you supply a page and undo write protect. Note that there is a 183*4882a593Smuzhiyundifference between writes into a WP area and into a !WP area. The 184*4882a593Smuzhiyunformer will have ``UFFD_PAGEFAULT_FLAG_WP`` set, the latter 185*4882a593Smuzhiyun``UFFD_PAGEFAULT_FLAG_WRITE``. The latter did not fail on protection but 186*4882a593Smuzhiyunyou still need to supply a page when ``UFFDIO_REGISTER_MODE_MISSING`` was 187*4882a593Smuzhiyunused. 188*4882a593Smuzhiyun 189*4882a593SmuzhiyunQEMU/KVM 190*4882a593Smuzhiyun======== 191*4882a593Smuzhiyun 192*4882a593SmuzhiyunQEMU/KVM is using the ``userfaultfd`` syscall to implement postcopy live 193*4882a593Smuzhiyunmigration. Postcopy live migration is one form of memory 194*4882a593Smuzhiyunexternalization consisting of a virtual machine running with part or 195*4882a593Smuzhiyunall of its memory residing on a different node in the cloud. The 196*4882a593Smuzhiyun``userfaultfd`` abstraction is generic enough that not a single line of 197*4882a593SmuzhiyunKVM kernel code had to be modified in order to add postcopy live 198*4882a593Smuzhiyunmigration to QEMU. 199*4882a593Smuzhiyun 200*4882a593SmuzhiyunGuest async page faults, ``FOLL_NOWAIT`` and all other ``GUP*`` features work 201*4882a593Smuzhiyunjust fine in combination with userfaults. Userfaults trigger async 202*4882a593Smuzhiyunpage faults in the guest scheduler so those guest processes that 203*4882a593Smuzhiyunaren't waiting for userfaults (i.e. network bound) can keep running in 204*4882a593Smuzhiyunthe guest vcpus. 205*4882a593Smuzhiyun 206*4882a593SmuzhiyunIt is generally beneficial to run one pass of precopy live migration 207*4882a593Smuzhiyunjust before starting postcopy live migration, in order to avoid 208*4882a593Smuzhiyungenerating userfaults for readonly guest regions. 209*4882a593Smuzhiyun 210*4882a593SmuzhiyunThe implementation of postcopy live migration currently uses one 211*4882a593Smuzhiyunsingle bidirectional socket but in the future two different sockets 212*4882a593Smuzhiyunwill be used (to reduce the latency of the userfaults to the minimum 213*4882a593Smuzhiyunpossible without having to decrease ``/proc/sys/net/ipv4/tcp_wmem``). 214*4882a593Smuzhiyun 215*4882a593SmuzhiyunThe QEMU in the source node writes all pages that it knows are missing 216*4882a593Smuzhiyunin the destination node, into the socket, and the migration thread of 217*4882a593Smuzhiyunthe QEMU running in the destination node runs ``UFFDIO_COPY|ZEROPAGE`` 218*4882a593Smuzhiyunioctls on the ``userfaultfd`` in order to map the received pages into the 219*4882a593Smuzhiyunguest (``UFFDIO_ZEROCOPY`` is used if the source page was a zero page). 220*4882a593Smuzhiyun 221*4882a593SmuzhiyunA different postcopy thread in the destination node listens with 222*4882a593Smuzhiyunpoll() to the ``userfaultfd`` in parallel. When a ``POLLIN`` event is 223*4882a593Smuzhiyungenerated after a userfault triggers, the postcopy thread read() from 224*4882a593Smuzhiyunthe ``userfaultfd`` and receives the fault address (or ``-EAGAIN`` in case the 225*4882a593Smuzhiyunuserfault was already resolved and waken by a ``UFFDIO_COPY|ZEROPAGE`` run 226*4882a593Smuzhiyunby the parallel QEMU migration thread). 227*4882a593Smuzhiyun 228*4882a593SmuzhiyunAfter the QEMU postcopy thread (running in the destination node) gets 229*4882a593Smuzhiyunthe userfault address it writes the information about the missing page 230*4882a593Smuzhiyuninto the socket. The QEMU source node receives the information and 231*4882a593Smuzhiyunroughly "seeks" to that page address and continues sending all 232*4882a593Smuzhiyunremaining missing pages from that new page offset. Soon after that 233*4882a593Smuzhiyun(just the time to flush the tcp_wmem queue through the network) the 234*4882a593Smuzhiyunmigration thread in the QEMU running in the destination node will 235*4882a593Smuzhiyunreceive the page that triggered the userfault and it'll map it as 236*4882a593Smuzhiyunusual with the ``UFFDIO_COPY|ZEROPAGE`` (without actually knowing if it 237*4882a593Smuzhiyunwas spontaneously sent by the source or if it was an urgent page 238*4882a593Smuzhiyunrequested through a userfault). 239*4882a593Smuzhiyun 240*4882a593SmuzhiyunBy the time the userfaults start, the QEMU in the destination node 241*4882a593Smuzhiyundoesn't need to keep any per-page state bitmap relative to the live 242*4882a593Smuzhiyunmigration around and a single per-page bitmap has to be maintained in 243*4882a593Smuzhiyunthe QEMU running in the source node to know which pages are still 244*4882a593Smuzhiyunmissing in the destination node. The bitmap in the source node is 245*4882a593Smuzhiyunchecked to find which missing pages to send in round robin and we seek 246*4882a593Smuzhiyunover it when receiving incoming userfaults. After sending each page of 247*4882a593Smuzhiyuncourse the bitmap is updated accordingly. It's also useful to avoid 248*4882a593Smuzhiyunsending the same page twice (in case the userfault is read by the 249*4882a593Smuzhiyunpostcopy thread just before ``UFFDIO_COPY|ZEROPAGE`` runs in the migration 250*4882a593Smuzhiyunthread). 251*4882a593Smuzhiyun 252*4882a593SmuzhiyunNon-cooperative userfaultfd 253*4882a593Smuzhiyun=========================== 254*4882a593Smuzhiyun 255*4882a593SmuzhiyunWhen the ``userfaultfd`` is monitored by an external manager, the manager 256*4882a593Smuzhiyunmust be able to track changes in the process virtual memory 257*4882a593Smuzhiyunlayout. Userfaultfd can notify the manager about such changes using 258*4882a593Smuzhiyunthe same read(2) protocol as for the page fault notifications. The 259*4882a593Smuzhiyunmanager has to explicitly enable these events by setting appropriate 260*4882a593Smuzhiyunbits in ``uffdio_api.features`` passed to ``UFFDIO_API`` ioctl: 261*4882a593Smuzhiyun 262*4882a593Smuzhiyun``UFFD_FEATURE_EVENT_FORK`` 263*4882a593Smuzhiyun enable ``userfaultfd`` hooks for fork(). When this feature is 264*4882a593Smuzhiyun enabled, the ``userfaultfd`` context of the parent process is 265*4882a593Smuzhiyun duplicated into the newly created process. The manager 266*4882a593Smuzhiyun receives ``UFFD_EVENT_FORK`` with file descriptor of the new 267*4882a593Smuzhiyun ``userfaultfd`` context in the ``uffd_msg.fork``. 268*4882a593Smuzhiyun 269*4882a593Smuzhiyun``UFFD_FEATURE_EVENT_REMAP`` 270*4882a593Smuzhiyun enable notifications about mremap() calls. When the 271*4882a593Smuzhiyun non-cooperative process moves a virtual memory area to a 272*4882a593Smuzhiyun different location, the manager will receive 273*4882a593Smuzhiyun ``UFFD_EVENT_REMAP``. The ``uffd_msg.remap`` will contain the old and 274*4882a593Smuzhiyun new addresses of the area and its original length. 275*4882a593Smuzhiyun 276*4882a593Smuzhiyun``UFFD_FEATURE_EVENT_REMOVE`` 277*4882a593Smuzhiyun enable notifications about madvise(MADV_REMOVE) and 278*4882a593Smuzhiyun madvise(MADV_DONTNEED) calls. The event ``UFFD_EVENT_REMOVE`` will 279*4882a593Smuzhiyun be generated upon these calls to madvise(). The ``uffd_msg.remove`` 280*4882a593Smuzhiyun will contain start and end addresses of the removed area. 281*4882a593Smuzhiyun 282*4882a593Smuzhiyun``UFFD_FEATURE_EVENT_UNMAP`` 283*4882a593Smuzhiyun enable notifications about memory unmapping. The manager will 284*4882a593Smuzhiyun get ``UFFD_EVENT_UNMAP`` with ``uffd_msg.remove`` containing start and 285*4882a593Smuzhiyun end addresses of the unmapped area. 286*4882a593Smuzhiyun 287*4882a593SmuzhiyunAlthough the ``UFFD_FEATURE_EVENT_REMOVE`` and ``UFFD_FEATURE_EVENT_UNMAP`` 288*4882a593Smuzhiyunare pretty similar, they quite differ in the action expected from the 289*4882a593Smuzhiyun``userfaultfd`` manager. In the former case, the virtual memory is 290*4882a593Smuzhiyunremoved, but the area is not, the area remains monitored by the 291*4882a593Smuzhiyun``userfaultfd``, and if a page fault occurs in that area it will be 292*4882a593Smuzhiyundelivered to the manager. The proper resolution for such page fault is 293*4882a593Smuzhiyunto zeromap the faulting address. However, in the latter case, when an 294*4882a593Smuzhiyunarea is unmapped, either explicitly (with munmap() system call), or 295*4882a593Smuzhiyunimplicitly (e.g. during mremap()), the area is removed and in turn the 296*4882a593Smuzhiyun``userfaultfd`` context for such area disappears too and the manager will 297*4882a593Smuzhiyunnot get further userland page faults from the removed area. Still, the 298*4882a593Smuzhiyunnotification is required in order to prevent manager from using 299*4882a593Smuzhiyun``UFFDIO_COPY`` on the unmapped area. 300*4882a593Smuzhiyun 301*4882a593SmuzhiyunUnlike userland page faults which have to be synchronous and require 302*4882a593Smuzhiyunexplicit or implicit wakeup, all the events are delivered 303*4882a593Smuzhiyunasynchronously and the non-cooperative process resumes execution as 304*4882a593Smuzhiyunsoon as manager executes read(). The ``userfaultfd`` manager should 305*4882a593Smuzhiyuncarefully synchronize calls to ``UFFDIO_COPY`` with the events 306*4882a593Smuzhiyunprocessing. To aid the synchronization, the ``UFFDIO_COPY`` ioctl will 307*4882a593Smuzhiyunreturn ``-ENOSPC`` when the monitored process exits at the time of 308*4882a593Smuzhiyun``UFFDIO_COPY``, and ``-ENOENT``, when the non-cooperative process has changed 309*4882a593Smuzhiyunits virtual memory layout simultaneously with outstanding ``UFFDIO_COPY`` 310*4882a593Smuzhiyunoperation. 311*4882a593Smuzhiyun 312*4882a593SmuzhiyunThe current asynchronous model of the event delivery is optimal for 313*4882a593Smuzhiyunsingle threaded non-cooperative ``userfaultfd`` manager implementations. A 314*4882a593Smuzhiyunsynchronous event delivery model can be added later as a new 315*4882a593Smuzhiyun``userfaultfd`` feature to facilitate multithreading enhancements of the 316*4882a593Smuzhiyunnon cooperative manager, for example to allow ``UFFDIO_COPY`` ioctls to 317*4882a593Smuzhiyunrun in parallel to the event reception. Single threaded 318*4882a593Smuzhiyunimplementations should continue to use the current async event 319*4882a593Smuzhiyundelivery model instead. 320