1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0 2*4882a593Smuzhiyun 3*4882a593SmuzhiyunIntroduction of Uacce 4*4882a593Smuzhiyun--------------------- 5*4882a593Smuzhiyun 6*4882a593SmuzhiyunUacce (Unified/User-space-access-intended Accelerator Framework) targets to 7*4882a593Smuzhiyunprovide Shared Virtual Addressing (SVA) between accelerators and processes. 8*4882a593SmuzhiyunSo accelerator can access any data structure of the main cpu. 9*4882a593SmuzhiyunThis differs from the data sharing between cpu and io device, which share 10*4882a593Smuzhiyunonly data content rather than address. 11*4882a593SmuzhiyunBecause of the unified address, hardware and user space of process can 12*4882a593Smuzhiyunshare the same virtual address in the communication. 13*4882a593SmuzhiyunUacce takes the hardware accelerator as a heterogeneous processor, while 14*4882a593SmuzhiyunIOMMU share the same CPU page tables and as a result the same translation 15*4882a593Smuzhiyunfrom va to pa. 16*4882a593Smuzhiyun 17*4882a593Smuzhiyun:: 18*4882a593Smuzhiyun 19*4882a593Smuzhiyun __________________________ __________________________ 20*4882a593Smuzhiyun | | | | 21*4882a593Smuzhiyun | User application (CPU) | | Hardware Accelerator | 22*4882a593Smuzhiyun |__________________________| |__________________________| 23*4882a593Smuzhiyun 24*4882a593Smuzhiyun | | 25*4882a593Smuzhiyun | va | va 26*4882a593Smuzhiyun V V 27*4882a593Smuzhiyun __________ __________ 28*4882a593Smuzhiyun | | | | 29*4882a593Smuzhiyun | MMU | | IOMMU | 30*4882a593Smuzhiyun |__________| |__________| 31*4882a593Smuzhiyun | | 32*4882a593Smuzhiyun | | 33*4882a593Smuzhiyun V pa V pa 34*4882a593Smuzhiyun _______________________________________ 35*4882a593Smuzhiyun | | 36*4882a593Smuzhiyun | Memory | 37*4882a593Smuzhiyun |_______________________________________| 38*4882a593Smuzhiyun 39*4882a593Smuzhiyun 40*4882a593Smuzhiyun 41*4882a593SmuzhiyunArchitecture 42*4882a593Smuzhiyun------------ 43*4882a593Smuzhiyun 44*4882a593SmuzhiyunUacce is the kernel module, taking charge of iommu and address sharing. 45*4882a593SmuzhiyunThe user drivers and libraries are called WarpDrive. 46*4882a593Smuzhiyun 47*4882a593SmuzhiyunThe uacce device, built around the IOMMU SVA API, can access multiple 48*4882a593Smuzhiyunaddress spaces, including the one without PASID. 49*4882a593Smuzhiyun 50*4882a593SmuzhiyunA virtual concept, queue, is used for the communication. It provides a 51*4882a593SmuzhiyunFIFO-like interface. And it maintains a unified address space between the 52*4882a593Smuzhiyunapplication and all involved hardware. 53*4882a593Smuzhiyun 54*4882a593Smuzhiyun:: 55*4882a593Smuzhiyun 56*4882a593Smuzhiyun ___________________ ________________ 57*4882a593Smuzhiyun | | user API | | 58*4882a593Smuzhiyun | WarpDrive library | ------------> | user driver | 59*4882a593Smuzhiyun |___________________| |________________| 60*4882a593Smuzhiyun | | 61*4882a593Smuzhiyun | | 62*4882a593Smuzhiyun | queue fd | 63*4882a593Smuzhiyun | | 64*4882a593Smuzhiyun | | 65*4882a593Smuzhiyun v | 66*4882a593Smuzhiyun ___________________ _________ | 67*4882a593Smuzhiyun | | | | | mmap memory 68*4882a593Smuzhiyun | Other framework | | uacce | | r/w interface 69*4882a593Smuzhiyun | crypto/nic/others | |_________| | 70*4882a593Smuzhiyun |___________________| | 71*4882a593Smuzhiyun | | | 72*4882a593Smuzhiyun | register | register | 73*4882a593Smuzhiyun | | | 74*4882a593Smuzhiyun | | | 75*4882a593Smuzhiyun | _________________ __________ | 76*4882a593Smuzhiyun | | | | | | 77*4882a593Smuzhiyun ------------- | Device Driver | | IOMMU | | 78*4882a593Smuzhiyun |_________________| |__________| | 79*4882a593Smuzhiyun | | 80*4882a593Smuzhiyun | V 81*4882a593Smuzhiyun | ___________________ 82*4882a593Smuzhiyun | | | 83*4882a593Smuzhiyun -------------------------- | Device(Hardware) | 84*4882a593Smuzhiyun |___________________| 85*4882a593Smuzhiyun 86*4882a593Smuzhiyun 87*4882a593SmuzhiyunHow does it work 88*4882a593Smuzhiyun---------------- 89*4882a593Smuzhiyun 90*4882a593SmuzhiyunUacce uses mmap and IOMMU to play the trick. 91*4882a593Smuzhiyun 92*4882a593SmuzhiyunUacce creates a chrdev for every device registered to it. New queue is 93*4882a593Smuzhiyuncreated when user application open the chrdev. The file descriptor is used 94*4882a593Smuzhiyunas the user handle of the queue. 95*4882a593SmuzhiyunThe accelerator device present itself as an Uacce object, which exports as 96*4882a593Smuzhiyuna chrdev to the user space. The user application communicates with the 97*4882a593Smuzhiyunhardware by ioctl (as control path) or share memory (as data path). 98*4882a593Smuzhiyun 99*4882a593SmuzhiyunThe control path to the hardware is via file operation, while data path is 100*4882a593Smuzhiyunvia mmap space of the queue fd. 101*4882a593Smuzhiyun 102*4882a593SmuzhiyunThe queue file address space: 103*4882a593Smuzhiyun 104*4882a593Smuzhiyun:: 105*4882a593Smuzhiyun 106*4882a593Smuzhiyun /** 107*4882a593Smuzhiyun * enum uacce_qfrt: qfrt type 108*4882a593Smuzhiyun * @UACCE_QFRT_MMIO: device mmio region 109*4882a593Smuzhiyun * @UACCE_QFRT_DUS: device user share region 110*4882a593Smuzhiyun */ 111*4882a593Smuzhiyun enum uacce_qfrt { 112*4882a593Smuzhiyun UACCE_QFRT_MMIO = 0, 113*4882a593Smuzhiyun UACCE_QFRT_DUS = 1, 114*4882a593Smuzhiyun }; 115*4882a593Smuzhiyun 116*4882a593SmuzhiyunAll regions are optional and differ from device type to type. 117*4882a593SmuzhiyunEach region can be mmapped only once, otherwise -EEXIST returns. 118*4882a593Smuzhiyun 119*4882a593SmuzhiyunThe device mmio region is mapped to the hardware mmio space. It is generally 120*4882a593Smuzhiyunused for doorbell or other notification to the hardware. It is not fast enough 121*4882a593Smuzhiyunas data channel. 122*4882a593Smuzhiyun 123*4882a593SmuzhiyunThe device user share region is used for share data buffer between user process 124*4882a593Smuzhiyunand device. 125*4882a593Smuzhiyun 126*4882a593Smuzhiyun 127*4882a593SmuzhiyunThe Uacce register API 128*4882a593Smuzhiyun---------------------- 129*4882a593Smuzhiyun 130*4882a593SmuzhiyunThe register API is defined in uacce.h. 131*4882a593Smuzhiyun 132*4882a593Smuzhiyun:: 133*4882a593Smuzhiyun 134*4882a593Smuzhiyun struct uacce_interface { 135*4882a593Smuzhiyun char name[UACCE_MAX_NAME_SIZE]; 136*4882a593Smuzhiyun unsigned int flags; 137*4882a593Smuzhiyun const struct uacce_ops *ops; 138*4882a593Smuzhiyun }; 139*4882a593Smuzhiyun 140*4882a593SmuzhiyunAccording to the IOMMU capability, uacce_interface flags can be: 141*4882a593Smuzhiyun 142*4882a593Smuzhiyun:: 143*4882a593Smuzhiyun 144*4882a593Smuzhiyun /** 145*4882a593Smuzhiyun * UACCE Device flags: 146*4882a593Smuzhiyun * UACCE_DEV_SVA: Shared Virtual Addresses 147*4882a593Smuzhiyun * Support PASID 148*4882a593Smuzhiyun * Support device page faults (PCI PRI or SMMU Stall) 149*4882a593Smuzhiyun */ 150*4882a593Smuzhiyun #define UACCE_DEV_SVA BIT(0) 151*4882a593Smuzhiyun 152*4882a593Smuzhiyun struct uacce_device *uacce_alloc(struct device *parent, 153*4882a593Smuzhiyun struct uacce_interface *interface); 154*4882a593Smuzhiyun int uacce_register(struct uacce_device *uacce); 155*4882a593Smuzhiyun void uacce_remove(struct uacce_device *uacce); 156*4882a593Smuzhiyun 157*4882a593Smuzhiyunuacce_register results can be: 158*4882a593Smuzhiyun 159*4882a593Smuzhiyuna. If uacce module is not compiled, ERR_PTR(-ENODEV) 160*4882a593Smuzhiyun 161*4882a593Smuzhiyunb. Succeed with the desired flags 162*4882a593Smuzhiyun 163*4882a593Smuzhiyunc. Succeed with the negotiated flags, for example 164*4882a593Smuzhiyun 165*4882a593Smuzhiyun uacce_interface.flags = UACCE_DEV_SVA but uacce->flags = ~UACCE_DEV_SVA 166*4882a593Smuzhiyun 167*4882a593Smuzhiyun So user driver need check return value as well as the negotiated uacce->flags. 168*4882a593Smuzhiyun 169*4882a593Smuzhiyun 170*4882a593SmuzhiyunThe user driver 171*4882a593Smuzhiyun--------------- 172*4882a593Smuzhiyun 173*4882a593SmuzhiyunThe queue file mmap space will need a user driver to wrap the communication 174*4882a593Smuzhiyunprotocol. Uacce provides some attributes in sysfs for the user driver to 175*4882a593Smuzhiyunmatch the right accelerator accordingly. 176*4882a593SmuzhiyunMore details in Documentation/ABI/testing/sysfs-driver-uacce. 177