xref: /OK3568_Linux_fs/kernel/Documentation/powerpc/vas-api.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0
2*4882a593Smuzhiyun.. _VAS-API:
3*4882a593Smuzhiyun
4*4882a593Smuzhiyun===================================================
5*4882a593SmuzhiyunVirtual Accelerator Switchboard (VAS) userspace API
6*4882a593Smuzhiyun===================================================
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunIntroduction
9*4882a593Smuzhiyun============
10*4882a593Smuzhiyun
11*4882a593SmuzhiyunPower9 processor introduced Virtual Accelerator Switchboard (VAS) which
12*4882a593Smuzhiyunallows both userspace and kernel communicate to co-processor
13*4882a593Smuzhiyun(hardware accelerator) referred to as the Nest Accelerator (NX). The NX
14*4882a593Smuzhiyununit comprises of one or more hardware engines or co-processor types
15*4882a593Smuzhiyunsuch as 842 compression, GZIP compression and encryption. On power9,
16*4882a593Smuzhiyunuserspace applications will have access to only GZIP Compression engine
17*4882a593Smuzhiyunwhich supports ZLIB and GZIP compression algorithms in the hardware.
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunTo communicate with NX, kernel has to establish a channel or window and
20*4882a593Smuzhiyunthen requests can be submitted directly without kernel involvement.
21*4882a593SmuzhiyunRequests to the GZIP engine must be formatted as a co-processor Request
22*4882a593SmuzhiyunBlock (CRB) and these CRBs must be submitted to the NX using COPY/PASTE
23*4882a593Smuzhiyuninstructions to paste the CRB to hardware address that is associated with
24*4882a593Smuzhiyunthe engine's request queue.
25*4882a593Smuzhiyun
26*4882a593SmuzhiyunThe GZIP engine provides two priority levels of requests: Normal and
27*4882a593SmuzhiyunHigh. Only Normal requests are supported from userspace right now.
28*4882a593Smuzhiyun
29*4882a593SmuzhiyunThis document explains userspace API that is used to interact with
30*4882a593Smuzhiyunkernel to setup channel / window which can be used to send compression
31*4882a593Smuzhiyunrequests directly to NX accelerator.
32*4882a593Smuzhiyun
33*4882a593Smuzhiyun
34*4882a593SmuzhiyunOverview
35*4882a593Smuzhiyun========
36*4882a593Smuzhiyun
37*4882a593SmuzhiyunApplication access to the GZIP engine is provided through
38*4882a593Smuzhiyun/dev/crypto/nx-gzip device node implemented by the VAS/NX device driver.
39*4882a593SmuzhiyunAn application must open the /dev/crypto/nx-gzip device to obtain a file
40*4882a593Smuzhiyundescriptor (fd). Then should issue VAS_TX_WIN_OPEN ioctl with this fd to
41*4882a593Smuzhiyunestablish connection to the engine. It means send window is opened on GZIP
42*4882a593Smuzhiyunengine for this process. Once a connection is established, the application
43*4882a593Smuzhiyunshould use the mmap() system call to map the hardware address of engine's
44*4882a593Smuzhiyunrequest queue into the application's virtual address space.
45*4882a593Smuzhiyun
46*4882a593SmuzhiyunThe application can then submit one or more requests to the engine by
47*4882a593Smuzhiyunusing copy/paste instructions and pasting the CRBs to the virtual address
48*4882a593Smuzhiyun(aka paste_address) returned by mmap(). User space can close the
49*4882a593Smuzhiyunestablished connection or send window by closing the file descriptior
50*4882a593Smuzhiyun(close(fd)) or upon the process exit.
51*4882a593Smuzhiyun
52*4882a593SmuzhiyunNote that applications can send several requests with the same window or
53*4882a593Smuzhiyuncan establish multiple windows, but one window for each file descriptor.
54*4882a593Smuzhiyun
55*4882a593SmuzhiyunFollowing sections provide additional details and references about the
56*4882a593Smuzhiyunindividual steps.
57*4882a593Smuzhiyun
58*4882a593SmuzhiyunNX-GZIP Device Node
59*4882a593Smuzhiyun===================
60*4882a593Smuzhiyun
61*4882a593SmuzhiyunThere is one /dev/crypto/nx-gzip node in the system and it provides
62*4882a593Smuzhiyunaccess to all GZIP engines in the system. The only valid operations on
63*4882a593Smuzhiyun/dev/crypto/nx-gzip are:
64*4882a593Smuzhiyun
65*4882a593Smuzhiyun	* open() the device for read and write.
66*4882a593Smuzhiyun	* issue VAS_TX_WIN_OPEN ioctl
67*4882a593Smuzhiyun	* mmap() the engine's request queue into application's virtual
68*4882a593Smuzhiyun	  address space (i.e. get a paste_address for the co-processor
69*4882a593Smuzhiyun	  engine).
70*4882a593Smuzhiyun	* close the device node.
71*4882a593Smuzhiyun
72*4882a593SmuzhiyunOther file operations on this device node are undefined.
73*4882a593Smuzhiyun
74*4882a593SmuzhiyunNote that the copy and paste operations go directly to the hardware and
75*4882a593Smuzhiyundo not go through this device. Refer COPY/PASTE document for more
76*4882a593Smuzhiyundetails.
77*4882a593Smuzhiyun
78*4882a593SmuzhiyunAlthough a system may have several instances of the NX co-processor
79*4882a593Smuzhiyunengines (typically, one per P9 chip) there is just one
80*4882a593Smuzhiyun/dev/crypto/nx-gzip device node in the system. When the nx-gzip device
81*4882a593Smuzhiyunnode is opened, Kernel opens send window on a suitable instance of NX
82*4882a593Smuzhiyunaccelerator. It finds CPU on which the user process is executing and
83*4882a593Smuzhiyundetermine the NX instance for the corresponding chip on which this CPU
84*4882a593Smuzhiyunbelongs.
85*4882a593Smuzhiyun
86*4882a593SmuzhiyunApplications may chose a specific instance of the NX co-processor using
87*4882a593Smuzhiyunthe vas_id field in the VAS_TX_WIN_OPEN ioctl as detailed below.
88*4882a593Smuzhiyun
89*4882a593SmuzhiyunA userspace library libnxz is available here but still in development:
90*4882a593Smuzhiyun
91*4882a593Smuzhiyun	 https://github.com/abalib/power-gzip
92*4882a593Smuzhiyun
93*4882a593SmuzhiyunApplications that use inflate / deflate calls can link with libnxz
94*4882a593Smuzhiyuninstead of libz and use NX GZIP compression without any modification.
95*4882a593Smuzhiyun
96*4882a593SmuzhiyunOpen /dev/crypto/nx-gzip
97*4882a593Smuzhiyun========================
98*4882a593Smuzhiyun
99*4882a593SmuzhiyunThe nx-gzip device should be opened for read and write. No special
100*4882a593Smuzhiyunprivileges are needed to open the device. Each window corresponds to one
101*4882a593Smuzhiyunfile descriptor. So if the userspace process needs multiple windows,
102*4882a593Smuzhiyunseveral open calls have to be issued.
103*4882a593Smuzhiyun
104*4882a593SmuzhiyunSee open(2) system call man pages for other details such as return values,
105*4882a593Smuzhiyunerror codes and restrictions.
106*4882a593Smuzhiyun
107*4882a593SmuzhiyunVAS_TX_WIN_OPEN ioctl
108*4882a593Smuzhiyun=====================
109*4882a593Smuzhiyun
110*4882a593SmuzhiyunApplications should use the VAS_TX_WIN_OPEN ioctl as follows to establish
111*4882a593Smuzhiyuna connection with NX co-processor engine:
112*4882a593Smuzhiyun
113*4882a593Smuzhiyun	::
114*4882a593Smuzhiyun
115*4882a593Smuzhiyun		struct vas_tx_win_open_attr {
116*4882a593Smuzhiyun			__u32   version;
117*4882a593Smuzhiyun			__s16   vas_id; /* specific instance of vas or -1
118*4882a593Smuzhiyun						for default */
119*4882a593Smuzhiyun			__u16   reserved1;
120*4882a593Smuzhiyun			__u64   flags;	/* For future use */
121*4882a593Smuzhiyun			__u64   reserved2[6];
122*4882a593Smuzhiyun		};
123*4882a593Smuzhiyun
124*4882a593Smuzhiyun	version:
125*4882a593Smuzhiyun		The version field must be currently set to 1.
126*4882a593Smuzhiyun	vas_id:
127*4882a593Smuzhiyun		If '-1' is passed, kernel will make a best-effort attempt
128*4882a593Smuzhiyun		to assign an optimal instance of NX for the process. To
129*4882a593Smuzhiyun		select the specific VAS instance, refer
130*4882a593Smuzhiyun		"Discovery of available VAS engines" section below.
131*4882a593Smuzhiyun
132*4882a593Smuzhiyun	flags, reserved1 and reserved2[6] fields are for future extension
133*4882a593Smuzhiyun	and must be set to 0.
134*4882a593Smuzhiyun
135*4882a593Smuzhiyun	The attributes attr for the VAS_TX_WIN_OPEN ioctl are defined as
136*4882a593Smuzhiyun	follows::
137*4882a593Smuzhiyun
138*4882a593Smuzhiyun		#define VAS_MAGIC 'v'
139*4882a593Smuzhiyun		#define VAS_TX_WIN_OPEN _IOW(VAS_MAGIC, 1,
140*4882a593Smuzhiyun						struct vas_tx_win_open_attr)
141*4882a593Smuzhiyun
142*4882a593Smuzhiyun		struct vas_tx_win_open_attr attr;
143*4882a593Smuzhiyun		rc = ioctl(fd, VAS_TX_WIN_OPEN, &attr);
144*4882a593Smuzhiyun
145*4882a593Smuzhiyun	The VAS_TX_WIN_OPEN ioctl returns 0 on success. On errors, it
146*4882a593Smuzhiyun	returns -1 and sets the errno variable to indicate the error.
147*4882a593Smuzhiyun
148*4882a593Smuzhiyun	Error conditions:
149*4882a593Smuzhiyun
150*4882a593Smuzhiyun		======	================================================
151*4882a593Smuzhiyun		EINVAL	fd does not refer to a valid VAS device.
152*4882a593Smuzhiyun		EINVAL	Invalid vas ID
153*4882a593Smuzhiyun		EINVAL	version is not set with proper value
154*4882a593Smuzhiyun		EEXIST	Window is already opened for the given fd
155*4882a593Smuzhiyun		ENOMEM	Memory is not available to allocate window
156*4882a593Smuzhiyun		ENOSPC	System has too many active windows (connections)
157*4882a593Smuzhiyun			opened
158*4882a593Smuzhiyun		EINVAL	reserved fields are not set to 0.
159*4882a593Smuzhiyun		======	================================================
160*4882a593Smuzhiyun
161*4882a593Smuzhiyun	See the ioctl(2) man page for more details, error codes and
162*4882a593Smuzhiyun	restrictions.
163*4882a593Smuzhiyun
164*4882a593Smuzhiyunmmap() NX-GZIP device
165*4882a593Smuzhiyun=====================
166*4882a593Smuzhiyun
167*4882a593SmuzhiyunThe mmap() system call for a NX-GZIP device fd returns a paste_address
168*4882a593Smuzhiyunthat the application can use to copy/paste its CRB to the hardware engines.
169*4882a593Smuzhiyun
170*4882a593Smuzhiyun	::
171*4882a593Smuzhiyun
172*4882a593Smuzhiyun		paste_addr = mmap(addr, size, prot, flags, fd, offset);
173*4882a593Smuzhiyun
174*4882a593Smuzhiyun	Only restrictions on mmap for a NX-GZIP device fd are:
175*4882a593Smuzhiyun
176*4882a593Smuzhiyun		* size should be PAGE_SIZE
177*4882a593Smuzhiyun		* offset parameter should be 0ULL
178*4882a593Smuzhiyun
179*4882a593Smuzhiyun	Refer to mmap(2) man page for additional details/restrictions.
180*4882a593Smuzhiyun	In addition to the error conditions listed on the mmap(2) man
181*4882a593Smuzhiyun	page, can also fail with one of the following error codes:
182*4882a593Smuzhiyun
183*4882a593Smuzhiyun		======	=============================================
184*4882a593Smuzhiyun		EINVAL	fd is not associated with an open window
185*4882a593Smuzhiyun			(i.e mmap() does not follow a successful call
186*4882a593Smuzhiyun			to the VAS_TX_WIN_OPEN ioctl).
187*4882a593Smuzhiyun		EINVAL	offset field is not 0ULL.
188*4882a593Smuzhiyun		======	=============================================
189*4882a593Smuzhiyun
190*4882a593SmuzhiyunDiscovery of available VAS engines
191*4882a593Smuzhiyun==================================
192*4882a593Smuzhiyun
193*4882a593SmuzhiyunEach available VAS instance in the system will have a device tree node
194*4882a593Smuzhiyunlike /proc/device-tree/vas@* or /proc/device-tree/xscom@*/vas@*.
195*4882a593SmuzhiyunDetermine the chip or VAS instance and use the corresponding ibm,vas-id
196*4882a593Smuzhiyunproperty value in this node to select specific VAS instance.
197*4882a593Smuzhiyun
198*4882a593SmuzhiyunCopy/Paste operations
199*4882a593Smuzhiyun=====================
200*4882a593Smuzhiyun
201*4882a593SmuzhiyunApplications should use the copy and paste instructions to send CRB to NX.
202*4882a593SmuzhiyunRefer section 4.4 in PowerISA for Copy/Paste instructions:
203*4882a593Smuzhiyunhttps://openpowerfoundation.org/?resource_lib=power-isa-version-3-0
204*4882a593Smuzhiyun
205*4882a593SmuzhiyunCRB Specification and use NX
206*4882a593Smuzhiyun============================
207*4882a593Smuzhiyun
208*4882a593SmuzhiyunApplications should format requests to the co-processor using the
209*4882a593Smuzhiyunco-processor Request Block (CRBs). Refer NX-GZIP user's manual for the format
210*4882a593Smuzhiyunof CRB and use NX from userspace such as sending requests and checking
211*4882a593Smuzhiyunrequest status.
212*4882a593Smuzhiyun
213*4882a593SmuzhiyunNX Fault handling
214*4882a593Smuzhiyun=================
215*4882a593Smuzhiyun
216*4882a593SmuzhiyunApplications send requests to NX and wait for the status by polling on
217*4882a593Smuzhiyunco-processor Status Block (CSB) flags. NX updates status in CSB after each
218*4882a593Smuzhiyunrequest is processed. Refer NX-GZIP user's manual for the format of CSB and
219*4882a593Smuzhiyunstatus flags.
220*4882a593Smuzhiyun
221*4882a593SmuzhiyunIn case if NX encounters translation error (called NX page fault) on CSB
222*4882a593Smuzhiyunaddress or any request buffer, raises an interrupt on the CPU to handle the
223*4882a593Smuzhiyunfault. Page fault can happen if an application passes invalid addresses or
224*4882a593Smuzhiyunrequest buffers are not in memory. The operating system handles the fault by
225*4882a593Smuzhiyunupdating CSB with the following data::
226*4882a593Smuzhiyun
227*4882a593Smuzhiyun	csb.flags = CSB_V;
228*4882a593Smuzhiyun	csb.cc = CSB_CC_FAULT_ADDRESS;
229*4882a593Smuzhiyun	csb.ce = CSB_CE_TERMINATION;
230*4882a593Smuzhiyun	csb.address = fault_address;
231*4882a593Smuzhiyun
232*4882a593SmuzhiyunWhen an application receives translation error, it can touch or access
233*4882a593Smuzhiyunthe page that has a fault address so that this page will be in memory. Then
234*4882a593Smuzhiyunthe application can resend this request to NX.
235*4882a593Smuzhiyun
236*4882a593SmuzhiyunIf the OS can not update CSB due to invalid CSB address, sends SEGV signal
237*4882a593Smuzhiyunto the process who opened the send window on which the original request was
238*4882a593Smuzhiyunissued. This signal returns with the following siginfo struct::
239*4882a593Smuzhiyun
240*4882a593Smuzhiyun	siginfo.si_signo = SIGSEGV;
241*4882a593Smuzhiyun	siginfo.si_errno = EFAULT;
242*4882a593Smuzhiyun	siginfo.si_code = SEGV_MAPERR;
243*4882a593Smuzhiyun	siginfo.si_addr = CSB adress;
244*4882a593Smuzhiyun
245*4882a593SmuzhiyunIn the case of multi-thread applications, NX send windows can be shared
246*4882a593Smuzhiyunacross all threads. For example, a child thread can open a send window,
247*4882a593Smuzhiyunbut other threads can send requests to NX using this window. These
248*4882a593Smuzhiyunrequests will be successful even in the case of OS handling faults as long
249*4882a593Smuzhiyunas CSB address is valid. If the NX request contains an invalid CSB address,
250*4882a593Smuzhiyunthe signal will be sent to the child thread that opened the window. But if
251*4882a593Smuzhiyunthe thread is exited without closing the window and the request is issued
252*4882a593Smuzhiyunusing this window. the signal will be issued to the thread group leader
253*4882a593Smuzhiyun(tgid). It is up to the application whether to ignore or handle these
254*4882a593Smuzhiyunsignals.
255*4882a593Smuzhiyun
256*4882a593SmuzhiyunNX-GZIP User's Manual:
257*4882a593Smuzhiyunhttps://github.com/libnxz/power-gzip/blob/master/power_nx_gzip_um.pdf
258*4882a593Smuzhiyun
259*4882a593SmuzhiyunSimple example
260*4882a593Smuzhiyun==============
261*4882a593Smuzhiyun
262*4882a593Smuzhiyun	::
263*4882a593Smuzhiyun
264*4882a593Smuzhiyun		int use_nx_gzip()
265*4882a593Smuzhiyun		{
266*4882a593Smuzhiyun			int rc, fd;
267*4882a593Smuzhiyun			void *addr;
268*4882a593Smuzhiyun			struct vas_setup_attr txattr;
269*4882a593Smuzhiyun
270*4882a593Smuzhiyun			fd = open("/dev/crypto/nx-gzip", O_RDWR);
271*4882a593Smuzhiyun			if (fd < 0) {
272*4882a593Smuzhiyun				fprintf(stderr, "open nx-gzip failed\n");
273*4882a593Smuzhiyun				return -1;
274*4882a593Smuzhiyun			}
275*4882a593Smuzhiyun			memset(&txattr, 0, sizeof(txattr));
276*4882a593Smuzhiyun			txattr.version = 1;
277*4882a593Smuzhiyun			txattr.vas_id = -1
278*4882a593Smuzhiyun			rc = ioctl(fd, VAS_TX_WIN_OPEN,
279*4882a593Smuzhiyun					(unsigned long)&txattr);
280*4882a593Smuzhiyun			if (rc < 0) {
281*4882a593Smuzhiyun				fprintf(stderr, "ioctl() n %d, error %d\n",
282*4882a593Smuzhiyun						rc, errno);
283*4882a593Smuzhiyun				return rc;
284*4882a593Smuzhiyun			}
285*4882a593Smuzhiyun			addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE,
286*4882a593Smuzhiyun					MAP_SHARED, fd, 0ULL);
287*4882a593Smuzhiyun			if (addr == MAP_FAILED) {
288*4882a593Smuzhiyun				fprintf(stderr, "mmap() failed, errno %d\n",
289*4882a593Smuzhiyun						errno);
290*4882a593Smuzhiyun				return -errno;
291*4882a593Smuzhiyun			}
292*4882a593Smuzhiyun			do {
293*4882a593Smuzhiyun				//Format CRB request with compression or
294*4882a593Smuzhiyun				//uncompression
295*4882a593Smuzhiyun				// Refer tests for vas_copy/vas_paste
296*4882a593Smuzhiyun				vas_copy((&crb, 0, 1);
297*4882a593Smuzhiyun				vas_paste(addr, 0, 1);
298*4882a593Smuzhiyun				// Poll on csb.flags with timeout
299*4882a593Smuzhiyun				// csb address is listed in CRB
300*4882a593Smuzhiyun			} while (true)
301*4882a593Smuzhiyun			close(fd) or window can be closed upon process exit
302*4882a593Smuzhiyun		}
303*4882a593Smuzhiyun
304*4882a593Smuzhiyun	Refer https://github.com/abalib/power-gzip for tests or more
305*4882a593Smuzhiyun	use cases.
306