xref: /OK3568_Linux_fs/kernel/Documentation/s390/monreader.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun=================================================
2*4882a593SmuzhiyunLinux API for read access to z/VM Monitor Records
3*4882a593Smuzhiyun=================================================
4*4882a593Smuzhiyun
5*4882a593SmuzhiyunDate  : 2004-Nov-26
6*4882a593Smuzhiyun
7*4882a593SmuzhiyunAuthor: Gerald Schaefer (geraldsc@de.ibm.com)
8*4882a593Smuzhiyun
9*4882a593Smuzhiyun
10*4882a593Smuzhiyun
11*4882a593Smuzhiyun
12*4882a593SmuzhiyunDescription
13*4882a593Smuzhiyun===========
14*4882a593SmuzhiyunThis item delivers a new Linux API in the form of a misc char device that is
15*4882a593Smuzhiyunusable from user space and allows read access to the z/VM Monitor Records
16*4882a593Smuzhiyuncollected by the `*MONITOR` System Service of z/VM.
17*4882a593Smuzhiyun
18*4882a593Smuzhiyun
19*4882a593SmuzhiyunUser Requirements
20*4882a593Smuzhiyun=================
21*4882a593SmuzhiyunThe z/VM guest on which you want to access this API needs to be configured in
22*4882a593Smuzhiyunorder to allow IUCV connections to the `*MONITOR` service, i.e. it needs the
23*4882a593SmuzhiyunIUCV `*MONITOR` statement in its user entry. If the monitor DCSS to be used is
24*4882a593Smuzhiyunrestricted (likely), you also need the NAMESAVE <DCSS NAME> statement.
25*4882a593SmuzhiyunThis item will use the IUCV device driver to access the z/VM services, so you
26*4882a593Smuzhiyunneed a kernel with IUCV support. You also need z/VM version 4.4 or 5.1.
27*4882a593Smuzhiyun
28*4882a593SmuzhiyunThere are two options for being able to load the monitor DCSS (examples assume
29*4882a593Smuzhiyunthat the monitor DCSS begins at 144 MB and ends at 152 MB). You can query the
30*4882a593Smuzhiyunlocation of the monitor DCSS with the Class E privileged CP command Q NSS MAP
31*4882a593Smuzhiyun(the values BEGPAG and ENDPAG are given in units of 4K pages).
32*4882a593Smuzhiyun
33*4882a593SmuzhiyunSee also "CP Command and Utility Reference" (SC24-6081-00) for more information
34*4882a593Smuzhiyunon the DEF STOR and Q NSS MAP commands, as well as "Saved Segments Planning
35*4882a593Smuzhiyunand Administration" (SC24-6116-00) for more information on DCSSes.
36*4882a593Smuzhiyun
37*4882a593Smuzhiyun1st option:
38*4882a593Smuzhiyun-----------
39*4882a593SmuzhiyunYou can use the CP command DEF STOR CONFIG to define a "memory hole" in your
40*4882a593Smuzhiyunguest virtual storage around the address range of the DCSS.
41*4882a593Smuzhiyun
42*4882a593SmuzhiyunExample: DEF STOR CONFIG 0.140M 200M.200M
43*4882a593Smuzhiyun
44*4882a593SmuzhiyunThis defines two blocks of storage, the first is 140MB in size an begins at
45*4882a593Smuzhiyunaddress 0MB, the second is 200MB in size and begins at address 200MB,
46*4882a593Smuzhiyunresulting in a total storage of 340MB. Note that the first block should
47*4882a593Smuzhiyunalways start at 0 and be at least 64MB in size.
48*4882a593Smuzhiyun
49*4882a593Smuzhiyun2nd option:
50*4882a593Smuzhiyun-----------
51*4882a593SmuzhiyunYour guest virtual storage has to end below the starting address of the DCSS
52*4882a593Smuzhiyunand you have to specify the "mem=" kernel parameter in your parmfile with a
53*4882a593Smuzhiyunvalue greater than the ending address of the DCSS.
54*4882a593Smuzhiyun
55*4882a593SmuzhiyunExample::
56*4882a593Smuzhiyun
57*4882a593Smuzhiyun	DEF STOR 140M
58*4882a593Smuzhiyun
59*4882a593SmuzhiyunThis defines 140MB storage size for your guest, the parameter "mem=160M" is
60*4882a593Smuzhiyunadded to the parmfile.
61*4882a593Smuzhiyun
62*4882a593Smuzhiyun
63*4882a593SmuzhiyunUser Interface
64*4882a593Smuzhiyun==============
65*4882a593SmuzhiyunThe char device is implemented as a kernel module named "monreader",
66*4882a593Smuzhiyunwhich can be loaded via the modprobe command, or it can be compiled into the
67*4882a593Smuzhiyunkernel instead. There is one optional module (or kernel) parameter, "mondcss",
68*4882a593Smuzhiyunto specify the name of the monitor DCSS. If the module is compiled into the
69*4882a593Smuzhiyunkernel, the kernel parameter "monreader.mondcss=<DCSS NAME>" can be specified
70*4882a593Smuzhiyunin the parmfile.
71*4882a593Smuzhiyun
72*4882a593SmuzhiyunThe default name for the DCSS is "MONDCSS" if none is specified. In case that
73*4882a593Smuzhiyunthere are other users already connected to the `*MONITOR` service (e.g.
74*4882a593SmuzhiyunPerformance Toolkit), the monitor DCSS is already defined and you have to use
75*4882a593Smuzhiyunthe same DCSS. The CP command Q MONITOR (Class E privileged) shows the name
76*4882a593Smuzhiyunof the monitor DCSS, if already defined, and the users connected to the
77*4882a593Smuzhiyun`*MONITOR` service.
78*4882a593SmuzhiyunRefer to the "z/VM Performance" book (SC24-6109-00) on how to create a monitor
79*4882a593SmuzhiyunDCSS if your z/VM doesn't have one already, you need Class E privileges to
80*4882a593Smuzhiyundefine and save a DCSS.
81*4882a593Smuzhiyun
82*4882a593SmuzhiyunExample:
83*4882a593Smuzhiyun--------
84*4882a593Smuzhiyun
85*4882a593Smuzhiyun::
86*4882a593Smuzhiyun
87*4882a593Smuzhiyun	modprobe monreader mondcss=MYDCSS
88*4882a593Smuzhiyun
89*4882a593SmuzhiyunThis loads the module and sets the DCSS name to "MYDCSS".
90*4882a593Smuzhiyun
91*4882a593SmuzhiyunNOTE:
92*4882a593Smuzhiyun-----
93*4882a593SmuzhiyunThis API provides no interface to control the `*MONITOR` service, e.g. specify
94*4882a593Smuzhiyunwhich data should be collected. This can be done by the CP command MONITOR
95*4882a593Smuzhiyun(Class E privileged), see "CP Command and Utility Reference".
96*4882a593Smuzhiyun
97*4882a593SmuzhiyunDevice nodes with udev:
98*4882a593Smuzhiyun-----------------------
99*4882a593SmuzhiyunAfter loading the module, a char device will be created along with the device
100*4882a593Smuzhiyunnode /<udev directory>/monreader.
101*4882a593Smuzhiyun
102*4882a593SmuzhiyunDevice nodes without udev:
103*4882a593Smuzhiyun--------------------------
104*4882a593SmuzhiyunIf your distribution does not support udev, a device node will not be created
105*4882a593Smuzhiyunautomatically and you have to create it manually after loading the module.
106*4882a593SmuzhiyunTherefore you need to know the major and minor numbers of the device. These
107*4882a593Smuzhiyunnumbers can be found in /sys/class/misc/monreader/dev.
108*4882a593Smuzhiyun
109*4882a593SmuzhiyunTyping cat /sys/class/misc/monreader/dev will give an output of the form
110*4882a593Smuzhiyun<major>:<minor>. The device node can be created via the mknod command, enter
111*4882a593Smuzhiyunmknod <name> c <major> <minor>, where <name> is the name of the device node
112*4882a593Smuzhiyunto be created.
113*4882a593Smuzhiyun
114*4882a593SmuzhiyunExample:
115*4882a593Smuzhiyun--------
116*4882a593Smuzhiyun
117*4882a593Smuzhiyun::
118*4882a593Smuzhiyun
119*4882a593Smuzhiyun	# modprobe monreader
120*4882a593Smuzhiyun	# cat /sys/class/misc/monreader/dev
121*4882a593Smuzhiyun	10:63
122*4882a593Smuzhiyun	# mknod /dev/monreader c 10 63
123*4882a593Smuzhiyun
124*4882a593SmuzhiyunThis loads the module with the default monitor DCSS (MONDCSS) and creates a
125*4882a593Smuzhiyundevice node.
126*4882a593Smuzhiyun
127*4882a593SmuzhiyunFile operations:
128*4882a593Smuzhiyun----------------
129*4882a593SmuzhiyunThe following file operations are supported: open, release, read, poll.
130*4882a593SmuzhiyunThere are two alternative methods for reading: either non-blocking read in
131*4882a593Smuzhiyunconjunction with polling, or blocking read without polling. IOCTLs are not
132*4882a593Smuzhiyunsupported.
133*4882a593Smuzhiyun
134*4882a593SmuzhiyunRead:
135*4882a593Smuzhiyun-----
136*4882a593SmuzhiyunReading from the device provides a 12 Byte monitor control element (MCE),
137*4882a593Smuzhiyunfollowed by a set of one or more contiguous monitor records (similar to the
138*4882a593Smuzhiyunoutput of the CMS utility MONWRITE without the 4K control blocks). The MCE
139*4882a593Smuzhiyuncontains information on the type of the following record set (sample/event
140*4882a593Smuzhiyundata), the monitor domains contained within it and the start and end address
141*4882a593Smuzhiyunof the record set in the monitor DCSS. The start and end address can be used
142*4882a593Smuzhiyunto determine the size of the record set, the end address is the address of the
143*4882a593Smuzhiyunlast byte of data. The start address is needed to handle "end-of-frame" records
144*4882a593Smuzhiyuncorrectly (domain 1, record 13), i.e. it can be used to determine the record
145*4882a593Smuzhiyunstart offset relative to a 4K page (frame) boundary.
146*4882a593Smuzhiyun
147*4882a593SmuzhiyunSee "Appendix A: `*MONITOR`" in the "z/VM Performance" document for a description
148*4882a593Smuzhiyunof the monitor control element layout. The layout of the monitor records can
149*4882a593Smuzhiyunbe found here (z/VM 5.1): https://www.vm.ibm.com/pubs/mon510/index.html
150*4882a593Smuzhiyun
151*4882a593SmuzhiyunThe layout of the data stream provided by the monreader device is as follows::
152*4882a593Smuzhiyun
153*4882a593Smuzhiyun	...
154*4882a593Smuzhiyun	<0 byte read>
155*4882a593Smuzhiyun	<first MCE>              \
156*4882a593Smuzhiyun	<first set of records>    |
157*4882a593Smuzhiyun	...                       |- data set
158*4882a593Smuzhiyun	<last MCE>                |
159*4882a593Smuzhiyun	<last set of records>    /
160*4882a593Smuzhiyun	<0 byte read>
161*4882a593Smuzhiyun	...
162*4882a593Smuzhiyun
163*4882a593SmuzhiyunThere may be more than one combination of MCE and corresponding record set
164*4882a593Smuzhiyunwithin one data set and the end of each data set is indicated by a successful
165*4882a593Smuzhiyunread with a return value of 0 (0 byte read).
166*4882a593SmuzhiyunAny received data must be considered invalid until a complete set was
167*4882a593Smuzhiyunread successfully, including the closing 0 byte read. Therefore you should
168*4882a593Smuzhiyunalways read the complete set into a buffer before processing the data.
169*4882a593Smuzhiyun
170*4882a593SmuzhiyunThe maximum size of a data set can be as large as the size of the
171*4882a593Smuzhiyunmonitor DCSS, so design the buffer adequately or use dynamic memory allocation.
172*4882a593SmuzhiyunThe size of the monitor DCSS will be printed into syslog after loading the
173*4882a593Smuzhiyunmodule. You can also use the (Class E privileged) CP command Q NSS MAP to
174*4882a593Smuzhiyunlist all available segments and information about them.
175*4882a593Smuzhiyun
176*4882a593SmuzhiyunAs with most char devices, error conditions are indicated by returning a
177*4882a593Smuzhiyunnegative value for the number of bytes read. In this case, the errno variable
178*4882a593Smuzhiyunindicates the error condition:
179*4882a593Smuzhiyun
180*4882a593SmuzhiyunEIO:
181*4882a593Smuzhiyun     reply failed, read data is invalid and the application
182*4882a593Smuzhiyun     should discard the data read since the last successful read with 0 size.
183*4882a593SmuzhiyunEFAULT:
184*4882a593Smuzhiyun	copy_to_user failed, read data is invalid and the application should
185*4882a593Smuzhiyun	discard the data read since the last successful read with 0 size.
186*4882a593SmuzhiyunEAGAIN:
187*4882a593Smuzhiyun	occurs on a non-blocking read if there is no data available at the
188*4882a593Smuzhiyun	moment. There is no data missing or corrupted, just try again or rather
189*4882a593Smuzhiyun	use polling for non-blocking reads.
190*4882a593SmuzhiyunEOVERFLOW:
191*4882a593Smuzhiyun	   message limit reached, the data read since the last successful
192*4882a593Smuzhiyun	   read with 0 size is valid but subsequent records may be missing.
193*4882a593Smuzhiyun
194*4882a593SmuzhiyunIn the last case (EOVERFLOW) there may be missing data, in the first two cases
195*4882a593Smuzhiyun(EIO, EFAULT) there will be missing data. It's up to the application if it will
196*4882a593Smuzhiyuncontinue reading subsequent data or rather exit.
197*4882a593Smuzhiyun
198*4882a593SmuzhiyunOpen:
199*4882a593Smuzhiyun-----
200*4882a593SmuzhiyunOnly one user is allowed to open the char device. If it is already in use, the
201*4882a593Smuzhiyunopen function will fail (return a negative value) and set errno to EBUSY.
202*4882a593SmuzhiyunThe open function may also fail if an IUCV connection to the `*MONITOR` service
203*4882a593Smuzhiyuncannot be established. In this case errno will be set to EIO and an error
204*4882a593Smuzhiyunmessage with an IPUSER SEVER code will be printed into syslog. The IPUSER SEVER
205*4882a593Smuzhiyuncodes are described in the "z/VM Performance" book, Appendix A.
206*4882a593Smuzhiyun
207*4882a593SmuzhiyunNOTE:
208*4882a593Smuzhiyun-----
209*4882a593SmuzhiyunAs soon as the device is opened, incoming messages will be accepted and they
210*4882a593Smuzhiyunwill account for the message limit, i.e. opening the device without reading
211*4882a593Smuzhiyunfrom it will provoke the "message limit reached" error (EOVERFLOW error code)
212*4882a593Smuzhiyuneventually.
213