xref: /OK3568_Linux_fs/external/xserver/doc/smartsched (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun			Client Scheduling in X
2*4882a593Smuzhiyun			    Keith Packard
3*4882a593Smuzhiyun			       SuSE
4*4882a593Smuzhiyun			     10/28/99
5*4882a593Smuzhiyun
6*4882a593SmuzhiyunHistory:
7*4882a593Smuzhiyun
8*4882a593SmuzhiyunSince the original X server was written at Digital in 1987, the OS and DIX
9*4882a593Smuzhiyunlayers shared responsibility for scheduling the order to service
10*4882a593Smuzhiyunclient requests.  The original design was simplistic; under the maximum
11*4882a593Smuzhiyunfirst make it work, then make it work well, this was a good idea.  Now
12*4882a593Smuzhiyunthat we have a bit more experience with X applications, it's time to
13*4882a593Smuzhiyunrethink the design.
14*4882a593Smuzhiyun
15*4882a593SmuzhiyunThe basic dispatch loop in DIX looks like:
16*4882a593Smuzhiyun
17*4882a593Smuzhiyun	for (;;)
18*4882a593Smuzhiyun	{
19*4882a593Smuzhiyun		nready = WaitForSomething (...);
20*4882a593Smuzhiyun		while (nready--)
21*4882a593Smuzhiyun		{
22*4882a593Smuzhiyun			isItTimeToYield = FALSE;
23*4882a593Smuzhiyun			while (!isItTimeToYield)
24*4882a593Smuzhiyun			{
25*4882a593Smuzhiyun				if (!ReadRequestFromClient (...))
26*4882a593Smuzhiyun					break;
27*4882a593Smuzhiyun				(execute request);
28*4882a593Smuzhiyun			}
29*4882a593Smuzhiyun		}
30*4882a593Smuzhiyun	}
31*4882a593Smuzhiyun
32*4882a593SmuzhiyunWaitForSomething looks like:
33*4882a593Smuzhiyun
34*4882a593Smuzhiyun	for (;;)
35*4882a593Smuzhiyun		if (ANYSET (ClientsWithInput))
36*4882a593Smuzhiyun			return popcount (ClientsWithInput);
37*4882a593Smuzhiyun		select (...)
38*4882a593Smuzhiyun		compute clientsReadable from select result;
39*4882a593Smuzhiyun		return popcount (clientsReadable)
40*4882a593Smuzhiyun	}
41*4882a593Smuzhiyun
42*4882a593SmuzhiyunReadRequestFromClient looks like:
43*4882a593Smuzhiyun
44*4882a593Smuzhiyun	if (!fullRequestQueued)
45*4882a593Smuzhiyun	{
46*4882a593Smuzhiyun		read ();
47*4882a593Smuzhiyun		if (!fullRequestQueued)
48*4882a593Smuzhiyun		{
49*4882a593Smuzhiyun			remove from ClientsWithInput;
50*4882a593Smuzhiyun			timesThisConnection = 0;
51*4882a593Smuzhiyun			return 0;
52*4882a593Smuzhiyun		}
53*4882a593Smuzhiyun	}
54*4882a593Smuzhiyun	if (twoFullRequestsQueued)
55*4882a593Smuzhiyun		add to ClientsWithInput;
56*4882a593Smuzhiyun
57*4882a593Smuzhiyun	if (++timesThisConnection >= 10)
58*4882a593Smuzhiyun	{
59*4882a593Smuzhiyun		isItTimeToYield = TRUE;
60*4882a593Smuzhiyun		timesThisConnection = 0;
61*4882a593Smuzhiyun	}
62*4882a593Smuzhiyun	return 1;
63*4882a593Smuzhiyun
64*4882a593SmuzhiyunHere's what happens in this code:
65*4882a593Smuzhiyun
66*4882a593SmuzhiyunWith a single client executing a stream of requests:
67*4882a593Smuzhiyun
68*4882a593Smuzhiyun	A client sends a packet of requests to the server.
69*4882a593Smuzhiyun
70*4882a593Smuzhiyun	WaitForSomething wakes up from select and returns that client
71*4882a593Smuzhiyun	to Dispatch
72*4882a593Smuzhiyun
73*4882a593Smuzhiyun	Dispatch calls ReadRequestFromClient which reads a buffer (4K)
74*4882a593Smuzhiyun	full of requests from the client
75*4882a593Smuzhiyun
76*4882a593Smuzhiyun	The server executes requests from this buffer until it emptys,
77*4882a593Smuzhiyun	in two stages -- 10 requests at a time are executed in the
78*4882a593Smuzhiyun	inner Dispatch loop, a buffer full of requests are executed
79*4882a593Smuzhiyun	because WaitForSomething immediately returns if any clients
80*4882a593Smuzhiyun	have complete requests pending in their input queues.
81*4882a593Smuzhiyun
82*4882a593Smuzhiyun	When the buffer finally emptys, the next call to ReadRequest
83*4882a593Smuzhiyun	FromClient will return zero and Dispatch will go back to
84*4882a593Smuzhiyun	WaitForSomething; now that the client has no requests pending,
85*4882a593Smuzhiyun	WaitForSomething will block in select again.  If the client
86*4882a593Smuzhiyun	is active, this select will immediately return that client
87*4882a593Smuzhiyun	as ready to read.
88*4882a593Smuzhiyun
89*4882a593SmuzhiyunWith multiple clients sending streams of requests, the sequence
90*4882a593Smuzhiyunof operations is similar, except that ReadRequestFromClient will
91*4882a593Smuzhiyunset isItTimeToYield after each 10 requests executed causing the
92*4882a593Smuzhiyunserver to round-robin among the clients with available requests.
93*4882a593Smuzhiyun
94*4882a593SmuzhiyunIt's important to realize here that any complete requests which have been
95*4882a593Smuzhiyunread from clients will be executed before the server will use select again
96*4882a593Smuzhiyunto discover input from other clients.  A single busy client can easily
97*4882a593Smuzhiyunmonopolize the X server.
98*4882a593Smuzhiyun
99*4882a593SmuzhiyunSo, the X server doesn't share well with clients which are more interactive
100*4882a593Smuzhiyunin nature.
101*4882a593Smuzhiyun
102*4882a593SmuzhiyunThe X server executes at most a buffer full of requests before again heading
103*4882a593Smuzhiyuninto select; ReadRequestFromClient causes the server to yield when the
104*4882a593Smuzhiyunclient request buffer doesn't contain a complete request.  When
105*4882a593Smuzhiyunthat buffer is executed quickly, the server spends a lot of time
106*4882a593Smuzhiyunin select discovering that the same client again has input ready.  Thus
107*4882a593Smuzhiyunthe server also runs busy clients less efficiently than is would be
108*4882a593Smuzhiyunpossible.
109*4882a593Smuzhiyun
110*4882a593SmuzhiyunWhat to do.
111*4882a593Smuzhiyun
112*4882a593SmuzhiyunThere are several things evident from the above discussion:
113*4882a593Smuzhiyun
114*4882a593Smuzhiyun 1	The server has a poor metric for deciding how much work it
115*4882a593Smuzhiyun	should do at one time on behalf of a particular client.
116*4882a593Smuzhiyun
117*4882a593Smuzhiyun 2	The server doesn't call select often enough to detect less
118*4882a593Smuzhiyun 	aggressive clients in the face of busy clients, especially
119*4882a593Smuzhiyun	when those clients are executing slow requests.
120*4882a593Smuzhiyun
121*4882a593Smuzhiyun 3	The server calls select too often when executing fast requests.
122*4882a593Smuzhiyun
123*4882a593Smuzhiyun 4	Some priority scheme is needed to keep interactive clients
124*4882a593Smuzhiyun 	responding to the user.
125*4882a593Smuzhiyun
126*4882a593SmuzhiyunAnd, there are some assumptions about how X applications work:
127*4882a593Smuzhiyun
128*4882a593Smuzhiyun 1	Each X request is executed relatively quickly; a request-granularity
129*4882a593Smuzhiyun 	is good enough for interactive response almost all of the time.
130*4882a593Smuzhiyun
131*4882a593Smuzhiyun 2	X applications receiving mouse/keyboard events are likely to
132*4882a593Smuzhiyun 	warrant additional attention from the X server.
133*4882a593Smuzhiyun
134*4882a593SmuzhiyunInstead of a request-count metric for work, a time-based metric should be
135*4882a593Smuzhiyunused.  The server should select a reasonable time slice for each client
136*4882a593Smuzhiyunand execute requests for the entire timeslice before yielding to
137*4882a593Smuzhiyunanother client.
138*4882a593Smuzhiyun
139*4882a593SmuzhiyunInstead of returning immediately from WaitForSomething if clients have
140*4882a593Smuzhiyuncomplete requests queued, the server should go through select each
141*4882a593Smuzhiyuntime and gather as many ready clients as possible.  This involves
142*4882a593Smuzhiyunpolling instead of blocking and adding the ClientsWithInput to
143*4882a593SmuzhiyunclientsReadable after the select returns.
144*4882a593Smuzhiyun
145*4882a593SmuzhiyunInstead of yielding when the request buffer is empty for a particular
146*4882a593Smuzhiyunclient, leave the yielding to the upper level scheduling and allow
147*4882a593Smuzhiyunthe server to try and read again from the socket.  If the client
148*4882a593Smuzhiyunis busy, another buffer full of requests will already be waiting
149*4882a593Smuzhiyunto be delivered thus avoiding the call through select and the
150*4882a593Smuzhiyunadditional overhead in WaitForSomething.
151*4882a593Smuzhiyun
152*4882a593SmuzhiyunFinally, the dispatch loop should not simply execute requests from the
153*4882a593Smuzhiyunfirst available client, instead each client should be prioritized with
154*4882a593Smuzhiyunbusy clients penalized and clients receiving user events praised.
155*4882a593Smuzhiyun
156*4882a593SmuzhiyunHow it's done:
157*4882a593Smuzhiyun
158*4882a593SmuzhiyunPolling the current time of day from the OS is too expensive to
159*4882a593Smuzhiyunbe done at each request boundary, so instead an interval timer is
160*4882a593Smuzhiyunset allowing the server to track time changes by counting invocations
161*4882a593Smuzhiyunof the related signal handler.  Instead of using the wall time for
162*4882a593Smuzhiyunthis purpose, the process CPU time is used instead.  This serves
163*4882a593Smuzhiyuntwo purposes -- first, it allows the server to consume no CPU cycles
164*4882a593Smuzhiyunwhen idle, second it avoids conflicts with SIGALRM usage in other
165*4882a593Smuzhiyunparts of the server code.  It's not without problems though; other
166*4882a593SmuzhiyunCPU intensive processes on the same machine can reduce interactive
167*4882a593Smuzhiyunresponse time within the X server.  The dispatch loop can now
168*4882a593Smuzhiyuncalculate an approximate time value using the number of signals
169*4882a593Smuzhiyunreceived.  The granularity of the timer sets the scheduling jitter,
170*4882a593Smuzhiyunat 20ms it's only occasionally noticeable.
171*4882a593Smuzhiyun
172*4882a593SmuzhiyunThe changes to WaitForSomething and ReadRequestFromClient are
173*4882a593Smuzhiyunstraightforward, adjusting when select is called and avoiding
174*4882a593Smuzhiyunsetting isItTimeToYield too often.
175*4882a593Smuzhiyun
176*4882a593SmuzhiyunThe dispatch loop changes are more extensive, now instead of
177*4882a593Smuzhiyunexecuting requests from all available clients, a single client
178*4882a593Smuzhiyunis chosen after each call to WaitForSomething, requests are
179*4882a593Smuzhiyunexecuted for that client and WaitForSomething is called again.
180*4882a593Smuzhiyun
181*4882a593SmuzhiyunEach client is assigned a priority, the dispatch loop chooses the
182*4882a593Smuzhiyunclient with the highest priority to execute.  Priorities are
183*4882a593Smuzhiyunupdated in three ways:
184*4882a593Smuzhiyun
185*4882a593Smuzhiyun 1.	Clients which consume their entire slice are penalized
186*4882a593Smuzhiyun 	by having their priority reduced by one until they
187*4882a593Smuzhiyun	reach some minimum value.
188*4882a593Smuzhiyun
189*4882a593Smuzhiyun 2.	Clients which have executed no requests for some time
190*4882a593Smuzhiyun 	are praised by having their priority raised until they
191*4882a593Smuzhiyun	return to normal priority.
192*4882a593Smuzhiyun
193*4882a593Smuzhiyun 3.	Clients which receive user input are praised by having
194*4882a593Smuzhiyun 	their priority rased until they reach some maximal
195*4882a593Smuzhiyun	value, above normal priority.
196*4882a593Smuzhiyun
197*4882a593SmuzhiyunThe effect of these changes is to both improve interactive application
198*4882a593Smuzhiyunresponse and benchmark numbers at the same time.
199