xref: /OK3568_Linux_fs/external/xserver/hw/dmx/doc/dmx.xml (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1<?xml version="1.0" encoding="ISO-8859-1"?>
2<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
3 "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
4 <!ENTITY % defs SYSTEM "/xserver/doc/xml/xserver.ent"> %defs;
5]>
6
7<article>
8
9  <articleinfo>
10    <!-- Title information -->
11    <title>Distributed Multihead X Design</title>
12    <authorgroup>
13      <author><firstname>Kevin E.</firstname><surname>Martin</surname></author>
14      <author><firstname>David H.</firstname><surname>Dawes</surname></author>
15      <author><firstname>Rickard E.</firstname><surname>Faith</surname></author>
16    </authorgroup>
17    <pubdate>29 June 2004 (created 25 July 2001)</pubdate>
18    <releaseinfo>X Server Version &xserver.version;</releaseinfo>
19    <abstract><para>
20        This document covers the motivation, background, design, and
21        implementation of the distributed multihead X (DMX) system.  It
22        is a living document and describes the current design and
23        implementation details of the DMX system.  As the project
24        progresses, this document will be continually updated to reflect
25        the changes in the code and/or design.  <emphasis remap="it">Copyright 2001 by VA
26        Linux Systems, Inc., Fremont, California.  Copyright 2001-2004
27        by Red Hat, Inc., Raleigh, North Carolina</emphasis>
28      </para></abstract>
29  </articleinfo>
30
31<!-- Begin the document -->
32<sect1>
33<title>Introduction</title>
34
35<sect2>
36<title>The Distributed Multihead X Server</title>
37
38<para>Current Open Source multihead solutions are limited to a single
39physical machine.  A single X server controls multiple display devices,
40which can be arranged as independent heads or unified into a single
41desktop (with Xinerama).  These solutions are limited to the number of
42physical devices that can co-exist in a single machine (e.g., due to the
43number of AGP/PCI slots available for graphics cards).  Thus, large
44tiled displays are not currently possible.  The work described in this
45paper will eliminate the requirement that the display devices reside in
46the same physical machine.  This will be accomplished by developing a
47front-end proxy X server that will control multiple back-end X servers
48that make up the large display.
49</para>
50
51<para>The overall structure of the distributed multihead X (DMX) project is
52as follows: A single front-end X server will act as a proxy to a set of
53back-end X servers, which handle all of the visible rendering.  X
54clients will connect to the front-end server just as they normally would
55to a regular X server.  The front-end server will present an abstracted
56view to the client of a single large display.  This will ensure that all
57standard X clients will continue to operate without modification
58(limited, as always, by the visuals and extensions provided by the X
59server).  Clients that are DMX-aware will be able to use an extension to
60obtain information about the back-end servers (e.g., for placement of
61pop-up windows, window alignments by the window manager, etc.).
62</para>
63
64<para>The architecture of the DMX server is divided into two main sections:
65input (e.g., mouse and keyboard events) and output (e.g., rendering and
66windowing requests).  Each of these are describe briefly below, and the
67rest of this design document will describe them in greater detail.
68</para>
69
70<para>The DMX server can receive input from three general types of input
71devices: "local" devices that are physically attached to the machine on
72which DMX is running, "backend" devices that are physically attached to
73one or more of the back-end X servers (and that generate events via the
74X protocol stream from the backend), and "console" devices that can be
75abstracted from any non-back-end X server.  Backend and console devices
76are treated differently because the pointer device on the back-end X
77server also controls the location of the hardware X cursor.  Full
78support for XInput extension devices is provided.
79</para>
80
81<para>Rendering requests will be accepted by the front-end server; however,
82rendering to visible windows will be broken down as needed and sent to
83the appropriate back-end server(s) via X11 library calls for actual
84rendering.  The basic framework will follow a Xnest-style approach.  GC
85state will be managed in the front-end server and sent to the
86appropriate back-end server(s) as required.  Pixmap rendering will (at
87least initially) be handled by the front-end X server.  Windowing
88requests (e.g., ordering, mapping, moving, etc.) will handled in the
89front-end server.  If the request requires a visible change, the
90windowing operation will be translated into requests for the appropriate
91back-end server(s).  Window state will be mirrored in the back-end
92server(s) as needed.
93</para>
94</sect2>
95
96<sect2>
97<title>Layout of Paper</title>
98
99<para>The next section describes the general development plan that was
100actually used for implementation.  The final section discusses
101outstanding issues at the conclusion of development.  The first appendix
102provides low-level technical detail that may be of interest to those
103intimately familiar with the X server architecture.  The final appendix
104describes the four phases of development that were performed during the
105first two years of development.
106</para>
107
108<para>The final year of work was divided into 9 tasks that are not
109described in specific sections of this document.  The major tasks during
110that time were the enhancement of the reconfiguration ability added in
111Phase IV, addition of support for a dynamic number of back-end displays
112(instead of a hard-coded limit), and the support for back-end display
113and input removal and addition.  This work is mentioned in this paper,
114but is not covered in detail.
115</para>
116</sect2>
117</sect1>
118
119<!-- ============================================================ -->
120<sect1>
121<title>Development plan</title>
122
123<para>This section describes the development plan from approximately June
1242001 through July 2003.
125</para>
126
127<sect2>
128<title>Bootstrap code</title>
129
130<para>To allow for rapid development of the DMX server by multiple
131developers during the first development stage, the problem will be
132broken down into three tasks: the overall DMX framework, back-end
133rendering services and input device handling services.  However, before
134the work begins on these tasks, a simple framework that each developer
135could use was implemented to bootstrap the development effort.  This
136framework renders to a single back-end server and provides dummy input
137devices (i.e., the keyboard and mouse).  The simple back-end rendering
138service was implemented using the shadow framebuffer support currently
139available in the XFree86 environment.
140</para>
141
142<para>Using this bootstrapping framework, each developer has been able to
143work on each of the tasks listed above independently as follows: the
144framework will be extended to handle arbitrary back-end server
145configurations; the back-end rendering services will be transitioned to
146the more efficient Xnest-style implementation; and, an input device
147framework to handle various input devices via the input extension will
148be developed.
149</para>
150
151<para>Status: The boot strap code is complete.   <!-- August 2001 -->
152</para>
153
154</sect2>
155
156<sect2>
157<title>Input device handling</title>
158
159<para>An X server (including the front-end X server) requires two core
160input devices -- a keyboard and a pointer (mouse).  These core devices
161are handled and required by the core X11 protocol.  Additional types of
162input devices may be attached and utilized via the XInput extension.
163These are usually referred to as ``XInput extension devices'',
164</para>
165
166<para>There are some options as to how the front-end X server gets its core
167input devices:
168
169<orderedlist>
170<listitem>
171    <para>Local Input. The physical input devices (e.g., keyboard and
172    mouse) can be attached directly to the front-end X server.  In this
173    case, the keyboard and mouse on the machine running the front-end X
174    server will be used.  The front-end will have drivers to read the
175    raw input from those devices and convert it into the required X
176    input events (e.g., key press/release, pointer button press/release,
177    pointer motion).  The front-end keyboard driver will keep track of
178    keyboard properties such as key and modifier mappings, autorepeat
179    state, keyboard sound and led state.  Similarly the front-end
180    pointer driver will keep track if pointer properties such as the
181    button mapping and movement acceleration parameters.  With this
182    option, input is handled fully in the front-end X server, and the
183    back-end X servers are used in a display-only mode.  This option was
184    implemented and works for a limited number of Linux-specific
185    devices.  Adding additional local input devices for other
186    architectures is expected to be relatively simple.
187</para>
188
189    <para>The following options are available for implementing local input
190    devices:
191
192<orderedlist>
193<listitem>
194        <para>The XFree86 X server has modular input drivers that could
195        be adapted for this purpose.  The mouse driver supports a wide
196        range of mouse types and interfaces, as well as a range of
197        Operating System platforms.  The keyboard driver in XFree86 is
198        not currently as modular as the mouse driver, but could be made
199        so.  The XFree86 X server also has a range of other input
200        drivers for extended input devices such as tablets and touch
201        screens.  Unfortunately, the XFree86 drivers are generally
202        complex, often simultaneously providing support for multiple
203        devices across multiple architectures; and rely so heavily on
204        XFree86-specific helper-functions, that this option was not
205        pursued.
206</para>
207</listitem>
208
209<listitem>
210        <para>The <command>kdrive</command> X server in XFree86 has built-in drivers that
211        support PS/2 mice and keyboard under Linux.  The mouse driver
212        can indirectly handle other mouse types if the Linux utility
213        <command>gpm</command> is used as to translate the native mouse protocol into
214        PS/2 mouse format.  These drivers could be adapted and built in
215        to the front-end X server if this range of hardware and OS
216        support is sufficient.  While much simpler than the XFree86
217        drivers, the <command>kdrive</command> drivers were not used for the DMX
218        implementation.
219</para>
220</listitem>
221
222<listitem>
223        <para>Reimplementation of keyboard and mouse drivers from
224        scratch for the DMX framework.  Because keyboard and mouse
225        drivers are relatively trivial to implement, this pathway was
226        selected.  Other drivers in the X source tree were referenced,
227        and significant contributions from other drivers are noted in
228        the DMX source code.
229</para>
230</listitem>
231</orderedlist>
232</para>
233</listitem>
234
235<listitem>
236    <para>Backend Input.  The front-end can make use of the core input
237    devices attached to one or more of the back-end X servers.  Core
238    input events from multiple back-ends are merged into a single input
239    event stream.  This can work sanely when only a single set of input
240    devices is used at any given time.  The keyboard and pointer state
241    will be handled in the front-end, with changes propagated to the
242    back-end servers as needed.  This option was implemented and works
243    well.  Because the core pointer on a back-end controls the hardware
244    mouse on that back-end, core pointers cannot be treated as XInput
245    extension devices.  However, all back-end XInput extensions devices
246    can be mapped to either DMX core or DMX XInput extension devices.
247</para>
248</listitem>
249
250<listitem>
251    <para>Console Input.  The front-end server could create a console
252    window that is displayed on an X server independent of the back-end
253    X servers.  This console window could display things like the
254    physical screen layout, and the front-end could get its core input
255    events from events delivered to the console window.  This option was
256    implemented and works well.  To help the human navigate, window
257    outlines are also displayed in the console window.  Further, console
258    windows can be used as either core or XInput extension devices.
259</para>
260</listitem>
261
262<listitem>
263    <para>Other options were initially explored, but they were all
264    partial subsets of the options listed above and, hence, are
265    irrelevant.
266</para>
267</listitem>
268
269</orderedlist>
270</para>
271
272<para>Although extended input devices are not specifically mentioned in the
273Distributed X requirements, the options above were all implemented so
274that XInput extension devices were supported.
275</para>
276
277<para>The bootstrap code (Xdmx) had dummy input devices, and these are
278still supported in the final version.  These do the necessary
279initialization to satisfy the X server's requirements for core pointer
280and keyboard devices, but no input events are ever generated.
281</para>
282
283<para>Status: The input code is complete.  Because of the complexity of the
284XFree86 input device drivers (and their heavy reliance on XFree86
285infrastructure), separate low-level device drivers were implemented for
286Xdmx.  The following kinds of drivers are supported (in general, the
287devices can be treated arbitrarily as "core" input devices or as XInput
288"extension" devices; and multiple instances of different kinds of
289devices can be simultaneously available):
290<orderedlist>
291<listitem>
292        <para> A "dummy" device drive that never generates events.
293</para>
294</listitem>
295
296<listitem>
297        <para> "Local" input is from the low-level hardware on which the
298        Xdmx binary is running.  This is the only area where using the
299        XFree86 driver infrastructure would have been helpful, and then
300        only partially, since good support for generic USB devices does
301        not yet exist in XFree86 (in any case, XFree86 and kdrive driver
302        code was used where possible).  Currently, the following local
303        devices are supported under Linux (porting to other operating
304        systems should be fairly straightforward):
305        <itemizedlist>
306            <listitem><para>Linux keyboard</para></listitem>
307            <listitem><para>Linux serial mouse (MS)</para></listitem>
308            <listitem><para>Linux PS/2 mouse</para></listitem>
309            <listitem><para>USB keyboard</para></listitem>
310            <listitem><para>USB mouse</para></listitem>
311            <listitem><para>USB generic device (e.g., joystick, gamepad, etc.)</para></listitem>
312        </itemizedlist>
313</para>
314</listitem>
315
316<listitem>
317        <para> "Backend" input is taken from one or more of the back-end
318        displays.  In this case, events are taken from the back-end X
319        server and are converted to Xdmx events.  Care must be taken so
320        that the sprite moves properly on the display from which input
321        is being taken.
322</para>
323</listitem>
324
325<listitem>
326        <para> "Console" input is taken from an X window that Xdmx
327        creates on the operator's display (i.e., on the machine running
328        the Xdmx binary).  When the operator's mouse is inside the
329        console window, then those events are converted to Xdmx events.
330        Several special features are available: the console can display
331        outlines of windows that are on the Xdmx display (to facilitate
332        navigation), the cursor can be confined to the console, and a
333        "fine" mode can be activated to allow very precise cursor
334        positioning.
335</para>
336</listitem>
337</orderedlist>
338
339</para>
340
341</sect2>
342
343<!-- May 2002; July 2003 -->
344
345<sect2>
346<title>Output device handling</title>
347
348<para>The output of the DMX system displays rendering and windowing
349requests across multiple screens.  The screens are typically arranged in
350a grid such that together they represent a single large display.
351</para>
352
353<para>The output section of the DMX code consists of two parts.  The first
354is in the front-end proxy X server (Xdmx), which accepts client
355connections, manages the windows, and potentially renders primitives but
356does not actually display any of the drawing primitives.  The second
357part is the back-end X server(s), which accept commands from the
358front-end server and display the results on their screens.
359</para>
360
361<sect3>
362<title>Initialization</title>
363
364<para>The DMX front-end must first initialize its screens by connecting to
365each of the back-end X servers and collecting information about each of
366these screens.  However, the information collected from the back-end X
367servers might be inconsistent.  Handling these cases can be difficult
368and/or inefficient.  For example, a two screen system has one back-end X
369server running at 16bpp while the second is running at 32bpp.
370Converting rendering requests (e.g., XPutImage() or XGetImage()
371requests) to the appropriate bit depth can be very time consuming.
372Analyzing these cases to determine how or even if it is possible to
373handle them is required.  The current Xinerama code handles many of
374these cases (e.g., in PanoramiXConsolidate()) and will be used as a
375starting point.  In general, the best solution is to use homogeneous X
376servers and display devices.  Using back-end servers with the same depth
377is a requirement of the final DMX implementation.
378</para>
379
380<para>Once this screen consolidation is finished, the relative position of
381each back-end X server's screen in the unified screen is initialized.  A
382full-screen window is opened on each of the back-end X servers, and the
383cursor on each screen is turned off.  The final DMX implementation can
384also make use of a partial-screen window, or multiple windows per
385back-end screen.
386</para>
387</sect3>
388
389<sect3>
390<title>Handling rendering requests</title>
391
392<para>After initialization, X applications connect to the front-end server.
393There are two possible implementations of how rendering and windowing
394requests are handled in the DMX system:
395
396<orderedlist>
397<listitem>
398    <para>A shadow framebuffer is used in the front-end server as the
399    render target.  In this option, all protocol requests are completely
400    handled in the front-end server.  All state and resources are
401    maintained in the front-end including a shadow copy of the entire
402    framebuffer.  The framebuffers attached to the back-end servers are
403    updated by XPutImage() calls with data taken directly from the
404    shadow framebuffer.
405</para>
406
407    <para>This solution suffers from two main problems.  First, it does not
408    take advantage of any accelerated hardware available in the system.
409    Second, the size of the XPutImage() calls can be quite large and
410    thus will be limited by the bandwidth available.
411</para>
412
413    <para>The initial DMX implementation used a shadow framebuffer by
414    default.
415</para>
416</listitem>
417
418<listitem>
419    <para>Rendering requests are sent to each back-end server for
420    handling (as is done in the Xnest server described above).  In this
421    option, certain protocol requests are handled in the front-end
422    server and certain requests are repackaged and then sent to the
423    back-end servers.  The framebuffer is distributed across the
424    multiple back-end servers.  Rendering to the framebuffer is handled
425    on each back-end and can take advantage of any acceleration
426    available on the back-end servers' graphics display device.  State
427    is maintained both in the front and back-end servers.
428</para>
429
430    <para>This solution suffers from two main drawbacks.  First, protocol
431    requests are sent to all back-end servers -- even those that will
432    completely clip the rendering primitive -- which wastes bandwidth
433    and processing time.  Second, state is maintained both in the front-
434    and back-end servers.  These drawbacks are not as severe as in
435    option 1 (above) and can either be overcome through optimizations or
436    are acceptable.  Therefore, this option will be used in the final
437    implementation.
438</para>
439
440    <para>The final DMX implementation defaults to this mechanism, but also
441    supports the shadow framebuffer mechanism.  Several optimizations
442    were implemented to eliminate the drawbacks of the default
443    mechanism.  These optimizations are described the section below and
444    in Phase II of the Development Results (see appendix).
445</para>
446</listitem>
447
448</orderedlist>
449</para>
450
451<para>Status: Both the shadow framebuffer and Xnest-style code is complete.
452<!-- May 2002 -->
453</para>
454
455</sect3>
456</sect2>
457
458<sect2>
459<title>Optimizing DMX</title>
460
461<para>Initially, the Xnest-style solution's performance will be measured
462and analyzed to determine where the performance bottlenecks exist.
463There are four main areas that will be addressed.
464</para>
465
466<para>First, to obtain reasonable interactivity with the first development
467phase, XSync() was called after each protocol request.  The XSync()
468function flushes any pending protocol requests.  It then waits for the
469back-end to process the request and send a reply that the request has
470completed.  This happens with each back-end server and performance
471greatly suffers.  As a result of the way XSync() is called in the first
472development phase, the batching that the X11 library performs is
473effectively defeated.  The XSync() call usage will be analyzed and
474optimized by batching calls and performing them at regular intervals,
475except where interactivity will suffer (e.g., on cursor movements).
476</para>
477
478<para>Second, the initial Xnest-style solution described above sends the
479repackaged protocol requests to all back-end servers regardless of
480whether or not they would be completely clipped out.  The requests that
481are trivially rejected on the back-end server wastes the limited
482bandwidth available.  By tracking clipping changes in the DMX X server's
483windowing code (e.g., by opening, closing, moving or resizing windows),
484we can determine whether or not back-end windows are visible so that
485trivial tests in the front-end server's GC ops drawing functions can
486eliminate these unnecessary protocol requests.
487</para>
488
489<para>Third, each protocol request will be analyzed to determine if it is
490possible to break the request into smaller pieces at display boundaries.
491The initial ones to be analyzed are put and get image requests since
492they will require the greatest bandwidth to transmit data between the
493front and back-end servers.  Other protocol requests will be analyzed
494and those that will benefit from breaking them into smaller requests
495will be implemented.
496</para>
497
498<para>Fourth, an extension is being considered that will allow font glyphs to
499be transferred from the front-end DMX X server to each back-end server.
500This extension will permit the front-end to handle all font requests and
501eliminate the requirement that all back-end X servers share the exact
502same fonts as the front-end server.  We are investigating the
503feasibility of this extension during this development phase.
504</para>
505
506<para>Other potential optimizations will be determined from the performance
507analysis.
508</para>
509
510<para>Please note that in our initial design, we proposed optimizing BLT
511operations (e.g., XCopyArea() and window moves) by developing an
512extension that would allow individual back-end servers to directly copy
513pixel data to other back-end servers.  This potential optimization was
514in response to the simple image movement implementation that required
515potentially many calls to GetImage() and PutImage().  However, the
516current Xinerama implementation handles these BLT operations
517differently.  Instead of copying data to and from screens, they generate
518expose events -- just as happens in the case when a window is moved from
519off a screen to on screen.  This approach saves the limited bandwidth
520available between front and back-end servers and is being standardized
521with Xinerama.  It also eliminates the potential setup problems and
522security issues resulting from having each back-end server open
523connections to all other back-end servers.  Therefore, we suggest
524accepting Xinerama's expose event solution.
525</para>
526
527<para>Also note that the approach proposed in the second and third
528optimizations might cause backing store algorithms in the back-end to be
529defeated, so a DMX X server configuration flag will be added to disable
530these optimizations.
531</para>
532
533<para>Status: The optimizations proposed above are complete.  It was
534determined that the using the xfs font server was sufficient and
535creating a new mechanism to pass glyphs was redundant; therefore, the
536fourth optimization proposed above was not included in DMX.
537<!-- September 2002 -->
538</para>
539
540</sect2>
541
542<sect2>
543<title>DMX X extension support</title>
544
545<para>The DMX X server keeps track of all the windowing information on the
546back-end X servers, but does not currently export this information to
547any client applications.  An extension will be developed to pass the
548screen information and back-end window IDs to DMX-aware clients.  These
549clients can then use this information to directly connect to and render
550to the back-end windows.  Bypassing the DMX X server allows DMX-aware
551clients to break up complex rendering requests on their own and send
552them directly to the windows on the back-end server's screens.  An
553example of a client that can make effective use of this extension is
554Chromium.
555</para>
556
557<para>Status: The extension, as implemented, is fully documented in
558"Client-to-Server DMX Extension to the X Protocol".  Future changes
559might be required based on feedback and other proposed enhancements to
560DMX.  Currently, the following facilities are supported:
561<orderedlist>
562<listitem><para>
563        Screen information (clipping rectangle for each screen relative
564        to the virtual screen)
565</para></listitem>
566<listitem><para>
567        Window information (window IDs and clipping information for each
568        back-end window that corresponds to each DMX window)
569</para></listitem>
570<listitem><para>
571        Input device information (mappings from DMX device IDs to
572        back-end device IDs)
573</para></listitem>
574<listitem><para>
575        Force window creation (so that a client can override the
576        server-side lazy window creation optimization)
577</para></listitem>
578<listitem><para>
579        Reconfiguration (so that a client can request that a screen
580        position be changed)
581</para></listitem>
582<listitem><para>
583        Addition and removal of back-end servers and back-end and
584        console inputs.
585</para></listitem>
586</orderedlist>
587</para>
588<!-- September 2002; July 2003 -->
589
590</sect2>
591
592<sect2>
593<title>Common X extension support</title>
594
595<para>The XInput, XKeyboard and Shape extensions are commonly used
596extensions to the base X11 protocol.  XInput allows multiple and
597non-standard input devices to be accessed simultaneously.  These input
598devices can be connected to either the front-end or back-end servers.
599XKeyboard allows much better keyboard mappings control.  Shape adds
600support for arbitrarily shaped windows and is used by various window
601managers.  Nearly all potential back-end X servers make these extensions
602available, and support for each one will be added to the DMX system.
603</para>
604
605<para>In addition to the extensions listed above, support for the X
606Rendering extension (Render) is being developed.  Render adds digital
607image composition to the rendering model used by the X Window System.
608While this extension is still under development by Keith Packard of HP,
609support for the current version will be added to the DMX system.
610</para>
611
612<para>Support for the XTest extension was added during the first
613development phase.
614</para>
615
616<!-- WARNING: this list is duplicated in the Phase IV discussion -->
617<para>Status: The following extensions are supported and are discussed in
618more detail in Phase IV of the Development Results (see appendix):
619    BIG-REQUESTS,
620    DEC-XTRAP,
621    DMX,
622    DPMS,
623    Extended-Visual-Information,
624    GLX,
625    LBX,
626    RECORD,
627    RENDER,
628    SECURITY,
629    SHAPE,
630    SYNC,
631    X-Resource,
632    XC-APPGROUP,
633    XC-MISC,
634    XFree86-Bigfont,
635    XINERAMA,
636    XInputExtension,
637    XKEYBOARD, and
638    XTEST.
639<!-- November 2002; updated February 2003, July 2003 -->
640</para>
641</sect2>
642
643<sect2>
644<title>OpenGL support</title>
645
646<para>OpenGL support using the Mesa code base exists in XFree86 release 4
647and later.  Currently, the direct rendering infrastructure (DRI)
648provides accelerated OpenGL support for local clients and unaccelerated
649OpenGL support (i.e., software rendering) is provided for non-local
650clients.
651</para>
652
653<para>The single head OpenGL support in XFree86 4.x will be extended to use
654the DMX system.  When the front and back-end servers are on the same
655physical hardware, it is possible to use the DRI to directly render to
656the back-end servers.  First, the existing DRI will be extended to
657support multiple display heads, and then to support the DMX system.
658OpenGL rendering requests will be direct rendering to each back-end X
659server.  The DRI will request the screen layout (either from the
660existing Xinerama extension or a DMX-specific extension).  Support for
661synchronized swap buffers will also be added (on hardware that supports
662it).  Note that a single front-end server with a single back-end server
663on the same physical machine can emulate accelerated indirect rendering.
664</para>
665
666<para>When the front and back-end servers are on different physical
667hardware or are using non-XFree86 4.x X servers, a mechanism to render
668primitives across the back-end servers will be provided.  There are
669several options as to how this can be implemented.
670</para>
671
672<orderedlist>
673<listitem>
674    <para>The existing OpenGL support in each back-end server can be
675    used by repackaging rendering primitives and sending them to each
676    back-end server.  This option is similar to the unoptimized
677    Xnest-style approach mentioned above.  Optimization of this solution
678    is beyond the scope of this project and is better suited to other
679    distributed rendering systems.
680</para></listitem>
681
682<listitem>
683    <para>Rendering to a pixmap in the front-end server using the
684    current XFree86 4.x code, and then displaying to the back-ends via
685    calls to XPutImage() is another option.  This option is similar to
686    the shadow frame buffer approach mentioned above.  It is slower and
687    bandwidth intensive, but has the advantage that the back-end servers
688    are not required to have OpenGL support.
689</para></listitem>
690</orderedlist>
691
692<para>These, and other, options will be investigated in this phase of the
693work.
694</para>
695
696<para>Work by others have made Chromium DMX-aware.  Chromium will use the
697DMX X protocol extension to obtain information about the back-end
698servers and will render directly to those servers, bypassing DMX.
699</para>
700
701<para>Status: OpenGL support by the glxProxy extension was implemented by
702SGI and has been integrated into the DMX code base.
703</para>
704<!-- May 2003-->
705</sect2>
706
707</sect1>
708
709<!-- ============================================================ -->
710<sect1>
711<title>Current issues</title>
712
713<para>In this sections the current issues are outlined that require further
714investigation.
715</para>
716
717<sect2>
718<title>Fonts</title>
719
720<para>The font path and glyphs need to be the same for the front-end and
721each of the back-end servers.  Font glyphs could be sent to the back-end
722servers as necessary but this would consume a significant amount of
723available bandwidth during font rendering for clients that use many
724different fonts (e.g., Netscape).  Initially, the font server (xfs) will
725be used to provide the fonts to both the front-end and back-end servers.
726Other possibilities will be investigated during development.
727</para>
728</sect2>
729
730<sect2>
731<title>Zero width rendering primitives</title>
732
733<para>To allow pixmap and on-screen rendering to be pixel perfect, all
734back-end servers must render zero width primitives exactly the same as
735the front-end renders the primitives to pixmaps.  For those back-end
736servers that do not exactly match, zero width primitives will be
737automatically converted to one width primitives.  This can be handled in
738the front-end server via the GC state.
739</para>
740</sect2>
741
742<sect2>
743<title>Output scaling</title>
744
745<para>With very large tiled displays, it might be difficult to read the
746information on the standard X desktop.  In particular, the cursor can be
747easily lost and fonts could be difficult to read.  Automatic primitive
748scaling might prove to be very useful.  We will investigate the
749possibility of scaling the cursor and providing a set of alternate
750pre-scaled fonts to replace the standard fonts that many applications
751use (e.g., fixed).  Other options for automatic scaling will also be
752investigated.
753</para>
754</sect2>
755
756<sect2>
757<title>Per-screen colormaps</title>
758
759<para>Each screen's default colormap in the set of back-end X servers
760should be able to be adjusted via a configuration utility.  This support
761is would allow the back-end screens to be calibrated via custom gamma
762tables.  On 24-bit systems that support a DirectColor visual, this type
763of correction can be accommodated.  One possible implementation would be
764to advertise to X client of the DMX server a TrueColor visual while
765using DirectColor visuals on the back-end servers to implement this type
766of color correction.  Other options will be investigated.
767</para>
768</sect2>
769</sect1>
770
771<!-- ============================================================ -->
772<appendix>
773<title>Appendix</title>
774
775<sect1>
776<title>Background</title>
777
778<para>This section describes the existing Open Source architectures that
779can be used to handle multiple screens and upon which this development
780project is based.  This section was written before the implementation
781was finished, and may not reflect actual details of the implementation.
782It is left for historical interest only.
783</para>
784
785<sect2>
786<title>Core input device handling</title>
787
788<para>The following is a description of how core input devices are handled
789by an X server.
790</para>
791
792<sect3>
793<title>InitInput()</title>
794
795<para>InitInput() is a DDX function that is called at the start of each
796server generation from the X server's main() function.  Its purpose is
797to determine what input devices are connected to the X server, register
798them with the DIX and MI layers, and initialize the input event queue.
799InitInput() does not have a return value, but the X server will abort if
800either a core keyboard device or a core pointer device are not
801registered.  Extended input (XInput) devices can also be registered in
802InitInput().
803</para>
804
805<para>InitInput() usually has implementation specific code to determine
806which input devices are available.  For each input device it will be
807using, it calls AddInputDevice():
808
809<variablelist>
810<varlistentry>
811<term>AddInputDevice()</term>
812<listitem><para>This DIX function allocates the device structure,
813registers a callback function (which handles device init, close, on and
814off), and returns the input handle, which can be treated as opaque.  It
815is called once for each input device.
816</para></listitem>
817</varlistentry>
818</variablelist>
819</para>
820
821<para>Once input handles for core keyboard and core pointer devices have
822been obtained from AddInputDevice().  If both core devices are not
823registered, then the X server will exit with a fatal error when it
824attempts to start the input devices in InitAndStartDevices(), which is
825called directly after InitInput() (see below).
826</para>
827
828<para>The core pointer device is then registered with the miPointer code
829(which does the high level cursor handling).  While this registration
830is not necessary for correct miPointer operation in the current XFree86
831code, it is still done mostly for compatibility reasons.
832</para>
833
834<para><variablelist>
835
836<varlistentry>
837<term>miRegisterPointerDevice()</term>
838<listitem><para>This MI function registers the core
839pointer's input handle with with the miPointer code.
840</para></listitem></varlistentry>
841</variablelist>
842</para>
843
844<para>The final part of InitInput() is the initialization of the input
845event queue handling.  In most cases, the event queue handling provided
846in the MI layer is used.  The primary XFree86 X server uses its own
847event queue handling to support some special cases related to the XInput
848extension and the XFree86-specific DGA extension.  For our purposes, the
849MI event queue handling should be suitable.  It is initialized by
850calling mieqInit():
851
852<variablelist>
853<varlistentry>
854<term>mieqInit()</term>
855<listitem><para>This MI function initializes the MI event queue for the
856core devices, and is passed the public component of the input handles
857for the two core devices.
858</para></listitem></varlistentry>
859</variablelist>
860</para>
861
862<para>If a wakeup handler is required to deliver synchronous input
863events, it can be registered here by calling the DIX function
864RegisterBlockAndWakeupHandlers().  (See the devReadInput() description
865below.)
866</para>
867</sect3>
868
869<sect3>
870<title>InitAndStartDevices()</title>
871
872<para>InitAndStartDevices() is a DIX function that is called immediately
873after InitInput() from the X server's main() function.  Its purpose is
874to initialize each input device that was registered with
875AddInputDevice(), enable each input device that was successfully
876initialized, and create the list of enabled input devices.  Once each
877registered device is processed in this way, the list of enabled input
878devices is checked to make sure that both a core keyboard device and
879core pointer device were registered and successfully enabled.  If not,
880InitAndStartDevices() returns failure, and results in the the X server
881exiting with a fatal error.
882</para>
883
884<para>Each registered device is initialized by calling its callback
885(dev-&gt;deviceProc) with the DEVICE_INIT argument:
886
887<variablelist>
888<varlistentry>
889<term>(*dev-&gt;deviceProc)(dev, DEVICE_INIT)</term>
890<listitem>
891<para>This function initializes the
892device structs with core information relevant to the device.
893</para>
894
895<para>For pointer devices, this means specifying the number of buttons,
896default button mapping, the function used to get motion events (usually
897miPointerGetMotionEvents()), the function used to change/control the
898core pointer motion parameters (acceleration and threshold), and the
899motion buffer size.
900</para>
901
902<para>For keyboard devices, this means specifying the keycode range,
903default keycode to keysym mapping, default modifier mapping, and the
904functions used to sound the keyboard bell and modify/control the
905keyboard parameters (LEDs, bell pitch and duration, key click, which
906keys are auto-repeating, etc).
907</para></listitem></varlistentry>
908</variablelist>
909</para>
910
911<para>Each initialized device is enabled by calling EnableDevice():
912
913<variablelist>
914<varlistentry>
915<term>EnableDevice()</term>
916<listitem>
917<para>EnableDevice() calls the device callback with
918DEVICE_ON:
919    <variablelist>
920    <varlistentry>
921    <term>(*dev-&gt;deviceProc)(dev, DEVICE_ON)</term>
922    <listitem>
923    <para>This typically opens and
924    initializes the relevant physical device, and when appropriate,
925    registers the device's file descriptor (or equivalent) as a valid
926    input source.
927    </para></listitem></varlistentry>
928    </variablelist>
929    </para>
930
931    <para>EnableDevice() then adds the device handle to the X server's
932    global list of enabled devices.
933</para></listitem></varlistentry>
934</variablelist>
935</para>
936
937<para>InitAndStartDevices() then verifies that a valid core keyboard and
938pointer has been initialized and enabled.  It returns failure if either
939are missing.
940</para>
941</sect3>
942
943<sect3>
944<title>devReadInput()</title>
945
946<para>Each device will have some function that gets called to read its
947physical input. This function should do at least two things: make sure that
948input events get enqueued, and make sure that the cursor gets moved for motion
949events (except if these are handled later by the driver's own event queue
950processing function, which cannot be done when using the MI event queue
951handling).
952</para>
953
954<para>Events are queued by calling mieqEnqueue():
955
956<variablelist>
957<varlistentry>
958<term>mieqEnqueue()</term>
959<listitem>
960<para>This MI function is used to add input events to the
961event queue.  It is simply passed the event to be queued.
962</para></listitem></varlistentry>
963</variablelist>
964</para>
965
966<para>The cursor position should be updated when motion events are
967enqueued by calling miPointerDeltaCursor():
968
969<variablelist>
970<varlistentry>
971<term>miPointerDeltaCursor()</term>
972<listitem>
973<para>This MI function is used to move the cursor
974relative to its current position.
975</para></listitem></varlistentry>
976</variablelist>
977</para>
978</sect3>
979
980<sect3>
981<title>ProcessInputEvents()</title>
982
983<para>ProcessInputEvents() is a DDX function that is called from the X
984server's main dispatch loop when new events are available in the input
985event queue.  It typically processes the enqueued events, and updates
986the cursor/pointer position.  It may also do other DDX-specific event
987processing.
988</para>
989
990<para>Enqueued events are processed by mieqProcessInputEvents() and passed
991to the DIX layer for transmission to clients:
992
993<variablelist>
994<varlistentry>
995<term>mieqProcessInputEvents()</term>
996<listitem>
997<para>This function processes each event in the
998event queue, and passes it to the device's input processing function.
999The DIX layer provides default functions to do this processing, and they
1000handle the task of getting the events passed back to the relevant
1001clients.
1002</para></listitem></varlistentry>
1003<varlistentry>
1004<term>miPointerUpdate()</term>
1005<listitem>
1006<para>This function resynchronized the cursor position
1007with the new pointer position.  It also takes care of moving the cursor
1008between screens when needed in multi-head configurations.
1009</para></listitem></varlistentry>
1010</variablelist>
1011</para>
1012
1013</sect3>
1014
1015<sect3>
1016<title>DisableDevice()</title>
1017
1018<para>DisableDevice is a DIX function that removes an input device from the
1019list of enabled devices.  The result of this is that the device no
1020longer generates input events.  The device's data structures are kept in
1021place, and disabling a device like this can be reversed by calling
1022EnableDevice().  DisableDevice() may be called from the DDX when it is
1023desirable to do so (e.g., the XFree86 server does this when VT
1024switching).  Except for special cases, this is not normally called for
1025core input devices.
1026</para>
1027
1028<para>DisableDevice() calls the device's callback function with
1029<constant>DEVICE_OFF</constant>:
1030
1031<variablelist>
1032<varlistentry>
1033<term>(*dev-&gt;deviceProc)(dev, DEVICE_OFF)</term>
1034<listitem>
1035<para>This typically closes the
1036relevant physical device, and when appropriate, unregisters the device's
1037file descriptor (or equivalent) as a valid input source.
1038</para></listitem></varlistentry>
1039</variablelist>
1040</para>
1041
1042<para>DisableDevice() then removes the device handle from the X server's
1043global list of enabled devices.
1044</para>
1045
1046</sect3>
1047
1048<sect3>
1049<title>CloseDevice()</title>
1050
1051<para>CloseDevice is a DIX function that removes an input device from the
1052list of available devices.  It disables input from the device and frees
1053all data structures associated with the device.  This function is
1054usually called from CloseDownDevices(), which is called from main() at
1055the end of each server generation to close all input devices.
1056</para>
1057
1058<para>CloseDevice() calls the device's callback function with
1059<constant>DEVICE_CLOSE</constant>:
1060
1061<variablelist>
1062<varlistentry>
1063<term>(*dev-&gt;deviceProc)(dev, DEVICE_CLOSE)</term>
1064<listitem>
1065<para>This typically closes the
1066relevant physical device, and when appropriate, unregisters the device's
1067file descriptor (or equivalent) as a valid input source.  If any device
1068specific data structures were allocated when the device was initialized,
1069they are freed here.
1070</para></listitem></varlistentry>
1071</variablelist>
1072</para>
1073
1074<para>CloseDevice() then frees the data structures that were allocated
1075for the device when it was registered/initialized.
1076</para>
1077
1078</sect3>
1079
1080<sect3>
1081<title>LegalModifier()</title>
1082<!-- dmx/dmxinput.c - currently returns TRUE -->
1083<para>LegalModifier() is a required DDX function that can be used to
1084restrict which keys may be modifier keys.  This seems to be present for
1085historical reasons, so this function should simply return TRUE
1086unconditionally.
1087</para>
1088
1089</sect3>
1090</sect2>
1091
1092<sect2>
1093<title>Output handling</title>
1094
1095<para>The following sections describe the main functions required to
1096initialize, use and close the output device(s) for each screen in the X
1097server.
1098</para>
1099
1100<sect3>
1101<title>InitOutput()</title>
1102
1103<para>This DDX function is called near the start of each server generation
1104from the X server's main() function.  InitOutput()'s main purpose is to
1105initialize each screen and fill in the global screenInfo structure for
1106each screen.  It is passed three arguments: a pointer to the screenInfo
1107struct, which it is to initialize, and argc and argv from main(), which
1108can be used to determine additional configuration information.
1109</para>
1110
1111<para>The primary tasks for this function are outlined below:
1112
1113<orderedlist>
1114<listitem>
1115    <para><emphasis remap="bf">Parse configuration info:</emphasis> The first task of InitOutput()
1116    is to parses any configuration information from the configuration
1117    file.  In addition to the XF86Config file, other configuration
1118    information can be taken from the command line.  The command line
1119    options can be gathered either in InitOutput() or earlier in the
1120    ddxProcessArgument() function, which is called by
1121    ProcessCommandLine().  The configuration information determines the
1122    characteristics of the screen(s).  For example, in the XFree86 X
1123    server, the XF86Config file specifies the monitor information, the
1124    screen resolution, the graphics devices and slots in which they are
1125    located, and, for Xinerama, the screens' layout.
1126</para>
1127</listitem>
1128
1129<listitem>
1130    <para><emphasis remap="bf">Initialize screen info:</emphasis> The next task is to initialize
1131    the screen-dependent internal data structures.  For example, part of
1132    what the XFree86 X server does is to allocate its screen and pixmap
1133    private indices, probe for graphics devices, compare the probed
1134    devices to the ones listed in the XF86Config file, and add the ones that
1135    match to the internal xf86Screens&lsqb;&rsqb; structure.
1136</para>
1137</listitem>
1138
1139<listitem>
1140    <para><emphasis remap="bf">Set pixmap formats:</emphasis> The next task is to initialize the
1141    screenInfo's image byte order, bitmap bit order and bitmap scanline
1142    unit/pad.  The screenInfo's pixmap format's depth, bits per pixel
1143    and scanline padding is also initialized at this stage.
1144</para>
1145</listitem>
1146
1147<listitem>
1148    <para><emphasis remap="bf">Unify screen info:</emphasis> An optional task that might be done at
1149    this stage is to compare all of the information from the various
1150    screens and determines if they are compatible (i.e., if the set of
1151    screens can be unified into a single desktop).  This task has
1152    potential to be useful to the DMX front-end server, if Xinerama's
1153    PanoramiXConsolidate() function is not sufficient.
1154</para>
1155</listitem>
1156</orderedlist>
1157</para>
1158
1159<para>Once these tasks are complete, the valid screens are known and each
1160of these screens can be initialized by calling AddScreen().
1161</para>
1162</sect3>
1163
1164<sect3>
1165<title>AddScreen()</title>
1166
1167<para>This DIX function is called from InitOutput(), in the DDX layer, to
1168add each new screen to the screenInfo structure.  The DDX screen
1169initialization function and command line arguments (i.e., argc and argv)
1170are passed to it as arguments.
1171</para>
1172
1173<para>This function first allocates a new Screen structure and any privates
1174that are required.  It then initializes some of the fields in the Screen
1175struct and sets up the pixmap padding information.  Finally, it calls
1176the DDX screen initialization function ScreenInit(), which is described
1177below.  It returns the number of the screen that were just added, or -1
1178if there is insufficient memory to add the screen or if the DDX screen
1179initialization fails.
1180</para>
1181</sect3>
1182
1183<sect3>
1184<title>ScreenInit()</title>
1185
1186<para>This DDX function initializes the rest of the Screen structure with
1187either generic or screen-specific functions (as necessary).  It also
1188fills in various screen attributes (e.g., width and height in
1189millimeters, black and white pixel values).
1190</para>
1191
1192<para>The screen init function usually calls several functions to perform
1193certain screen initialization functions.  They are described below:
1194
1195<variablelist>
1196<varlistentry>
1197<term>{mi,*fb}ScreenInit()</term>
1198<listitem>
1199<para>The DDX layer's ScreenInit() function usually
1200calls another layer's ScreenInit() function (e.g., miScreenInit() or
1201fbScreenInit()) to initialize the fallbacks that the DDX driver does not
1202specifically handle.
1203</para>
1204
1205<para>After calling another layer's ScreenInit() function, any
1206screen-specific functions either wrap or replace the other layer's
1207function pointers.  If a function is to be wrapped, each of the old
1208function pointers from the other layer are stored in a screen private
1209area.  Common functions to wrap are CloseScreen() and SaveScreen().
1210</para></listitem></varlistentry>
1211
1212<varlistentry>
1213<term>miDCInitialize()</term>
1214<listitem>
1215<para>This MI function initializes the MI cursor
1216display structures and function pointers.  If a hardware cursor is used,
1217the DDX layer's ScreenInit() function will wrap additional screen and
1218the MI cursor display function pointers.
1219</para></listitem></varlistentry>
1220</variablelist>
1221</para>
1222
1223<para>Another common task for ScreenInit() function is to initialize the
1224output device state.  For example, in the XFree86 X server, the
1225ScreenInit() function saves the original state of the video card and
1226then initializes the video mode of the graphics device.
1227</para>
1228</sect3>
1229
1230<sect3>
1231<title>CloseScreen()</title>
1232
1233<para>This function restores any wrapped screen functions (and in
1234particular the wrapped CloseScreen() function) and restores the state of
1235the output device to its original state.  It should also free any
1236private data it created during the screen initialization.
1237</para>
1238</sect3>
1239
1240<sect3>
1241<title>GC operations</title>
1242
1243<para>When the X server is requested to render drawing primitives, it does
1244so by calling drawing functions through the graphics context's operation
1245function pointer table (i.e., the GCOps functions).  These functions
1246render the basic graphics operations such as drawing rectangles, lines,
1247text or copying pixmaps.  Default routines are provided either by the MI
1248layer, which draws indirectly through a simple span interface, or by the
1249framebuffer layers (e.g., CFB, MFB, FB), which draw directly to a
1250linearly mapped frame buffer.
1251</para>
1252
1253<para>To take advantage of special hardware on the graphics device,
1254specific GCOps functions can be replaced by device specific code.
1255However, many times the graphics devices can handle only a subset of the
1256possible states of the GC, so during graphics context validation,
1257appropriate routines are selected based on the state and capabilities of
1258the hardware.  For example, some graphics hardware can accelerate single
1259pixel width lines with certain dash patterns.  Thus, for dash patterns
1260that are not supported by hardware or for width 2 or greater lines, the
1261default routine is chosen during GC validation.
1262</para>
1263
1264<para>Note that some pointers to functions that draw to the screen are
1265stored in the Screen structure.  They include GetImage(), GetSpans(),
1266CopyWindow() and RestoreAreas().
1267</para>
1268</sect3>
1269
1270<sect3>
1271<title>Xnest</title>
1272
1273<para>The Xnest X server is a special proxy X server that relays the X
1274protocol requests that it receives to a ``real'' X server that then
1275processes the requests and displays the results, if applicable.  To the X
1276applications, Xnest appears as if it is a regular X server.  However,
1277Xnest is both server to the X application and client of the real X
1278server, which will actually handle the requests.
1279</para>
1280
1281<para>The Xnest server implements all of the standard input and output
1282initialization steps outlined above.
1283</para>
1284
1285<para><variablelist>
1286<varlistentry>
1287<term>InitOutput()</term>
1288<listitem>
1289<para>Xnest takes its configuration information from
1290command line arguments via ddxProcessArguments().  This information
1291includes the real X server display to connect to, its default visual
1292class, the screen depth, the Xnest window's geometry, etc.  Xnest then
1293connects to the real X server and gathers visual, colormap, depth and
1294pixmap information about that server's display, creates a window on that
1295server, which will be used as the root window for Xnest.
1296</para>
1297
1298<para>Next, Xnest initializes its internal data structures and uses the
1299data from the real X server's pixmaps to initialize its own pixmap
1300formats.  Finally, it calls AddScreen(xnestOpenScreen, argc, argv) to
1301initialize each of its screens.
1302</para></listitem></varlistentry>
1303
1304<varlistentry>
1305<term>ScreenInit()</term>
1306<listitem>
1307<para>Xnest's ScreenInit() function is called
1308xnestOpenScreen().  This function initializes its screen's depth and
1309visual information, and then calls miScreenInit() to set up the default
1310screen functions.  It then calls miDCInitialize() to initialize the
1311software cursor.
1312Finally, it replaces many of the screen functions with its own
1313functions that repackage and send the requests to the real X server to
1314which Xnest is attached.
1315</para></listitem></varlistentry>
1316
1317<varlistentry>
1318<term>CloseScreen()</term>
1319<listitem>
1320<para>This function frees its internal data structure
1321allocations.  Since it replaces instead of wrapping screen functions,
1322there are no function pointers to unwrap.  This can potentially lead to
1323problems during server regeneration.
1324</para></listitem></varlistentry>
1325
1326<varlistentry>
1327<term>GC operations</term>
1328<listitem>
1329<para>The GC operations in Xnest are very simple since
1330they leave all of the drawing to the real X server to which Xnest is
1331attached.  Each of the GCOps takes the request and sends it to the
1332real X server using standard Xlib calls.  For example, the X
1333application issues a XDrawLines() call.  This function turns into a
1334protocol request to Xnest, which calls the xnestPolylines() function
1335through Xnest's GCOps function pointer table.  The xnestPolylines()
1336function is only a single line, which calls XDrawLines() using the same
1337arguments that were passed into it.  Other GCOps functions are very
1338similar.  Two exceptions to the simple GCOps functions described above
1339are the image functions and the BLT operations.
1340</para>
1341
1342<para>The image functions, GetImage() and PutImage(), must use a temporary
1343image to hold the image to be put of the image that was just grabbed
1344from the screen while it is in transit to the real X server or the
1345client.  When the image has been transmitted, the temporary image is
1346destroyed.
1347</para>
1348
1349<para>The BLT operations, CopyArea() and CopyPlane(), handle not only the
1350copy function, which is the same as the simple cases described above,
1351but also the graphics exposures that result when the GC's graphics
1352exposure bit is set to True.  Graphics exposures are handled in a helper
1353function, xnestBitBlitHelper().  This function collects the exposure
1354events from the real X server and, if any resulting in regions being
1355exposed, then those regions are passed back to the MI layer so that it
1356can generate exposure events for the X application.
1357</para></listitem></varlistentry>
1358</variablelist>
1359</para>
1360
1361<para>The Xnest server takes its input from the X server to which it is
1362connected.  When the mouse is in the Xnest server's window, keyboard and
1363mouse events are received by the Xnest server, repackaged and sent back
1364to any client that requests those events.
1365</para>
1366</sect3>
1367
1368<sect3>
1369<title>Shadow framebuffer</title>
1370
1371<para>The most common type of framebuffer is a linear array memory that
1372maps to the video memory on the graphics device.  However, accessing
1373that video memory over an I/O bus (e.g., ISA or PCI) can be slow.  The
1374shadow framebuffer layer allows the developer to keep the entire
1375framebuffer in main memory and copy it back to video memory at regular
1376intervals.  It also has been extended to handle planar video memory and
1377rotated framebuffers.
1378</para>
1379
1380<para>There are two main entry points to the shadow framebuffer code:
1381
1382<variablelist>
1383<varlistentry>
1384<term>shadowAlloc(width, height, bpp)</term>
1385<listitem>
1386<para>This function allocates the in
1387memory copy of the framebuffer of size width*height*bpp.  It returns a
1388pointer to that memory, which will be used by the framebuffer
1389ScreenInit() code during the screen's initialization.
1390</para></listitem></varlistentry>
1391
1392<varlistentry>
1393<term>shadowInit(pScreen, updateProc, windowProc)</term>
1394<listitem>
1395<para>This function
1396initializes the shadow framebuffer layer.  It wraps several screen
1397drawing functions, and registers a block handler that will update the
1398screen.  The updateProc is a function that will copy the damaged regions
1399to the screen, and the windowProc is a function that is used when the
1400entire linear video memory range cannot be accessed simultaneously so
1401that only a window into that memory is available (e.g., when using the
1402VGA aperture).
1403</para></listitem></varlistentry>
1404</variablelist>
1405</para>
1406
1407<para>The shadow framebuffer code keeps track of the damaged area of each
1408screen by calculating the bounding box of all drawing operations that
1409have occurred since the last screen update.  Then, when the block handler
1410is next called, only the damaged portion of the screen is updated.
1411</para>
1412
1413<para>Note that since the shadow framebuffer is kept in main memory, all
1414drawing operations are performed by the CPU and, thus, no accelerated
1415hardware drawing operations are possible.
1416</para>
1417
1418</sect3>
1419</sect2>
1420
1421<sect2>
1422<title>Xinerama</title>
1423
1424<para>Xinerama is an X extension that allows multiple physical screens
1425controlled by a single X server to appear as a single screen.  Although
1426the extension allows clients to find the physical screen layout via
1427extension requests, it is completely transparent to clients at the core
1428X11 protocol level.  The original public implementation of Xinerama came
1429from Digital/Compaq.  XFree86 rewrote it, filling in some missing pieces
1430and improving both X11 core protocol compliance and performance.  The
1431Xinerama extension will be passing through X.Org's standardization
1432process in the near future, and the sample implementation will be based
1433on this rewritten version.
1434</para>
1435
1436<para>The current implementation of Xinerama is based primarily in the DIX
1437(device independent) and MI (machine independent) layers of the X
1438server.  With few exceptions the DDX layers do not need any changes to
1439support Xinerama.  X server extensions often do need modifications to
1440provide full Xinerama functionality.
1441</para>
1442
1443<para>The following is a code-level description of how Xinerama functions.
1444</para>
1445
1446<para>Note: Because the Xinerama extension was originally called the
1447PanoramiX extension, many of the Xinerama functions still have the
1448PanoramiX prefix.
1449</para>
1450
1451<variablelist>
1452<varlistentry>
1453<term>PanoramiXExtensionInit()</term>
1454<listitem>
1455    <para>PanoramiXExtensionInit() is a
1456    device-independent extension function that is called at the start of
1457    each server generation from InitExtensions(), which is called from
1458    the X server's main() function after all output devices have been
1459    initialized, but before any input devices have been initialized.
1460    </para>
1461
1462    <para>PanoramiXNumScreens is set to the number of physical screens.  If
1463    only one physical screen is present, the extension is disabled, and
1464    PanoramiXExtensionInit() returns without doing anything else.
1465    </para>
1466
1467    <para>The Xinerama extension is registered by calling AddExtension().
1468    </para>
1469
1470    <para>GC and Screen private
1471    indexes are allocated, and both GC and Screen private areas are
1472    allocated for each physical screen.  These hold Xinerama-specific
1473    per-GC and per-Screen data.  Each screen's CreateGC and CloseScreen
1474    functions are wrapped by XineramaCreateGC() and
1475    XineramaCloseScreen() respectively.  Some new resource classes are
1476    created for Xinerama drawables and GCs, and resource types for
1477    Xinerama windows, pixmaps and colormaps.
1478    </para>
1479
1480    <para>A region (PanoramiXScreenRegion) is
1481    initialized to be the union of the screen regions.
1482    The relative positioning information for the
1483    physical screens is taken from the ScreenRec x and y members, which
1484    the DDX layer must initialize in InitOutput().  The bounds of the
1485    combined screen is also calculated (PanoramiXPixWidth and
1486    PanoramiXPixHeight).
1487    </para>
1488
1489    <para>The DIX layer has a list of function pointers
1490    (ProcVector&lsqb;&rsqb;) that
1491    holds the entry points for the functions that process core protocol
1492    requests.  The requests that Xinerama must intercept and break up
1493    into physical screen-specific requests are wrapped.  The original
1494    set is copied to SavedProcVector&lsqb;&rsqb;.  The types of requests
1495    intercepted are Window requests, GC requests, colormap requests,
1496    drawing requests, and some geometry-related requests.  This wrapping
1497    allows the bulk of the protocol request processing to be handled
1498    transparently to the DIX layer.  Some operations cannot be dealt with
1499    in this way and are handled with Xinerama-specific code within the
1500    DIX layer.
1501    </para>
1502</listitem></varlistentry>
1503
1504<varlistentry>
1505<term>PanoramiXConsolidate()</term>
1506<listitem>
1507    <para>PanoramiXConsolidate() is a
1508    device-independent extension function that is called directly from
1509    the X server's main() function after extensions and input/output
1510    devices have been initialized, and before the root windows are
1511    defined and initialized.
1512</para>
1513
1514    <para>This function finds the set of depths (PanoramiXDepths&lsqb;&rsqb;) and
1515    visuals (PanoramiXVisuals&lsqb;&rsqb;)
1516    common to all of the physical screens.
1517    PanoramiXNumDepths is set to the number of common depths, and
1518    PanoramiXNumVisuals is set to the number of common visuals.
1519    Resources are created for the single root window and the default
1520    colormap.  Each of these resources has per-physical screen entries.
1521    </para>
1522</listitem></varlistentry>
1523
1524<varlistentry>
1525<term>PanoramiXCreateConnectionBlock()</term>
1526<listitem>
1527    <para>PanoramiXConsolidate() is a
1528    device-independent extension function that is called directly from
1529    the X server's main() function after the per-physical screen root
1530    windows are created.  It is called instead of the standard DIX
1531    CreateConnectionBlock() function.  If this function returns FALSE,
1532    the X server exits with a fatal error.  This function will return
1533    FALSE if no common depths were found in PanoramiXConsolidate().
1534    With no common depths, Xinerama mode is not possible.
1535    </para>
1536
1537    <para>The connection block holds the information that clients get when
1538    they open a connection to the X server.  It includes information
1539    such as the supported pixmap formats, number of screens and the
1540    sizes, depths, visuals, default colormap information, etc, for each
1541    of the screens (much of information that <command>xdpyinfo</command> shows).  The
1542    connection block is initialized with the combined single screen
1543    values that were calculated in the above two functions.
1544    </para>
1545
1546    <para>The Xinerama extension allows the registration of connection
1547    block callback functions.  The purpose of these is to allow other
1548    extensions to do processing at this point.  These callbacks can be
1549    registered by calling XineramaRegisterConnectionBlockCallback() from
1550    the other extension's ExtensionInit() function.  Each registered
1551    connection block callback is called at the end of
1552    PanoramiXCreateConnectionBlock().
1553    </para>
1554</listitem></varlistentry>
1555</variablelist>
1556
1557<sect3>
1558<title>Xinerama-specific changes to the DIX code</title>
1559
1560<para>There are a few types of Xinerama-specific changes within the DIX
1561code.  The main ones are described here.
1562</para>
1563
1564<para>Functions that deal with colormap or GC -related operations outside of
1565the intercepted protocol requests have a test added to only do the
1566processing for screen numbers &gt; 0.  This is because they are handled for
1567the single Xinerama screen and the processing is done once for screen 0.
1568</para>
1569
1570<para>The handling of motion events does some coordinate translation between
1571the physical screen's origin and screen zero's origin.  Also, motion
1572events must be reported relative to the composite screen origin rather
1573than the physical screen origins.
1574</para>
1575
1576<para>There is some special handling for cursor, window and event processing
1577that cannot (either not at all or not conveniently) be done via the
1578intercepted protocol requests.  A particular case is the handling of
1579pointers moving between physical screens.
1580</para>
1581</sect3>
1582
1583<sect3>
1584<title>Xinerama-specific changes to the MI code</title>
1585
1586<para>The only Xinerama-specific change to the MI code is in miSendExposures()
1587to handle the coordinate (and window ID) translation for expose events.
1588</para>
1589</sect3>
1590
1591<sect3>
1592<title>Intercepted DIX core requests</title>
1593
1594<para>Xinerama breaks up drawing requests for dispatch to each physical
1595screen.  It also breaks up windows into pieces for each physical screen.
1596GCs are translated into per-screen GCs.  Colormaps are replicated on
1597each physical screen.  The functions handling the intercepted requests
1598take care of breaking the requests and repackaging them so that they can
1599be passed to the standard request handling functions for each screen in
1600turn.  In addition, and to aid the repackaging, the information from
1601many of the intercepted requests is used to keep up to date the
1602necessary state information for the single composite screen.  Requests
1603(usually those with replies) that can be satisfied completely from this
1604stored state information do not call the standard request handling
1605functions.
1606</para>
1607
1608</sect3>
1609
1610</sect2>
1611
1612</sect1>
1613
1614<!-- ============================================================ -->
1615
1616<sect1>
1617<title>Development Results</title>
1618
1619<para>In this section the results of each phase of development are
1620discussed.  This development took place between approximately June 2001
1621and July 2003.
1622</para>
1623
1624<sect2>
1625<title>Phase I</title>
1626
1627<para>The initial development phase dealt with the basic implementation
1628including the bootstrap code, which used the shadow framebuffer, and the
1629unoptimized implementation, based on an Xnest-style implementation.
1630</para>
1631
1632<sect3>
1633<title>Scope</title>
1634
1635<para>The goal of Phase I is to provide fundamental functionality that can
1636act as a foundation for ongoing work:
1637<orderedlist>
1638<listitem>
1639    <para>Develop the proxy X server
1640    <itemizedlist>
1641	<listitem>
1642	<para>The proxy X server will operate on the X11 protocol and
1643	relay requests as necessary to correctly perform the request.
1644	</para></listitem>
1645	<listitem>
1646	<para>Work will be based on the existing work for Xinerama and
1647	Xnest.
1648	</para></listitem>
1649	<listitem>
1650	<para>Input events and windowing operations are handled in the
1651	proxy server and rendering requests are repackaged and sent to
1652	each of the back-end servers for display.
1653	</para></listitem>
1654	<listitem>
1655	<para>The multiple screen layout (including support for
1656	overlapping screens) will be user configurable via a
1657	configuration file or through the configuration tool.
1658	</para></listitem>
1659    </itemizedlist>
1660    </para></listitem>
1661    <listitem>
1662    <para>Develop graphical configuration tool
1663    <itemizedlist>
1664	<listitem>
1665	<para>There will be potentially a large number of X servers to
1666	configure into a single display.  The tool will allow the user
1667	to specify which servers are involved in the configuration and
1668	how they should be laid out.
1669	</para></listitem>
1670    </itemizedlist>
1671    </para></listitem>
1672    <listitem>
1673    <para>Pass the X Test Suite
1674    <itemizedlist>
1675	<listitem>
1676	<para>The X Test Suite covers the basic X11 operations.  All
1677	tests known to succeed must correctly operate in the distributed
1678	X environment.
1679	</para></listitem>
1680    </itemizedlist>
1681    </para></listitem>
1682</orderedlist>
1683
1684</para>
1685
1686<para>For this phase, the back-end X servers are assumed to be unmodified X
1687servers that do not support any DMX-related protocol extensions; future
1688optimization pathways are considered, but are not implemented; and the
1689configuration tool is assumed to rely only on libraries in the X source
1690tree (e.g., Xt).
1691</para>
1692</sect3>
1693
1694<sect3>
1695<title>Results</title>
1696
1697<para>The proxy X server, Xdmx, was developed to distribute X11 protocol
1698requests to the set of back-end X servers.  It opens a window on each
1699back-end server, which represents the part of the front-end's root
1700window that is visible on that screen.  It mirrors window, pixmap and
1701other state in each back-end server.  Drawing requests are sent to
1702either windows or pixmaps on each back-end server.  This code is based
1703on Xnest and uses the existing Xinerama extension.
1704</para>
1705
1706<para>Input events can be taken from (1) devices attached to the back-end
1707server, (2) core devices attached directly to the Xdmx server, or (3)
1708from a ``console'' window on another X server.  Events for these devices
1709are gathered, processed and delivered to clients attached to the Xdmx
1710server.
1711</para>
1712
1713<para>An intuitive configuration format was developed to help the user
1714easily configure the multiple back-end X servers.  It was defined (see
1715grammar in Xdmx man page) and a parser was implemented that is used by
1716the Xdmx server and by a standalone xdmxconfig utility.  The parsing
1717support was implemented such that it can be easily factored out of the X
1718source tree for use with other tools (e.g., vdl).  Support for
1719converting legacy vdl-format configuration files to the DMX format is
1720provided by the vdltodmx utility.
1721</para>
1722
1723<para>Originally, the configuration file was going to be a subsection of
1724XFree86's XF86Config file, but that was not possible since Xdmx is a
1725completely separate X server.  Thus, a separate config file format was
1726developed.  In addition, a graphical configuration
1727tool, xdmxconfig, was developed to allow the user to create and arrange
1728the screens in the configuration file.  The <emphasis remap="bf">-configfile</emphasis> and <emphasis remap="bf">-config</emphasis>
1729command-line options can be used to start Xdmx using a configuration
1730file.
1731</para>
1732
1733<para>An extension that enables remote input testing is required for the X
1734Test Suite to function.  During this phase, this extension (XTEST) was
1735implemented in the Xdmx server.  The results from running the X Test
1736Suite are described in detail below.
1737</para>
1738</sect3>
1739
1740<sect3>
1741<title>X Test Suite</title>
1742
1743        <sect4>
1744          <title>Introduction</title>
1745            <para>
1746              The X Test Suite contains tests that verify Xlib functions
1747              operate correctly.  The test suite is designed to run on a
1748              single X server; however, since X applications will not be
1749              able to tell the difference between the DMX server and a
1750              standard X server, the X Test Suite should also run on the
1751              DMX server.
1752            </para>
1753            <para>
1754              The Xdmx server was tested with the X Test Suite, and the
1755              existing failures are noted in this section.  To put these
1756              results in perspective, we first discuss expected X Test
1757              failures and how errors in underlying systems can impact
1758              Xdmx test results.
1759            </para>
1760        </sect4>
1761
1762        <sect4>
1763          <title>Expected Failures for a Single Head</title>
1764            <para>
1765              A correctly implemented X server with a single screen is
1766              expected to fail certain X Test tests.  The following
1767              well-known errors occur because of rounding error in the X
1768              server code:
1769              <literallayout>
1770XDrawArc: Tests 42, 63, 66, 73
1771XDrawArcs: Tests 45, 66, 69, 76
1772              </literallayout>
1773            </para>
1774            <para>
1775              The following failures occur because of the high-level X
1776              server implementation:
1777              <literallayout>
1778XLoadQueryFont: Test 1
1779XListFontsWithInfo: Tests 3, 4
1780XQueryFont: Tests 1, 2
1781              </literallayout>
1782            </para>
1783            <para>
1784              The following test fails when running the X server as root
1785              under Linux because of the way directory modes are
1786              interpreted:
1787              <literallayout>
1788XWriteBitmapFile: Test 3
1789              </literallayout>
1790            </para>
1791            <para>
1792              Depending on the video card used for the back-end, other
1793              failures may also occur because of bugs in the low-level
1794              driver implementation.  Over time, failures of this kind
1795              are usually fixed by XFree86, but will show up in Xdmx
1796              testing until then.
1797            </para>
1798        </sect4>
1799
1800        <sect4>
1801          <title>Expected Failures for Xinerama</title>
1802            <para>
1803              Xinerama fails several X Test Suite tests because of
1804              design decisions made for the current implementation of
1805              Xinerama.  Over time, many of these errors will be
1806              corrected by XFree86 and the group working on a new
1807              Xinerama implementation.  Therefore, Xdmx will also share
1808              X Suite Test failures with Xinerama.
1809            </para>
1810
1811            <para>
1812              We may be able to fix or work-around some of these
1813              failures at the Xdmx level, but this will require
1814              additional exploration that was not part of Phase I.
1815            </para>
1816
1817            <para>
1818              Xinerama is constantly improving, and the list of
1819              Xinerama-related failures depends on XFree86 version and
1820              the underlying graphics hardware.  We tested with a
1821              variety of hardware, including nVidia, S3, ATI Radeon,
1822              and Matrox G400 (in dual-head mode).  The list below
1823              includes only those failures that appear to be from the
1824              Xinerama layer, and does not include failures listed in
1825              the previous section, or failures that appear to be from
1826              the low-level graphics driver itself:
1827            </para>
1828
1829            <para>
1830              These failures were noted with multiple Xinerama
1831              configurations:
1832              <literallayout>
1833XCopyPlane: Tests 13, 22, 31 (well-known Xinerama implementation issue)
1834XSetFontPath: Test 4
1835XGetDefault: Test 5
1836XMatchVisualInfo: Test 1
1837              </literallayout>
1838            </para>
1839            <para>
1840              These failures were noted only when using one dual-head
1841              video card with a 4.2.99.x XFree86 server:
1842              <literallayout>
1843XListPixmapFormats: Test 1
1844XDrawRectangles: Test 45
1845              </literallayout>
1846            </para>
1847            <para>
1848              These failures were noted only when using two video cards
1849              from different vendors with a 4.1.99.x XFree86 server:
1850              <literallayout>
1851XChangeWindowAttributes: Test 32
1852XCreateWindow: Test 30
1853XDrawLine: Test 22
1854XFillArc: Test 22
1855XChangeKeyboardControl: Tests 9, 10
1856XRebindKeysym: Test 1
1857              </literallayout>
1858            </para>
1859        </sect4>
1860
1861        <sect4>
1862	  <title>Additional Failures from Xdmx</title>
1863
1864            <para>
1865              When running Xdmx, no unexpected failures were noted.
1866              Since the Xdmx server is based on Xinerama, we expect to
1867              have most of the Xinerama failures present in the Xdmx
1868              server.  Similarly, since the Xdmx server must rely on the
1869              low-level device drivers on each back-end server, we also
1870              expect that Xdmx will exhibit most of the back-end
1871              failures.  Here is a summary:
1872              <literallayout>
1873XListPixmapFormats: Test 1 (configuration dependent)
1874XChangeWindowAttributes: Test 32
1875XCreateWindow: Test 30
1876XCopyPlane: Test 13, 22, 31
1877XSetFontPath: Test 4
1878XGetDefault: Test 5 (configuration dependent)
1879XMatchVisualInfo: Test 1
1880XRebindKeysym: Test 1 (configuration dependent)
1881                </literallayout>
1882            </para>
1883            <para>
1884              Note that this list is shorter than the combined list for
1885              Xinerama because Xdmx uses different code paths to perform
1886              some Xinerama operations.  Further, some Xinerama failures
1887              have been fixed in the XFree86 4.2.99.x CVS repository.
1888            </para>
1889        </sect4>
1890
1891        <sect4>
1892          <title>Summary and Future Work</title>
1893
1894            <para>
1895              Running the X Test Suite on Xdmx does not produce any
1896              failures that cannot be accounted for by the underlying
1897              Xinerama subsystem used by the front-end or by the
1898              low-level device-driver code running on the back-end X
1899              servers.  The Xdmx server therefore is as ``correct'' as
1900              possible with respect to the standard set of X Test Suite
1901              tests.
1902            </para>
1903
1904            <para>
1905              During the following phases, we will continue to verify
1906              Xdmx correctness using the X Test Suite.  We may also use
1907              other tests suites or write additional tests that run
1908              under the X Test Suite that specifically verify the
1909              expected behavior of DMX.
1910            </para>
1911        </sect4>
1912</sect3>
1913
1914<sect3>
1915<title>Fonts</title>
1916
1917<para>In Phase I, fonts are handled directly by both the front-end and the
1918back-end servers, which is required since we must treat each back-end
1919server during this phase as a ``black box''.  What this requires is that
1920<emphasis remap="bf">the front- and back-end servers must share the exact same font
1921path</emphasis>.  There are two ways to help make sure that all servers share the
1922same font path:
1923
1924<orderedlist>
1925  <listitem>
1926    <para>First, each server can be configured to use the same font
1927    server.  The font server, xfs, can be configured to serve fonts to
1928    multiple X servers via TCP.
1929    </para></listitem>
1930
1931  <listitem>
1932    <para>Second, each server can be configured to use the same font
1933    path and either those font paths can be copied to each back-end
1934    machine or they can be mounted (e.g., via NFS) on each back-end
1935    machine.
1936    </para></listitem>
1937</orderedlist>
1938</para>
1939
1940<para>One additional concern is that a client program can set its own font
1941path, and if it does so, then that font path must be available on each
1942back-end machine.
1943</para>
1944
1945<para>The -fontpath command line option was added to allow users to
1946initialize the font path of the front end server.  This font path is
1947propagated to each back-end server when the default font is loaded.  If
1948there are any problems, an error message is printed, which will describe
1949the problem and list the current font path.  For more information about
1950setting the font path, see the -fontpath option description in the man
1951page.
1952</para>
1953</sect3>
1954
1955<sect3>
1956<title>Performance</title>
1957
1958<para>Phase I of development was not intended to optimize performance.  Its
1959focus was on completely and correctly handling the base X11 protocol in
1960the Xdmx server.  However, several insights were gained during Phase I,
1961which are listed here for reference during the next phase of
1962development.
1963</para>
1964
1965<orderedlist>
1966  <listitem>
1967    <para>Calls to XSync() can slow down rendering since it requires a
1968    complete round trip to and from a back-end server.  This is
1969    especially problematic when communicating over long haul networks.
1970    </para></listitem>
1971
1972  <listitem>
1973    <para>Sending drawing requests to only the screens that they overlap
1974    should improve performance.
1975    </para></listitem>
1976</orderedlist>
1977</sect3>
1978
1979<sect3>
1980<title>Pixmaps</title>
1981
1982<para>Pixmaps were originally expected to be handled entirely in the
1983front-end X server; however, it was found that this overly complicated
1984the rendering code and would have required sending potentially large
1985images to each back server that required them when copying from pixmap
1986to screen.  Thus, pixmap state is mirrored in the back-end server just
1987as it is with regular window state.  With this implementation, the same
1988rendering code that draws to windows can be used to draw to pixmaps on
1989the back-end server, and no large image transfers are required to copy
1990from pixmap to window.
1991</para>
1992
1993</sect3>
1994
1995</sect2>
1996
1997<!-- ============================================================ -->
1998<sect2>
1999<title>Phase II</title>
2000
2001<para>The second phase of development concentrates on performance
2002optimizations.  These optimizations are documented here, with
2003<command>x11perf</command> data to show how the optimizations improve performance.
2004</para>
2005
2006<para>All benchmarks were performed by running Xdmx on a dual processor
20071.4GHz AMD Athlon machine with 1GB of RAM connecting over 100baseT to
2008two single-processor 1GHz Pentium III machines with 256MB of RAM and ATI
2009Rage 128 (RF) video cards.  The front end was running Linux
20102.4.20-pre1-ac1 and the back ends were running Linux 2.4.7-10 and
2011version 4.2.99.1 of XFree86 pulled from the XFree86 CVS repository on
2012August 7, 2002.  All systems were running Red Hat Linux 7.2.
2013</para>
2014
2015<sect3>
2016<title>Moving from XFree86 4.1.99.1 to 4.2.0.0</title>
2017
2018<para>For phase II, the working source tree was moved to the branch tagged
2019with dmx-1-0-branch and was updated from version 4.1.99.1 (20 August
20202001) of the XFree86 sources to version 4.2.0.0 (18 January 2002).
2021After this update, the following tests were noted to be more than 10%
2022faster:
2023<screen>
20241.13   Fill 300x300 opaque stippled trapezoid (161x145 stipple)
20251.16   Fill 1x1 tiled trapezoid (161x145 tile)
20261.13   Fill 10x10 tiled trapezoid (161x145 tile)
20271.17   Fill 100x100 tiled trapezoid (161x145 tile)
20281.16   Fill 1x1 tiled trapezoid (216x208 tile)
20291.20   Fill 10x10 tiled trapezoid (216x208 tile)
20301.15   Fill 100x100 tiled trapezoid (216x208 tile)
20311.37   Circulate Unmapped window (200 kids)
2032</screen>
2033And the following tests were noted to be more than 10% slower:
2034<screen>
20350.88   Unmap window via parent (25 kids)
20360.75   Circulate Unmapped window (4 kids)
20370.79   Circulate Unmapped window (16 kids)
20380.80   Circulate Unmapped window (25 kids)
20390.82   Circulate Unmapped window (50 kids)
20400.85   Circulate Unmapped window (75 kids)
2041</screen>
2042</para>
2043
2044<para>These changes were not caused by any changes in the DMX system, and
2045may point to changes in the XFree86 tree or to tests that have more
2046"jitter" than most other <command>x11perf</command> tests.
2047</para>
2048</sect3>
2049
2050<sect3>
2051<title>Global changes</title>
2052
2053<para>During the development of the Phase II DMX server, several global
2054changes were made.  These changes were also compared with the Phase I
2055server.  The following tests were noted to be more than 10% faster:
2056<screen>
20571.13   Fill 300x300 opaque stippled trapezoid (161x145 stipple)
20581.15   Fill 1x1 tiled trapezoid (161x145 tile)
20591.13   Fill 10x10 tiled trapezoid (161x145 tile)
20601.17   Fill 100x100 tiled trapezoid (161x145 tile)
20611.16   Fill 1x1 tiled trapezoid (216x208 tile)
20621.19   Fill 10x10 tiled trapezoid (216x208 tile)
20631.15   Fill 100x100 tiled trapezoid (216x208 tile)
20641.15   Circulate Unmapped window (4 kids)
2065</screen>
2066</para>
2067
2068<para>The following tests were noted to be more than 10% slower:
2069<screen>
20700.69   Scroll 10x10 pixels
20710.68   Scroll 100x100 pixels
20720.68   Copy 10x10 from window to window
20730.68   Copy 100x100 from window to window
20740.76   Circulate Unmapped window (75 kids)
20750.83   Circulate Unmapped window (100 kids)
2076</screen>
2077</para>
2078
2079<para>For the remainder of this analysis, the baseline of comparison will
2080be the Phase II deliverable with all optimizations disabled (unless
2081otherwise noted).  This will highlight how the optimizations in
2082isolation impact performance.
2083</para>
2084</sect3>
2085
2086<sect3>
2087<title>XSync() Batching</title>
2088
2089<para>During the Phase I implementation, XSync() was called after every
2090protocol request made by the DMX server.  This provided the DMX server
2091with an interactive feel, but defeated X11's protocol buffering system
2092and introduced round-trip wire latency into every operation.  During
2093Phase II, DMX was changed so that protocol requests are no longer
2094followed by calls to XSync().  Instead, the need for an XSync() is
2095noted, and XSync() calls are only made every 100mS or when the DMX
2096server specifically needs to make a call to guarantee interactivity.
2097With this new system, X11 buffers protocol as much as possible during a
2098100mS interval, and many unnecessary XSync() calls are avoided.
2099</para>
2100
2101<para>Out of more than 300 <command>x11perf</command> tests, 8 tests became more than 100
2102times faster, with 68 more than 50X faster, 114 more than 10X faster,
2103and 181 more than 2X faster.  See table below for summary.
2104</para>
2105
2106<para>The following tests were noted to be more than 10% slower with
2107XSync() batching on:
2108<screen>
21090.88   500x500 tiled rectangle (161x145 tile)
21100.89   Copy 500x500 from window to window
2111</screen>
2112</para>
2113</sect3>
2114
2115<sect3>
2116<title>Offscreen Optimization</title>
2117
2118<para>Windows span one or more of the back-end servers' screens; however,
2119during Phase I development, windows were created on every back-end
2120server and every rendering request was sent to every window regardless
2121of whether or not that window was visible.  With the offscreen
2122optimization, the DMX server tracks when a window is completely off of a
2123back-end server's screen and, in that case, it does not send rendering
2124requests to those back-end windows.  This optimization saves bandwidth
2125between the front and back-end servers, and it reduces the number of
2126XSync() calls.  The performance tests were run on a DMX system with only
2127two back-end servers.  Greater performance gains will be had as the
2128number of back-end servers increases.
2129</para>
2130
2131<para>Out of more than 300 <command>x11perf</command> tests, 3 tests were at least twice as
2132fast, and 146 tests were at least 10% faster.  Two tests were more than
213310% slower with the offscreen optimization:
2134<screen>
21350.88   Hide/expose window via popup (4 kids)
21360.89   Resize unmapped window (75 kids)
2137</screen>
2138</para>
2139</sect3>
2140
2141<sect3>
2142<title>Lazy Window Creation Optimization</title>
2143
2144<para>As mentioned above, during Phase I, windows were created on every
2145back-end server even if they were not visible on that back-end.  With
2146the lazy window creation optimization, the DMX server does not create
2147windows on a back-end server until they are either visible or they
2148become the parents of a visible window.  This optimization builds on the
2149offscreen optimization (described above) and requires it to be enabled.
2150</para>
2151
2152<para>The lazy window creation optimization works by creating the window
2153data structures in the front-end server when a client creates a window,
2154but delays creation of the window on the back-end server(s).  A private
2155window structure in the DMX server saves the relevant window data and
2156tracks changes to the window's attributes and stacking order for later
2157use.  The only times a window is created on a back-end server are (1)
2158when it is mapped and is at least partially overlapping the back-end
2159server's screen (tracked by the offscreen optimization), or (2) when the
2160window becomes the parent of a previously visible window.  The first
2161case occurs when a window is mapped or when a visible window is copied,
2162moved or resized and now overlaps the back-end server's screen.  The
2163second case occurs when starting a window manager after having created
2164windows to which the window manager needs to add decorations.
2165</para>
2166
2167<para>When either case occurs, a window on the back-end server is created
2168using the data saved in the DMX server's window private data structure.
2169The stacking order is then adjusted to correctly place the window on the
2170back-end and lastly the window is mapped.  From this time forward, the
2171window is handled exactly as if the window had been created at the time
2172of the client's request.
2173</para>
2174
2175<para>Note that when a window is no longer visible on a back-end server's
2176screen (e.g., it is moved offscreen), the window is not destroyed;
2177rather, it is kept and reused later if the window once again becomes
2178visible on the back-end server's screen.  Originally with this
2179optimization, destroying windows was implemented but was later rejected
2180because it increased bandwidth when windows were opaquely moved or
2181resized, which is common in many window managers.
2182</para>
2183
2184<para>The performance tests were run on a DMX system with only two back-end
2185servers.  Greater performance gains will be had as the number of
2186back-end servers increases.
2187</para>
2188
2189<para>This optimization improved the following <command>x11perf</command> tests by more
2190than 10%:
2191<screen>
21921.10   500x500 rectangle outline
21931.12   Fill 100x100 stippled trapezoid (161x145 stipple)
21941.20   Circulate Unmapped window (50 kids)
21951.19   Circulate Unmapped window (75 kids)
2196</screen>
2197</para>
2198</sect3>
2199
2200<sect3>
2201<title>Subdividing Rendering Primitives</title>
2202
2203<para>X11 imaging requests transfer significant data between the client and
2204the X server.  During Phase I, the DMX server would then transfer the
2205image data to each back-end server.  Even with the offscreen
2206optimization (above), these requests still required transferring
2207significant data to each back-end server that contained a visible
2208portion of the window.  For example, if the client uses XPutImage() to
2209copy an image to a window that overlaps the entire DMX screen, then the
2210entire image is copied by the DMX server to every back-end server.
2211</para>
2212
2213<para>To reduce the amount of data transferred between the DMX server and
2214the back-end servers when XPutImage() is called, the image data is
2215subdivided and only the data that will be visible on a back-end server's
2216screen is sent to that back-end server.  Xinerama already implements a
2217subdivision algorithm for XGetImage() and no further optimization was
2218needed.
2219</para>
2220
2221<para>Other rendering primitives were analyzed, but the time required to
2222subdivide these primitives was a significant proportion of the time
2223required to send the entire rendering request to the back-end server, so
2224this optimization was rejected for the other rendering primitives.
2225</para>
2226
2227<para>Again, the performance tests were run on a DMX system with only two
2228back-end servers.  Greater performance gains will be had as the number
2229of back-end servers increases.
2230</para>
2231
2232<para>This optimization improved the following <command>x11perf</command> tests by more
2233than 10%:
2234<screen>
22351.12   Fill 100x100 stippled trapezoid (161x145 stipple)
22361.26   PutImage 10x10 square
22371.83   PutImage 100x100 square
22381.91   PutImage 500x500 square
22391.40   PutImage XY 10x10 square
22401.48   PutImage XY 100x100 square
22411.50   PutImage XY 500x500 square
22421.45   Circulate Unmapped window (75 kids)
22431.74   Circulate Unmapped window (100 kids)
2244</screen>
2245</para>
2246
2247<para>The following test was noted to be more than 10% slower with this
2248optimization:
2249<screen>
22500.88   10-pixel fill chord partial circle
2251</screen>
2252</para>
2253</sect3>
2254
2255<sect3>
2256<title>Summary of x11perf Data</title>
2257
2258<para>With all of the optimizations on, 53 <command>x11perf</command> tests are more than
2259100X faster than the unoptimized Phase II deliverable, with 69 more than
226050X faster, 73 more than 10X faster, and 199 more than twice as fast.
2261No tests were more than 10% slower than the unoptimized Phase II
2262deliverable.  (Compared with the Phase I deliverable, only Circulate
2263Unmapped window (100 kids) was more than 10% slower than the Phase II
2264deliverable.  As noted above, this test seems to have wider variability
2265than other <command>x11perf</command> tests.)
2266</para>
2267
2268<para>The following table summarizes relative <command>x11perf</command> test changes for
2269all optimizations individually and collectively.  Note that some of the
2270optimizations have a synergistic effect when used together.
2271<screen>
2272
22731: XSync() batching only
22742: Off screen optimizations only
22753: Window optimizations only
22764: Subdivprims only
22775: All optimizations
2278
2279    1     2    3    4      5 Operation
2280------ ---- ---- ---- ------ ---------
2281  2.14 1.85 1.00 1.00   4.13 Dot
2282  1.67 1.80 1.00 1.00   3.31 1x1 rectangle
2283  2.38 1.43 1.00 1.00   2.44 10x10 rectangle
2284  1.00 1.00 0.92 0.98   1.00 100x100 rectangle
2285  1.00 1.00 1.00 1.00   1.00 500x500 rectangle
2286  1.83 1.85 1.05 1.06   3.54 1x1 stippled rectangle (8x8 stipple)
2287  2.43 1.43 1.00 1.00   2.41 10x10 stippled rectangle (8x8 stipple)
2288  0.98 1.00 1.00 1.00   1.00 100x100 stippled rectangle (8x8 stipple)
2289  1.00 1.00 1.00 1.00   0.98 500x500 stippled rectangle (8x8 stipple)
2290  1.75 1.75 1.00 1.00   3.40 1x1 opaque stippled rectangle (8x8 stipple)
2291  2.38 1.42 1.00 1.00   2.34 10x10 opaque stippled rectangle (8x8 stipple)
2292  1.00 1.00 0.97 0.97   1.00 100x100 opaque stippled rectangle (8x8 stipple)
2293  1.00 1.00 1.00 1.00   0.99 500x500 opaque stippled rectangle (8x8 stipple)
2294  1.82 1.82 1.04 1.04   3.56 1x1 tiled rectangle (4x4 tile)
2295  2.33 1.42 1.00 1.00   2.37 10x10 tiled rectangle (4x4 tile)
2296  1.00 0.92 1.00 1.00   1.00 100x100 tiled rectangle (4x4 tile)
2297  1.00 1.00 1.00 1.00   1.00 500x500 tiled rectangle (4x4 tile)
2298  1.94 1.62 1.00 1.00   3.66 1x1 stippled rectangle (17x15 stipple)
2299  1.74 1.28 1.00 1.00   1.73 10x10 stippled rectangle (17x15 stipple)
2300  1.00 1.00 1.00 0.89   0.98 100x100 stippled rectangle (17x15 stipple)
2301  1.00 1.00 1.00 1.00   0.98 500x500 stippled rectangle (17x15 stipple)
2302  1.94 1.62 1.00 1.00   3.67 1x1 opaque stippled rectangle (17x15 stipple)
2303  1.69 1.26 1.00 1.00   1.66 10x10 opaque stippled rectangle (17x15 stipple)
2304  1.00 0.95 1.00 1.00   1.00 100x100 opaque stippled rectangle (17x15 stipple)
2305  1.00 1.00 1.00 1.00   0.97 500x500 opaque stippled rectangle (17x15 stipple)
2306  1.93 1.61 0.99 0.99   3.69 1x1 tiled rectangle (17x15 tile)
2307  1.73 1.27 1.00 1.00   1.72 10x10 tiled rectangle (17x15 tile)
2308  1.00 1.00 1.00 1.00   0.98 100x100 tiled rectangle (17x15 tile)
2309  1.00 1.00 0.97 0.97   1.00 500x500 tiled rectangle (17x15 tile)
2310  1.95 1.63 1.00 1.00   3.83 1x1 stippled rectangle (161x145 stipple)
2311  1.80 1.30 1.00 1.00   1.83 10x10 stippled rectangle (161x145 stipple)
2312  0.97 1.00 1.00 1.00   1.01 100x100 stippled rectangle (161x145 stipple)
2313  1.00 1.00 1.00 1.00   0.98 500x500 stippled rectangle (161x145 stipple)
2314  1.95 1.63 1.00 1.00   3.56 1x1 opaque stippled rectangle (161x145 stipple)
2315  1.65 1.25 1.00 1.00   1.68 10x10 opaque stippled rectangle (161x145 stipple)
2316  1.00 1.00 1.00 1.00   1.01 100x100 opaque stippled rectangle (161x145...
2317  1.00 1.00 1.00 1.00   0.97 500x500 opaque stippled rectangle (161x145...
2318  1.95 1.63 0.98 0.99   3.80 1x1 tiled rectangle (161x145 tile)
2319  1.67 1.26 1.00 1.00   1.67 10x10 tiled rectangle (161x145 tile)
2320  1.13 1.14 1.14 1.14   1.14 100x100 tiled rectangle (161x145 tile)
2321  0.88 1.00 1.00 1.00   0.99 500x500 tiled rectangle (161x145 tile)
2322  1.93 1.63 1.00 1.00   3.53 1x1 tiled rectangle (216x208 tile)
2323  1.69 1.26 1.00 1.00   1.66 10x10 tiled rectangle (216x208 tile)
2324  1.00 1.00 1.00 1.00   1.00 100x100 tiled rectangle (216x208 tile)
2325  1.00 1.00 1.00 1.00   1.00 500x500 tiled rectangle (216x208 tile)
2326  1.82 1.70 1.00 1.00   3.38 1-pixel line segment
2327  2.07 1.56 0.90 1.00   3.31 10-pixel line segment
2328  1.29 1.10 1.00 1.00   1.27 100-pixel line segment
2329  1.05 1.06 1.03 1.03   1.09 500-pixel line segment
2330  1.30 1.13 1.00 1.00   1.29 100-pixel line segment (1 kid)
2331  1.32 1.15 1.00 1.00   1.32 100-pixel line segment (2 kids)
2332  1.33 1.16 1.00 1.00   1.33 100-pixel line segment (3 kids)
2333  1.92 1.64 1.00 1.00   3.73 10-pixel dashed segment
2334  1.34 1.16 1.00 1.00   1.34 100-pixel dashed segment
2335  1.24 1.11 0.99 0.97   1.23 100-pixel double-dashed segment
2336  1.72 1.77 1.00 1.00   3.25 10-pixel horizontal line segment
2337  1.83 1.66 1.01 1.00   3.54 100-pixel horizontal line segment
2338  1.86 1.30 1.00 1.00   1.84 500-pixel horizontal line segment
2339  2.11 1.52 1.00 0.99   3.02 10-pixel vertical line segment
2340  1.21 1.10 1.00 1.00   1.20 100-pixel vertical line segment
2341  1.03 1.03 1.00 1.00   1.02 500-pixel vertical line segment
2342  4.42 1.68 1.00 1.01   4.64 10x1 wide horizontal line segment
2343  1.83 1.31 1.00 1.00   1.83 100x10 wide horizontal line segment
2344  1.07 1.00 0.96 1.00   1.07 500x50 wide horizontal line segment
2345  4.10 1.67 1.00 1.00   4.62 10x1 wide vertical line segment
2346  1.50 1.24 1.06 1.06   1.48 100x10 wide vertical line segment
2347  1.06 1.03 1.00 1.00   1.05 500x50 wide vertical line segment
2348  2.54 1.61 1.00 1.00   3.61 1-pixel line
2349  2.71 1.48 1.00 1.00   2.67 10-pixel line
2350  1.19 1.09 1.00 1.00   1.19 100-pixel line
2351  1.04 1.02 1.00 1.00   1.03 500-pixel line
2352  2.68 1.51 0.98 1.00   3.17 10-pixel dashed line
2353  1.23 1.11 0.99 0.99   1.23 100-pixel dashed line
2354  1.15 1.08 1.00 1.00   1.15 100-pixel double-dashed line
2355  2.27 1.39 1.00 1.00   2.23 10x1 wide line
2356  1.20 1.09 1.00 1.00   1.20 100x10 wide line
2357  1.04 1.02 1.00 1.00   1.04 500x50 wide line
2358  1.52 1.45 1.00 1.00   1.52 100x10 wide dashed line
2359  1.54 1.47 1.00 1.00   1.54 100x10 wide double-dashed line
2360  1.97 1.30 0.96 0.95   1.95 10x10 rectangle outline
2361  1.44 1.27 1.00 1.00   1.43 100x100 rectangle outline
2362  3.22 2.16 1.10 1.09   3.61 500x500 rectangle outline
2363  1.95 1.34 1.00 1.00   1.90 10x10 wide rectangle outline
2364  1.14 1.14 1.00 1.00   1.13 100x100 wide rectangle outline
2365  1.00 1.00 1.00 1.00   1.00 500x500 wide rectangle outline
2366  1.57 1.72 1.00 1.00   3.03 1-pixel circle
2367  1.96 1.35 1.00 1.00   1.92 10-pixel circle
2368  1.21 1.07 0.86 0.97   1.20 100-pixel circle
2369  1.08 1.04 1.00 1.00   1.08 500-pixel circle
2370  1.39 1.19 1.03 1.03   1.38 100-pixel dashed circle
2371  1.21 1.11 1.00 1.00   1.23 100-pixel double-dashed circle
2372  1.59 1.28 1.00 1.00   1.58 10-pixel wide circle
2373  1.22 1.12 0.99 1.00   1.22 100-pixel wide circle
2374  1.06 1.04 1.00 1.00   1.05 500-pixel wide circle
2375  1.87 1.84 1.00 1.00   1.85 100-pixel wide dashed circle
2376  1.90 1.93 1.01 1.01   1.90 100-pixel wide double-dashed circle
2377  2.13 1.43 1.00 1.00   2.32 10-pixel partial circle
2378  1.42 1.18 1.00 1.00   1.42 100-pixel partial circle
2379  1.92 1.85 1.01 1.01   1.89 10-pixel wide partial circle
2380  1.73 1.67 1.00 1.00   1.73 100-pixel wide partial circle
2381  1.36 1.95 1.00 1.00   2.64 1-pixel solid circle
2382  2.02 1.37 1.00 1.00   2.03 10-pixel solid circle
2383  1.19 1.09 1.00 1.00   1.19 100-pixel solid circle
2384  1.02 0.99 1.00 1.00   1.01 500-pixel solid circle
2385  1.74 1.28 1.00 0.88   1.73 10-pixel fill chord partial circle
2386  1.31 1.13 1.00 1.00   1.31 100-pixel fill chord partial circle
2387  1.67 1.31 1.03 1.03   1.72 10-pixel fill slice partial circle
2388  1.30 1.13 1.00 1.00   1.28 100-pixel fill slice partial circle
2389  2.45 1.49 1.01 1.00   2.71 10-pixel ellipse
2390  1.22 1.10 1.00 1.00   1.22 100-pixel ellipse
2391  1.09 1.04 1.00 1.00   1.09 500-pixel ellipse
2392  1.90 1.28 1.00 1.00   1.89 100-pixel dashed ellipse
2393  1.62 1.24 0.96 0.97   1.61 100-pixel double-dashed ellipse
2394  2.43 1.50 1.00 1.00   2.42 10-pixel wide ellipse
2395  1.61 1.28 1.03 1.03   1.60 100-pixel wide ellipse
2396  1.08 1.05 1.00 1.00   1.08 500-pixel wide ellipse
2397  1.93 1.88 1.00 1.00   1.88 100-pixel wide dashed ellipse
2398  1.94 1.89 1.01 1.00   1.94 100-pixel wide double-dashed ellipse
2399  2.31 1.48 1.00 1.00   2.67 10-pixel partial ellipse
2400  1.38 1.17 1.00 1.00   1.38 100-pixel partial ellipse
2401  2.00 1.85 0.98 0.97   1.98 10-pixel wide partial ellipse
2402  1.89 1.86 1.00 1.00   1.89 100-pixel wide partial ellipse
2403  3.49 1.60 1.00 1.00   3.65 10-pixel filled ellipse
2404  1.67 1.26 1.00 1.00   1.67 100-pixel filled ellipse
2405  1.06 1.04 1.00 1.00   1.06 500-pixel filled ellipse
2406  2.38 1.43 1.01 1.00   2.32 10-pixel fill chord partial ellipse
2407  2.06 1.30 1.00 1.00   2.05 100-pixel fill chord partial ellipse
2408  2.27 1.41 1.00 1.00   2.27 10-pixel fill slice partial ellipse
2409  1.98 1.33 1.00 0.97   1.97 100-pixel fill slice partial ellipse
2410 57.46 1.99 1.01 1.00 114.92 Fill 1x1 equivalent triangle
2411 56.94 1.98 1.01 1.00  73.89 Fill 10x10 equivalent triangle
2412  6.07 1.75 1.00 1.00   6.07 Fill 100x100 equivalent triangle
2413 51.12 1.98 1.00 1.00 102.81 Fill 1x1 trapezoid
2414 51.42 1.82 1.01 1.00  94.89 Fill 10x10 trapezoid
2415  6.47 1.80 1.00 1.00   6.44 Fill 100x100 trapezoid
2416  1.56 1.28 1.00 0.99   1.56 Fill 300x300 trapezoid
2417 51.27 1.97 0.96 0.97 102.54 Fill 1x1 stippled trapezoid (8x8 stipple)
2418 51.73 2.00 1.02 1.02  67.92 Fill 10x10 stippled trapezoid (8x8 stipple)
2419  5.36 1.72 1.00 1.00   5.36 Fill 100x100 stippled trapezoid (8x8 stipple)
2420  1.54 1.26 1.00 1.00   1.59 Fill 300x300 stippled trapezoid (8x8 stipple)
2421 51.41 1.94 1.01 1.00 102.82 Fill 1x1 opaque stippled trapezoid (8x8 stipple)
2422 50.71 1.95 0.99 1.00  65.44 Fill 10x10 opaque stippled trapezoid (8x8...
2423  5.33 1.73 1.00 1.00   5.36 Fill 100x100 opaque stippled trapezoid (8x8...
2424  1.58 1.25 1.00 1.00   1.58 Fill 300x300 opaque stippled trapezoid (8x8...
2425 51.56 1.96 0.99 0.90 103.68 Fill 1x1 tiled trapezoid (4x4 tile)
2426 51.59 1.99 1.01 1.01  62.25 Fill 10x10 tiled trapezoid (4x4 tile)
2427  5.38 1.72 1.00 1.00   5.38 Fill 100x100 tiled trapezoid (4x4 tile)
2428  1.54 1.25 1.00 0.99   1.58 Fill 300x300 tiled trapezoid (4x4 tile)
2429 51.70 1.98 1.01 1.01 103.98 Fill 1x1 stippled trapezoid (17x15 stipple)
2430 44.86 1.97 1.00 1.00  44.86 Fill 10x10 stippled trapezoid (17x15 stipple)
2431  2.74 1.56 1.00 1.00   2.73 Fill 100x100 stippled trapezoid (17x15 stipple)
2432  1.29 1.14 1.00 1.00   1.27 Fill 300x300 stippled trapezoid (17x15 stipple)
2433 51.41 1.96 0.96 0.95 103.39 Fill 1x1 opaque stippled trapezoid (17x15...
2434 45.14 1.96 1.01 1.00  45.14 Fill 10x10 opaque stippled trapezoid (17x15...
2435  2.68 1.56 1.00 1.00   2.68 Fill 100x100 opaque stippled trapezoid (17x15...
2436  1.26 1.10 1.00 1.00   1.28 Fill 300x300 opaque stippled trapezoid (17x15...
2437 51.13 1.97 1.00 0.99 103.39 Fill 1x1 tiled trapezoid (17x15 tile)
2438 47.58 1.96 1.00 1.00  47.86 Fill 10x10 tiled trapezoid (17x15 tile)
2439  2.74 1.56 1.00 1.00   2.74 Fill 100x100 tiled trapezoid (17x15 tile)
2440  1.29 1.14 1.00 1.00   1.28 Fill 300x300 tiled trapezoid (17x15 tile)
2441 51.13 1.97 0.99 0.97 103.39 Fill 1x1 stippled trapezoid (161x145 stipple)
2442 45.14 1.97 1.00 1.00  44.29 Fill 10x10 stippled trapezoid (161x145 stipple)
2443  3.02 1.77 1.12 1.12   3.38 Fill 100x100 stippled trapezoid (161x145 stipple)
2444  1.31 1.13 1.00 1.00   1.30 Fill 300x300 stippled trapezoid (161x145 stipple)
2445 51.27 1.97 1.00 1.00 103.10 Fill 1x1 opaque stippled trapezoid (161x145...
2446 45.01 1.97 1.00 1.00  45.01 Fill 10x10 opaque stippled trapezoid (161x145...
2447  2.67 1.56 1.00 1.00   2.69 Fill 100x100 opaque stippled trapezoid (161x145..
2448  1.29 1.13 1.00 1.01   1.27 Fill 300x300 opaque stippled trapezoid (161x145..
2449 51.41 1.96 1.00 0.99 103.39 Fill 1x1 tiled trapezoid (161x145 tile)
2450 45.01 1.96 0.98 1.00  45.01 Fill 10x10 tiled trapezoid (161x145 tile)
2451  2.62 1.36 1.00 1.00   2.69 Fill 100x100 tiled trapezoid (161x145 tile)
2452  1.27 1.13 1.00 1.00   1.22 Fill 300x300 tiled trapezoid (161x145 tile)
2453 51.13 1.98 1.00 1.00 103.39 Fill 1x1 tiled trapezoid (216x208 tile)
2454 45.14 1.97 1.01 0.99  45.14 Fill 10x10 tiled trapezoid (216x208 tile)
2455  2.62 1.55 1.00 1.00   2.71 Fill 100x100 tiled trapezoid (216x208 tile)
2456  1.28 1.13 1.00 1.00   1.20 Fill 300x300 tiled trapezoid (216x208 tile)
2457 50.71 1.95 1.00 1.00  54.70 Fill 10x10 equivalent complex polygon
2458  5.51 1.71 0.96 0.98   5.47 Fill 100x100 equivalent complex polygons
2459  8.39 1.97 1.00 1.00  16.75 Fill 10x10 64-gon (Convex)
2460  8.38 1.83 1.00 1.00   8.43 Fill 100x100 64-gon (Convex)
2461  8.50 1.96 1.00 1.00  16.64 Fill 10x10 64-gon (Complex)
2462  8.26 1.83 1.00 1.00   8.35 Fill 100x100 64-gon (Complex)
2463 14.09 1.87 1.00 1.00  14.05 Char in 80-char line (6x13)
2464 11.91 1.87 1.00 1.00  11.95 Char in 70-char line (8x13)
2465 11.16 1.85 1.01 1.00  11.10 Char in 60-char line (9x15)
2466 10.09 1.78 1.00 1.00  10.09 Char16 in 40-char line (k14)
2467  6.15 1.75 1.00 1.00   6.31 Char16 in 23-char line (k24)
2468 11.92 1.90 1.03 1.03  11.88 Char in 80-char line (TR 10)
2469  8.18 1.78 1.00 0.99   8.17 Char in 30-char line (TR 24)
2470 42.83 1.44 1.01 1.00  42.11 Char in 20/40/20 line (6x13, TR 10)
2471 27.45 1.43 1.01 1.01  27.45 Char16 in 7/14/7 line (k14, k24)
2472 12.13 1.85 1.00 1.00  12.05 Char in 80-char image line (6x13)
2473 10.00 1.84 1.00 1.00  10.00 Char in 70-char image line (8x13)
2474  9.18 1.83 1.00 1.00   9.12 Char in 60-char image line (9x15)
2475  9.66 1.82 0.98 0.95   9.66 Char16 in 40-char image line (k14)
2476  5.82 1.72 1.00 1.00   5.99 Char16 in 23-char image line (k24)
2477  8.70 1.80 1.00 1.00   8.65 Char in 80-char image line (TR 10)
2478  4.67 1.66 1.00 1.00   4.67 Char in 30-char image line (TR 24)
2479 84.43 1.47 1.00 1.00 124.18 Scroll 10x10 pixels
2480  3.73 1.50 1.00 0.98   3.73 Scroll 100x100 pixels
2481  1.00 1.00 1.00 1.00   1.00 Scroll 500x500 pixels
2482 84.43 1.51 1.00 1.00 134.02 Copy 10x10 from window to window
2483  3.62 1.51 0.98 0.98   3.62 Copy 100x100 from window to window
2484  0.89 1.00 1.00 1.00   1.00 Copy 500x500 from window to window
2485 57.06 1.99 1.00 1.00  88.64 Copy 10x10 from pixmap to window
2486  2.49 2.00 1.00 1.00   2.48 Copy 100x100 from pixmap to window
2487  1.00 0.91 1.00 1.00   0.98 Copy 500x500 from pixmap to window
2488  2.04 1.01 1.00 1.00   2.03 Copy 10x10 from window to pixmap
2489  1.05 1.00 1.00 1.00   1.05 Copy 100x100 from window to pixmap
2490  1.00 1.00 0.93 1.00   1.04 Copy 500x500 from window to pixmap
2491 58.52 1.03 1.03 1.02  57.95 Copy 10x10 from pixmap to pixmap
2492  2.40 1.00 1.00 1.00   2.45 Copy 100x100 from pixmap to pixmap
2493  1.00 1.00 1.00 1.00   1.00 Copy 500x500 from pixmap to pixmap
2494 51.57 1.92 1.00 1.00  85.75 Copy 10x10 1-bit deep plane
2495  6.37 1.75 1.01 1.01   6.37 Copy 100x100 1-bit deep plane
2496  1.26 1.11 1.00 1.00   1.24 Copy 500x500 1-bit deep plane
2497  4.23 1.63 0.98 0.97   4.38 Copy 10x10 n-bit deep plane
2498  1.04 1.02 1.00 1.00   1.04 Copy 100x100 n-bit deep plane
2499  1.00 1.00 1.00 1.00   1.00 Copy 500x500 n-bit deep plane
2500  6.45 1.98 1.00 1.26  12.80 PutImage 10x10 square
2501  1.10 1.87 1.00 1.83   2.11 PutImage 100x100 square
2502  1.02 1.93 1.00 1.91   1.91 PutImage 500x500 square
2503  4.17 1.78 1.00 1.40   7.18 PutImage XY 10x10 square
2504  1.27 1.49 0.97 1.48   2.10 PutImage XY 100x100 square
2505  1.00 1.50 1.00 1.50   1.52 PutImage XY 500x500 square
2506  1.07 1.01 1.00 1.00   1.06 GetImage 10x10 square
2507  1.01 1.00 1.00 1.00   1.01 GetImage 100x100 square
2508  1.00 1.00 1.00 1.00   1.00 GetImage 500x500 square
2509  1.56 1.00 0.99 0.97   1.56 GetImage XY 10x10 square
2510  1.02 1.00 1.00 1.00   1.02 GetImage XY 100x100 square
2511  1.00 1.00 1.00 1.00   1.00 GetImage XY 500x500 square
2512  1.00 1.00 1.01 0.98   0.95 X protocol NoOperation
2513  1.02 1.03 1.04 1.03   1.00 QueryPointer
2514  1.03 1.02 1.04 1.03   1.00 GetProperty
2515100.41 1.51 1.00 1.00 198.76 Change graphics context
2516 45.81 1.00 0.99 0.97  57.10 Create and map subwindows (4 kids)
2517 78.45 1.01 1.02 1.02  63.07 Create and map subwindows (16 kids)
2518 73.91 1.01 1.00 1.00  56.37 Create and map subwindows (25 kids)
2519 73.22 1.00 1.00 1.00  49.07 Create and map subwindows (50 kids)
2520 72.36 1.01 0.99 1.00  32.14 Create and map subwindows (75 kids)
2521 70.34 1.00 1.00 1.00  30.12 Create and map subwindows (100 kids)
2522 55.00 1.00 1.00 0.99  23.75 Create and map subwindows (200 kids)
2523 55.30 1.01 1.00 1.00 141.03 Create unmapped window (4 kids)
2524 55.38 1.01 1.01 1.00 163.25 Create unmapped window (16 kids)
2525 54.75 0.96 1.00 0.99 166.95 Create unmapped window (25 kids)
2526 54.83 1.00 1.00 0.99 178.81 Create unmapped window (50 kids)
2527 55.38 1.01 1.01 1.00 181.20 Create unmapped window (75 kids)
2528 55.38 1.01 1.01 1.00 181.20 Create unmapped window (100 kids)
2529 54.87 1.01 1.01 1.00 182.05 Create unmapped window (200 kids)
2530 28.13 1.00 1.00 1.00  30.75 Map window via parent (4 kids)
2531 36.14 1.01 1.01 1.01  32.58 Map window via parent (16 kids)
2532 26.13 1.00 0.98 0.95  29.85 Map window via parent (25 kids)
2533 40.07 1.00 1.01 1.00  27.57 Map window via parent (50 kids)
2534 23.26 0.99 1.00 1.00  18.23 Map window via parent (75 kids)
2535 22.91 0.99 1.00 0.99  16.52 Map window via parent (100 kids)
2536 27.79 1.00 1.00 0.99  12.50 Map window via parent (200 kids)
2537 22.35 1.00 1.00 1.00  56.19 Unmap window via parent (4 kids)
2538  9.57 1.00 0.99 1.00  89.78 Unmap window via parent (16 kids)
2539 80.77 1.01 1.00 1.00 103.85 Unmap window via parent (25 kids)
2540 96.34 1.00 1.00 1.00 116.06 Unmap window via parent (50 kids)
2541 99.72 1.00 1.00 1.00 124.93 Unmap window via parent (75 kids)
2542112.36 1.00 1.00 1.00 125.27 Unmap window via parent (100 kids)
2543105.41 1.00 1.00 0.99 120.00 Unmap window via parent (200 kids)
2544 51.29 1.03 1.02 1.02  74.19 Destroy window via parent (4 kids)
2545 86.75 0.99 0.99 0.99 116.87 Destroy window via parent (16 kids)
2546106.43 1.01 1.01 1.01 127.49 Destroy window via parent (25 kids)
2547120.34 1.01 1.01 1.00 140.11 Destroy window via parent (50 kids)
2548126.67 1.00 0.99 0.99 145.00 Destroy window via parent (75 kids)
2549126.11 1.01 1.01 1.00 140.56 Destroy window via parent (100 kids)
2550128.57 1.01 1.00 1.00 137.91 Destroy window via parent (200 kids)
2551 16.04 0.88 1.00 1.00  20.36 Hide/expose window via popup (4 kids)
2552 19.04 1.01 1.00 1.00  23.48 Hide/expose window via popup (16 kids)
2553 19.22 1.00 1.00 1.00  20.44 Hide/expose window via popup (25 kids)
2554 17.41 1.00 0.91 0.97  17.68 Hide/expose window via popup (50 kids)
2555 17.29 1.01 1.00 1.01  17.07 Hide/expose window via popup (75 kids)
2556 16.74 1.00 1.00 1.00  16.17 Hide/expose window via popup (100 kids)
2557 10.30 1.00 1.00 1.00  10.51 Hide/expose window via popup (200 kids)
2558 16.48 1.01 1.00 1.00  26.05 Move window (4 kids)
2559 17.01 0.95 1.00 1.00  23.97 Move window (16 kids)
2560 16.95 1.00 1.00 1.00  22.90 Move window (25 kids)
2561 16.05 1.01 1.00 1.00  21.32 Move window (50 kids)
2562 15.58 1.00 0.98 0.98  19.44 Move window (75 kids)
2563 14.98 1.02 1.03 1.03  18.17 Move window (100 kids)
2564 10.90 1.01 1.01 1.00  12.68 Move window (200 kids)
2565 49.42 1.00 1.00 1.00 198.27 Moved unmapped window (4 kids)
2566 50.72 0.97 1.00 1.00 193.66 Moved unmapped window (16 kids)
2567 50.87 1.00 0.99 1.00 195.09 Moved unmapped window (25 kids)
2568 50.72 1.00 1.00 1.00 189.34 Moved unmapped window (50 kids)
2569 50.87 1.00 1.00 1.00 191.33 Moved unmapped window (75 kids)
2570 50.87 1.00 1.00 0.90 186.71 Moved unmapped window (100 kids)
2571 50.87 1.00 1.00 1.00 179.19 Moved unmapped window (200 kids)
2572 41.04 1.00 1.00 1.00  56.61 Move window via parent (4 kids)
2573 69.81 1.00 1.00 1.00 130.82 Move window via parent (16 kids)
2574 95.81 1.00 1.00 1.00 141.92 Move window via parent (25 kids)
2575 95.98 1.00 1.00 1.00 149.43 Move window via parent (50 kids)
2576 96.59 1.01 1.01 1.00 153.98 Move window via parent (75 kids)
2577 97.19 1.00 1.00 1.00 157.30 Move window via parent (100 kids)
2578 96.67 1.00 0.99 0.96 159.44 Move window via parent (200 kids)
2579 17.75 1.01 1.00 1.00  27.61 Resize window (4 kids)
2580 17.94 1.00 1.00 0.99  25.42 Resize window (16 kids)
2581 17.92 1.01 1.00 1.00  24.47 Resize window (25 kids)
2582 17.24 0.97 1.00 1.00  24.14 Resize window (50 kids)
2583 16.81 1.00 1.00 0.99  22.75 Resize window (75 kids)
2584 16.08 1.00 1.00 1.00  21.20 Resize window (100 kids)
2585 12.92 1.00 0.99 1.00  16.26 Resize window (200 kids)
2586 52.94 1.01 1.00 1.00 327.12 Resize unmapped window (4 kids)
2587 53.60 1.01 1.01 1.01 333.71 Resize unmapped window (16 kids)
2588 52.99 1.00 1.00 1.00 337.29 Resize unmapped window (25 kids)
2589 51.98 1.00 1.00 1.00 329.38 Resize unmapped window (50 kids)
2590 53.05 0.89 1.00 1.00 322.60 Resize unmapped window (75 kids)
2591 53.05 1.00 1.00 1.00 318.08 Resize unmapped window (100 kids)
2592 53.11 1.00 1.00 0.99 306.21 Resize unmapped window (200 kids)
2593 16.76 1.00 0.96 1.00  19.46 Circulate window (4 kids)
2594 17.24 1.00 1.00 0.97  16.24 Circulate window (16 kids)
2595 16.30 1.03 1.03 1.03  15.85 Circulate window (25 kids)
2596 13.45 1.00 1.00 1.00  14.90 Circulate window (50 kids)
2597 12.91 1.00 1.00 1.00  13.06 Circulate window (75 kids)
2598 11.30 0.98 1.00 1.00  11.03 Circulate window (100 kids)
2599  7.58 1.01 1.01 0.99   7.47 Circulate window (200 kids)
2600  1.01 1.01 0.98 1.00   0.95 Circulate Unmapped window (4 kids)
2601  1.07 1.07 1.01 1.07   1.02 Circulate Unmapped window (16 kids)
2602  1.04 1.09 1.06 1.05   0.97 Circulate Unmapped window (25 kids)
2603  1.04 1.23 1.20 1.18   1.05 Circulate Unmapped window (50 kids)
2604  1.18 1.53 1.19 1.45   1.24 Circulate Unmapped window (75 kids)
2605  1.08 1.02 1.01 1.74   1.01 Circulate Unmapped window (100 kids)
2606  1.01 1.12 0.98 0.91   0.97 Circulate Unmapped window (200 kids)
2607</screen>
2608</para>
2609</sect3>
2610
2611<sect3>
2612<title>Profiling with OProfile</title>
2613
2614<para>OProfile (available from http://oprofile.sourceforge.net/) is a
2615system-wide profiler for Linux systems that uses processor-level
2616counters to collect sampling data.  OProfile can provide information
2617that is similar to that provided by <command>gprof</command>, but without the
2618necessity of recompiling the program with special instrumentation (i.e.,
2619OProfile can collect statistical profiling information about optimized
2620programs).  A test harness was developed to collect OProfile data for
2621each <command>x11perf</command> test individually.
2622</para>
2623
2624<para>Test runs were performed using the RETIRED_INSNS counter on the AMD
2625Athlon and the CPU_CLK_HALTED counter on the Intel Pentium III (with a
2626test configuration different from the one described above).  We have
2627examined OProfile output and have compared it with <command>gprof</command> output.
2628This investigation has not produced results that yield performance
2629increases in <command>x11perf</command> numbers.
2630</para>
2631
2632</sect3>
2633
2634<!--
2635<sect3>Retired Instructions
2636
2637<p>The initial tests using OProfile were done using the RETIRED_INSNS
2638counter with DMX running on the dual-processor AMD Athlon machine - the
2639same test configuration that was described above and that was used for
2640other tests.  The RETIRED_INSNS counter counts retired instructions and
2641showed drawing, text, copying, and image tests to be dominated (&gt;
264230%) by calls to Hash(), SecurityLookupIDByClass(),
2643SecurityLookupIDByType(), and StandardReadRequestFromClient().  Some of
2644these tests also executed significant instructions in
2645WaitForSomething().
2646
2647<p>In contrast, the window tests executed significant
2648instructions in SecurityLookupIDByType(), Hash(),
2649StandardReadRequestFromClient(), but also executed significant
2650instructions in other routines, such as ConfigureWindow().  Some time
2651was spent looking at Hash() function, but optimizations in this routine
2652did not lead to a dramatic increase in <tt/x11perf/ performance.
2653-->
2654
2655<!--
2656<sect3>Clock Cycles
2657
2658<p>Retired instructions can be misleading because Intel/AMD instructions
2659execute in variable amounts of time.  The OProfile tests were repeated
2660using the Intel CPU_CLK_HALTED counter with DMX running on the second
2661back-end machine.  Note that this is a different test configuration that
2662the one described above.  However, these tests show the amount of time
2663(as measured in CPU cycles) that are spent in each routine.  Because
2664<tt/x11perf/ was running on the first back-end machine and because
2665window optimizations were on, the load on the second back-end machine
2666was not significant.
2667
2668<p>Using CPU_CLK_HALTED, DMX showed simple drawing
2669tests spending more than 10% of their time in
2670StandardReadRequestFromClient(), with significant time (&gt; 20% total)
2671spent in SecurityLookupIDByClass(), WaitForSomething(), and Dispatch().
2672For these tests, &lt; 5% of the time was spent in Hash(), which explains
2673why optimizing the Hash() routine did not impact <tt/x11perf/ results.
2674
2675<p>The trapezoid, text, scrolling, copying, and image tests were
2676dominated by time in ProcFillPoly(), PanoramiXFillPoly(), dmxFillPolygon(),
2677SecurityLookupIDByClass(), SecurityLookupIDByType(), and
2678StandardReadRequestFromClient().  Hash() time was generally above 5% but
2679less than 10% of total time.
2680-->
2681
2682<sect3>
2683<title>X Test Suite</title>
2684
2685<para>The X Test Suite was run on the fully optimized DMX server using the
2686configuration described above.  The following failures were noted:
2687<screen>
2688XListPixmapFormats: Test 1              [1]
2689XChangeWindowAttributes: Test 32        [1]
2690XCreateWindow: Test 30                  [1]
2691XFreeColors: Test 4                     [3]
2692XCopyArea: Test 13, 17, 21, 25, 30      [2]
2693XCopyPlane: Test 11, 15, 27, 31         [2]
2694XSetFontPath: Test 4                    [1]
2695XChangeKeyboardControl: Test 9, 10      [1]
2696
2697[1] Previously documented errors expected from the Xinerama
2698    implementation (see Phase I discussion).
2699[2] Newly noted errors that have been verified as expected
2700    behavior of the Xinerama implementation.
2701[3] Newly noted error that has been verified as a Xinerama
2702    implementation bug.
2703</screen>
2704</para>
2705
2706</sect3>
2707
2708</sect2>
2709
2710<!-- ============================================================ -->
2711<sect2>
2712<title>Phase III</title>
2713
2714<para>During the third phase of development, support was provided for the
2715following extensions: SHAPE, RENDER, XKEYBOARD, XInput.
2716</para>
2717
2718<sect3>
2719<title>SHAPE</title>
2720
2721<para>The SHAPE extension is supported.  Test applications (e.g., xeyes and
2722oclock) and window managers that make use of the SHAPE extension will
2723work as expected.
2724</para>
2725</sect3>
2726
2727<sect3>
2728<title>RENDER</title>
2729
2730<para>The RENDER extension is supported.  The version included in the DMX
2731CVS tree is version 0.2, and this version is fully supported by Xdmx.
2732Applications using only version 0.2 functions will work correctly;
2733however, some apps that make use of functions from later versions do not
2734properly check the extension's major/minor version numbers.  These apps
2735will fail with a Bad Implementation error when using post-version 0.2
2736functions.  This is expected behavior.  When the DMX CVS tree is updated
2737to include newer versions of RENDER, support for these newer functions
2738will be added to the DMX X server.
2739</para>
2740</sect3>
2741
2742<sect3>
2743<title>XKEYBOARD</title>
2744
2745<para>The XKEYBOARD extension is supported.  If present on the back-end X
2746servers, the XKEYBOARD extension will be used to obtain information
2747about the type of the keyboard for initialization.  Otherwise, the
2748keyboard will be initialized using defaults.  Note that this departs
2749from older behavior: when Xdmx is compiled without XKEYBOARD support,
2750the map from the back-end X server will be preserved.  With XKEYBOARD
2751support, the map is not preserved because better information and control
2752of the keyboard is available.
2753</para>
2754</sect3>
2755
2756<sect3>
2757<title>XInput</title>
2758
2759<para>The XInput extension is supported.  Any device can be used as a core
2760device and be used as an XInput extension device, with the exception of
2761core devices on the back-end servers.  This limitation is present
2762because cursor handling on the back-end requires that the back-end
2763cursor sometimes track the Xdmx core cursor -- behavior that is
2764incompatible with using the back-end pointer as a non-core device.
2765</para>
2766
2767<para>Currently, back-end extension devices are not available as Xdmx
2768extension devices, but this limitation should be removed in the future.
2769</para>
2770
2771<para>To demonstrate the XInput extension, and to provide more examples for
2772low-level input device driver writers, USB device drivers have been
2773written for mice (usb-mou), keyboards (usb-kbd), and
2774non-mouse/non-keyboard USB devices (usb-oth).  Please see the man page
2775for information on Linux kernel drivers that are required for using
2776these Xdmx drivers.
2777</para>
2778</sect3>
2779
2780<sect3>
2781<title>DPMS</title>
2782
2783<para>The DPMS extension is exported but does not do anything at this time.
2784</para>
2785
2786</sect3>
2787
2788<sect3>
2789<title>Other Extensions</title>
2790
2791<para>The LBX,
2792       SECURITY,
2793       XC-APPGROUP, and
2794       XFree86-Bigfont
2795extensions do not require any special Xdmx support and have been exported.
2796</para>
2797
2798<para>The
2799    BIG-REQUESTS,
2800    DEC-XTRAP,
2801    DOUBLE-BUFFER,
2802    Extended-Visual-Information,
2803    FontCache,
2804    GLX,
2805    MIT-SCREEN-SAVER,
2806    MIT-SHM,
2807    MIT-SUNDRY-NONSTANDARD,
2808    RECORD,
2809    SECURITY,
2810    SGI-GLX,
2811    SYNC,
2812    TOG-CUP,
2813    X-Resource,
2814    XC-MISC,
2815    XFree86-DGA,
2816    XFree86-DRI,
2817    XFree86-Misc,
2818    XFree86-VidModeExtension, and
2819    XVideo
2820extensions are <emphasis remap="it">not</emphasis> supported at this time, but will be evaluated
2821for inclusion in future DMX releases.  <emphasis remap="bf">See below for additional work
2822on extensions after Phase III.</emphasis>
2823</para>
2824</sect3>
2825</sect2>
2826
2827<sect2>
2828<title>Phase IV</title>
2829
2830<sect3>
2831<title>Moving to XFree86 4.3.0</title>
2832
2833<para>For Phase IV, the recent release of XFree86 4.3.0 (27 February 2003)
2834was merged onto the dmx.sourceforge.net CVS trunk and all work is
2835proceeding using this tree.
2836</para>
2837</sect3>
2838
2839<sect3>
2840<title>Extensions </title>
2841
2842<sect4>
2843<title>XC-MISC (supported)</title>
2844
2845<para>XC-MISC is used internally by the X library to recycle XIDs from the
2846X server.  This is important for long-running X server sessions.  Xdmx
2847supports this extension.  The X Test Suite passed and failed the exact
2848same tests before and after this extension was enabled.
2849<!-- Tested February/March 2003 -->
2850</para>
2851</sect4>
2852
2853<sect4>
2854<title>Extended-Visual-Information (supported)</title>
2855
2856<para>The Extended-Visual-Information extension provides a method for an X
2857client to obtain detailed visual information.  Xdmx supports this
2858extension.  It was tested using the <filename>hw/dmx/examples/evi</filename> example
2859program.  <emphasis remap="bf">Note that this extension is not Xinerama-aware</emphasis> -- it will
2860return visual information for each screen even though Xinerama is
2861causing the X server to export a single logical screen.
2862<!-- Tested March 2003 -->
2863</para>
2864</sect4>
2865
2866<sect4>
2867<title>RES (supported)</title>
2868
2869<para>The X-Resource extension provides a mechanism for a client to obtain
2870detailed information about the resources used by other clients.  This
2871extension was tested with the <filename>hw/dmx/examples/res</filename> program.  The
2872X Test Suite passed and failed the exact same tests before and after
2873this extension was enabled.
2874<!-- Tested March 2003 -->
2875</para>
2876</sect4>
2877
2878<sect4>
2879<title>BIG-REQUESTS (supported)</title>
2880
2881<para>This extension enables the X11 protocol to handle requests longer
2882than 262140 bytes.  The X Test Suite passed and failed the exact same
2883tests before and after this extension was enabled.
2884<!-- Tested March 2003 -->
2885</para>
2886</sect4>
2887
2888<sect4>
2889<title>XSYNC (supported)</title>
2890
2891<para>This extension provides facilities for two different X clients to
2892synchronize their requests.  This extension was minimally tested with
2893<command>xdpyinfo</command> and the X Test Suite passed and failed the exact same
2894tests before and after this extension was enabled.
2895<!-- Tested March 2003 -->
2896</para>
2897</sect4>
2898
2899<sect4>
2900<title>XTEST, RECORD, DEC-XTRAP (supported) and XTestExtension1 (not supported)</title>
2901
2902<para>The XTEST and RECORD extension were developed by the X Consortium for
2903use in the X Test Suite and are supported as a standard in the X11R6
2904tree.  They are also supported in Xdmx.  When X Test Suite tests that
2905make use of the XTEST extension are run, Xdmx passes and fails exactly
2906the same tests as does a standard XFree86 X server.  When the
2907<literal remap="tt">rcrdtest</literal> test (a part of the X Test Suite that verifies the RECORD
2908extension) is run, Xdmx passes and fails exactly the same tests as does
2909a standard XFree86 X server. <!-- Tested February/March 2003 -->
2910</para>
2911
2912<para>There are two older XTEST-like extensions: DEC-XTRAP and
2913XTestExtension1.  The XTestExtension1 extension was developed for use by
2914the X Testing Consortium for use with a test suite that eventually
2915became (part of?) the X Test Suite.  Unlike XTEST, which only allows
2916events to be sent to the server, the XTestExtension1 extension also
2917allowed events to be recorded (similar to the RECORD extension).  The
2918second is the DEC-XTRAP extension that was developed by the Digital
2919Equipment Corporation.
2920</para>
2921
2922<para>The DEC-XTRAP extension is available from Xdmx and has been tested
2923with the <command>xtrap*</command> tools which are distributed as standard X11R6
2924clients. <!-- Tested March 2003 -->
2925</para>
2926
2927<para>The XTestExtension1 is <emphasis>not</emphasis> supported because it does not appear
2928to be used by any modern X clients (the few that support it also support
2929XTEST) and because there are no good methods available for testing that
2930it functions correctly (unlike XTEST and DEC-XTRAP, the code for
2931XTestExtension1 is not part of the standard X server source tree, so
2932additional testing is important). <!-- Tested March 2003 -->
2933</para>
2934
2935<para>Most of these extensions are documented in the X11R6 source tree.
2936Further, several original papers exist that this author was unable to
2937locate -- for completeness and historical interest, citations are
2938provide:
2939<variablelist>
2940<varlistentry>
2941<term>XRECORD</term>
2942<listitem>
2943<para>Martha Zimet. Extending X For Recording.  8th Annual X
2944Technical Conference Boston, MA January 24-26, 1994.
2945</para></listitem></varlistentry>
2946<varlistentry>
2947<term>DEC-XTRAP</term>
2948<listitem>
2949<para>Dick Annicchiarico, Robert Chesler, Alan Jamison. XTrap
2950Architecture. Digital Equipment Corporation, July 1991.
2951</para></listitem></varlistentry>
2952<varlistentry>
2953<term>XTestExtension1</term>
2954<listitem>
2955<para>Larry Woestman. X11 Input Synthesis Extension
2956Proposal. Hewlett Packard, November 1991.
2957</para></listitem></varlistentry>
2958</variablelist>
2959</para>
2960</sect4>
2961
2962<sect4>
2963<title>MIT-MISC (not supported)</title>
2964
2965<para>The MIT-MISC extension is used to control a bug-compatibility flag
2966that provides compatibility with xterm programs from X11R1 and X11R2.
2967There does not appear to be a single client available that makes use of
2968this extension and there is not way to verify that it works correctly.
2969The Xdmx server does <emphasis>not</emphasis> support MIT-MISC.
2970</para>
2971</sect4>
2972
2973<sect4>
2974<title>SCREENSAVER (not supported)</title>
2975
2976<para>This extension provides special support for the X screen saver.  It
2977was tested with beforelight, which appears to be the only client that
2978works with it.  When Xinerama was not active, <command>beforelight</command> behaved
2979as expected.  However, when Xinerama was active, <command>beforelight</command> did
2980not behave as expected.  Further, when this extension is not active,
2981<command>xscreensaver</command> (a widely-used X screen saver program) did not behave
2982as expected.  Since this extension is not Xinerama-aware and is not
2983commonly used with expected results by clients, we have left this
2984extension disabled at this time.
2985</para>
2986</sect4>
2987
2988<sect4>
2989<title>GLX (supported)</title>
2990
2991<para>The GLX extension provides OpenGL and GLX windowing support.  In
2992Xdmx, the extension is called glxProxy, and it is Xinerama aware.  It
2993works by either feeding requests forward through Xdmx to each of the
2994back-end servers or handling them locally.  All rendering requests are
2995handled on the back-end X servers.  This code was donated to the DMX
2996project by SGI.  For the X Test Suite results comparison, see below.
2997</para>
2998</sect4>
2999
3000<sect4>
3001<title>RENDER (supported)</title>
3002
3003<para>The X Rendering Extension (RENDER) provides support for digital image
3004composition.  Geometric and text rendering are supported.  RENDER is
3005partially Xinerama-aware, with text and the most basic compositing
3006operator; however, its higher level primitives (triangles, triangle
3007strips, and triangle fans) are not yet Xinerama-aware.  The RENDER
3008extension is still under development, and is currently at version 0.8.
3009Additional support will be required in DMX as more primitives and/or
3010requests are added to the extension.
3011</para>
3012
3013<para>There is currently no test suite for the X Rendering Extension;
3014however, there has been discussion of developing a test suite as the
3015extension matures.  When that test suite becomes available, additional
3016testing can be performed with Xdmx.  The X Test Suite passed and failed
3017the exact same tests before and after this extension was enabled.
3018</para>
3019</sect4>
3020
3021<sect4>
3022<title>Summary</title>
3023
3024<!-- WARNING: this list is duplicated in the "Common X extension
3025support" section -->
3026<para>To summarize, the following extensions are currently supported:
3027    BIG-REQUESTS,
3028    DEC-XTRAP,
3029    DMX,
3030    DPMS,
3031    Extended-Visual-Information,
3032    GLX,
3033    LBX,
3034    RECORD,
3035    RENDER,
3036    SECURITY,
3037    SHAPE,
3038    SYNC,
3039    X-Resource,
3040    XC-APPGROUP,
3041    XC-MISC,
3042    XFree86-Bigfont,
3043    XINERAMA,
3044    XInputExtension,
3045    XKEYBOARD, and
3046    XTEST.
3047</para>
3048
3049<para>The following extensions are <emphasis>not</emphasis> supported at this time:
3050    DOUBLE-BUFFER,
3051    FontCache,
3052    MIT-SCREEN-SAVER,
3053    MIT-SHM,
3054    MIT-SUNDRY-NONSTANDARD,
3055    TOG-CUP,
3056    XFree86-DGA,
3057    XFree86-Misc,
3058    XFree86-VidModeExtension,
3059    XTestExtensionExt1, and
3060    XVideo.
3061</para>
3062</sect4>
3063</sect3>
3064
3065<sect3>
3066<title>Additional Testing with the X Test Suite</title>
3067
3068<sect4>
3069<title>XFree86 without XTEST</title>
3070
3071<para>After the release of XFree86 4.3.0, we retested the XFree86 X server
3072with and without using the XTEST extension.  When the XTEST extension
3073was <emphasis>not</emphasis> used for testing, the XFree86 4.3.0 server running on our
3074usual test system with a Radeon VE card reported unexpected failures in
3075the following tests:
3076<literallayout>
3077XListPixmapFormats: Test 1
3078XChangeKeyboardControl: Tests 9, 10
3079XGetDefault: Test 5
3080XRebindKeysym: Test 1
3081</literallayout>
3082</para>
3083</sect4>
3084
3085<sect4>
3086<title>XFree86 with XTEST</title>
3087
3088<para>When using the XTEST extension, the XFree86 4.3.0 server reported the
3089following errors:
3090<literallayout>
3091XListPixmapFormats: Test 1
3092XChangeKeyboardControl: Tests 9, 10
3093XGetDefault: Test 5
3094XRebindKeysym: Test 1
3095
3096XAllowEvents: Tests 20, 21, 24
3097XGrabButton: Tests 5, 9-12, 14, 16, 19, 21-25
3098XGrabKey: Test 8
3099XSetPointerMapping: Test 3
3100XUngrabButton: Test 4
3101</literallayout>
3102</para>
3103
3104<para>While these errors may be important, they will probably be fixed
3105eventually in the XFree86 source tree.  We are particularly interested
3106in demonstrating that the Xdmx server does not introduce additional
3107failures that are not known Xinerama failures.
3108</para>
3109</sect4>
3110
3111<sect4>
3112<title>Xdmx with XTEST, without Xinerama, without GLX</title>
3113
3114<para>Without Xinerama, but using the XTEST extension, the following errors
3115were reported from Xdmx (note that these are the same as for the XFree86
31164.3.0, except that XGetDefault no longer fails):
3117<literallayout>
3118XListPixmapFormats: Test 1
3119XChangeKeyboardControl: Tests 9, 10
3120XRebindKeysym: Test 1
3121
3122XAllowEvents: Tests  20, 21, 24
3123XGrabButton: Tests 5, 9-12, 14, 16, 19, 21-25
3124XGrabKey: Test 8
3125XSetPointerMapping: Test 3
3126XUngrabButton: Test 4
3127</literallayout>
3128</para>
3129</sect4>
3130
3131<sect4>
3132<title>Xdmx with XTEST, with Xinerama, without GLX</title>
3133
3134<para>With Xinerama, using the XTEST extension, the following errors
3135were reported from Xdmx:
3136<literallayout>
3137XListPixmapFormats: Test 1
3138XChangeKeyboardControl: Tests 9, 10
3139XRebindKeysym: Test 1
3140
3141XAllowEvents: Tests 20, 21, 24
3142XGrabButton: Tests 5, 9-12, 14, 16, 19, 21-25
3143XGrabKey: Test 8
3144XSetPointerMapping: Test 3
3145XUngrabButton: Test 4
3146
3147XCopyPlane: Tests 13, 22, 31 (well-known XTEST/Xinerama interaction issue)
3148XDrawLine: Test 67
3149XDrawLines: Test 91
3150XDrawSegments: Test 68
3151</literallayout>
3152Note that the first two sets of errors are the same as for the XFree86
31534.3.0 server, and that the XCopyPlane error is a well-known error
3154resulting from an XTEST/Xinerama interaction when the request crosses a
3155screen boundary.  The XDraw* errors are resolved when the tests are run
3156individually and they do not cross a screen boundary.  We will
3157investigate these errors further to determine their cause.
3158</para>
3159</sect4>
3160
3161<sect4>
3162<title>Xdmx with XTEST, with Xinerama, with GLX</title>
3163
3164<para>With GLX enabled, using the XTEST extension, the following errors
3165were reported from Xdmx (these results are from early during the Phase
3166IV development, but were confirmed with a late Phase IV snapshot):
3167<literallayout>
3168XListPixmapFormats: Test 1
3169XChangeKeyboardControl: Tests 9, 10
3170XRebindKeysym: Test 1
3171
3172XAllowEvents: Tests 20, 21, 24
3173XGrabButton: Tests 5, 9-12, 14, 16, 19, 21-25
3174XGrabKey: Test 8
3175XSetPointerMapping: Test 3
3176XUngrabButton: Test 4
3177
3178XClearArea: Test 8
3179XCopyArea: Tests 4, 5, 11, 14, 17, 23, 25, 27, 30
3180XCopyPlane: Tests 6, 7, 10, 19, 22, 31
3181XDrawArcs: Tests 89, 100, 102
3182XDrawLine: Test 67
3183XDrawSegments: Test 68
3184</literallayout>
3185Note that the first two sets of errors are the same as for the XFree86
31864.3.0 server, and that the third set has different failures than when
3187Xdmx does not include GLX support.  Since the GLX extension adds new
3188visuals to support GLX's visual configs and the X Test Suite runs tests
3189over the entire set of visuals, additional rendering tests were run and
3190presumably more of them crossed a screen boundary.  This conclusion is
3191supported by the fact that nearly all of the rendering errors reported
3192are resolved when the tests are run individually and they do no cross a
3193screen boundary.
3194</para>
3195
3196<para>Further, when hardware rendering is disabled on the back-end displays,
3197many of the errors in the third set are eliminated, leaving only:
3198<literallayout>
3199XClearArea: Test 8
3200XCopyArea: Test 4, 5, 11, 14, 17, 23, 25, 27, 30
3201XCopyPlane: Test 6, 7, 10, 19, 22, 31
3202</literallayout>
3203</para>
3204</sect4>
3205
3206<sect4>
3207<title>Conclusion</title>
3208
3209<para>We conclude that all of the X Test Suite errors reported for Xdmx are
3210the result of errors in the back-end X server or the Xinerama
3211implementation.  Further, all of these errors that can be reasonably
3212fixed at the Xdmx layer have been.  (Where appropriate, we have
3213submitted patches to the XFree86 and Xinerama upstream maintainers.)
3214</para>
3215</sect4>
3216</sect3>
3217
3218<sect3>
3219<title>Dynamic Reconfiguration</title>
3220
3221<para>During this development phase, dynamic reconfiguration support was
3222added to DMX.  This support allows an application to change the position
3223and offset of a back-end server's screen.  For example, if the
3224application would like to shift a screen slightly to the left, it could
3225query Xdmx for the screen's &lt;x,y&gt; position and then dynamically
3226reconfigure that screen to be at position &lt;x+10,y&gt;.  When a screen
3227is dynamically reconfigured, input handling and a screen's root window
3228dimensions are adjusted as needed.  These adjustments are transparent to
3229the user.
3230</para>
3231
3232<sect4>
3233<title>Dynamic reconfiguration extension</title>
3234
3235<para>The application interface to DMX's dynamic reconfiguration is through
3236a function in the DMX extension library:
3237<programlisting>
3238Bool DMXReconfigureScreen(Display *dpy, int screen, int x, int y)
3239</programlisting>
3240where <parameter>dpy</parameter> is DMX server's display, <parameter>screen</parameter> is the number of the
3241screen to be reconfigured, and <parameter>x</parameter> and <parameter>y</parameter> are the new upper,
3242left-hand coordinates of the screen to be reconfigured.
3243</para>
3244
3245<para>The coordinates are not limited other than as required by the X
3246protocol, which limits all coordinates to a signed 16 bit number.  In
3247addition, all coordinates within a screen must also be legal values.
3248Therefore, setting a screen's upper, left-hand coordinates such that the
3249right or bottom edges of the screen is greater than 32,767 is illegal.
3250</para>
3251</sect4>
3252
3253<sect4>
3254<title>Bounding box</title>
3255
3256<para>When the Xdmx server is started, a bounding box is calculated from
3257the screens' layout given either on the command line or in the
3258configuration file.  This bounding box is currently fixed for the
3259lifetime of the Xdmx server.
3260</para>
3261
3262<para>While it is possible to move a screen outside of the bounding box, it
3263is currently not possible to change the dimensions of the bounding box.
3264For example, it is possible to specify coordinates of &lt;-100,-100&gt;
3265for the upper, left-hand corner of the bounding box, which was
3266previously at coordinates &lt;0,0&gt;.  As expected, the screen is moved
3267down and to the right; however, since the bounding box is fixed, the
3268left side and upper portions of the screen exposed by the
3269reconfiguration are no longer accessible on that screen.  Those
3270inaccessible regions are filled with black.
3271</para>
3272
3273<para>This fixed bounding box limitation will be addressed in a future
3274development phase.
3275</para>
3276</sect4>
3277
3278<sect4>
3279<title>Sample applications</title>
3280
3281<para>An example of where this extension is useful is in setting up a video
3282wall.  It is not always possible to get everything perfectly aligned,
3283and sometimes the positions are changed (e.g., someone might bump into a
3284projector).  Instead of physically moving projectors or monitors, it is
3285now possible to adjust the positions of the back-end server's screens
3286using the dynamic reconfiguration support in DMX.
3287</para>
3288
3289<para>Other applications, such as automatic setup and calibration tools,
3290can make use of dynamic reconfiguration to correct for projector
3291alignment problems, as long as the projectors are still arranged
3292rectilinearly.  Horizontal and vertical keystone correction could be
3293applied to projectors to correct for non-rectilinear alignment problems;
3294however, this must be done external to Xdmx.
3295</para>
3296
3297<para>A sample test program is included in the DMX server's examples
3298directory to demonstrate the interface and how an application might use
3299dynamic reconfiguration.  See <filename>dmxreconfig.c</filename> for details.
3300</para>
3301</sect4>
3302
3303<sect4>
3304<title>Additional notes</title>
3305
3306<para>In the original development plan, Phase IV was primarily devoted to
3307adding OpenGL support to DMX; however, SGI became interested in the DMX
3308project and developed code to support OpenGL/GLX.  This code was later
3309donated to the DMX project and integrated into the DMX code base, which
3310freed the DMX developers to concentrate on dynamic reconfiguration (as
3311described above).
3312</para>
3313</sect4>
3314</sect3>
3315
3316<sect3>
3317<title>Doxygen documentation</title>
3318
3319<para>Doxygen is an open-source (GPL) documentation system for generating
3320browseable documentation from stylized comments in the source code.  We
3321have placed all of the Xdmx server and DMX protocol source code files
3322under Doxygen so that comprehensive documentation for the Xdmx source
3323code is available in an easily browseable format.
3324</para>
3325</sect3>
3326
3327<sect3>
3328<title>Valgrind</title>
3329
3330<para>Valgrind, an open-source (GPL) memory debugger for Linux, was used to
3331search for memory management errors.  Several memory leaks were detected
3332and repaired.  The following errors were not addressed:
3333<orderedlist>
3334    <listitem><para>
3335        When the X11 transport layer sends a reply to the client, only
3336        those fields that are required by the protocol are filled in --
3337        unused fields are left as uninitialized memory and are therefore
3338        noted by valgrind.  These instances are not errors and were not
3339        repaired.
3340    </para></listitem>
3341    <listitem><para>
3342        At each server generation, glxInitVisuals allocates memory that
3343        is never freed.  The amount of memory lost each generation
3344        approximately equal to 128 bytes for each back-end visual.
3345        Because the code involved is automatically generated, this bug
3346        has not been fixed and will be referred to SGI.
3347    </para></listitem>
3348    <listitem><para>
3349        At each server generation, dmxRealizeFont calls XLoadQueryFont,
3350        which allocates a font structure that is not freed.
3351        dmxUnrealizeFont can free the font structure for the first
3352        screen, but cannot free it for the other screens since they are
3353        already closed by the time dmxUnrealizeFont could free them.
3354        The amount of memory lost each generation is approximately equal
3355        to 80 bytes per font per back-end.  When this bug is fixed in
3356        the the X server's device-independent (dix) code, DMX will be
3357        able to properly free the memory allocated by XLoadQueryFont.
3358    </para></listitem>
3359</orderedlist>
3360</para>
3361</sect3>
3362
3363<sect3>
3364<title>RATS</title>
3365
3366<para>RATS (Rough Auditing Tool for Security) is an open-source (GPL)
3367security analysis tool that scans source code for common
3368security-related programming errors (e.g., buffer overflows and TOCTOU
3369races).  RATS was used to audit all of the code in the hw/dmx directory
3370and all "High" notations were checked manually.  The code was either
3371re-written to eliminate the warning, or a comment containing "RATS" was
3372inserted on the line to indicate that a human had checked the code.
3373Unrepaired warnings are as follows:
3374<orderedlist>
3375    <listitem><para>
3376        Fixed-size buffers are used in many areas, but code has been
3377        added to protect against buffer overflows (e.g., snprintf).
3378        The only instances that have not yet been fixed are in
3379        config/xdmxconfig.c (which is not part of the Xdmx server) and
3380        input/usb-common.c.
3381    </para></listitem>
3382    <listitem><para>
3383        vprintf and vfprintf are used in the logging routines.  In
3384        general, all uses of these functions (e.g., dmxLog) provide a
3385        constant format string from a trusted source, so the use is
3386        relatively benign.
3387    </para></listitem>
3388    <listitem><para>
3389        glxProxy/glxscreens.c uses getenv and strcat.  The use of these
3390        functions is safe and will remain safe as long as
3391        ExtensionsString is longer then GLXServerExtensions (ensuring
3392        this may not be ovious to the casual programmer, but this is in
3393        automatically generated code, so we hope that the generator
3394        enforces this constraint).
3395    </para></listitem>
3396</orderedlist>
3397
3398</para>
3399
3400</sect3>
3401
3402</sect2>
3403
3404</sect1>
3405
3406</appendix>
3407
3408  </article>
3409
3410  <!-- Local Variables: -->
3411  <!-- fill-column: 72  -->
3412  <!-- End:             -->
3413