Apertis uses Wayland as protocol for graphics servers, which provides a very mature and much simpler architecture than X11. The architecture is explained in Wayland compositors, however, since Wayland is just a core protocol, with some extensions defined in Wayland Protocols several different implementation can be found.

Currently Apertis uses agl-compositor, an AGL project to provide a simple compositor targeted to the automotive industry based on libweston, while older releases ship mildenhall-compositor which is a Mutter based compositor.

New compositor: agl-compositor

Since the main use case for Apertis is the same that AGL, it was decided to align the efforts to contribute to the development of agl-compositor.

The main goal of this new compositor is to simplify the graphical stack and the way clients interact with the compositor plus avoiding using unmaintained pieces of software like Wayland IVI. Since XDG-shell is the standard protocol used in desktop to manage windows, agl-compositor relies on it and adds its own private extension to support its unique use cases.

The fact of supporting XDG-shell allows standard toolkits, such as GTK to be used to develop graphical applications that can be shipped using Flatpak.


Since agl-compositor is conceived to serve as compositor for embedded devices and not for desktops, several simplifications have been made. This requires that applications designed to run on it should follow some guidelines

  • Use the whole screen: the compositor will check that the buffers provided by applications match the screen size, if they don’t applications will trigger an error
  • Use only one main surface: compositor will manage only one main surface per application
  • Provide valid application_id: compositor will use application_id to manage different applications, so it requires that a valid one is set previous to the first surface commit.
  • Popups implementation: instead of using xdg_popup a compositor specific protocol is used to implement them.

All these restrictions are aligned with the typical use cases of non-desktop HMIs, like industrial devices and automotive setups.


Apertis uses a fork of Maynard as shell to manage windows in agl-compositor. Maynard uses GTK 3 for rendering and implements the private extensions that allows it to interact with the compositor, configure the different surface roles and manage the running applications.

It also supports the Freedesktop Desktop Entry Specification, which allows to list the currently installed applications, both, native and Flatpak ones and keep track of their life cycle.

Previous compositor: mildenhall-compositor

During the initial support of Wayland Apertis used mildenhall-compositor which is based on Mutter, a compositor built with Clutter. Mutter gained support for Wayland from 2013 onwards as part of the GNOME project’s roadmap for moving to Wayland. While Mutter was originally created as X11 window manager for desktop use-cases, Weston has been created to be the reference compositor for the Wayland protocols and is more suited for the non-desktop use-cases that are the focus for Apertis. In it’s role as reference compositor, Weston also has an advantage in term of hardware support: vendors are much more likely to test against Weston and add dedicated enhancements for it in their BSP, while it is unlikely that they would do the same for Mutter.

Mutter also supports XDG-shell for implementing management of client windows (such as minimisation fullscreening and maximisation). There are a few parts of the XDG-shell protocol which are not needed for touchscreen IVI systems, such as xdg_popup; these could not be eliminated from the protocol, and stub implementations are present in the Apertis compositor.

Wayland architecture overview

Overview diagram of how Wayland, its clients, and theLinux kernel graphics infrastructure (DRM/DRI)interact.

Wayland is a protocol specifying communication between a display server (or compositor) and its clients, which are individual applications. Unlike X11, it does not specify a set of rendering primitives, or even a canonical protocol for transferring pixel data between clients and compositor. The most important primitives which Wayland defines are surfaces and buffers. A surface is an object representing a rectangular area on the screen, defined by a location, size and pixel content; a buffer provides pixel data when attached to a surface. The interactions between surfaces and buffers allow for double-buffering and glitch-free updates of windows, but are not directly relevant to how composition works, so this document will simply use ‘surface’ to refer to the combination of the two.

In the Wayland architecture, the window manager and compositor are combined (and together known as simply the ‘compositor’), performing these two main tasks:

  1. Receive pixel data from clients and composite it to form frames which are outputted to the screen.
  2. Handle user input and direct it to the appropriate client to be handled.

As well as performing composition of client windows, the compositor performs window management from within the same process. Weston is the reference, and most developed, implementation of a Wayland compositor. Mutter is another implementation.

Compositor terminology

Although not necessarily the case for all compositors, Weston contains several components, which are ‘plugged together’ to support each platform.

  • Shell: Provides the shell UI, such as a task bar and clock. Different shells can be plugged in to provide different UI experiences.
  • Renderer: Implements a specific method of rendering surfaces. For example, Weston has a software renderer (pixman), a GLES renderer (DRI), and hardware-specific renderers (e.g. RPI for the DispmanX hardware on Raspberry Pi).
  • Backend: Implements the platform-specific parts of the compositor, instantiating one or more renderers, and handling all composition of surfaces. For example, if a platform can use hardware-specific APIs, it instantiates the appropriate hardware-specific renderer; if it also needs GLES support, it instantiates the DRI renderer. It can then pass surfaces to the most appropriate renderer for the current frame.

Other compositors are typically arranged similarly; for example, Mutter has a plugin system which can be used to implement different shells; GNOME Shell is one example.

Some more general terminology:

  • Display controller: Hardware which processes pixels and overlays (see Compositing) and drives the screen.
  • GPU: Hardware which implements 3D acceleration and the GLES pipeline. Its output is fed into the display controller.

Client-side rendering

In the Wayland architecture, all rendering of client UIs is performed by client code, typically by the graphics toolkit the client uses. This is no different from modern usage of X11.

The graphics toolkit may use whatever method it wishes to render UI elements: software rendering on the CPU, or hardware rendering using GLES. All Wayland requires is for the resulting pixels to be sent to the compositor for each frame and window the client renders. Pixel data may be transported in several ways, depending on how it was rendered, and what is mutually supported by the clients and compositor:

  • Shared memory buffers containing actual pixel data. These are a fallback mechanism supported if no others are.
  • GPU buffer sharing (DRM/DRI). Clients render windows directly on the GPU, and the resulting pixel data remains in GPU memory and a handle to it is passed to the compositor. This prevents unnecessary and expensive copying of pixel data.


Once the compositor has all the pixel data (or handles to GPU buffers containing it), it can composite a frame. As with client-side rendering, this can be done in several ways:

  • Software rendering. CPU-intensive and used as a fallback. This also entails pulling pixel data out of GPU memory, which is expensive.
  • Full GPU rendering using GLES. This takes the pixel data and composites it on the GPU, potentially applying shaders and 3D transformations if required for animations.
  • Hardware-specific APIs on the display controller. These are generally 2D composition APIs which are less resource intensive than full 3D computation, but still keep processing on the display controller rather than the CPU, and do not require extra copies of the pixel data.

Different compositors use different approaches. As Mutter is tied to Clutter, it is constrained to using GLES for all rendering. Conversely, Weston implements several different renderers, so can choose the most efficient method of rendering depending on the requirements of the current frame (for example, if an animation is underway, or if any effects are being applied). For typical UIs, this will mean using hardware-specific APIs to composite the pixel data in 2D, as 3D effects are rarely needed, even when performing simple animations such as slides and fades. If full GLES 3D support is needed, Weston can choose to use the full GPU capabilities instead.

This is supported in Weston by the use of planes (known on some hardware as ‘overlays’). Planes are collections of surfaces, and each plane maps to a different overlay in hardware. Before rendering each frame, the compositor backend can choose which surfaces to put on each plane, resulting in them being rendered differently by the hardware. There was a good introductory talk on Planes in Weston given at FOSDEM 2013.

For example, on a traditional graphics card, there are perhaps four hard-coded overlays:

  • Primary: main overlay
  • Scanout: a single, full-screen surface
  • Sprite: typically a video overlay in a different colour space
  • Cursor

Within each plane, the display controller is used to composite all the surfaces to form an output frame for that plane, using the normal GLES pipeline. The hardware then has special support in the display controller for compositing the four planes to form the final output frame sent to the monitor. In this example, the planes are composited as a stack, with the cursor on top of the sprite, on top of the scanout, on top of the primary.

On more powerful embedded systems, the display controller (which is separate from the GLES pipeline) often has many more overlays, which are more general purpose. This means that Weston can assign fewer surfaces to each overlay, or divide them so that only one overlay needs to run through the GLES pipeline. In the best case, there is at most one surface per overlay, and no GLES processing needs to be done. There is an article on how the Weston Raspberry Pi backend uses planes to best advantage.

Journey of a pixel

As an illustration, consider the journey of a single UI element from being programmatically created in a client application, to appearing on the user’s screen. This assumes that textures are shared between clients and compositor by passing handles to GPU resources (rather than the other methods listed in client-side rendering). For this example, we use Weston as the compositor; if Mutter were used, steps 5–7 would be replaced by a single step: compositing all surfaces on the GPU using GLES, to form the final output frame.

  1. The program creates a new widget using its UI toolkit.
  2. The toolkit sets up the widget in its GLES context and uploads any necessary textures to the GPU.
  3. When the application next renders a frame (e.g. due to part of the UI changing), it pushes its GLES context through the GPU’s GLES pipeline, creating an output texture in the GPU which contains pixel data for the entire application window.
  4. The application uses the Wayland protocol to notify the compositor (Weston) of the updated window, passing a handle to the GPU texture.
  5. When Weston next renders a frame, it determines if any surfaces need GLES transformations applied to them, and assigns the surfaces to planes and hardware overlays as required.
  6. For each plane, Weston composites all the surfaces in that plane, creating output pixel data for that plane.
    1. If any transformations are needed for a plane, the Weston renderer pushes the surfaces in that plane through the GPU GLES pipeline.
    2. If no transformations are needed, or if the needed transformations can be implemented using more efficient hardware-specific APIs, this step is skipped.
  7. Weston uses hardware-specific APIs to composite all the planes to form the final output frame.
  8. The output frame is sent to the user’s screen.


  • The windows as the developer would like them to appear:

  • Uploading textures to the graphics memory.

  • Client-side (e.g. Clutter) rendering of individual windows in GLES.

  • Notifying Weston of the updated window.

  • Compositing surfaces within each overlay in GLES.

  • Compositing overlays to form the final output in the display controller.