The compositor is the component of Apertis that is responsible for drawing application windows and other graphical elements on the screen.
The compositor is a process responsible for combining surfaces (texture buffers) representing application windows into the single 2D image displayed on the screen. In an X11 environment, it combines the roles of a window manager and a compositing manager. In a Wayland environment, it also takes on the role of the display server from X11.
In Apertis 15.12 with either X11 or Wayland, the compositor runs as the
mutter executable. This is a thin executable wrapper around the
libmutter, which provides the majority of its functionality;
both of these components are part of GNOME’s Mutter project.
Additionally, the mildenhall-mutter-plugin component is loaded by the
mutter executable as a plugin, and provides an Apertis-specific
reference user experience (UX). Near-future versions of Apertis might
move to a model more like GNOME Shell, where the component responsible
for the compositor’s UX is a standalone executable linked to libmutter,
with the equivalent of the code from mildenhall-mutter-plugin included
in that executable; this would have little effect on the compositor’s
If Apertis moves from Mutter to Weston as its compositor in a future release, we anticipate that the UX layer equivalent to mildenhall-mutter-plugin would become a Weston plugin.
In X11, unprivileged graphical programs cannot display their graphics before the display server has started. Programs arrange for their graphics to be displayed by connecting to the X11 display server and sending a request to create a window. Such requests are always granted: if the compositor has not yet been started, the X11 display server itself carries out fallback window management behaviour in which the window is displayed with the size and position that the program requested. If the compositor has already been started, the window is not immediately displayed, but is instead made available to the compositor, which may choose whether to composite the window into the final 2D scene (and if so, where to place it).
In Wayland, the compositor is the display server. Graphical programs arrange for their graphics to be displayed by creating a buffer (a surface) in GPU memory, drawing their text, images etc. into that buffer, then sending requests to the Wayland compositor which ask the compositor to include that surface in the final 2D scene. Unprivileged programs cannot display graphics until the compositor is ready, so we can be sure that the compositor’s policies are applied to every surface.
We aim to provide the usual security properties described in the Security design document:
for the two mechanisms provided by the compositor:
- output (placing application windows on the screen)
- input (dispatching input events such as touchscreen touches and gestures to applications)
Wayland Compositors - Why and How to Handle Privileged Clients provides a good overview of how those security properties apply to compositors.
In GNOME 3 on either Wayland or X11, GNOME Shell is a standalone executable linked to the libmutter library, similar to the design proposed above.
Android’s SurfaceFlinger and Windows' Desktop Window Manager also fulfil essentially the same role as our compositor.
“The platform” refers to the overall Apertis platform, including the compositor, application manager and so on.
Because we anticipate that the desired graphical presentation and user experience (UX) will be a point of differentiation for OEMs, each of these requirements should be interpreted as a requirement that it is possible for the platform to behave as specified, and a recommendation that OEMs' platform variants should do so unless it conflicts with their desired UX. For example, for brevity, we will use “the compositor must …” as shorthand for “it must be possible for the compositor to …, and we recommend that OEMs' compositors should have that behaviour unless it conflicts with their desired UX”.
In some circumstances, such as when the Apertis device is switched on for the first time, it must go into a default state.
- The platform must draw a “home screen” or launcher from which further programs can be launched.
- The home screen may either be part of the compositor, or a separate graphical program.
- Pressing a button or menu entry representing an application entry point results in the relevant graphical program being started.
(These are aspects of input and output availability).
Platform UI elements
In addition to the home screen, there might be UI elements which are outside the scope of any particular application window, such as a status bar, Notifications, System-modal dialogs, or the UI controls used for application-switching.
- The OEM-specific visual design might reserve regions of the screen
for these visual elements. We recommend that this is done.
- For example, the equivalent features in Android are the small region at the top of the screen that is normally reserved for the status bar, and the larger region at the bottom or side of the screen that is normally reserved for the navigation bar (Back, Home and Apps buttons).
- The compositor may either draw each of those UI elements itself, or arrange for separate programs to provide them.
- Some of these UI elements must remain visible at all times (they
must be displayed on top of ordinary program windows), unless the
compositor’s UX calls for them to be hidden under certain specific
- For example, Android allows applications to request that the status bar and navigation bar are hidden, but the gestures to reinstate them are always available, and the operating system displays a reminder of those gestures when they become hidden.
- If separate programs provide some or all of these UI elements, then normal platform startup must arrange for them to be launched.
(These are aspects of input and output availability).
- The compositor must not allow unprivileged programs to display their
content in the regions of the screen that are reserved for these UI
elements, unless the compositor’s UX design specifically allows it.
This is a trusted path with which the platform can display
information to the user. (Output integrity)
- Ideally, the APIs provided to programs should be designed so that it is impossible to request display in a forbidden area.
- If the APIs provided to programs are such that the program can attempt to display in these regions, and an unprivileged program attempts to do so, this must be detected and prevented.
- Trusted paths are discussed in academic security literature, for example References.
Launching a program
When a graphical program is launched, after carrying some non-graphical initialization, it will create a surface, fill it with the first frame that it wants to be displayed, and submit that surface to the compositor for display.
- The compositor must be able to identify that surface as having come
from that graphical program. In particular, it must be able to
determine the app-bundle and
user account that originated
the surface. (Input and output integrity)
- Non-requirement: If an app-bundle is allowed to contain multiple graphical programs, the ability to distinguish between those graphical programs is optional. We treat the app-bundle as a security boundary, but we do not place a security boundary between individual graphical programs within an app-bundle.
- This identification must be securely authenticated. If a different
user account or app-bundle asks to display a surface, one of these
options must be true:
- (Preferred) The compositor obtains the originating program’s user account and app-bundle directly from the Linux kernel or some other trusted platform component, and there is no opportunity for the originating program to give false information.
- The originating program tells the compositor which user account and app-bundle it claims to be, and the compositor verifies in a secure way that this claim is true.
- Non-requirement: If an app-bundle is allowed to contain multiple graphical programs and the compositor distinguishes between them, it is acceptable for it to be possible for a graphical program to be able to impersonate a different graphical program in the same bundle.
- The compositor must perform whatever appropriate smooth graphical transition is desired (for example a cross-fade, animated movement, or a simple atomic change between one frame and the next) between the home screen and the graphical program’s surface as the main contents of the screen.
- If the compositor’s UX involves multiple tiled content areas, the
graphical program must be displayed in the desired content area.
- In Wayland, the application only controls the content of its surfaces, and the compositor chooses where they are displayed, so this is easy to ensure.
- If the compositor’s UX involves floating or cascading windows (as
seen in GNOME, Windows, etc.), the graphical program must be
displayed in the location chosen by the compositor. It may influence
that location by setting “hints” in its requests, but the compositor
must be free to ignore those hints.
- Again, this is how Wayland always works in any case.
- The compositor must arrange for any UI elements that should remain
visible at all times to remain
visible and interactive during this process (input and output
- if they are provided by the compositor itself, they must be layered above the graphical program’s surfaces in the compositor’s scene-graph;
- if they are provided by a separate “shell” program, the surfaces representing them must be layered above the surfaces from the graphical program.
- The compositor must deliver location-specific input events such as touchscreen touches to the application at the relevant location, and to no other application. (Input availability, input confidentiality)
- In particular, if application windows can overlap (for example stacking or cascading), and application A is in front of application B, then application A must not be able to trick the user into entering confidential input that was intended for application B by making itself transparent or almost-transparent, so that the user interface of application B shows through (clickjacking). (Input confidentiality)
- The compositor must deliver non-location-specific input events such as touchscreen edge-swipe gestures to the current application, using a definition of “current” that is part of its UX, and to no other application. (Input availability, input confidentiality)
In some circumstances, such as when the Apertis device is switched off with a particular app active, UX designers may wish to return to a previous saved state, for example one that was saved during device shutdown (“last-used mode”).
- The platform must arrange for each of the graphical programs that was previously active and visible (in the foreground) to be restarted.
- When one of those graphical programs asks the compositor to display a surface, the compositor must place it in the same location where it was previously visible.
- The platform may launch other graphical programs that were running
but not visible when the state was saved. They must not become
visible until the user makes a request to switch to them.
- Alternatively, the platform may delay starting those graphical programs until the user makes a request to switch to them.
(Input and output availability)
Main window selection
The user should have the opportunity to switch between the main (top-level) windows presented by various programs.
A graphical program might make it difficult for the user to leave, either accidentally (because the program has become unresponsive) or deliberately as a denial of service (because the program is maliciously written or has been compromised by an attacker).
- The compositor must have the opportunity to intercept input events (touchscreen touches, touchscreen gestures, hardware button presses) regardless of the actions of the program. (Input availability)
- The compositor should always provide a way to return to a home screen or application switcher, from which an unresponsive program can be terminated. (Input and output availability)
- The way to return to a home screen or application switcher should be
consistent and predictable. For example, Android reserves a small
area of the screen for Back, Home and Applications buttons. In older
Android versions, applications such as the camera may request that
these buttons are displayed unobtrusively, but are not able to hide
them altogether; in newer versions, these buttons can be hidden, but
the swipe gesture to make them available cannot be disabled, and the
user is given a reminder of that gesture which cannot be hidden by
the application. (Input availability, output integrity)
- Optionally, specially privileged app-bundles might be given the opportunity to hide these UI elements, or arrange for one of the app-bundle’s surfaces to be displayed as an overlay “above” them. However, this should be a “red flag” in app-store review, to be granted only to trusted applications. For example, Android requires the SYSTEM_ALERT_WINDOW permission for applications that use overlays, and additionally requires that the user has been specifically prompted by the platform to grant this permission to this app. (Output integrity)
- If the compositor receives an input event that it interprets as a
request to switch away from the graphical program, for example
pressing a “home” or “application switcher” button, then this switch
must occur within a reasonable time, even if the current graphical
program does not cooperate with that operation. This must have a
smooth graphical transition (cross-fade or animation) if that is the
desired UX. (Input and output availability)
- For example, if a bug in the current graphical program results in it ceasing to respond to messages from the compositor (for example a deadlock or live-lock situation) and the window switching operation involves communicating with it, the compositor must not wait indefinitely for a response. If it gets a response, it may switch immediately; if it does not, it may wait a short time, but after that time it must continue switching anyway. The maximum wait time should be chosen so that switching still appears responsive.
- Similarly, if the current graphical program is deliberately/maliciously written with the intention of delaying task-switching as much as possible, the compositor must still switch within a reasonable time.
- Each window offered for switching must be associated with the
relevant app-bundle, for example with a title and/or icon, so that
when the user believes they are switching to a particular window,
they can know that they are in fact switching to a window from the
correct trust domain. (Input and output integrity)
- The ability to distinguish between windows from different graphical programs in the same app-bundle is optional, because graphical programs in an app-bundle share a trust domain.
- A UX designer might require a limit on the number of simultaneous windows per app-bundle. For example, an app-bundle might be limited to having up to 5 entry points in the same or different processes, each with up to 2 main windows open at any given time.
A graphical program might include dialogs in its UX.
- We recommend that dialogs should normally appear as a direct result of user activity, but they may also appear as a result of an external event.
- If the graphical program’s corresponding main window is currently displayed in a particular location, the dialog should overlay that location. If the API to open dialogs makes it possible to attempt to place dialogs elsewhere, and the program does so, the compositor must prevent this. (Output integrity)
- If surfaces (windows) are tiled, stacked or floating, the dialog may extend outside the boundaries of the graphical program’s main window if desired, but we recommend that this pattern is discouraged. If this is done, it should always be made obvious which surface the dialog belongs to. (Output integrity)
- The dialog must not prevent the user from switching away from the program, even if it extends outside the main window; in other words, it may be app-modal or document-modal, but must not be system-modal. (Input and output availability)
- We suggest encouraging the use of document-modal dialogs similar to those in OS X and GNOME.
A graphical program might include pop-up or drop-down menus in its UX.
- Menus typically behave like a document-modal window immediately above their “parent” window.
- The requirements are essentially the same as for dialogs, although the visual presentation is likely to be different.
External events might result in a notification, typically implemented as a “pop-up” window.
- A calendar might trigger notifications as time passes, for example when an appointment will occur soon.
- A messaging application (for example email or Twitter) might trigger a notification when new messages are available.
These notifications should be displayed by the platform user interface (HMI), either as part of the compositor (like in GNOME Shell) or a separate process.
- If there is a current notification, the platform should draw a visual representation of it, displaying it “above” any current window. (Output availability for the notification)
- If there is no current notification, any program (including non-graphical programs such as agents) may trigger a new notification. (Output availability for the notification)
- Each notification should be visually associated with the appropriate app-bundle, perhaps via an icon and title. (Output integrity)
- Notifications should be drawn in such a way that only the compositor (or the trusted notification service, if separate) can produce the same visual result, for example by displaying it over the top of Platform UI elements in a way that would not be possible or would not be allowed for an ordinary application window. (Output integrity)
- There should be a straightforward mechanism by which the driver can close any notification, minimizing distraction. (Input and output availability for other UI components)
- High-priority platform components such as navigation must be able to force their notifications to be displayed instead of, or “above”, other components' notifications. (Output availability for the higher-priority notification)
- Excessive notifications by an application might distract the driver. The compositor must have the opportunity to limit the number of notifications per app-bundle or deny notification display altogether, with an optional user-configurable limit per application so that the user could selectively silence an app-bundle that they found distracting.
- The precise handling of notifications (for example topics such as how multiple simultaneous notifications are handled) is outside the scope of this document.
- If the notification has “actions”, for example a button to go to the relevant app-bundle, these actions must be able to bring that app-bundle to the foreground.
GNOME’s design page for notifications, in addition to GNOME’s own designs, has some useful references to other platforms in the “See Also” section.
A graphical program might attempt to get the user’s attention by creating new main windows while it is in the background.
- These windows must not be displayed or given input focus, to avoid user distraction and focus-stealing.
- We recommend encouraging application developers to use Notifications instead.
- Some programs ported from non-Apertis environments might rely on the
ability to create a window at any time as a way to get the user’s
attention. If a program does this, the compositor must not display
it or give it input focus until the user requests main window
- The compositor could handle this with no user distraction at all, by making the window available in the Main window selection list, but not showing it. However, this would not have the desired effect of informing the user that something has happened.
- Additionally, the compositor could optionally provide a visual cue to the user while minimizing distraction, by behaving as though that program had requested a notification, with content based on the program and/or window title, and one action button which would bring the new window to the foreground.
- If the window would exceed a limit on the number of simultaneous windows or graphical programs in an app-bundle, as described in Main window selection, the compositor must not display those excessive windows, and may terminate the graphical program.
A previously non-graphical program could connect to the display server and create a new main window, becoming a graphical program.
- Unresolved: what happens?
- The simplest resolution would be to treat it as though it had always been graphical and was previously in the background, and apply the Focus-stealing requirements to it. Is this sufficient?
- If there is a requirement that we are able to classify programs into (potentially) graphical and non-graphical in the manifest, with only graphical programs allowed to open windows, this would somewhat undermine the idea that there is no security boundary within an app-bundle.
- If the window creation is allowed, it must be treated as though a graphical program in the background had opened that window, for the purposes of Focus-stealing prevention.
- A program from one app-bundle must not be able to copy the texture
data of a window from a different app-bundle, which might contain
confidential information. (Output confidentiality)
- In particular, this forbids taking screenshots of a program from a different app-bundle.
- The ability for programs in the same app-bundle to take screenshots of each other is optional. For “least-privilege”, we suggest that the platform should not allow app-bundles to request that the platform takes a screenshot of that app-bundle. The programs can communicate directly with each other to share their texture data, if desired, so the platform’s involvement is not needed.
- A program from an app-bundle must not be able to copy the texture
data of platform UI elements, which might contain confidential
information. (Output confidentiality)
- In particular, this forbids screenshots again.
- Unresolved: Is there a requirement that specially privileged
app-bundles must be able to take screenshots, bypassing these
- If this is required, we suggest an interface similar to GNOME Shell’s org.gnome.Shell.Screenshot D-Bus API, with which these privileged app-bundles can submit a request to the compositor, which the compositor can accept or reject according to the permissions flags in that app-bundle’s manifest.
- Screencasting or video recording is essentially equivalent to an ongoing stream of screenshots, and has equivalent requirements.
- A program from one app-bundle must not be able to synthesize input events for delivery to a window in a different app-bundle, which could be used to force the target program to carry out undesired actions. (Input integrity)
- A program from one app-bundle must not be able to synthesize input events for delivery to the compositor, which could be used to force the compositor or other programs to carry out undesired actions. (Input integrity)
Trusted input paths
In some situations the platform may need to ask the user for input, in such a way that the user can be confident that their input will in fact go to the platform and not to a potentially malicious app-bundle. One prominent example of a trusted input path is the “Ctrl+Alt+Del to log in” mechanism in Windows operating systems: Windows does not allow ordinary applications to intercept this key sequence, which means that the user can be confident that the resulting login dialog actually belong to Windows, and not an ordinary application that is mimicking it.
GNOME uses system-modal dialogs for a similar purpose when carrying out platform-related actions like asking for confirmation of a potentially dangerous system-wide action or when unlocking access to stored passwords: to make it more difficult for an ordinary application to present the same visual effect, GNOME .
- The compositor must be able to request input from the user regardless of any other factors, for example application windows or notifications. For example, if this is done via system-modal dialogs like the ones in GNOME, then the system-modal dialog must replace or be displayed “above” all application or notification windows. (Availability, integrity)
- Other platform components might need to request input from the user
in a similar way.
- We anticipate that this would be implemented by providing a privileged API on the compositor that is only accessible by those components.
- Unprivileged app-bundles must not be able to make equivalent requests. (Output integrity; output availability for everything else)
- The trusted input path must be displayed in such a way that only the
compositor or another trusted service can produce the same visual
result, for example by displaying it over the top of Platform UI
elements in a way that would not
be possible or would not be allowed for an ordinary application
window. (Output integrity, input integrity)
- This is an example of a trusted output path; see References.
Interaction with the automotive domain
If the Apertis device (infotainment domain, CE domain) shares its input and output device with a separate automotive domain, graphics from the automotive domain must in general be displayed “above” anything from the infotainment domain. As an exception, if the relevant surfaces in the automotive domain are associated with something for which input and output availability and integrity does not need to be preserved against a potentially hostile infotainment domain, they may be displayed differently. For example, if the main navigation view in a navigation app is to be displayed by the automotive domain, it could be displayed in the same way as an ordinary app window originating from the infotainment domain.
The requirements in this document can be re-stated for the compositor in the automotive domain, with the infotainment domain taking on the role of an ordinary application from the automotive compositor’s point of view. For example, Synthesized input requires that ordinary applications cannot send input events to the infotainment compositor or to each other. The corresponding requirement for the automotive compositor is that the infotainment domain must not be able to send input to the automotive compositor, or to another client of the automotive compositor (if there are others).
The UX of the automotive domain might reserve particular areas of the screen for platform UI and/or a trusted path. If it does, the compositor in the infotainment domain must avoid relying those areas for its own UX (either for application windows or its own platform UI), because they would never be visible in practice: the automotive domain would draw its UI elements “above” the output of the compositor.
Unresolved: Are trusted input and output paths to the automotive domain within the scope of this document?
Some of these requirements are known to be impossible to meet in X11, so we do not aim to solve them there.
Platform features which are likely to be useful in implementing this:
- Wayland surfaces are not displayed unless the compositor chooses to do so. If we can write down whatever policy is required for a particular UX, then the compositor can be programmed to have exactly that policy.
- The Wayland protocol operates via an AF_UNIX
like D-Bus, so we can identify peer applications by their AppArmor
profile and uid using the same credentials-passing mechanisms that
we already use in D-Bus.
- Wayland already has API for the uid/gid/pid. Similar API for the LSM context should be straightforward to add.
not yet written