Table of Contents:
- API designs must make sense from the point of view of a third-party app developer (start by designing high-level APIs, only add daemons if it is necessary)
- Interfaces that don’t have to be API should not be API (minimize surface area)
- Use existing frameworks where we can; if we can’t use them directly, learn from their design
- Identify privilege boundaries, do not trust less-privileged components, and consider whether some features should be restricted
Minimize “surface area”
The “SDK API” is intended to remain stable/compatible over time, which means that we are committing to interfaces in the “SDK API” continuing to work in future: third-party code that uses stable Apertis APIs should continue to work in future, without needing changes.
As a result, one of the most important questions to ask about new public interfaces is: does this need to be public API right now? If it doesn’t, then it can be private, at least to begin with. We can change private APIs to be public later if they turn out to be necessary, but changing public APIs to be private would be a compatibility break.
Initially omitting interfaces from the public API means fewer things whose stability we need to guarantee, which means fewer constraints on how we improve the platform in future. Conversely, if we put too many things in the public API (guarantee too much) too early, we’ll probably regret it in future.
Some examples of applying that principle:
- hiding struct contents by using a
MyObjectPrivatestruct instead of putting members in the
MyObjectstruct (in GObject, use G_ADD_PRIVATE() or G_DEFINE_TYPE_WITH_PRIVATE())
- considering the D-Bus API between a built-in or otherwise special app, and the system components it uses, to be private
If in doubt, making things private initially, and making them public if it later proves to be necessary
Have as few daemons as possible, but no fewer
Sometimes it’s necessary to have more than one process (app code talking to a daemon/service/agent, typically a D-Bus service). There are lots of good reasons to do that:
- having a privilege boundary
- mediating between multiple processes that all want to manipulate the state of the same object (for instance, the now deprecated Barkway decides the order of the popup stack, which is global shared state)
- having something persist in the background when switching between apps or closing/reopening apps (for instance, Telepathy puts telephony, instant messaging and other real-time communications in the background)
However, every time we introduce inter-process communication between two components, we increase the complexity of the system, which increases the cost (time) of building and maintaining it.
As a result, if we don’t need the extra process because none of the reasons above apply, we should try not to have it: it’s “cheaper” to use a shared library that gets loaded into the app process. For instance, Grilo is just a library, and doesn’t have an associated daemon. Similarly, if there’s an opportunity to reduce the number of layers of “a daemon talking to a daemon talking to a daemon”, we should probably consider it.
Another consideration is whether daemons with similar privileges, performance characteristics and lifetimes can be merged together: for instance, as of May 2015 we have Barkway and Mutter as separate components, whereas GNOME Shell puts notifications and the window manager (among other things) in the same process. If the requirements allow it, we should do the same.
Be aware of where the privilege boundaries are
Not all of the code in Apertis is equally-privileged: components that run as root are more privileged than components that run as the user, and components with a permissive AppArmor profile are more privileged than components with a restrictive profile.
Whenever two components with different privilege levels communicate, we should be aware of where the boundary is, and what should be allowed. If the more privileged component trusts the less privileged component too much, then we have privilege escalation - the less privileged component can effectively get access to the more privileged component’s privileges, by controlling the more privileged component and requesting that it carries out whatever action is being restricted, with the result that the privilege boundary doesn’t actually provide any security.
When crossing a privilege boundary, the question to ask is: if the less privileged side is being actively malicious, what’s the worst that can happen? For instance, the less privileged side might be a malicious third-party app (malware), or it might be a legitimate component which an attacker has been able to compromise via a security vulnerability.
One important example is where the less privileged component communicates with a service via a D-Bus API. If the method call has a username or app_name parameter, a malicious or compromised app could use a different user’s name, or a different app’s name: what would happen then?
We can avoid problems in situations like this by using information that comes from a trusted component (the Linux kernel, systemd or dbus-daemon) and cannot be faked. For instance, the GetConnectionCredentials D-Bus method tells you the user ID and AppArmor profile of a process on the bus: this information comes from dbus-daemon, and can be trusted. If we can derive the app name from the AppArmor profile, then that cannot be faked either, so we can reliably identify an app by its profile.
Start from the API that the app developer will use
This is related to the surface area and fewer daemons reasoning above. The requirements for our SDK APIs take the form “third-party apps should be able to do X, Y and Z”. In many cases, it is tempting to address this by providing a D-Bus API that the third-party app can use, with auto-generated GDBus C “bindings”.
In Collabora’s experience, auto-generated code has limits: it’s often the quickest way to get from nothing to a minimum acceptable API, but it usually produces somewhat weird APIs, which could be easier to use if they were designed as C APIs from the beginning. When we started developing telepathy-glib, we mistakenly thought that most of it could be generated code. However, over time, we realised that there was quite a low limit to its quality and usability if we stuck to that idea: the current approach has much more focus on high-level C APIs, with the D-Bus API designed separately in support of the desired C API.
It often results in better (nicer to use) APIs if the starting point for the design is the GObject-style C API that we intend third-party applications to use, in the form of a library logically arranged into appropriate objects: we could even consider starting with stub implementations that work with “mock” results, and filling in the real implementation afterwards (for instance a mock version of a contacts database might return a list of hard-coded contacts).
That C API might have to evolve over time as we fill in the implementation, but if the general outline can stay intact, it’s a good indication that the result is going to be something that apps can use.
Think about how much we want the app to be allowed to do
This is not normally a concern, but in Apertis it is, because there’s a privilege boundary between apps (whether our apps or third-party apps). We need to consider (and document) which of these categories each feature is in:
- all apps (including third-party ones) should be able to do this
- apps should be able to do this if they have some special flag in their manifest
- only trusted components within Apertis should be able to do this
For instance, in Android, all apps can write to /sdcard; only apps that have asked for the “permission” can record audio; and only trusted system components can install/uninstall apps.
If two parts of an API are in different categories, or if two parts of an API should be in the second category and have different flags (be usable by different apps) that’s probably a sign that there needs to be a split between those APIs.
For instance, looking at Frampton and Tinwell as of May 2015, recording probably needs to be locked-down more than playback: the worst case for playback is that the driver is annoyed and turns down or mutes the volume, whereas the worst case for recording is sending private conversations to the Internet. So recording and playback should have a clear division between them.
Follow GNOME conventions
Our platform includes a lot of GNOME-related libraries such as GLib and Clutter, and our API guidelines follow GObject/GNOME style quite closely. Application developers will find it easier to use our APIs if we are consistent about following GObject conventions.
Many of the developers working on these components are not necessarily familiar with GObject conventions. If in doubt, it might be helpful to ask a developer with more GObject experience, or look at how GIO or GTK+ 3 does similar things.
See the Coding Conventions for more on this topic, but here is a brief summary:
- namespace objects with the library’s appropriate prefix
- use GLib naming conventions: CapitalizedWords for types; CAPITALS_AND_UNDERSCORES for constants; lowercase_and_underscores for parameters, struct members and variables
- avoid Hungarian notation (pSomePointer, v_func_returning_void(), etc.)
- use GError to report recoverable runtime errors
g_return_[val_]if_fail()to report failed precondition checks and other library-user errors
- use GObject features where appropriate: signals, properties, construct-time properties, virtual methods (vfuncs)
- use GIO features where appropriate: GAsyncResult, GCancellable, GTask
- prefer to use GAsyncResult instead of your own function typedef for asynchronous operations
- prefer to use signals instead of your own function typedef for events
Use existing software, or if we can’t, learn from it
Several components in Apertis overlap with existing open-source projects, in which people have already spent a lot of time understanding a particular problem-space, making mistakes (for instance MPRIS version 1), and recovering from those mistakes. If we can learn from their mistakes, we can take a short-cut past the process of making and learning from our own mistakes, and arrive directly at a solution.
For some of those components, we might be able to adopt entire APIs from the existing project, perhaps with Apertis-specific extensions to fill in missing functionality. For instance, a media play providing an MPRIS2 interface, that would cover 90% of what it does, and would make sure we’ve avoided the problems that MPRIS1 had.
If our requirements prevent us from re-using existing APIs (for instance, we’re not using Gtk for user interface widgets), the next best thing is to compare our APIs with the existing ones. Where they differ, there are several possibilities: perhaps our API specifically needs to be different to solve one of our requirements (in which case we keep it); or perhaps the difference doesn’t really matter either way; or perhaps the difference points to a design issue in our API, which would mean we can improve it by correcting that design issue, and get a better outcome.
API review process
Before a public API may be marked as stable and suitable for use by third-party apps, it must be reviewed. The process for this is:
- Some or all of the APIs of a component are selected for review. For example, it might be decided to review the C APIs for a component, but not the D-Bus APIs which it uses internally to communicate with a background service.
- The component authors produce:
- A description of the goals of the component and the APIs under review. These should be high-level descriptions, covering the use cases for that component and its APIs, goals of the implementation, and specific non-goals. Where possible, this should refer back to a design document. These descriptions should not be simply a copy of the API documentation describing how the goals are implemented.
- A list of required features and APIs which must pass review before the API may be marked as stable.
- A list of other components which depend on the APIs under review, and a description of how those other components should interact with the APIs under review. This ensures the final APIs are still usable as intended from other components, and is another way of describing the use cases of the API.
- An explicit description of where privilege boundaries are required in the system, and whether any of the APIs cross those boundaries (for example, if they implement functionality which should not be callable without elevated privileges).
- A description of how the API may need to be extended. For example, should it be extensible by the component maintainers, by third-party apps, or not at all?
- The reviewers examine the APIs and provided documentation, and make recommendations for the component as a whole, its interactions with other components, and specific recommendations for each API under review (for example, ‘keep this API unchanged’, ‘remove this API’, ‘keep it with the following changes’, etc.).
- The feedback is discussed, potentially iterated, and changes are made to the public API accordingly.
- The changes are re-reviewed, and the process iterates until the reviewers are happy.
- The APIs are marked as public and stable, and their API documentation is updated.
The main goal of the review process is to ensure that the stable APIs, which have to be supported for a long time in the future, are a good match for all current and anticipated use cases, and follow good system architecture practices. A large part of this effort is in ensuring that all use cases for the APIs are known and documented. A non-goal of the review process is to look at each API individually and mechanically run through some ‘API review checklist’ — this can easily miss mismatches between the API and the component’s use cases; architectural issues which are important to catch.
Components’ APIs should not be marked as stable until they have been reviewed.