Apertis is predominantly written in C, so dynamically allocated memory has to be managed manually. Through use of GLib convenience APIs, memory management can be trivial, but programmers always need to keep memory in mind when writing code.
It is assumed that users of Apertis are familiar with the idea of heap
allocation of memory using
free(), and know of the GLib
There are three situations to avoid, in order of descending importance:
- Using memory after freeing it (use-after-free).
- Using memory before allocating it.
- Not freeing memory after allocating it (leaking).
Key principles, in no particular order:
- Determine and document whether each variable is owned or unowned. They must never change from one to the other at runtime.
- Determine and document the ownership transfers at function boundaries.
- Ensure that each assignment, function call and function return respects the relevant ownership transfers.
- Use reference counting rather than explicit finalisation where possible.
- Use GLib convenience functions like
g_clear_object()) where possible.
- Do not split memory management across code paths.
- Use the single-path cleanup pattern for large or complex functions.
- Leaks should be checked for using Valgrind.
Principles of memory management
The normal approach to memory management is for the programmer to keep track of which variables point to allocated memory, and to manually free them when they are no longer needed. This is correct, but can be clarified by introducing the concept of ‘‘ownership’', which is the piece of code (such as a function, struct or object) which is responsible for freeing a piece of allocated memory (an ‘‘allocation’'). Each allocation has exactly one owner; this owner may change as the program runs, by ‘‘transferring’’ ownership to another piece of code. Each variable is ‘‘owned’’ or ‘‘unowned’', according to whether the scope containing it is always its owner. Each function parameter and return type either transfers ownership of the values passed to it, or it doesn’t. By statically calculating which variables are owned, memory management becomes a simple task of unconditionally freeing the owned variables before they leave their scope, and ‘‘not’’ freeing the unowned variables (see Single-path cleanup).
There is an important restriction here: variables must never change from owned to unowned (or vice-versa) at runtime. This restriction is key to simplifying memory management.
For example, given the functions:
the following code has been annotated to note where the ownership transfers happen:
There are a few points here: Firstly, the ‘owned’ comments by the variable declarations denote that those variables are owned by the local scope, and hence need to be freed before they go out of scope. The alternative is ‘unowned’, which means the local scope does not have ownership, and must not free the variables before going out of scope. Similarly, ownership must not be transferred to them on assignment.
Secondly, the variable type modifiers reflect the ownership status of each
my_str is owned by the local scope, it has type
const to denote it is unowned. Similarly, the
template parameter of
generate_string() and the
str parameter of
const because no ownership is transferred when those
functions are called. As ownership is transferred for the string parameter of
g_value_take_string(), we can expect its type to be
(Note that this is not the case for
GObjects and subclasses, which can never
const. It is only the case for strings and simple
Given this ownership and transfer infrastructure, the correct approach to
memory allocation can be mechanically determined for each situation. In each
copy() function must be appropriate to the data type, e.g.
g_strdup() for strings, or
g_object_ref() for GObjects.
|Assigning from/to||Owned destination||Unowned destination|
|Owned source||Copy or move the source to the destination.
||Pure assignment, assuming the unowned variable is not used after the owned one is freed.
|Unowned source||Copy the source to the destination.
|Call from/to||Transfer full parameter||Transfer none parameter|
|Owned source||Copy or move the source for the parameter.
||Pure parameter passing.
|Unowned source||Copy the source for the parameter.
||Pure parameter passing.
|Return from/to||Transfer full return||Transfer none return|
|Owned source||Pure variable return.
||Invalid. The source needs to be freed, so the return value would use freed memory — a use-after-free error.|
|Unowned source||Copy the source for the return.
||Pure variable passing.
Documenting the ownership transfer for each function parameter and return, and the ownership for each variable, is important. While they may be clear when writing the code, they are not clear a few months later; and may never be clear to users of an API. They should always be documented.
The best way to document ownership transfer is to use the
annotation introduced by
Include this in the API documentation comment for each function parameter and
return type. If a function is not public API, write a documentation comment for
it anyway and include the
(transfer) annotations. By doing so, the
gobject-introspection tools can also read the annotations and use them to
correctly introspect the API.
Ownership for variables can be documented using inline comments. These are non-standard, and not read by any tools, but can form a convention if used consistently.
The documentation for container types is similarly only a convention; it includes the type of the contained elements too:
Note also that owned variables should always be initialised so that freeing them is more convenient. See Convenience functions.
Also note that some types, e.g. basic C types like strings, can have the
const modifier added if they are unowned, to take advantage of compiler
warnings resulting from assigning those variables to owned variables (which
must not use the
const modifier). If so, the
/* unowned */ comment may be
As well as conventional
free()-style types, GLib has various
reference counted types —
being a prime example.
The concepts of ownership and transfer apply just as well to reference counted
types as they do to allocated types. A scope owns a reference counted type if
it holds a strong reference to the instance (e.g. by calling
An instance can be ‘copied’ by calling
g_object_ref() again. Ownership can be
— even though this may not actually finalise the instance, it frees the current
scope’s ownership of that instance.
g_clear_object() for a convenient way of
handling GObject references.
There are other reference counted types in GLib, such as
Some types, like
GHashTable, support both reference counting and explicit
finalisation. Reference counting should always be used in preference, because
it allows instances to be easily shared between multiple scopes (each holding
their own reference) without having to allocate multiple copies of the
instance. This saves memory.
GLib provides various convenience functions for memory management, especially for GObjects. Three will be covered here, but others exist — check the GLib API documentation for more. They typically follow similar naming schemas to these three (using ‘_full’ suffixes, or the verb ‘clear’ in the function name).
This makes it easier to implement code that guarantees a GObject pointer is
NULL, or has ownership of a GObject (but which never points to
a GObject it no longer owns).
By initialising all owned GObject pointers to
NULL, freeing them at the end
of the scope is as simple as calling
g_clear_object() without any checks, as
discussed in single-path cleanup:
frees all the elements in a linked list, and all their data. It is much more
convenient than iterating through the list to free all the elements’ data, then
to free the
GList elements themselves.
is a newer version of
which allows setting functions to destroy each key and value in the hash table
when they are removed. These functions are then automatically called for all
keys and values when the hash table is destroyed, or when an entry is removed
Essentially, it simplifies memory management of keys and values to the question of whether they are present in the hash table. See container types for a discussion on ownership of elements within container types.
A similar function exists for
When using container types, such as
GList, an additional level
of ownership is introduced: as well as the ownership of the container instance,
each element in the container is either owned or unowned too. By nesting
containers, multiple levels of ownership must be tracked. Ownership of owned
elements belongs to the container; ownership of the container belongs to the
scope it’s in (which may be another container).
A key principle for simplifying this is to ensure that all elements in a
container have the same ownership: they are either all owned, or all unowned.
This happens automatically if the normal
convenience functions are used for types like
If elements in a container are owned, adding them to the container is
essentially an ownership transfer. For example, for an array of strings, if the
elements are owned, the definition of
g_ptr_array_add() is effectively:
So, for example, constant (unowned) strings must be added to the array using
g_ptr_array_add (array, g_strdup ("constant string")).
Whereas if the elements are unowned, the definition is effectively:
and constant strings can be added without copying them:
See the documentation section for examples of comments to add to variable definitions to annotate them with the element type and ownership.
A useful design pattern for more complex functions is to have a single control path which cleans up (frees) allocations and returns to the caller. This vastly simplifies tracking of allocations, as it’s no longer necessary to mentally work out which allocations have been freed on each code path — all code paths end at the same point, so perform all the frees then. The benefits of this approach rapidly become greater for larger functions with more owned local variables; it may not make sense to apply the pattern to smaller functions.
This approach has two requirements:
- The function returns from a single point, and uses
gototo reach that point from other paths.
- All owned variables are set to
NULLwhen initialised or when ownership is transferred away from them.
The example below is for a small function (for brevity), but should illustrate the principles for application of the pattern to larger functions:
Memory leaks can be checked for in two ways: static analysis, and runtime leak checking.
Static analysis with tools like Coverity or Tartan can catch some leaks, but require knowledge of the ownership transfer of every function called in the code. Domain-specific static analysers like Tartan (which knows about GLib memory allocation and transfer) can perform better here, but Tartan is quite a young project and still misses things (a low true positive rate). It is recommended that code be put through a static analyser, but the primary tool for detecting leaks should be runtime leak checking.
Runtime leak checking is done using Valgrind, using its memcheck tool. Any leak it detects as ‘definitely losing memory’ should be fixed. Many of the leaks which ‘potentially’ lose memory are not real leaks, and should be added to the suppression file.
See the tooling guide for more information on using Valgrind.