There are a few anti-patterns to consider when accessing the filesystem. This article assumes knowledge of the standard GFile, GInputStream and GOutputStream APIs.


Asynchronous I/O

All I/O should be performed asynchronously. That is, without blocking the GLib main context. This can be achieved by always using the *_async() and *_finish() variants of each I/O function. For example, g_input_stream_read_async() rather than g_input_stream_read().

Synchronous I/O blocks the main loop, which means that other events, such as user input, incoming networking packets, timeouts and idle callbacks, are not handled until the blocking function returns.

Note that the alternative, running synchronous I/O in a separate thread, is highly discouraged; see the threading guidelines for more information.

File path construction

File names and paths are not normal strings: on some systems, they can use a character encoding other than UTF-8, while normal strings in GLib are guaranteed to always use UTF-8. For this reason, special functions should be used to build and handle file names and paths. (Modern Linux systems almost universally use UTF-8 for filename encoding, so this is not an issue in practice, but the file path functions should still be used.)

For example, file paths should be built using g_build_filename() rather than g_strconcat(). Doing so makes it clearer what the code is meant to do, and also eliminates duplicate directory separators, so the returned path is canonical (though not necessarily absolute).

As another example, paths should be disassembled using g_path_get_basename() and g_path_get_dirname() rather than g_strrstr() and other manual searching functions.

Path validation and sandboxing

If a filename or path comes from external input, such as a web page or user input, it should be validated to ensure that putting it into a file path will not produce an arbitrary path. For example if a filename is constructed from the constant string ~/ plus some user input, if the user inputs ../../etc/passwd, they can (potentially) gain access to sensitive account information, depending on which user the program is running as, and what it does with data loaded from the constructed path.

This can be avoided by validating constructed paths before using them, using g_file_resolve_relative_path() to convert any relative paths to absolute ones, and then validating that the path is beneath a given root sandboxing directory appropriate for the operation. For example, if code downloads a file, it could validate that all paths are beneath ~/Downloads, using g_file_has_parent().

As a second line of defence, all projects which access the filesystem should provide an AppArmor profile which limits the directories they can read from and write to. See the AppArmor guidelines for more information.

External links