Current situation

When g_malloc() (and thus g_new()) can't allocate memory, they abort() the program. Many allocations done in GTK+, Pango, Glib, etc. go through g_malloc(). Applications don't want to crash when they run out of memory; they want to recover gracefully.

The situation of GSlice is not clear. I couldn't tell from a 10-second look at the source code whether it will abort() or return NULL. The documentatin doesn't say.

FIXME: The docs for g_malloc() don't say that it will abort() if it can't allocate memory.

FIXME: The docs for GSlice don't say what will happen if the allocation fails.

All the code in GTK+ assumes the following: if it does a known-to-be-small allocation, it will do so through g_new() or g_malloc(), and it won't bother to check the return value because it knows that it will be non-NULL (the program would have aborted if the allocation failed). It's only potentially large allocations through g_try_malloc() and g_try_new() that are checked, for example, allocating the large RGB buffer for a GdkPixbuf.

We cannot change the ABI. We cannot suddenly make g_malloc() and friends return NULL if the allocation fails, since programs do not check for this condition.

Plan of action

A common technique is to have an "emergency memory pool" for this kind of situation. Turbo Pascal used it for its Turbo Vision widget toolkit, and it worked very well.

The idea is that you can do

  foo = gtk_foo_new ();

  if (g_needed_emergency_memory_pool ()) {
      gtk_foo_free (foo);
      dialog_box ("out of memory");
  }

That is, there is a pool of memory which will only be touched if an allocation fails within the normal heap. The pool is large enough to contain typical "small" allocations (i.e. widget structures, strings). You can ask if the pool contains any data: if it does, it means that the normal heap is exhausted and you should immediately free some stuff. The pool is also large enough that it lets you bring up an "out of memory" dialog.

We can change GSlice to dip into the memory pool if its normal allocation machinery fails to find some free space.

Advantages

The GTK+ code wouldn't have to change. Widget code can still assume that normal allocations will succeed. User code can check the memory pool after "important" operations (for example, creating a whole window or dialog box) to see that memory has not been exhausted.

Existing application code doesn't have to change, either. People can retrofit their applications with the memory pool gradually, but their current behavior would not change at all.

You would do this:

  /* Create a bunch of widgets as usual */

  GtkWindow *window;
  GtkWidget *box;
  GtkWidget *widget;

  window = gtk_window_new (GTK_WINDOW_TOPLEVEL);
  box = gtk_box_new (...);
  gtk_container_add (window, box);

  widget = gtk_entry_new ();
  gtk_container_add (...);

  ...

  /* Now, check for exhausted memory */

  if (g_needed_emergency_memory_pool ()) {
      gtk_widget_destroy (window);
      dialog_box ("out of memory");
  }

Disadvantages

This will only work out of the box for applications which allocate memory through Glib. So, it won't work for Cairo, Fontconfig, etc. Hopefully those libraries are already able to return NULL if allocations fail. If not, we can probably get the Glibc maintainers to add an emergency memory pool for the basic malloc().

On cairo and out-of-memory

Cairo does already have a system for dealing with out-of-memory more gracefully than abort(). It checks, (or at least we intend it to check---we haven't done exhaustive analysis nor testing to see if we missed any), the return value of all malloc calls. On encountering NULL, it doesn't actually return a NULL pointer to the user expecting an object. Instead, it returns the address of a special static "nil" object, and then there is code at all entry points for any functions operating on that object to immediately return if passed a nil object.

This scheme is intended to allow applications to gracefully deal with out-of-memory situations, while at the same time not requiring applications to test the return value of every call that might perform allocation. Instead, applications can check the error status at a convenient time, (similar to your example application taking advantage of the memory pool).

So, the cairo approach has benefits similar to the memory pool, but it has the additional benefit of only using static objects, so there's no pool that could be exhausted. On the other hand, cairo does not guarantee that any useful work could be done while the out-of-memory situation persists. For example, if the application needs to create a new surface in order to report the error message, then the user will need to be able to free up memory before doing that, (or the application could have allocated emergency resources like that in advance for this very purpose).

It's possible that cairo's scheme is worth considering for GTK+. One potential disadvantage that it has over the memory pool approach is that functions return these inert "nil" objects that are non NULL, yet can't be used for anything. So that can lead to unexpected effects in some situations. For example, an application might actually want to ensure the allocation worked, and use a test for NULL, thinking that is sufficient. In fact, however, the correct way to distinguish an inert object from a real one is to call the status() function on that object. Perhaps that's outweighed by the fact that object creation functions have a strong guarantee that they will never return NULL, (and once that concept is internalized, it gets easy to know that a test for NULL is not correct).

Attic/GlibLowMemoryPool (last edited 2018-12-05 18:23:04 by EmmanueleBassi)