Iterators API

Status

Bug with patch being proposed: http://bugzilla.gnome.org/show_bug.cgi?id=560061 Git repository with this proposal implemented: http://git.codethink.co.uk/?p=glib;a=shortlog;h=collections

git clone git://git.codethink.co.uk/git/glib
cd glib
git checkout collections

The idea

We would like to introduce an iterator API in glib/gio. This could allow us to migrate our public APIs away from C-isms like GList, GHashTable and GPtrArray to container types that contain typed objects. These could be used with a type-safe GObject-based API that could be wrapped easily by language bindings.

You would not be forced to use these types for all of your list algorithms, but we would like you to use these types in your public API.

Take a look at the Subversion repository of libgee to get an idea: http://svn.gnome.org/viewvc/libgee/trunk/gee/. Note that the type names are not always the same as the ones that we have in mind (especially for the List type). However, its shows the general API and structure that we have in mind.

This is a simplified version of that API (using a C#-like syntax for simplicity):

interface Iterator {
        bool next ();
        object current ();
}

interface Iterable {
        Iterator iterator ();
}

interface Collection: Iterable {
        int size { get; }
        bool contains (object item);
        bool add (object item);
        bool remove (object item);
        void clear ();
}

interface Seq: Collection {
        object get (int index);
        void set (int index, object item);
        int index_of (object item);
        void insert (int index, object item);
        void remove_at (int index);
}

Comments

These are not (yet) real examples

This page illustrates ideas and concepts with examples, but these are not yet real examples that use a real API.

Boxing/Unboxing

Lists of non-object types (value types) will contain boxed versions of the items. We have not yet decided how this boxing will be done. For example GValue could be used, but would make these list types up to four times slower for just the boxing and unboxing. Although usually it's only important to make the internal (cached) lists as fast as possible, this could make the public API slower.

One idea is to provide convenience functions such as g_seq_add_object, g_seq_add_int, g_seq_add_string, g_seq_add_double and g_iterator_current_object, g_iterator_current_int, g_iterator_current_string, g_iterator_current_double.

void MyMethod () {
        GList *glist = g_list_prepend (glist, "Abc");
        GSeq *seq = g_glist_seq_new (glist);
        GIterator *iter = g_iterable_iterator (G_ITERABLE (seq));

        while (g_iterator_next (iter)) {
                gchar *str = g_iterator_current_string (iter);
                g_print ("%s\n", str);
                g_free (str);
        }

        g_object_unref (iter);
        g_object_unref (seq);
        g_list_free (glist);
}

You want boxing because you want reference counting. It would be inconsistent if caller-owns would be utilized for one kind of GIterator implementation and not for another kind of GIterator implementation. You want consistency in this to help language binding developers.

Meanwhile you want reference counting not only because that's the right thing to do for a GObject based API, but also for thread safety. If every return adds a reference, you don't have to worry about another thread taking a reference: your context will have its own reference until you are done with it.

This consistency would help language bindings developers.

Naming

GList is already used, so we suggest GSeq. Other types that we have in mind are GIterable, GCollection and GIterator

Why isn't an Iterable always a List?

Consider the example of a Directory type. A directory is certainly not a list, but you could structure it in such a way that you can iterate over its sub-directories.

At the moment we would do this:

GList * g_directory_get_subdirs (GDirectory *d)

But that GList is untyped. There is no way at runtime to know what types of object are in the GList. You have to read the documentation, which often forgets to mention it, or whether the caller should free or unref the items and/or the list itself.

But with the iterator API, this is possible. Let's try this in C# first. We want to know how it'll look in the high-level language first, right?

class Directory : Iterable { ... }

private void MyMethod () {
        Directory root = new Directory ("/");
        foreach (Directory subdir in root) {
                Console.WriteLine (subdir.ToString());
        }
}

That doesn't look too bad. Let's now try putting this in C:

void main (..) {
        GDirectory *root = g_directory_new ("/");
        GIterator *iter = g_iterable_iterator (G_ITERABLE (root));

        while (g_iterator_next (iter)) {
                GDirectory *subdir = G_DIRECTORY (g_iterator_current (iter));
                g_print ("%s\n", g_directory_get_path (subdir));
                g_object_unref (subdir);
        }

        g_object_unref (iter);
        g_object_unref (root);
}

You probably wonder how GDirectory would look? GDirectory would implement GIterable by providing an implementation for g_iterable_iterator. For example, like this:

#include "g-directory-iterator-priv.h"
static GIterator*
g_directory_iterator (GIterable *self) {
        return _g_directory_iterator_new (self);
}

static void
g_directory_iterable_iface_init (GIterableIface *iface, ...) {
        iface->iterator = g_directory_iterator;
}

The GDirectoryIterator would be a private type. Even its constructor wouldn't be publicly exported because it's only needed in the implementation of g_iterator_iterator in the static g_directory_iterator. We illustrated this by using -priv.h for its .h file.

Let's take a look at another example:

The result of a database query does not have to be a list, but you can nevertheless make it iterable. For example:

public class QueryResult : Iterable { ... }
public class QueryRow : Iterable { ... }

private void MyMethod (Database database) {
        Query query = new Query (database, "* FROM Table");
        QueryResult result = query.Execute ();

        foreach (QueryRow row in result) {
                foreach (QueryCell cell in row) {
                        Console.WriteLine (cell.ToString ());
                }
        }
}

The example doesn't really illustrate how the cursor is used. While the first foreach loops, it'll be executing NEXT SQL commands to the database that is within the context of the query that was executed. This page is not about how you do this, so we've written "..." as QueryResult's implementation.

Putting this in C. This will look less nice, but it'll still understandable. Let's try:

void MyMethod (GDatabase *database) {
        GQuery *query = g_query_new (database, "* FROM Table");
        GQueryResult *result = g_query_execute ();
        GIterator *row_iter = g_iterable_iterator (G_ITERABLE (result));

        while (g_iterator_next (row_iter)) {
                GQueryRow *row = G_QUERY_ROW (g_iterator_current (row_iter));
                GIterator *col_iter = g_iterable_iterator (G_ITERABLE (row));
                while (g_iterator_next (col_iter)) {
                        GQueryCell *cell = Q_QUERY_CELL (g_iterator_current (cell_iter));
                        g_print ("%s\n", g_query_cell_get_content_as_string (cell));
                        g_object_unref (cell);
                }
                g_object_unref (col_iter);
                g_object_unref (row);
        }

        g_object_unref (row_iter);
        g_object_unref (result);
        g_object_unref (query);
}

And here's an example of an XML parsing API:

public void MyMethod (GLib.XmlDocument document) {
        GLib.XmlNode node = document.Root;
        foreach (GLib.XmlNode sub_node in node) {
                Console.WriteLine (sub_node.ToString ());
        }
}

And in C:

void MyMethod (GXmlDocument *document) {
        GXmlNode *node = g_xml_document_get_root (document);
        GIterator *iter = g_iterable_iterator (G_ITERABLE (node));

        while (g_iterator_next (iter)) {
                GXmlNode *sub_node = G_XML_NODE (g_iterator_current (iter));
                g_print ("%s\n", g_xml_node_get_content (sub_node));
                g_object_unref (sub_node);
        }

        g_object_unref (iter);
        g_object_unref (node);
}

Isn't this going to make the transition very difficult?

Consider this example:

/**
 * gtk_container_get_children:
 * ...
 * language-bindings: (cism)
 * Returns: a doubly link...
 **/
GList * gtk_container_get_children (GtkContainer *container);

It could easily be converted to a much easier API for high-level languages:

/**
 * gtk_container_get_children_in_seq:
 * ...
 * language-bindings: (uncism-for gtk_container_get_children) (remove _in_seq)
 **/
void gtk_container_get_children_in_seq (GtkContainer *container, GSeq *seq) {
        GList *old_list = gtk_container_get_children (container);
        while (old_list) {
                g_collection_add_object (G_COLLECTION (seq), old_list->data);
                old_list = old_list->next;
        }
        g_list_free (old_list);
}

In the high-level language:

This is a bad example for a Model View Controller because it sounds unlikely that you would want to make a view for a list of Gtk.Widgets and steal them from a Gtk.Container. But it illustrates the example with an existing Gtk+ API.

To know how GLib.SeqImplementor would work in .NET, take a look at the GtkTreeModelDemo and Implementing GInterfaces of the Mono documentation.

public class MyModel : GLib.SeqImplementor, IList { 
        ...
}
public class MyView {
        public MyModel Model { get {..} set {..} }
        ...
}

private void MyMethod () {
        MyModel model = new MyModel ()
        Gtk.Container container = ...
        MyView view = new MyView ();
        view.Model = model;
        container.get_children (view.Model);
}

Implementing an Iterable in a high-level programming language

The previous example already showed it but let's give a more complex example: at the GtkTreeModelDemo and Implementing GInterfaces are the Mono documentation that describe how this works. For information about IEnumerator and IEnumerable go to the IEnumerator and the IEnumerable documentation pages of the .NET framework. You'll notice that this is the same example but adapted to be compatible with GIterator and GIterable.

Note that some of this code could be put (by the language binding developer) in a .custom file.

public class PeopleIterator: IEnumerator, GLib.IteratorImplementor {
        private Person[] people;
        private int position = -1;

#region In the .custom file
        // The standard .NET IEnumerator interface
        public bool MoveNext () {
                return this.next ();
        }
        public object Current {
                get {
                        return this.current ();
                }
        }
#endregion

        public void Reset () {
                this.position = -1;
        }

        // The GLib.IteratorImplementor interface
        public bool next () {
                this.position++;
                return (this.position < this.people.Length);
        }
        public object current () {
                if (this.position == -1)
                        throw new InvalidOperationException ();
                else
                        return this.people[this.position];
        }
}

public class People: IEnumerable, GLib.IterableImplementor {
        private Person[] people;

#region In the .custom file
        // The standard .NET IEnumerable interface
        public IEnumerator GetEnumerator() {
                return this.iterator ();
        }
#endregion

        // The GLib.IterableImplentor interface
        public GLib.Iterator iterator () {
                return new PeopleIterator (this.people);
        }

        public People(Person[] pArray) {
                this.people = new Person[pArray.Length];
                for (int i = 0; i < pArray.Length; i++)
                        this.people[i] = pArray[i];
        }
}

public class App {
        static void Main() {
                Person[] peopleArray = new Person[3] {
                        new Person("John", "Smith"),
                        new Person("Jim", "Johnson"),
                        new Person("Sue", "Rabon"),
                };
                People peopleList = new People (peopleArray);

                foreach (Person p in peopleList)
                        Console.WriteLine(p.firstName + " " + p.lastName);
        }
}

You probably notice that to implement the standard .NET interfaces, we just need to call the implementations for GIterator and GIterable. The GIterator and GIterable concept is modelled precisely on the infrastructure provided by the current high-level languages. We also could have kept the sample from the .NET documentation as is and called the IEnumerator and IEnumerable API from our GIterator and GIterable ones.

Libraries already using custom iterator interfaces

Actually all doubly-linked list style APIs are kind of like iterators/iterable too.

  • libxml
  • glist
  • etc etc

A complete example

This would not necessarily be the best way to use the API:

GCollection *
g_volume_monitor_get_volumes (GVolumeMonitor *mon)
{
    GCollection *coll = g_list_collection_new (G_TYPE_VOLUME);
    ...
   return coll;
}

A better API usage might be this (the native C code):

static inline void
foreachvol (GVolume *volume, GCollection *collection)
{
   g_collection_add (collection, g_object_ref (volume));
}

void
g_volume_monitor_get_volumes (GVolumeMonitor *mon, GCollection *collection)
{
  g_list_foreach (mon->priv->volumes, (GFunc) foreachvol, collection);
}

This would enable you to do this in the higher language:

public class VolumeModel : Gtk.TreeModel, GLib.Collection {

  public override void add (GLib.Object volume) {
       Gtk.TreePath path;
       Gtk.TreeIter iter;

       iter = this.insert_in_treemodel (volume, out path);
       this.row_inserted (path, out iter);
  }

  public override void remove (GLib.Object volume) {
       Gtk.TreePath path;
       Gtk.TreeIter iter;

       iter = this.delete_from_treemodel (volume, out path);
       this.row_deleted (path, out iter);
  }

  public VolumeModel (GLib.VolumeMonitor monitor) {
      this.monitor = monitor;

      this.monitor.added += add (GLib.Volume volume) => {
         this.add (volume);
      }

      this.monitor.deleted += del (GLib.Volume volume) => {
         this.remove (volume);
      }

      monitor.get_volumes (this);
  }
}

public class VolumeView : Gtk.TreeView {

  private void setup_columns () {
        /* ... */
  }

  public VolumeView (VolumeModel model) {
        this.setup_columns ();
        this.model = model
  }
}

public class MyClass : Gtk.Window {
  public void MyMethod () {
     GLib.VolumeMonitor monitor = new GLib.VolumeMonitor ();
     VolumeModel model = new VolumeModel (monitor);
     VolumeView view = new VolumeView (model);

     this.add (view);
  }
}

Conclusions

This API makes the common things easy while the complex things are still possible. You can still implement extremely efficient iterators and iterables if your application needs those. You can even override existing ones to make them more efficient or to make them more application-domain specific.

Such things would not be possible with an introspected GList. Such a solution would only make the complex cases possible, wihtout making the common cases simple. And you would lose a certain amount of performance to the introspection tasks.

Attic/IteratorsAPI (last edited 2013-11-23 01:41:23 by WilliamJonMcCann)