GtkRecentManager is our standard interface to ~/.local/share/recently-used.xbel (described in detail in the Desktop Bookmark Specification).

Zeitgeist is a project to keep a chronological log of events (things the user does or that otherwise happen): accessing/editing files (local, remote and web), having IM conversations, launching applications, etc.

In this page, we talk about making GtkRecentManager use Zeitgeist's logging backend instead of (or in addition to) using that XBEL file, and about extending its API to support some concepts which would make things much more useful for Zeitgeist.

Summary

The idea is to keep GtkRecentManager as the simple logging API for apps that deal with "real file URIs". Applications handling special things like IM conversations or music can use the Zeitgeist API directly.

The logging and querying API will be kept as legacy for apps that use them (to generate recently-used menus and such). For other more sophisticated usage, the Zeitgeist API should be used directly.

Objectives

  • Get more details logged. Particularly, correct timestamps and a distinction between create, open, modify and (new) close events.
  • Get data from Zeitgeist, since it's expected to be more complete (ie. not only from GNOME apps).

API Proposal (WIP)

Insertion

The following operation will replace gtk_recent_manager_add_item and gtk_recent_manager_add_full:

void gtk_recent_manager_add_activity (manager, const enum GtkRecentManagerActivityType type, const GtkRecentActivityData *data);

The methods return void so that the operation can be made asynchronous. Type can take the following values: _CREATED, _OPENED, _CLOSED and _MODIFIED. Data can have the following information: uri, display_name, mime_type, application.

Outstanding issue:

  • Do we need the timestamp parameter or can GTK+ just use the current time?

Retrieval

GList* get_latest_items_for_app(GtkRecentManager *manager, gchar* app, gint limit)  //returning all items for app sorted by recently used where the app should be the name of the .desktop file, limit sets the count of items to be returned

GList* get_latest_items_for_mime_types(GtkRecentManager *manager, GList* mime_types, gint limit)  //returning all items for mimetypes sorted by recently used, limit sets the count of items to be returned

Function gtk_recent_manager_get_items gets deprecated. We need to figure out how it'd make sense to implement it in terms of get_latest_items_for_{app/mime_types} (requesting just the last 100 subjects of any kind may have too much noise -all of them may be websites-).

Targeted retrieval

gtk_recent_manager_lookup_item(uri) -> GtkRecentInfo

The GtkRecentInfo structure isn't the most optimal thing for Zeitgeist but I suppose for backwards compatibility we don't really want to change it.

GtkRecentInfo:
 uri: OK
 display name: OK
 description: deprecated (always NULL)
 mime-type: OK
 app-name: .desktop file name
 app_exec: deprecated (???)
 groups: deprecated
 is_private: deprecated (always false)
 - get_added: what's the point of this?
 - get_modified, get_visited: need ZG queries
 - get_applications: needs ZG query
 - get_application_info(app): needs ZG query,
                     changed to take a .desktop file
 - last_application: OK (app-name)
 - has_application: needs query
 - has_group: deprecated (always false)
 - get_icon, get_gicon: OK
 - get_short_name: OK
 - get_uri_display: OK
 - get_age: needs query
 - is_local, exists, match: OK

gtk_recent_manager_has_item(uri) -> bool

Can be implemented. However, what's it good for? Should probably be deprecated.

Modification and removal

gtk_recent_manager_move_item -> bool

Insert a MOVE_EVENT.

gtk_recent_manager_remove_item(uri) -> bool

We could implement it with ZG, but it may not be a good idea (eg. gedit is using it to delete unaccessible stuff, but that stuff still belongs into Zeitgeist's database).

gtk_recent_manager_purge_items -> gint

Doesn't make sense in the context of Zeitgeist. Deprecate (NO-OP).

Outstanding issue:

  • Do we want to introduce some API for deletion (and maybe make purge_items use that)? What would it look like, removing stuff only for that application, or only recent stuff or what?


OLD DISCUSSION

Snapshot from the whiteboard at the Desktop Summit

FIXME: this needs detailing

1. Example of app

2. Docs

3. Tests

4. Bugs

5. Simple API

6. Add missing pieces for Zeitgeist

7. Zeitgeist backend

Event interpretations:

gtk_recent_opened (uri);
gtk_recent_closed (uri);
gtk_recent_modified (uri);
...

Decision: Keep GtkRecentManager as the simple API for apps that deal in "real file URIs". Special stuff like IM conversations or music can use the Zeitgeist API directly.

Keep the querying APIs as legacy for apps that use them (to generate recently-used menus and such). For other more sophisticated usage, promote using the Zeitgeist API directly.

Right now GtkRecentManager doesn't deal with timestamps at all; they are magically generated in !GBookmarkFile. Zeitgeist needs an accurate and finer-grained view of timestamps - bug.

Zeitgeist likes to have "event interpretations" that say what happened to a file - opened? closed? edited? So we need an API to specify that.

Problems with GtkRecentManager

Vague timestamps

GtkRecentManager doesn't provide a way for apps to specify the timestamps for its items, and it does not actually add timestamps automatically itself. Instead, it assumes that the underlying GBookmarkFile will do it. However, GBookmarkFile has a very simplistic way of handling this when the caller doesn't specify timestamps at all:

  • added/modified/visited timestamps start as a magic value of -1.
  • When a new item is added (one whose URI wasn't present in the bookmarks file), its added/modified timestamps are set to NOW. This is okay for a "save" action, but not for an "open" action, as the latter does not modify the data.
  • When an item is written out to the bookmarks file, any timestamps that remain as -1 are set and written as NOW.

Clearly, Zeitgeist needs a more accurate view of timestamps for each event: it thinks in terms of "events", not "items".

Event information

Zeitgeist likes to have "event interpretations" that say what happened to a file. Was the file opened? saved (with modifications)? closed?

We need a way in the GtkRecentManager API to specify a few standard or common actions, ideally along with their timestamps ("opened at $time", "saved at $time").

See the Zeitgeist docs on interpretations and look for ACCESS_EVENT, CREATE_EVENT, MODIFY_EVENT, LEAVE_EVENT.

Note: Zeitgeist also has "event manifestations", which mean, "how did this come to happen?". For example, did something take place because of direct action from the user, or did the system do something on his behalf. For GtkRecentManager, we can probably always say that the user did it, or we can leave the manifestation field blank.

Usage in applications

Gedit

gedit-window.c has two simple wrappers, _gedit_recent_add() and _gedit_recent_remove(), for gtk_recent_manager_add_full() and gtk_recent_manager_remove_item(), respectively. They only use the URI and mime-type when adding items; everything else (application, groups) is the "obvious" info for gedit.

Items get added to the recent manager when a file is finished loading (look for _gedit_recent_add() in gedit-tab.c:document_loaded()), and when a file is finished saving (document_saved() in the same file).

Items get removed from the recent manager when a file fails to load (look for _gedit_recent_remove() in various places in gedit-tab.c). I'm not sure if removing items from the list is a good idea in this case.

The Zeitgeist plugin for Gedit (FIXME: link to source) saves these:

  • subject interpretation: Interpretation.DOCUMENT
  • subject manifestation: Manifestation.FILE_DATA_OBJECT
  • event interpretation: Interpretation.ACCESS_EVENT, Interpretation.MODIFY_EVENT or Interpreation.LEAVE_EVENT; see below
  • event manifestation: Manifestation.USER_ACTIVITY

Note: The plugin doesn't do CREATE_EVENT. The analogous plugin for Emacs handles CREATE_EVENT by doing "if file_exists then MODIFY_EVENT else CREATE_EVENT" when saving.

EOG

EOG only uses gtk_recent_manager_add_full() when opening images; it doesn't ever remove them from the recent manager (this is the right thing to do).

There is a Zeitgeist plugin for EOG (FIXME: link to source). It saves these:

  • subject interpretation: Interpretation.IMAGE
  • subject manifestation: Manifestation.FILE_DATA_OBJECT
  • event interpretation: Interpretation.MODIFY_EVENT / LEAVE_EVENT / ACCESS_EVENT
  • event manifestation: Manifestation.USER_ACTIVITY

Geany

This is practically the whole source for the Zeitgeist plugin for Geany - it is particularly enlightening as to how an API should look like: we can pass event interpretations as strings; we could have some #defines for them (what about language bindings?).

static void insert_zeitgeist(GeanyDocument *doc,const char *action)
{
        char                            *uri;
        gchar                           *filetype;
        ZeitgeistEvent          *event;

        uri = DOC_FILENAME(doc);
        filetype = doc->file_type->name;

        event = zeitgeist_event_new_full (
                action,                                 /* ev. interpretation */
                ZEITGEIST_ZG_USER_ACTIVITY,             /* ev. manifestation */
                "app://geany.desktop",                  /* actor */
                zeitgeist_subject_new_full (
                        uri,                            /* uri */
                        ZEITGEIST_NFO_TEXT_DOCUMENT,    /* subj. interpretation */
                        ZEITGEIST_NFO_FILE_DATA_OBJECT, /* subj. manifestation */
                        filetype,                       /* mime-type */
                        uri,                            /* origin */
                        uri,                            /* text (display name) */
                        "net"),                         /* storage (volume UUID or "net") */
                NULL);                                  /* no more subjects for the event */

        g_debug("inserting event");
        zeitgeist_log_insert_events_no_reply(log, event, NULL);
        g_debug("zeitgeist end");
}

static void on_document_open(GObject *obj, GeanyDocument *doc, gpointer user_data)
{
        g_debug("Example: %s was opened\n", DOC_FILENAME(doc));
        insert_zeitgeist(doc, ZEITGEIST_ZG_ACCESS_EVENT);
}

static void on_document_close(GObject *obj, GeanyDocument *doc, gpointer user_data)
{
        g_debug("Example: %s was closed\n", DOC_FILENAME(doc));
        insert_zeitgeist(doc, ZEITGEIST_ZG_LEAVE_EVENT);
}

static void on_document_new(GObject *obj, GeanyDocument *doc, gpointer user_data)
{
        g_debug("Example: %s was created\n", DOC_FILENAME(doc));
        insert_zeitgeist(doc, ZEITGEIST_ZG_CREATE_EVENT);
}

static void on_document_activate(GObject *obj, GeanyDocument *doc, gpointer user_data)
{
        g_debug("Example: %s was opened\n", DOC_FILENAME(doc));
        insert_zeitgeist(doc, ZEITGEIST_ZG_ACCESS_EVENT);
}

API discussion

Insertion

The Gedit plugin uses a subject interpretation of Interpretation.DOCUMENT, while the EOG plugin uses Interpretation.IMAGE. In the Desktop Summit, we decided to leave GtkRecentManager as an "easy API" for apps that deal in plain old files (i.e. not Evolution, which would represent emails in Zeitgeist events differently). For those apps, can the subject's interpretation simply be derived from the MIME-type, or do they really need to specify it by hand? Yes, it looks like we can derive this from the MIME-type.

We can probably always assume a subject manifestation of Manifestation.FILE_DATA_OBJECT (i.e. users of this API deal in plain old files), and an event manifestation of Manifestation.USER_ACTIVITY (i.e. not for automated notifications and such). Are these assumptions correct? Yes, FILE_DATA_OBJECT and USER_ACTIVITY are correct. RainCT: Not really, gedit can handle remote files (eg. sftp://) and GtkRecentManager currently handles those correctly; deciding between local and remote file programatically shouldn't be a problem though.

Events have a storage property. What do we do about the storage type? Gedit's plugin doesn't even specify it, while Geany always uses "net". RainCT: Just leave the field empty and Zeitgeist will know to do the right thing (if Geany is using "net" that's badly broken).

From the above, it looks like we could simply use the following, with some provision to specify the app-specific information like app identifier (gedit.desktop) and the MIME-type.

void gtk_recent_manager_add_create_event (manager, uri, timestamp);
void gtk_recent_manager_add_access_event (manager, uri, timestamp);
void gtk_recent_manager_add_leave_event (manager, uri, timestamp);
void gtk_recent_manager_add_modify_event (manager, uri, timestamp);

Queries would remain the same as they are right now, to return "items" rather than "events" (i.e. all accesses of foo.txt constitute an item, while Zeitgeist may be storing multiple events for that subject internally).

We can probably use a GtkRecentData in the API above, rather than the URI, to allow specifying the app name and MIME-type. We may need to deprecate some fields in that structure (groups? app_exec?).

Retrieval

Queries in the current API don't allow you to filter upfront which "uris" are of you interest. Currently to find all files opened with gedit one needs to:

  • use get_items() which return a list of all GtkRecent items

  • filter out the items one by one to get only the gedit stuff

This needs to be done otherwise. With a Zeitgeist backend this get_items will return a much bigger list than what we are used to by gtk.recentmanager. Limiting it to "n" unique items but that would not be fair. Playing "n" items with rhythmbox will result in the current API not returning any items for gedit. For that I sugges adding a new API.

GList* get_latest_items_for_app(GtkRecentManager *manager, gchar* app, gint limit)  //returning all items for app sorted by recently used where the app should be the name of the .desktop file, limit sets the count of items to be returned

GList* get_latest_items_for_mime_types(GtkRecentManager *manager, GList* mime_types, gint limit)  //returning all items for mimetypes sorted by recently used, limit sets the count of items to be returned

This API would allow us to actually make Zeitgeist do the calculations for us. It is quicker in response than iterating through a list and filtering out, since everything will be done directly in the Zeitgeist DB.

Comments

MatthiasClasen:

void gtk_recent_manager_add_create_event (manager, uri, timestamp);
void gtk_recent_manager_add_access_event (manager, uri, timestamp);
void gtk_recent_manager_add_leave_event (manager, uri, timestamp);
void gtk_recent_manager_add_modify_event (manager, uri, timestamp);

This naming is just awkward: 'add_create', 'add_access', 'add_leave' - confusing to put verbs next to each other like that. And I don't think we really want to introduce 'events' in this api at all. Events are something quite different in GTK+.

I also need to see self-contained documentation describing what these do, just referring to some Zeitgeist docs does not suffice.

Finally, I'd caution against an API that collects data without a way to remove the collected data.

Proposal from ebassi:

typedef enum {
  CREATE, /* Document -> New */
  OPEN, /* Document -> Open */
  SAVE, /* Document -> Save, Save as? */
  CLOSE /* Document -> Close */
} GtkRecentActivityType;

void gtk_recent_manager_add_activity (GtkRecentManager *, GtkRecentActivityType, GFile *, const char *display_name_or_null, GCancellable *, GError **);

int gtk_recent_activity_iter_init (GtkRecentActivityIter *, GtkRecentManager *, GList *mimetypes); /* for stuff from this app */
int gtk_recent_activity_iter_init_for_mimetypes (GtkRecentActivityIter *, GtkRecentManager *, GList *mimetypes); /* for stuff from anyone matching the mimetypes */
-- we don't use mime types: we use GContentType.
     don't use GList, it's an awful data structure
     you can use: GContentType **types, int *n_types
     though I'd wager that people will use "text/*", "image/*", and something similar

why do you need the mime types there as well? just let the people check if it matches a content type during the iteration; it's not like it's going to be more efficient if it's done by the server or by the client app, and the client app encodes much more knowledge than the server can possibly do.

gboolean gtk_recent_activity_iter_next (GtkRecentActivityIter*, GFile*, GtkRecentInfo*);

From Benjamin Otte:

<Company> i'd try to model the API after GFileEnumerator
<Company> GFileEnumerator *gtk_recent_manager_list (...)
<ebassi> Company: you can create a private RecentFileEnumerator
<ebassi> Company: and map RecentInfo to a FileInfo with custom attributes
<ebassi> actually, that would be way, *way* better
<Company> g_file_info_get (info, "recent::whatever", &data);

i'm fine with gtk_recent_manager_enumerate_filtered (GtkFileFilter *filter);

FedericoMenaQuintero:

I like the idea of using an enum for the event type ("interpretation" in Zeitgeist's parlance) instead of having a different function for each one. Maybe even macros with strings for extra extensibility.

Aren't we discussing the query API too much? This was supposed to be an easy API to put things into Zeitgeist. The query API is pretty much only for GTK+'s internals ("gimme the recent files for this File menu"); is there something that uses GtkRecentManager in a more exotic way? I.e. apps that need sophisticated queries are better off using the Zeitgeist API directly.

However, I do like the idea of a GFileEnumerator - probably makes things more consistent. Zeitgeist queries reply asynchronously but contain the whole reply in a single chunk; you don't get multiple callbacks - not sure how well the enumeration API would work with that scheme.

Matthias: I think all the stuff I wrote above explains the interpretation types; please tell me if it's not clear enough. They are about what happened to a certain file or URI.

MatthiasClasen:

Some more comments, after a long time:

From my perspective, GtkRecentManager is more or less a dead-end API in GTK+. It is designed for Win95-era File menus with a list of recent files; and these menus are not really something you expect to see in modern apps. So, I question somewhat the value of investing tons of effort into improving the recent manager implementation at this point. That being said, there's no harm in making GtkRecentManager write proper timestamps etc. But a zeitgeist dependency in GTK+ is not going to be acceptable. And a backend abstraction with loadable modules is even more effort/overhead.

Isn't DELETE missing from the activity types ? Or are deletions not logged ?

Wrt. to the lack of documentation for the proposed apis, things that need to be documented include:

  • are app authors expected to call this themselves, or does gtk call it behind the scenes ? If so, where and when ?
  • are there consistency rules that have to be followed ? e.g can't close without a prior open
  • what files are you expected to call this api for ? (I guess ~/.myconfig doesn't qualify...)

SeifLotfy:

I understand no zeitgeist in Gtk. There is no place for it there. The overhead effort for loadable modules i also agree on. However if the new Gtk API can follow the scheme discussed above it is easy for Zeitgeist to passive log any events inserted into Gtk.Recentmanager Deleteing a file should also be logged too. But Moving a file around would kinda have to trigger it to. I suggest not logging it.

The authors would have to call the methods themselves from within their Application when a file is opened/modified/closed/created. You cant force rules really, because activities are sequential so at some point the file might be open which would give it a greater open timestamp and close and the other way around too. The only timestamp that needs to be smaller than all others is "CREATED" timestamp

As for files on which is would apply, the rule is "if you touched it then its logged". The UI can later decide if it wants to display hidden files or not.

Attic/GtkRecentManagerAndZeitgeist (last edited 2018-12-05 18:17:07 by EmmanueleBassi)