This site has been retired. For up to date information, see handbook.gnome.org or gitlab.gnome.org.


[Home] [TitleIndex] [WordIndex

Log API Revamp (OBSOLETE)

(!) Zeitgeist 0.3.0 had a major API overhaul, obsoleting this blueprint. The concept of private logs is still not addressed though

Launchpad blueprint: Not registered

This document proposes a new DBus API for the Zeitgeist engine which aims to accomplish a few things that the API used hitherto does not allow for:

Important Concepts

Why Private Logs

There are a few reasons for allowing dynamic creation and deletion of new Log objects

Warning on JOINs

Splitting the Repository from the Log will change the way the Zeitgeist API is used. Queries like:

Give me the most recent documents tagged "foo"

are not possible directly against the Zeitgeist engine. One would first need to resolve the URI of the Tag with label "foo" against the Repository, and then ask Zeitgeist

Give me the most recent docs with tag tag://mjfd673bfw82bf3

DBus Addresses

The Zeitgeist LogManager object runs as:

      Address : org.gnome.zeitgeist.LogManager
  Object path : /org/gnome/zeitgeist/log
    Interface : None

Logs controlled by the log manager runs as

      Address : org.gnome.zeitgeist.LogManager
  Object path : /org/gnome/zeitgeist/log/[a-z_]+   (the 'activity' log is there by default)
    Interface : org.gnome.zeitgeist.Log

The Zeitgeist Repository object runs as:

      Address : org.gnome.zeitgeist.Repository
  Object path : /org/gnome/zeitgeist/repository
    Interface : org.gnome.zeitgeist.Repository

DBus Protocol

DBus Wire Representations of Events

We use the signature T as a short hand for for event templates

T   =   (asaas)

We use the signature E as a short hand for for full event metadata

E   =   (asaasay)

The first array of strings contains specific Event data fields as enumerated below. The following array of arrays of strings (aas) contains an array of metadata for each subject, consult the table below for the details. The second array of bytes (not present in the event template) ay is the event payload. The event payload is a free-form binary blob completely controlled by the logging application (ie. the client).

Event data member offsets:

seqnum              = 0
timestamp           = 1
interpretation      = 2
manifestation       = 3
app                 = 4
origin              = 5

Subject metadata member offsets:

subj_uri            = 0            
subj_interpretation = 1
subj_manifestation  = 2
subj_mimetype       = 3
subj_origin         = 4
subj_text           = 5

org.gnome.zeitgeist.LogManager

No interfaces besides the standard DBus Introspectable interface. DBus introspection is used to enumerate all Log objects. Log objects live under the DBus Object path

/org/gnome/zeitgeist/log/[a-z_]+

Eg. the default activity log is at

/org/gnome/zeitgeist/log/activity

Sending DBus messages to other paths matching the pattern above, will create them dynamically. Ie. To create a new log called mylog then I simply start sending messages to the org.gnome.zeitgeist.Log interface on:

/org/gnome/zeitgeist/log/mylog

org.gnome.zeitgeist.Log

Represents a Log for some specific collection of data. Most stuff should go to the Log object at /org/gnome/zeitgeist/log/activity unless the client has specific needs to do private logging.

org.gnome.zeitgeist.Repository

This is a very simple interface for accessing general item metadata. It is designed to be simple enough to be implemented on top of most modern desktop repository systems such as Nepomuk/Soprano, Tracker, CouchDB, etc. Zeitgeist ships with a simple default implementation backed by an SQLite database.

FIXME

Zeitgeist Internals and Implementation Details

This section contains a rough draft on how the above API could be implemented.

Rename Content and Source

The concepts of "Content" and "Source" types has evidently confused a lot of people. In this proposal they have been renamed to "Interpretation" and "Manifestation" respectively. So:

Log Database

There is a bunch of value tables that stores pairs (integer, string), with the integer part being the primary key. These are used to remove duplicate strings from the DB and shrink DB file size to a minimum. The core event table will link to these value tables via the integer primary key.

Value Tables

There are 6 values tables:

uri
interpretation
manifestation
mimetype
actor
text

Each value tables is constructed as like the uri table below:

CREATE TABLE IF NOT EXISTS uri (id INTEGER PRIMARY KEY, value VARCHAR UNIQUE);

CREATE UNIQUE INDEX IF NOT EXISTS uri_value ON uri(value);

Payload Table

Any event can have a payload assigned which is simply a binary blob that can contain any old kind of data. The payloads are stored in a table similar to a value table, but with a BLOB in stead of a VARCHAR:

CREATE TABLE IF NOT EXISTS payload (id INTEGER PRIMARY KEY, value BLOB)

Storage/Availability Table

The storage table contains an entry for each storage medium or resource the user has data on. Storage media can UUIDS of hard drives or USB sticks or well known names such as "online" for data that requires online access. If the user is not on line then the row with the "online" storage medium as not available. The idea with this table is for the log to be able to return only events about items that are currently available.

CREATE TABLE IF NOT EXISTS storage
                        (id INTEGER PRIMARY KEY,
                         value VARCHAR UNIQUE,
                         state INTEGER);

CREATE UNIQUE INDEX IF NOT EXISTS storage_value ON storage(value);

Storage states:

  0 : Not available
  1 : Available

Event Table

The event table does not contain any data it self only relational ids to values in the value tables.

Notice that event.id is not the primary key this is because we can have several subjects per event.

CREATE TABLE IF NOT EXISTS event
                    (id INTEGER,                  # uri.id
                     timestamp INTEGER,           # timestamp in system millis
                     interpretation INTEGER,      # interpretation.id
                     manifestation INTEGER,       # manifestation.id
                     actor INTEGER,               # uri.id
                     origin INTEGER,              # uri.id
                     payload INTEGER,             # payload.id
                     subj_id INTEGER,             # uri.id
                     subj_interpretation INTEGER, # interpretation.id
                     subj_manifestation INTEGER,  # manifestation.id
                     subj_mimetype INTEGER,       # uri.id
                     subj_origin INTEGER,         # uri.id
                     subj_text INTEGER,           # text.id
                     subj_storage INTEGER         # storage.id
                     )

2024-10-23 11:37