Usage of URIs in Tracker

Tracker currently uses a mix of local file paths (for files) and URIs (for e-mails). The path/URI is split in two parts which are stored in the columns Path and Name in the Services table. The full URI is formed by concatenating Path and Name separated by a slash.

To store the identifier for files and e-mails in a uniform way, we should also store local file paths as URIs in the database. There has been an IRC discussion how exactly we should store the URI in the database, without conclusion so far. This Wiki page tries to describe the two proposals.

Store full URIs in the database

The idea of this proposal is to replace the Path column in the Services table by a Uri column and store the full URI in that column. The Name column will stay but will not be used anymore to identify services.

The advantage of this approach is that it supports arbitrary URI schemes and simplifies the code in various places.

Test code available in the uri branch in SVN.

Store URI split in two components in the database

The idea of this proposal is to store the URI of the parent in the Path column and the basename in the Name column.

The advantage of this approach is that it allows fast access to all children of a directory. However, this can be implemented equally fast with the other proposal at the cost of an additional integer index (storing the ID of the parent).

The issue with this approach is that it only works with URIs that contain slashes, which is not the case for all URI schemes. We could work around this limitation by allowing the Name column to be NULL and storing the full URI in the Path column if there is no slash in the URI. However, we would have to add code handling this special case in various places when splitting and combining URIs, which increases code complexity unnecessarily.

URI schemes

If Tracker wants to use a more expressive RDF-based ontology such as NEPOMUK at some point, we cannot allow to restrict Tracker to file-compatible URI schemes.

For example, it might makes sense to use the mid URI scheme (RFC 2111) to identify e-mails:

URIs of newsgroups do not contain slashes:

URIs of IMAP mailboxes contain slashes but the split leads to unexpected results: 'imap://user@server' is stored as Path 'imap:/' and Name 'user@server'

Projects/Tracker/URIs (last edited 2013-11-25 12:54:24 by WilliamJonMcCann)