Tracker and Zeitgeist

<!> NOTE: This document does not indicate that I necessarily want to run Zeitgeist within Tracker or somehow use Tracker as backend. Only that I am considering the possibility. It may or may not be applicable depending on a lot of stuff...

This page is an informal writeup of what I (where I == kamstrup) see Zeitgeist will need from Tracker, or indeed any storage technology we may want to adopt as our primary log database. At the very least I want clear answers on all of these points before we can start real work.

I am writing this with a "no regressions hat" on - which means that we could in theory lax a little on some of the points, but OTOH I also believe that our current API is quite good. I want to keep it. We developed it through may iterations and it withstood a week of intense debates at the hackfest in Bolzano.

Technical Requirements

Template Based Queries

We must be able to compile our list of event templates into a fielded boolean query.

  • (!) : Note: We will also want to do query expansion, prefix- and negation queries in the near term future

Notifications

We can deliver low-latency (not necessarily instant) notifications to monitoring clients

Blacklisting

Runtime configurable blacklisting system. Our blacklist consists of a set of queries inserted events must not match

Binary Event Payloads

It must be possible to attach binary blobs to events. This may be a problem for text-only APIs, eg. Sparql - we are not going to require manual base64 encoding!

Small Disk Size Footprint

Our current DB is very optimized storage-wise. Roughly 150mb for 500,000 events, probably a little less since these numbers where extrapolated from a worst-case scenario.

Small Memory Footprint

We run in under 10mb RAM and barely spike above that - even under peak stress.

No IPC to the DB

We must have direct in-process access to the DB, so we can do funky heuristic algorithms with several queries without suffering the IPC overhead.

Advanced Sorting Capabilities

We can not only sort by recency, we can also sort by frequency of subject (aka MostPopularSubject). We can also group on subject and sort on the most recent subjects.

  • (!) Note In the future we might also want to sort by "most recent access or modification-time" or something like that

Meta/Political Requirements

Autonomous Development

We must be able to hack like we see fit. We can not develop at the mercy of 3rd parties. Considering Tracker this probably means that we'd require an out-of-tree plugin.

KDE has Shown Some Interest in ZG...

I am not sure the K-camp would like a hard Tracker dep. very much. Maybe we some static libzeitgeistcore that gets linked into a Tracker- and a Soprano plugin...

C

We must port everything to C. Personally I am fine with this, but it's going to take a shitload of development resources and the gains for the C port alone will probably not be big (perf. and memory wise at least).

Projects/Zeitgeist/Blueprint/TrackerAndZeitgeist (last edited 2013-12-03 14:54:41 by WilliamJonMcCann)