Log-database Schema Changes (SUPERSEDED)
This blueprint has been implemented in manner resembling solutions 2. and .3. Consider this blueprint obsolete
- Log-database Schema Changes (SUPERSEDED)
Launchpad blueprint: database-schema-changes
During the 0.2 cycles we have identified a collection of changes we want to do to the database schema. This blueprint suggests three solution models of which one should be picked.
Solution 1: Small Changes
Documentation on the database design used in the 0.2 can be found in Zeitgeist/Document/DatabaseDesign0.2
In the 0.2 series we have the mimetype of an item as a VARCHAR column in the db. We should the item.mimetype column to be an INT and introduce a new table called mimetype looking exactly like the uri, content, and source tables. Ie. create the table like:
CREATE TABLE IF NOT EXISTS mimetype (id INTEGER PRIMARY KEY, value VARCHAR UNIQUE) CREATE UNIQUE INDEX IF NOT EXISTS mimetype_value ON mimetype(value)
The event.end column is not being used and should be removed
Just like the item.mimetype was split out in its own table, item.origin should simply point into the uri table (so no need for a new table).
Multiple Subjects Per Event
Right now the event table uses item_id as primary key, meaning that we can only have one subject per event. This might be a very bad idea considering that the user can very well delete 1000 selected files with 1 mouse click, or apply the tag "puppies" to the same 1000 files with one click from Nautilus. If we use a compound PK like the annotation-table already does then we could have multiple subjects per event. Specifically the event table should be created as
CREATE TABLE IF NOT EXISTS event (item_id INTEGER , subject_id INTEGER, start INTEGER, end INTEGER, app_id INTEGER, PRIMARY KEY (item_id, subject_id))
(we use almost the same now, except that item_id is declared as PK)
Questionable: Origin moved to uri Table?
Siegfried has an idea - I don't completely get it https://bugs.launchpad.net/zeitgeist/+bug/425258
Questionable: Remove item.origin?
Is item.origin being used, and is it useful?
Questionable: Remove the app table all together?
When we reference applications we use the app table. It currently has two columns id and info. We not just make applications normal items and set item.content = Content.APPLICATION? One less table can save us a SQL JOIN.
Solution 2: Separate Tables
This model is more invasive than solution 1, but addresses the same issues and then some. The idea is to apply solution 1 and then also not make the event and annotation tables "inherit data" from the item table. Specifically the item table will be exactly as from solution 1 - the event and annotation tables will look exactly like the item table but adding an extra column subject_id:
id INT # uri.id, PRIMARY KEY subject_id INT # uri.id, the subject of the event or annotation content_id INT # content.id source_id INT # source.id origin INT # uri.id, url the item can be said to originate from text VARCHAR # the title or the name of the item (first thing the users sees) mimetype INT # mimetype.id icon VARCHAR # payload BLOB # Free-for-use array of raw bytes
Solution 3: Split Databases
This model is more invasive than solutions 1 and 2, and addresses the same issues. In order to better fit with having items stored in an external storage (such as Tracker or DesktopCouch) we should split items, events, and annotations not just into separate tables but different SQLite database files on disk. This probably means that we can no longer do JOINs between these bases - perhaps, haven't checked the SQLite docs yet.
Perhaps keep the items+annotations in one DB and the events in another...