Moving Mail to EDS

This document outlines some preliminary thoughts on moving the mail service to EDS, the so-called "evolution data server". Personally, I don't think it is really a very good idea. I think there will be reasonably large performance and technical hurdles to overcome, and very limited benefit from doing it anyway.

CORBA interfaces

The CORBA interfaces required have pretty well already been defined in the implementation of the mail-remote plugin. That doesn't mean they are fully implemented, but they are defined well enough to be used as is, with some additional missing interfaces added.

I will go through each of them in turn, and list how the interrelate to the corresponding Camel or Mail interfaces.

MailException

This is the standard exception that most interfaces will return. It primarily wraps CamelException's textual data, but can be used for other purposes. Having a single typed exception simplifies CORBA exception processing in the C language.

        enum ErrorType {
                SYSTEM_ERROR,
                CAMEL_ERROR,
                FAILED,
                NOT_SUPPORTED,
                NO_PERMISSION
        };
 
        exception MailException {
                ErrorType id;
                string desc;
        };

Factory

This is a proposed interface, which isn't implemented currently in the mail-remote api. But it would allow the EDS server to support multiple, independent, backend sessions concurrently. For example, multiple email clients with independent mail stores.

This should probably be implemented for an EDS mail server.

        interface Factory : Bonobo::Unknown {
                Session getSession(in string base, in SessionListener listener);
        };

SessionListener

The SessionListener interface is used by EDS to notify the client of changes to the session, and give the client access to the virtual method callbacks on the CamelSession object.

        interface SessionListener : Bonobo::Unknown {
                oneway void changed(in Session session, in SessionChanges changes);
 
                oneway void shutdown(in Session session);
 
                // We need to use gnome-keyring instead of an interface like this?
                // Or does gnome-keyring run off this?
                //string getPassword(in Session session, string uri, string domain, string item, string prompt, long flags);
        };

This is incomplete. getPassword is required for EDS to ask for passwords. Unlike the other EDS interfaces, passwords are not passed a-priori, they are asked for when requested. This is required because additional passwords may be asked for without knowing they are required in advance, and it just makes sense to do it this way. As mentioned above however, gnome-keyring may be used more directly, perhaps.

There are other calls missing too, like userMessage() and others from CamelSession.

SessionChanges

The session changes correspond to account manipulation.

        struct SessionChange {
                ChangeType type;
                StoreInfos stores;
        };
        typedef sequence <SessionChange> SessionChanges;

Session

The session maps more or less directly to a CamelSession. Currently in mail-remote, this is the singleton root-object used to access the data in the Evolution mail session (MailSession). If the Factory method above were to be implemented, then it would be singleton for each base directory, but otherwise multiple instances may exist, one of which would be the Evolution mail store in ~/.evolution/mail.

        interface Session : Bonobo::Unknown {
                boolean getProperties(in PropertyNames names, out Properties props);
 
                StoreInfos getStores(in string pattern, in StoreListener listener)
                        raises (MailException);
 
                void addListener(in SessionListener listener);
                void removeListener(in SessionListener listener);
        };

Note that addListener and removeListener are only required whilst this is a singleton object, they would be removed if the Factory object were implemented.

Also note that the Stores are actually created from the EAccountList, so this isn't strictly the same as CamelSession, whose Stores and Transports are created by external code. The mailer creates the stores based on the EAccountList, so they are the same information.

StoreListener

The StoreListener interface wraps the various Camel Event Hooks on the CamelStore underlying the Store.

        interface StoreListener : Bonobo::Unknown {
                oneway void changed(in Store store, in StoreChanges changes);
        };

For moving to EDS, this should be changed slightly, so that changed sends a StoreInfo rather than a Store, to simplify client processing.

StoreChanges

Store changes come in blocks, which may include the normal added/removed/changed types.

        struct StoreChange {
                ChangeType type;
                FolderInfos folders;
        };
        typedef sequence <StoreChange> StoreChanges;

StoreInfo

Because CORBA object property lookups require rount-trips, calls returning a Store should normally return a StoreInfo, since it includes some common information you might need to look up about the store, for example, its uid or name.

        struct StoreInfo {
                string name;
                string uid;
                Store store;
        };
        typedef sequence <StoreInfo> StoreInfos;

Store

A Store interface wraps pretty well directly to a CamelStore interface. Actually to simplify everything, it actually wraps an EAccount object; which may include a CamelStore AND a CamelTransport reference. The Store interface hides these details; they are not important to users of the class. If it is a sending-only account (e.g. receiving type is 'None'), then Store-related methods just return exceptions.

        interface Store : Bonobo::Unknown {
                boolean getProperties(in PropertyNames names, out Properties props);
 
                FolderInfos getFolders(in string pattern, in FolderListener listener)
                        raises (MailException);
 
                void sendMessage(in MessageStream msg)
                        raises (MailException);
        };

To simplify the code (at both ends), folders are just returned in a completely flat namespace. getFolders can be passed a pattern perhaps to list sub-sets of the namespace, or unsubscribed folders perhaps - it is unclear what is the best solution here.

Note that the listener is used to return any changes to the folder list, so you call it once, and then get any changes are they occur, rather than attaching signals and then asking for the list. The listener must be supplied; it is also used as life-cycle detection, so the EDS server can clean up resources when clients no longer exist.

Also note that there are additional interfaces required, for example, to create, delete, and rename folders, which are not defined in mail-remote.

FolderInfo

The FolderInfo works similarly to the StoreInfo, but ... for folders.

        struct FolderInfo {
                string name;
                string full_name;
                Folder folder;
        };
        typedef sequence <FolderInfo> FolderInfos;

full_name will be the / separated unicode name, as normally present in the CamelFolderInfo.full_name field.

Folder

The Folder interface defines ... wait for it ... a CamelFolder wrapper. Since much of the (old) CamelFolder interface is more fine-grained than will work on a remote interface, it is drastically reduced in complexity.

Also note that any interfaces which can work with more than one item don't take single items to work with. And interfaces that may return all items, will do so through batched iterators. This is very important for maintaining any semblence of effeciency and scalability in an IPC environment.

        interface Folder : Bonobo::Unknown {
                boolean getProperties(in PropertyNames names, out Properties props);
 
                MessageIterator getMessages(in string pattern)
                        raises (MailException);
 
                void changeMessages(in MessageInfoSets infos)
                        raises (MailException);
 
                MessageStream getMessage(in string uid)
                        raises (MailException);
 
                void appendMessage(in MessageInfoSet info, in MessageStream msg)
                        raises (MailException);
        };

getMessages is the primary listing interface, it takes a pattern, so this one interface does searching, and listing as well.

Note that the disksummary branch has a completely new CamelFolder model, which in some ways maps more closely to this model.

FolderListener

        interface FolderListener : Bonobo::Unknown {
                oneway void changed(in Folder folder, in FolderChanges changes);
        };

Again, changed should take a FolderInfo rather than a Folder, for an EDS version.

FolderChanges

Folder change information comes in blocks.

        struct FolderChange {
                ChangeType type;
                MessageInfos messages;
        };
        typedef sequence <FolderChange> FolderChanges;

MessageIterator

Rather than pass full arrays around of message information, instead a MessageIterator is used. But because individual calls and rount-trips are expensive, the iterator doesn't return individual objects, it returns an array of them in a block. The interface is almost identical ot the MessageStream one.

        interface MessageIterator : Bonobo::Unknown {
                MessageInfos next(in long limit)
                        raises (MailException);

                void dispose();
        };

MessageStream

To simplify the CORBA interfaces, folders do not return structured messages like their Camel counterparts, they simply return MIME encoded streams which can easily be decoded in client code using native Camel calls. This could be any stream interface, but because they are read-only, I just wrote a tiny iterator-like interface to implement it.

        typedef sequence <octet>Buffer;
 
        interface MessageStream : Bonobo::Unknown {
                Buffer next(in long size)
                        raises (MailException);

                void dispose();
        };

MessageInfo

MessageInfo structures are used to wrap the read-only, data portions of the CamelMessageInfo structure.

        typedef string UserFlag;
        typedef sequence <UserFlag> UserFlags;
 
        struct UserTag {
                string name;
                string value;   // when setting, value == "" == unset
        };
        typedef sequence <UserTag> UserTags;
 
        struct MessageInfo {
                string uid;
                string subject;
                string to;
                string from;
                long flags;     // CamelMessageInfo flag bits
                UserFlags userFlags;
                UserTags userTags;
        };
        typedef sequence <MessageInfo> MessageInfos;

Note that this structure is not entirely complete; it is missing the References list, message list information, at least.

MessageInfoSet

Rather than have a single MessageInfo object which behaves like the CamelMessageInfo structure, a separate setting structure is used for append and changeMessages operations. This structure is then mapped to corresponding camel_message_info_set invocations on the corresponding CamelMessageInfo.

        struct MessageInfoSet {
                string uid;
                long flagSet;   // values bits to set in the flags
                long flagMask;  // mask of bits to change in the flags
                UserFlags userFlagSet;
                UserFlags userFlagUnset;
                UserTags userTags;
        };
        typedef sequence <MessageInfoSet> MessageInfoSets;

A sequence of these is passed to change a number of messages at once.

Helper Interfaces

It would be highly recommended that the CORBA interfaces be wrapped as thinly as possible. Both because the CORBA interfaces are quite simple and do the job, and to avoid additional overhead. But some bare Bonobo based objects are still required to instantiate the CORBA interfaces in usable ways.

These are not client interfaces; they are implemented at the 'service' end of the call. So infact the listeners are used by the clients, but they do little work.

EvolutionMailSession

Wraps an Evolution.Mail.Session object at the service end. This will run against a thread-per-invocation POA.

EvolutionMailSessionListener

Wraps the Evolution.Mail.SessionListener object. It provides signal wrappers for the changed method. It needs to implement some mechanism for the getPassword method, etc. It will run against an idle-handler POA to simplify implementation in the client code.

The use of GObject signals here is almost certainly useless, they could probably be simple function pointers. There is no need for any application to have more than event handler for each listener.

EvolutionMailStore

Wraps an Evolution.Mail.Store object at the service end.

EvolutionMailStoreListener

Wraps an Evolution.Mail.StoreListener object. Again using signals, which are passed the raw CORBA data types.

EvolutionMailFolder

Wraps an Evolution.Mail.Folder object at the service end.

EvolutionMailFolderListener

Wraps an Evolution.Mail.FolderListener object. Again using signals, which are passed the raw CORBA data types.

EMMessageStream

Wraps an Evolution.Mail.MessageStream object. This is used by both the server and the client code to communicate messages from one end to the other.

Other things to consider

Here's some other notes which may be important, or just future potentials.

Performance

It is likely that these changes will significantly affect performance and memory use.

Memory

Given the current implementation of CamelFolder in head CVS, they will for example, mean the complete duplication of every 'MessageInfo' in memory for every client as well as the EDS process, for the currently displayed folder.

Also given that this is currently the single-most memory intensive aspect of the mail component, this will likely have a noticable impact on memory performance. This is only a real problem on large folders, larger than 10K-50K messages depending on your system.

Time

Again, since potentially vast gobs of material is going to be moved from one process to another, I would expect about a 100% time-performance (i.e. 2x slower) drop when selecting a different folder to display.

Evolution Mail conversion

Once the above interfaces exist (and they do already, exist enough to get something working), the task remains to convert the Evolution Mail code to use it. As stated above, they should use the CORBA interfaces as closely as possible, purely for performance reasons.

This isn't terribly difficult, but there is just a lot of changes required.

Also note that the changes may need to add additional threaded-tasks, since calls which were once guaranteed to be local-memory may now block for indefinite time periods (i.e. message state changing).

Linking

To ensure no remote interfaces are used locally, simply remove libcamel-provider from the libevolution-mail (and ALL plugins), link lines. Any calls they are not allowed to use will thus be removed instantly. this may need adjusting for the use of remote camel ssl streams though.

EAccountList

The current interfaces propose nothing to fix the abomination that is EAccountList. I presume it would mean that this means we would still be using the shoddy gconf based IPC mechanism to edit the configuration from client code, and have EDS know about it. This is rubbish and would need to be fixed. This is where the "unified account" thing comes in, and should be solved before mail moves to EDS.

Creating and editing accounts

i.e. EAccount. Currently, editing accounts requires access to libcamel-provider for backend-specific account preferences, and the like. This would most likely need to change, and should also be part of the unified account solution.

A related issue may be the dropping of the CamelProvider configuration information entirely; moving completely to an EPlugin based configuration mechanism. This is probably the right way to go here.

Filtering

Currently filtering is a mixture of backend-invoked filter-on-arrival, client-invoked filter-on-arrival, and user-invoked filter-on-demand.

Ideally these should all be done at the EDS end somehow, so additional interfaces may be required. Certainly the current interface to get the FilterDriver object from the client using the CamelSession virtual method will need to be changed.

VFolders

Although none of these interfaces preclude it, vFolders are completely unaccountered for in the current mail-remote implementation. Obviously they would need to be added somehow, and they must live in the backend for performance and memory reasons.

Ideally, either through extended Store objects specific for controlling vFolders, or new base Store object interfaces that provided a more generic configuration environment. vFolders are another aspect which should probably be taken into account for the unified account architecture.

Note that vFolders and Filtering are closely tied in implementation details.

DiskSummary branch

The disksummary branch requires changes to almost all of the same code. This should be taken into account before embarking on any of these changes in a separate branch. Otherwise the work will be done twice, for no benefit.

Also note that it has significantly different vFolder mechanisms, and a completely different CamelFolder interface, both of which require slightly different CORBA interfaces as well.

Self Containment

Note that all of these interfaces are fully self-contained. It requires no extra discovery interfaces for finding out which accounts are available, and it hides internal details like connection state. It would be better than otherwise if this were to continue, although an explicit off-line state is probably required for the session.

Additional interfaces should be added for creating accounts; they cannot rely on gconf.

Whether account editing utilities should be moved to libedataserverui is very debatable. Well, as is this whole concept anyway.

Is libcamelprovider needed at all?

Given that these interfaces are so much simpler than the equivalent Camel ones (even once they are complete), it may make sense to drop the Camel equivalents anyway, and leave the libcamel-provider library behind.

This would have an added benefit of making it very easy to implement mail stores in any language that supported CORBA, and also the possibility of remote mail stores via CORBA ipc.

Is this all worth it?

I don't think so. Given the extensibility of evolution using e-plugin, the need for this effort at all is highly suspect. Generally, most applications would only want to at most, send an email, or identify who you are.

Any operations that need to do more with email should just sit in the email application; that is what it is good at and what it is there for. EPlugin can be used to easily monitor and access data more directly for external indexing applications to track the data in a reliable manner.

Apps/Evolution/Mail_in_EDS (last edited 2013-08-08 22:50:11 by WilliamJonMcCann)