Camel.Utils, disksummary

This page is for some utility classes developed on the disksummary branch.

Camel.Record

The CamelRecord utilities are used to encode or decode binary record structures in a way which should hopefully be relatively efficient, portable, and version-independent.

They are a revamped look at the Evolution/Camel.Misc#Camel.FileUtils interfaces, that add a bit more structure to allow this version independence.

Each record consists of variable-sized, tagged blocks of data, which are versioned independently. This lets new versions of code append new data fields without upsetting the structure of neighbouring records.

Each sub-class which might write additional fields of a given structure puts it in a different tagged block, which gives it independence from the version of its parent or subclasses for the same structure. The only requirement is that each subclass use a unique tag identifier for its section in this record - they needn't be globally unique, just unique per class.

Data format

The encoded data format consists of a number of sections concatenated together - the overall size of the record must be recorded externally. All records are store in native endinaness to simplify decoding, although currently there is no endianness marker - these records are not sharable across different architectures, although this could be fixed by storing an endianness marker inside the TAG information.

Each section consists of a 4 byte header followed by the section data records.

 TAG  VER  SIZE

TAG: 8 bit tag information.
VER: 8 bit version information.
SIZE: 16 bit size of this record, in native-endian format. Currently no endianness information is stored, it should probably be the top-bit of the TAG.

Each type is then stored in a packed format after this, depending on the size of the value. All integral types are stored as a direct copy of value, at their native size, packed at the next storage location.

Strings are stored as NUL-terminated character strings.

Given that no information of the structure of individual records is stored, then it is still up to the code to properly interpret each record.

Camel.RecordEncoder

The encoder encodes multiple blocks of data into a GByteArray which can then be accessed as a single data block for i/o.

 struct _CamelRecordEncoder {
        GByteArray *out;
        int section;
 };
 
 CamelRecordEncoder *camel_record_encoder_new(void);
 void camel_record_encoder_free(CamelRecordEncoder *cde);

Encoders can be re-used to encode more than one record, just call reset.

 void camel_record_encoder_reset(CamelRecordEncoder *cde);

All data must exist inside a section. Sections cannot be nested. Simply start a new section before writing data and then end it when finished. Note that although they are ints, only 8 bit values are stored for tag and version.

 void camel_record_encoder_start_section(CamelRecordEncoder *cde, int tag, int version);
 void camel_record_encoder_end_section(CamelRecordEncoder *cde);

Then there are the writing functions for each basic type.

 void camel_record_encoder_string(CamelRecordEncoder *cde, const char *s);
 void camel_record_encoder_int8(CamelRecordEncoder *cde, guint8 v);
 void camel_record_encoder_int32(CamelRecordEncoder *cde, guint32 v);
 void camel_record_encoder_int64(CamelRecordEncoder *cde, guint64 v);
 void camel_record_encoder_timet(CamelRecordEncoder *cde, time_t v);
 void camel_record_encoder_sizet(CamelRecordEncoder *cde, size_t v);

NULL strings can be encoded, but they are translated to "" when decoded.

Camel.RecordDecoder

The record decoder helps decode records generated with the encoder. It can be used to iterate through all sections, and then extract the structured record data from each section.

It works on a binary buffer of known size, and will enforce both section boundaries and the data size.

 struct _CamelRecordDecoder {
        const unsigned char *data;
        const unsigned char *dataend;
 
        const unsigned char *pos;
        const unsigned char *end;
 };
 
 CamelRecordDecoder *camel_record_decoder_new(const unsigned char *data, int len);
 void camel_record_decoder_free(CamelRecordDecoder *cdd);
 void camel_record_decoder_reset(CamelRecordDecoder *cdd);

Sections are iterated using next_section. You can also get the version back, if you need to do versioning. This should only ever be used to short-cut how many data items are read back - any changes to a record should purely consist of appending new values.

 int camel_record_decoder_next_section(CamelRecordDecoder *cdd, int *verp);

And then the corresponding decoder functions for each type.

 const char *camel_record_decoder_string(CamelRecordDecoder *cdd);
 guint8 camel_record_decoder_int8(CamelRecordDecoder *cdd);
 guint32 camel_record_decoder_int32(CamelRecordDecoder *cdd);
 guint64 camel_record_decoder_int64(CamelRecordDecoder *cdd);
 time_t camel_record_decoder_timet(CamelRecordDecoder *cdd);
 size_t camel_record_decoder_sizet(CamelRecordDecoder *cdd);

Reads beyond the end of sections will return default values, i.e. 0 integral types and "" strings. So even if a newer implementation adds new values, it might not need to version the decoding.

Record Notes

Perhaps the packing rules should be similar to C, or even identical to C, i.e. bytes are packed as bytes but any multi-byte values are aligned differently. Although the data is designed as a storage format, so this is a relatively weak argument.

Apps/Evolution/CamelDS.Misc (last edited 2013-08-08 22:50:05 by WilliamJonMcCann)