Shotwell Architecture: Photo File Formats

1. Some Commentary

One enlightened user on Slashdot proffered this comment when the discussion turned toward Shotwell 0.5 not supporting PNG:

They don't do PNG? What, are they writing their own image handling codecs from scratch? What kind of half-assed project doesn't build on the existing available libraries to handle low-level things like image formats? Sounds like someone needs to review a college first year CS textbook.

To be clear, you'll not find a line of image decoding in half-assed Shotwell; we do use existing libraries. The reason we didn't simply support every image type out there has to do with fully supporting every aspect of their quirks and issues: alpha channel, metadata, performance characteristics, even thumbnail generation all have a role in Shotwell supporting a particular file type. We could have simply supported every file type that GDK decodes. The problem is, we wouldn't have done it very well.

In 0.6, we expanded support to PNG and RAW photos (and, thanks to Exiv2 and gexiv2, support all metadata they might hold). PNG is supported via GDK's internal support. RAW support comes from LibRaw, which is built on Dave Coffers' dcraw. Future support for other file types are planned. Adding them is relatively painless using Shotwell's PhotoFileFormat abstraction, which operates as a kind of driver (or adapter) providing read/write primitives, metadata support, and file format information, so that Shotwell can manage the photo file generically. (There is some special-case code for RAW, which has particular user requirements.)

2. A Note on RAW

RAW files are oddballs in the world of digital photography. When working with and discussing RAW files, there are two things to keep in mind. First, there is no single RAW format; there are dozens. Although Adobe has proposed a standard DNG file format, most camera manufacturers opt to use their own proprietary formats, some even going to far as to encrypt the file to avoid use by third-party (and open source) tools. Thus, when speaking of “RAW files” or “RAW file format” in this document, it means the many dozens of file formats. Thanks to LibRaw and dcraw, however, the differences between these formats disappears, and Shotwell can treat them as though they are one.

Second, Shotwell cannot work with RAW image data directly. RAW files are not traditional raster images made up of rows of RGB pixel values at some level of precision (such as 16 bits or 24 bits per pixel). Instead, RAW files are a direct dump of the luminant intensities recorded by a digital camera’s CCD sensor at the moment of exposure. Like traditional raster image files, RAW files are rows and columns of intensity numbers. Unlike traditional raster images, what these numbers mean is highly camera-specific. The best way to think of a RAW photo is as a digital negative. Just like film negatives in traditional analog photography, RAW digital negatives aren’t usuable until they’re “developed” into conventional, 8 bit-per-component RGB pixel data for display. In digital photography, this development involves the application of mathematical tone-mapping curves that scale and bias the RAW image data into traditional pixel values. Shotwell supports selectable developments of RAW files. See the RAW Files and RAW Developments section of this guide for more information.

3. PhotoFileFormat

The key to the entire system is the PhotoFileFormat enum. Enums in Vala are quite powerful. They can support their own methods, for example. PhotoFileFormat is simply a list of the file formats (currently BMP, JPEG/JFIF, PNG, RAW, and TIFF) that Shotwell understands. From this single enum, the entire subsystem for file format support unfolds.

Each file format provides a PhotoFileFormatDriver (a singleton class) which can be obtained via PhotoFileFormat. In turn, this driver provides accessors to other classes/objects which are relevant to examining and manipulating files of that format: PhotoFileReader, PhotoMetadata, and PhotoFileSniffer. PhotoFileFormat provides helper methods to automatically return these objects, simply to protect the caller from an additional level of indirection.

Note that all these objects are expected to be thread-safe. However, file system atomicity (i.e. file locking) is not expected of them.

4. PhotoFileFormatProperties and PhotoFileSniffer

This object provides a smorgasbord of methods offering details about the file format itself (and not any particular file): common file extensions, MIME types, and flags that are of interest to the photo subsystem. One use of these details are in the import system to weed out photo files from non-photo files.

PhotoFileSniffer is a more aggressive examination of a photo file. It determines if the file is actually a photo file by examining its contents. It also returns information about the photo itself (dimensions, colorspace, MD5 checksums for duplicate detection) and photographic metadata. Since each file format has its own sniffer, the PhotoFileInterrogator automates sniffing of a file to determine its type. (Note that PhotoFileInterrogator itself is not thread-safe.)

5. PhotoFileReader and PhotoFileWriter

PhotoFileReader and PhotoFileWriter (which derive from a common base class, PhotoFileAdapter) are concerned with I/O on the file format. Because classes named FooReader and BarWriter are common in many streams-based I/O libraries, let's be clear: These classes are not stream-based. Their only concern is reading and writing images (i.e. GDK pixbufs) and metadata. They keep almost no state (beyond the filename), and should avoid caching unless absolutely necessary. They may come and go constantly (they're not necessarily held long-term by any object).

One unusual aspect of PhotoFileReader is its scaled read. This is to take advantage of decoding optimizations in some libraries . This call does require that the full-sized dimensions of the image are known prior to loading it; this is one reason why PhotoFileSniffer (and the entire import process) is important to Shotwell's performance.

6. PhotoMetadata

Shotwell switched from libexif to Exiv2 in 0.6. Exiv2 offers a broader range of support for various file formats as well as more metadata domains, including XMP and IPTC. PhotoMetadata provides additional abstractions and utility functions over Exiv2.

Unlike the other classes in this subsystem, there's no expectation of thread-safety. (Unlike the others, it's mutable and stores a lot of state in memory.) This may change in the future, but is not expected.

PhotoMetadata offers various primitives for loading and saving metadata to/from photo files, reading and updating metadata of various types (string, integers, dates, rationals), and numerous utility methods to save the trouble of searching the various metadata domains for a particular value (for example, there are potentially four different places to look in a photo file for its caption). PhotoMetadata also offers a way to query and load the various previews (i.e. thumbnails) that may be stored in the file.