Shotwell Architecture Overview: Importing

Shotwell can import photos into its library from disk or from a camera (via gPhoto). There are many steps to getting the photo onto disk from the camera (if necessary), inspecting the file to ensure it’s a valid photo and that Shotwell understands its format, and then adding it to the database and creating thumbnails for it. To allow the user to continue working with Shotwell while the import is happening, most of these operations occur in the background. To make the import process as speedy as possible, most of these operations occur in parallel whenever possible.

1. Initiating an Import

There are three ways an import may be started: By attaching a camera and selecting photos to import from the camera’s page; by selecting File -> Import or File ->; Import From F-Spot; and by drag-and-dropping photos from Nautilus onto the Shotwell window.

2. BatchImportJob and BatchImport

However the import is started, that block of code creates one or more BatchImportJobs. A BatchImportJob may represent a file or a directory (which is recursed for files) or a photo on the camera. It’s an abstraction that hides details of the photo’s source.

The code then creates a BatchImport object (the importer) which holds the list of BatchImportJobs. BatchImport offers a number of signals to report when files have been imported (or failed to import), as well as when it’s started and when it’s finished. The camera page, for example, needs to known when the BatchImport is finished so it can unlock the camera.

Once the BatchImport is created, it’s handed over to the ImportQueuePage.

3. ImportQueuePage

WorkSniffer -- This background job walks through all the BatchImportJobs to determine if they represent a file or a directory. If it’s a file, a File object is added to the result list. If it’s a directory, the directory is recursed discovering all files within the directory, again, adding them to the result list. No attempt is made to determine the contents of the file.

PrepareFilesJob -- This job walks the list of files. If the file needs to be downloaded from the camera, that happens here. Each file is examined for file type by examining its extension. Then, if the file has EXIF data, that data is loaded as well as any thumbnails embedded in the file and hashed with MD5. If no EXIF is found, the entire file is hashed. After each file is examined a notification is sent back to the BatchImport object, which in turn fires off more background threads to continue. Thus, this step may repeat while further import work occurs.

FileImportJob -- Duplicate detection occurs in the main thread by searching the database for photos with the same MD5 as the file. If no duplicates are found, a FileImportJob is fired off. This job copies the file into the library (generally ~/Pictures) if necessary or requested by the user. A PhotoFileInterrogator object sniffs out the file’s contents to determine various relevant information, such as the photo’s dimensions, colorspace, etc. That information is used to prepare a data structure for adding to the database. Thumbnails are also generated during this stage.

If all has gone well up to this point, the file is formally imported into the database. Because the thumbnail cache needs a database ID to store its thumbnails, writing the thumbnails has to be delayed until this point.

ImportManifest -- Details what files were examined and what happened to them (i.e. if they were imported and if not, why). When the BatchImport signals it’s finished, this manifest is used to present a result dialog to the user.