Projects/GdkPixbuf/Optimization

First of all, I have no evidence GdkPixbuf is a bottleneck in normal operation. However, it does a number of things that simply waste memory, CPU, or just add latency. Implementing some of the ideas here would likely make GNOME snappier and/or work better on low-end hardware.

The particular situation where GdkPixbuf could be faster is displaying an image from a file to the screen, either scaled or unscaled. It is assumed that this is the primary usage of GdkPixbuf. GdkPixbuf supports a number of other situations, such as incremental loading of images from a network connection, which directly conflict with optimizing the primary use.

The optimal design for getting an image from a file to the display (assuming no additional features) is to mmap() the file and have the backend decode directly to an XShmImage, in X's native image format. In this case, the time consuming part is the actual decoding.

Some notes:

calling mmap() is usually pretty fast, since it doesn't really do much. If the file is not cached, there's a penalty due to the disk seek when you read the first byte in the mmap'd area.
reading a file using read() incurs the disk seek penalty during the read() system call.
Some people have measured performance of a function operating on the contents of a file and claim that using read() is faster than mmap() -- this is typically due to incorrect measurement, i.e., not including the cost of the mmap() or read().
Algorithms using mmap() are always faster than using read() unless the files are very small.
POSIX always assumes file descriptors for files on disk are always ready for reading. If you assume that the disk is infinitely fast, this is true. But if your application could be doing something more useful in the 8 ms it takes to do a disk seek, it won't let you know.
mincore() can be used to test if pages of a mmap'd file are resident, i.e., that they can be accessed without a disk access. This can be used to decode an image now, or wait until the entire file is resident.

FedericoMenaQuintero rants

But is the common case case really about immediately displaying an image file to the screen? I would think that the most common usage case in GNOME is to load an icon into a pixbuf, and keep it around forever. The pixbuf then gets painted on the screen when you pop up a menu, or open a Nautilus window.

The second most common case is that of an app loading an image that will be a part of its user interface, and displaying it when the user activates that part of the GUI.

If you want to fix the "load icons quickly" problem, you need to extend the mmap()able icon theme cache format to allow embedding the actual image data in it. Then, you can have stock icons whose pixbufs reference the mmap()ed data directly.

If you want to solve the general problem of loading images quickly, you need to fix the image libraries. Libjpeg is horrendously slow. Compare it to Photoshop, for example, and you'll see what I mean. I don't know if libpng is slow; maybe it is limited by zlib's own performance --- someone needs to profile that.

You could of course mmap() an image file and then have libjpeg read out of the mapped data. With the current state of things, however, I'd say that performance is limited by the decoder itself, not by reading the file.

In the context of GUI apps, you never really do "read an image from a file and decode it to the screen as fast as possible" in a single step; you need to read the image and keep the data around, waiting for an Expose event to come in.

MatthiasClasen comments

This kind of unsubstantiated talk is not leading anywhere. If you have no idea if GdkPixbuf is a bottleneck, then go measure it before making wild claims about mmap() "always being faster".

BobMurphy remarks

Speaking as someone who works in the embedded space on low-end hardware (e.g. cell phones), I have run into some specific problems with GdkPixbuf reading and scaling down raster images.

As Matthias remarks, memory usage can be a problem. If you want to display an image on a small screen, or make a thumbnail of it, GdkPixbuf's standard approach is to decode the entire image into memory and then scale it down. That can be a problem if you've got a 5 megapixel image from a digital camera, which expands to 20 MB in RAM at 32 bpp, and your device only has 64 MB RAM in the first place.

GdkPixbuf does have a partial solution for JPEG/JFIF images. That's because libjpeg will scale the image for you to 1/2, 1/4, or 1/8 the original size in each dimension. So the JPEG io system does that, and then applies GdkPixbuf's internal scaling after that if needed. But that doesn't help for other raster formats.

For raster formats whose underlying libraries report sequential (e.g. non-interlaced) scan lines, it would be possible for GdkPixbuf itself to do something similar to what libjpeg does:

Read one or several scan lines at a time, and combine them into a smaller intermediate GdkPixbuf image
Scale the intermediate image into the final GdkPixbuf, either via the default bilinear interpolation, or a user-chosen algorithm

This limits memory requirements by using a smaller intermediate image. Also, step 1 could be done using a stupid-but-fast algorithm, with the intermediate GdkPixbuf sized heuristically so that step 2 still provides a good-quality result, thus accelerating the scaling.

If this is of interest, I could look into taking a stab at it.

Also, a new major revision of libjpeg was just released a few weeks ago. Is there any thought of using that?

Projects/GdkPixbuf/Optimization (last edited 2013-12-04 20:14:16 by WilliamJonMcCann)