- The problem
- The plan
- Patch Nautilus to log a message when the desktop is loaded and painted
- Patch gnome-panel to log a message when all the applets are loaded
- Patch gnome-session to log a message when all the clients are launched
- Implement negative key cache GConf
- Using DTrace, find chatty programs during login
- For popular distros, find a way to run strace on the whole session
- Document a way to ensure that popular distros have the right preload/readahead
- Document a way to ensure that popular distros have an updated GTK+ icon cache
- Related work
GOAL: login should take 3 seconds.
We need a one-step way to get login timings. Currently one must do many things by hand to even get an strace log or a bootchart; we need to automate this so that we can run these tests frequently. Otherwise, any performance fixes we put in will be worthless when someone adds more slow code. We need to be able to see regressions.
The timeline plots require programs to be modified to provide the necessary information ("instrumented" for buzzword-compliance). This post describes how to modify programs for use with the plot-timeline.py script.
It is hard to know when the login sequence is finished. Both Nautilus and the panel have many idle handlers, timers, threads, and other asynchronous processes that make it hard to know when an operation is finished. For example, when Nautilus loads the desktop, it asks gnome-vfs for an async directory load in a thread, which later causes icons to be created, which get loaded and painted in an idle handler, etc. We need to add code to emit a notification when the program is done loading. This can be hard to do, but it is fundamental to getting timings automatically.
Patch Nautilus to log a message when the desktop is loaded and painted
Patch gnome-panel to log a message when all the applets are loaded
Patch gnome-session to log a message when all the clients are launched
I.e. when the session's state machine goes to the IDLE state. Grep around to see where that happens.
Seems the idle state we are interested in is set in manager.c:1372, update_save_state(), just after launching all the clients in the start_list.
Implement negative key cache GConf
Everyone uses gconf_client_add_dir ("/apps/my-app", PRELOAD_RECURSIVE). However, this doesn't cache the non-existence of a key and will query the daemon every time an unset key is read. This means that we have a lot of roundtrips to the GConf daemon. For all the apps run at login, these roundtrips end up taking about 1 second (where this happens, plot of roundtrips).
The gconf client library needs to keep track if it has a complete cache (ie, nothing has been cleared or removed from the cache) and avoid roundtrips to the server for keys unset keys that fall under a previous gconf_client_add_dir.
Using DTrace, find chatty programs during login
We can optimize away things like read()s from files and stat()s by careful preloading. What we cannot optimize that way is when programs go to sleep because they do IPC with each other.
Examples: programs request hundreds of keys from GConf, and each one takes a round-trip to the GConf daemon (during login we spend about 1 whole second doing this!). Programs do D-Bus IPC during login. Programs wait for gnome-session to respond (and vice-versa).
We need to know:
- When a program starts blocking on a socket or pipe.
- When the program comes back to life after that operation.
- When a program goes to sleep by doing select()/poll().
- When a program wakes up from a select()/poll().
For each of these events, we need a timestamp and a stack trace. Later we can do the work of correlating who was waiting for what, and we can see if we can eliminate round-trips.
Rationale: Ideally, during the login sequence each process should just churn away within its CPU timeslice, and be done as fast as possible. We want to prevent context-switches; to do this we need to know if things do unnecessary IPC (we can mostly optimize away disk I/O with preloading).
For popular distros, find a way to run strace on the whole session
This used to work for openSUSE (or was it SLED 10?):
start an X server in :1 ("X -ac :1") DISPLAY=:1 strace -ttt -f -o /tmp/login.strace /usr/bin/dbus-launch --sh-syntax --exit-with-session /usr/X11R6/bin/gnome
For Debian and Ubuntu, instead of using /usr/X11R6/bin/gnome just use /usr/X11R6/bin/gnome-session.
Once this is done, you can run the plot-timeline.py script on the resulting strace log.
Document a way to ensure that popular distros have the right preload/readahead
openSUSE 10.2 has a bug where the /etc/rc.d/earlygdm script is not run by default. This does preloading of a bunch of libraries and binaries. To enable it,
cd /etc/rc.d/rc5.d ln -s ../earlygdm S05earlygdm
Document a way to ensure that popular distros have an updated GTK+ icon cache
In openSUSE, it is enough to run "SuSEconfig" as root.
Ubuntu boot time tests results using bootchart: http://wiki.ubuntu.com/BootCharting
Fedora 11 improve boot time effort: https://fedoraproject.org/wiki/Features/20SecondStartup