Parallel Builds with Jhbuild
This is related to gnomebug:312910 and gnomebug:133567
In the context of Jhbuild, parallel build refers to building multiple modules at the same time. This is desirable, because it allows you to make optimal use of SMP systems.
While you can currently make use of multiple CPUs by passing the -j flag to make, some modules have trouble with parallel make and there are often choke points where only one CPU is getting used. If we have multiple modules available that can be built in parallel though, we can potentially make better use of the CPUs.
The proposed model to support parallel builds is via multiple threads communicating via async queues (provided by the Queue module):
Modules +-----------------+ to build +--------------------+ | | ==================> | | | Mananger | | Worker | | Thread | | Threads | | | <================== | | +-----------------+ Build +--------------------+ status
The manager thread does the following:
- Assemble the set of module to build
start N worker threads
- for each module with no dependencies, add it to the "to build" queue
- read an item from the "build status" queue
- if the module in build successfully, add any unbuilt modules whose dependencies are now satisfied to the "packages to build" queue.
- if the module failed to build, remove modules that depend on it from the unbuilt modules set
- if the unbuilt modules set is non-empty, loop to step 4
otherwise, write N "exit" tokens to the "packages to build" queue, and join all the worker threads.
Each worker thread does the following:
- read a module name from the "to build" queue.
- if the module name is a special "exit" token, quit
- build the module as normal
- pass the build summary back to the manager thread via the "build status" queue.
- loop to step 1.
In the case of 1 worker thread, this should give identical behaviour to the current single-thread behaviour.
In the normal jhbuild build mode each module is built in sequence currently, and the user is prompted for input in the case of an error. If multiple modules are being built at once things are a bit more complex.
The easiest answer here would be to intermix messages from each build, as make -j does.
To handle user prompts, we'll need some way to control the printing of messages. This could be done by piping all command output through jhbuild (similar to how the jhbuild tinderbox mode works) and require that a mutex be held while printing to the screen. Then it is a simple matter of holding the mutex while prompting the user to fix the problem.
To get separated output for module builds, people will be required to use the tinderbox mode.
Tinderbox Index Page
Currently the index page generated by the Tinderbox mode is written as the build progresses. If multiple modules are being built at once, then this will need to change.
The simplest solution would be to make the manager thread write this page, and update it as it pops build summaries off the "build status" queue.
Even on a single processor system, the "update" stage of a build can be a bottleneck on slow network connections. For the most part, the "update" stage does not require much resources, so could be run in parallel to the builds. The above system could be fairly easily modified to handle this case:
- manager thread starts with an empty unbuilt packages set.
- worker threads do not run the "update" stage.
- a new type of thread is created to handle the update stage:
- for each module to be built (in topological sorted order) run the "update" stage
- push an "updated" or "error" summary to the "build status" queue
- when the manager thread receives an "updated" summary, it will add the package to the unbuilt modules set. Any relevant modules are pushed to the "to build" queue.
- manager thread won't exit if the unbuilt modules set is empty but there are still modules to build.
Current Working Directory
Changing the working directory could cause problems for other threads. All uses of os.chdir() should be checked. The subprocess module allows you to set a working directory for the child process, which should be sufficient for all current uses.
Builds on Clusters
DavydMadeley is looking at doing a wrapper for jhbuild to handle builds on clusters of machines that share a file system. This work is available in the jhfarmer module in CVS. Would the setup described here help in the implementation of such a system?