Wool - fine grained
independent task parallelism in C
Short version: Wool is an extremely fast (low overhead) implementation of
independent task parallelism. It spawns and joins with a new task in
under 20 cycles on our development machine, a quite ordinary dual quadcore
AMD Opteron server. This is more than an order of magnitude faster than
comparable systems like TBB, Cilk++ and OpenMP, and gives Wool a definite
performance advantage over these systems.
Wool works with plain ordinary C and is released under GPL.
Long version: Wool is a C-library supporting fine
grained independent task parallelism. Ok, we'll explain what we mean.
-
Parallelism:
Modern microprocessors typically have more than one
processor core on the chip, making them multiprocessors. In order
to realize more than a fraction of their performance potential,
one must write parallel programs.
-
Task:
A task is a unit of parallel execution. Tasks can create other tasks,
so that the tasks of a program form a tree. Each task has its own control
flow, in contrast to for instance data parallelism where the same operation
is always applied to each data item in a collection.
-
Independent:
Two tasks are independent if none of them write a location that
the other task reads or writes. Two independent tasks can be executed in
arbitrary order or in parallel with no difference in observable behaviour
(for instance memory content when both tasks have completed). In independent
task parallelism, only independent tasks may be executed in parallel,
obviating the need for any synchronization between concurrent activities.
This is in contrast to the situation in more general multithreading
which allows dependencies between threads that need to be manged using
explicit synchronization.
-
Fine grained:
In order for independent task parallelism to be expressive
enough to be useful, task creation must be cheap, on the order of the
same cost as synchronization in more general threading models.
-
Supporting: Currently, Wool provides primitives for defining,
spawning and joining with tasks as well as for defining and invoking
parallel for-loops. Plans for the future include reducers (inspired
by hyperobjects in Cilk++) and a scalable memory allocation library
similar to that provided by TBB.
-
Library: Wool is implemented as a set of C functions to link with the
program and a header file defining a number of (rather complex) macros.
Thus Wool is not as convenient as for example Cilk(++) which is a
programming language in its own right, implemented using a compiler
which enables a lot saner error messages. Then, Wool is a research vehicle,
not a product, even if some documentation actually exist. On the other hand,
Wool will immediately benefit from any improvements in the compiler used to
compile the program and library.
-
C: Wool is written in and for C. It is probably usable with C++, but I
have not tested that combination. Most other current offerings are based
on C++ (Intel TBB) or C# (Microsoft TPL). These languages provide featurs
that make it easy to write powerful libraries; for Wool, we use the
C preprocessor and its macro facilities to achieve the same effect.
For more information, check the
Wool User's
Guide.
Download
Please drop me a line if you download Wool, and if you wish, I'll put
you on a mailing list of Wool users. Can be a good way to get help!
The current version is
0.1.5alpha. This is a snapshot of my current development version. Worth noting:
- The User Guide is updated with new information. Read it!
- There are even more options than before. When invoking
a Wool program, give it "--" before any options and arguments that
your code needs to see.
- Less options that you actually *need* to tweak, in particular task pool size and stealable tasks.
- Tasks can be invoked from ordinary C functions; main() does not need to be a task.
- Now supports TilePRO64 (and possily other Tilera chips).
- More example programs added, as well as a few papers and a bibliography.
- Do not read the source. You'll get lost in a twisted maze of conditional
compilation directives and never get out! ;-)
- Wool is now faster for programs that have very unbalanced task trees, thanks to transitive leap frogging.
The previous version is
0.1.2alpha. Worth noting:
- There are a lot more options enabled than are documented. When invoking
a Wool program, give it "--" before any options and arguments that
your code needs to see.
- Now supports IA64 under Linux (tested on SGI Altix).
- Not tested under SPARC Solaris.
- Supports event logging; compile with "-DLOG_EVENTS=1 -lrt" and you'll
get a lot of stuff to stderr. Program runs a bit slower, of course.
- Do not read the source. You'll get lost in a twisted maze of conditional
compilation directives and never get out! ;-)
- And of course, Wool is now a lot faster :-)
Alternate versions
Here is a version where main is not a task, better to use when
parallelizing a complex code. Use ROOT_FOR and ROOT_CALL when
invoking Wool code from sequential code.
Version
0.2
(yes, I know I should merge this functionality with the development version!).
Older versions
- 0.1.1 (friendlier to MacOS than 0.1).
- 0.1
Material for CSL lab meeting
A sequential
quicksort program and
rmw.
Quick instructions for unpacking and building Wool:
tar xzf wool-0.1.1.tgz
make
Contact
If have any questions, please contact Karl-Filip Faxén (kff in the
domain sics.se). If you download Wool, do drop me a line as well!