» Overview» Thread Pools» Thread Pools in Java 5.0» More About Java Thread Pools
The multicore processor architecture that is taking over desktop and server systems delivers its best advantage when running threaded applications. Numerous articles on this site explain the many ways to use threading to derive the performance benefits of multiple cores. One issue that comes up, however, as you begin using threads regularly is the problem of creating and managing many threads: How to do this efficiently with a minimum of fuss and bother.
This article examines the classic solution to this problem, the thread pool, and presents new features in the Java 5.0 release that make thread pools easy to use. I presume you are conversant with Java and have some basic exposure to using Java threads. If you lack the latter, there are many tutorials on the topic. You will quickly come to see that threads in Java are easier to use than in most other languages.
Programs that are amenable to threading are generally broken down into thread-sized pieces using either functional decomposition or data decomposition. Functional decomposition assigns a thread to each distinct task. (For example, in a word processor, assigning the task of real-time spell checking to a thread would be an instance of functional decomposition.) Data decomposition takes a data-oriented task and breaks it into smaller chunks of data and gives one chunk to each of several threads to process in parallel. For example, processing large data arrays use this technique. Each thread, let's say, processes one quarter of the array—with all four threads running in parallel on different processor cores.
Many programs use both types of decomposition. They assign specific tasks to identified threads and then create a bunch of threads to handle data processing. Managing this bunch of threads has certain challenges: creating threads is expensive, so you want to create the minimum needed for optimal performance. Moreover, when the threads are finished with their initial assignment, you'd rather they not die off (the default behavior) but stay alive to handle upcoming work. In addition, you need a mechanism that doles out the work to the available threads.
Writing such a mechanism yourself is certainly possible, but it would require a lot of delicate code. Fortunately, Java provides us with a thread pool, which is the instantiation of just this concept: a bunch of threads, a manager that keeps them alive when they’re done working, and a queue that feeds them tasks. The Java thread pool APIs were greatly improved with the Java 5.0 release, which provides some easy to use high-level constructs.
Let's look at some code. First we’ll start with a small routine that we want each thread to execute. To keep things simple, we’ll have the threads count from 7 down to 0. The class CountDown, shown next, does this plus it also creates a unique sequential identifier each time it's instantiated. With every iteration of the loop, the identifier and the current loop count are printed to the console.
/*
* This class counts down from 7 to 0, printing the task ID
* and the count value with each iteration. After printing,
* it then yields to another thread. The thread pool puts it
* back in the queue and it gets called for the next iteration.
*/
public class CountDown implements Runnable
{
protected int count = 8;
/*
* the following counter is incremented once
* each time the class is instantiated, giving each
* instance a unique number, which is printed in run()
*/
private static int taskCount = 0;
private final int id = taskCount++;
public CountDown() {}
/*
* print the id and the iteration count to the console, then
* yield to another thread.
*/
public void run()
{
while ( count-- > 0 )
{
System.out.printf( "Task %d, count = %d\n", id, count );
Thread.yield();
}
}
}
You will note two things about this little routine: it implements the Runnable interface, which all tasks assigned to threads must do; and once it prints an iteration count to the screen, it calls Thread.yield(). This function tells the operating system that it can swap out the thread at this point, should it need to do so. This is only a suggestion, but by placing it at convenient points in your logic, you play nice with the other threads. Its inclusion here is optional, but good practice.
Next we have the code for the thread pool. As you can see, it is surprisingly short. This brevity is due to the use of Executors, which are the new, high-level feature of Java 5.0 threads.
/*
* example of a fixed-size thread pool using the
* executors introduced in Java 5
*/
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class ThreadPool
{
public static void main(String[] args)
{
/*
* create a thread pool with four threads
*/
ExecutorService execSvc = Executors.newFixedThreadPool( 4 );
/*
* place six tasks in the work queue for the thread pool
*/
for( int i = 0; i < 6; i++ )
execSvc.execute( new CountDown() );
/*
* prevent other tasks from being added to the queue
*/
execSvc.shutdown();
}
}
That's all it takes—two statements: one to create the pool with four threads, one to assign the work. The final statement—the call to shutdown()—is optional. This code uses the ExecutorService to create the pool, queues six tasks (each one an instance of CountDown), and then closes up shop. The output from this routine demonstrates exactly how the pool works:
Task 2, count = 7
Task 3, count = 7
Task 0, count = 7
Task 1, count = 7
Task 2, count = 6
Task 3, count = 6
Task 1, count = 6
Task 0, count = 6
Task 2, count = 5
Task 1, count = 5
Task 0, count = 5
Task 2, count = 4
Task 3, count = 5
Task 0, count = 4
Task 2, count = 3
Task 0, count = 3
Task 2, count = 2
Task 0, count = 2
Task 2, count = 1
Task 0, count = 1
Task 2, count = 0
Task 1, count = 4
Task 3, count = 4
Task 1, count = 3
Task 0, count = 0
Task 4, count = 7
Task 3, count = 3
Task 1, count = 2
Task 5, count = 7
Task 4, count = 6
Task 3, count = 2
Task 1, count = 1
Task 5, count = 6
Task 4, count = 5
Task 3, count = 1
Task 1, count = 0
Task 5, count = 5
Task 4, count = 4
Task 3, count = 0
Task 5, count = 4
Task 4, count = 3
Task 5, count = 3
Task 4, count = 2
Task 5, count = 2
Task 4, count = 1
Task 5, count = 1
Task 4, count = 0
Task 5, count = 0
Notice that the initial lines show only Tasks 0-3, meaning that only four threads are running in the pool—just as we'd expect. Task 4 does not begin until one of the previous tasks has reached 0. And Task 5 does not begin until a second Task has been completed. At the end, again as we'd expect, only those two Tasks are running.
The Executors and ExecutorService are high-level constructs that shield you from lots of calls. This code would not have been nearly as short and intuitive prior to this release. The details of Executors are not terribly important. View them as high-level interfaces that mask many lower-level management calls. The Javadoc for java.util.concurrent.Executors can provide insight into some additional APIs. One of these, which I discuss next, is particularly useful.
In the previous example, we used a fixed size thread pool so as to illustrate in the output exactly what was going on under the covers. However, in real life, there are few occasions where you would hard-code the number of threads in the pool. These situations would occur in when, due to known limitations, you must carefully control the execution resource usage.
In most other cases, however, you will want to call the API Executors.newCachedThreadPool(), which creates a thread pool of an indeterminate size. The number of threads is enough to handle the initial tasks handed to the pool; after which, the threads are reused. But for this one change, the remaining code in this example would remain the same.
The preference for not hard-coding the thread count is consistent with a principle that appears time and again in parallel programming: the more you trust the operating system or runtime framework to manage threads, the better off (and more portable) your code will be. This is due to the fact that operating systems are very finely tuned when it comes to thread scheduling, so the less you limit their operation with explicit constraints, the faster your code will run, and the less you'll have to keep tweaking it for different runtime environments.
Now that you see how simple thread pools are in Java, you should try them out. I suspect you'll like the results.
Note: the example presented here is modified from code presented in Thinking in Java, 4th Edition, by Bruce Eckel (ISBN 0-13-187248-6), which is one of the best introductions to the topic of thread pools. Coverage of the topic, which surprisingly is not listed in the index, begins at page 1118.
Anderson Bailey is a developer with a longstanding interest in the techniques for using code to exploit processor features. He can be reached at chip.coder@gmail.com.