Multicore computing: now and future

Moore’s Law has been the golden rule in predicting the increasing of personal computing power for more than a decade, but change has arrived. A couple of years ago, Amdahl’s Law became the governing law (without an inauguration). Multicore computing is now the critical driven force of computer performance. As of October 2008, the 7400-series “Dunnington” of Intel’s Xeon offers hexa-core capacity.

The trend is, knowing it or not, computers will have more cores and your personal computer will rival the computing power of a supercomputer defined by the government only a decade ago. There are, of course, some debates going on among CPU designers whether to make a personal computer something like a shiny iPhone box, or to make it a little machine that can easily crack the current military code (http://www.eetimes.com/showArticle.jhtml?articleID=206105179). This is a critical decision that will affect the architecture of future computers: will the future generation of CPUs be a bundle of different special-purpose cores, or will they will be made of homogeneous, generic cores that can be assigned any task? (Or, maybe they should be the combination of the special-purpose cores such as GPUs and generic cores, as that seems to be the way our brains work.)

As a software developer, I clearly want to have more generic cores, as they are apparently my power base. One could suggest that a developer can try to access the power of things like a GPU (as folding@home seems to be doing well with), but the real questions are: (1) Do we really want to learn those lower-level libraries for each type of the special-purpose cores, in order to use them? (2) Do we really want our applications to be bound to special-purpose cores, which raises the cross-platform issues?

On the one hand, if a CPU comes with a lot of power that cannot be easily harnessed by an average programmer, then it will become only a few elite developers’ privilege. On the other hand, if average programmers like me cannot come up with a convincing argument that we can develop killer applications if we are given more generic power, then the industry has the right reason to doubt that generic power will be useful to the vast majority of people out there.

So, can we come up with some cool ideas how multicore computing may be good to average people like Joe and Jane (not just offering dream machines to evil hackers — for them to break into our bank accounts)?

I feel it is not easy to present a clear example of how I would use a 128-core CPU predicted to be available on a single notebook machine in less than 10 years (note that each core will surely run faster than the ones currently in your dual-core CPU — just imagine the power we will have at our fingers). It is hard for me to imagine an application that will invoke 128 processes simultaneously at any time. But I recognize that I am probably in a mind block. The fact that I cannot see the big picture now simply does not mean it does not exist.

My background as a computational physicist gave me some hints of how things might develop. Parallel computing is essential in solving many scientific problems that involve huge calculations. Computational scientists are used to think in the language of parallelism. So a 128-core computer is nothing new for them. It is just a shared-memory supercomputer condensed to a laptop box.

Molecular dynamics is a “lab rat” for parallel computing research, because it is relatively simple to implement and study. Given the fact that the Molecular Workbench does molecular dynamics on a personal computer, it may be a wonderful candidate for us to make a highly relevant case.

The Molecular Workbench currently benefits from multicore computing in two ways. First, there exists embarrassingly parallel problems that automatically utilize this power. For example, one can do multiple simulations at the same time. If there are enough cores available, each will run on a core independently. This needs no extra work from the programmer, because a simulation runs on a thread that is assigned by the JVM and OS to a core. It is interesting to note that the model containers in the Molecular Workbench could provide a way to decompose a larger system, if the simulations are synchronized by communication among them through scripts.

Second, the graphics part of a simulation is handled in a different thread than the calculation thread. Therefore, a single simulation can have its molecular dynamics calculations running on one core, and the graphics running asynchronously on another. This is most helpful when the refreshing rate needs to be high to render the motion smoothly.

The Java molecular dynamics code of the Molecular Workbench, however, has not been parallelized. I have been playing with java.util.concurrent to parallelize it, but at this point, it seems the gain won’t be measurable (if positive at all!) if we only have two cores, as is the case of most personal computers as of today. The overhead cost of task coordination may be higher than what it worths.

But suppose I had a 128-core CPU to back my pool of simulation threads, the story I am writing could be quite different.

Besides scientific simulations, 3D navigation environments such as SecondLife would also benefit enormously from multicore computing. The process of landscape download and construction can be easily decomposed into chunks and assigned to different cores.