The age of the supercomputer has fast given way to the age of the cluster – a collection of machines, individually low-powered, that when linked together can rival, or even exceed, the power of a supercomputer at a fraction of the cost. At the heart of this move towards cluster computing is the idea of “distribution”, that is, working out an approach to a problem that lets a swarm of independent agents each attack some small subproblem in isolation.
The idea of applying these techniques to medical research is not new either – the chief bottlenecks are identifying problems that will benefit from a distributed attack, and actually working out the means of breaking them up into subproblems, after which, of course, comes the problem of gathering up and linking the required number of processors. Here are three highlights from a vast and active field:
The folding@home project is based on the premise that personal computers are, for the most part, lying idle, or at the least vastly underutilised (things like websurfing and writing documents usually take up a small fraction of your computer’s processing power). The project, run out of Stanford University, distributes a program that you can download and install, and which will then detect when your computer is idle, fetch a protein folding problem from the folding@home site, solve it and send back the result. The central site then puts together millions of individual results to attack hard problems like the causes of Alzheimer’s Disease and cancer. Folding@home thereby leverages the connectedness of the internet to vastly increase the computational power available to crunch through these problems.
At the other end of the spectrum, a single company, Google, has put some of its vast computational resources to work to partner with startup Adimab in discovering new antibody drugs. Google has, perhaps more than any other company, mastered the art of building, maintaining and distributing a problem across a vast cluster of machines, in its efforts to index more and more of the web. Adimab, in its turn, is researching more efficient ways of developing precisely-targeted antibodies. This is a venture in its early stages, but the synergy is exciting, and both partners have an excellent track record.
And finally, Harvard is harnessing a global, distributed network of both computational- and brainpower, in a crowdsourced attack on type-1 diabetes. This is in some ways the most exciting development of the lot, a distributed model in which the hard problems of how to break a problem up, attack the individual subproblems, and combine the results meaningfully, have not been solved up front, but have themselves been handed over to the distributed network to solve. Along the way, social issues of credit for the research, publication priority, and the ability of individual researchers to get funding to solve a distributed problem, will have to be tackled and overcome. As the linked article notes, the expected benefits of the project are not just a better handle on type-1 diabetes, but “shared knowledge of how to share knowledge”:
One reason the NIH is funding this crowdsourcing experiment is the hope that its findings will stretch subsequent research dollars further. “Focusing across teams on solving very specific, well-defined problems could crack the code on collaboration and improve the yield of all of this medical research,” said Spradlin