distributed medical research

The age of the supercomputer has fast given way to the age of the cluster – a collection of machines, individually low-powered, that when linked together can rival, or even exceed, the power of a supercomputer at a fraction of the cost. At the heart of this move towards cluster computing is the idea of “distribution”, that is, working out an approach to a problem that lets a swarm of independent agents each attack some small subproblem in isolation.

The idea of applying these techniques to medical research is not new either – the chief bottlenecks are identifying problems that will benefit from a distributed attack, and actually working out the means of breaking them up into subproblems, after which, of course, comes the problem of gathering up and linking the required number of processors. Here are three highlights from a vast and active field:

The folding@home project is based on the premise that personal computers are, for the most part, lying idle, or at the least vastly underutilised (things like websurfing and writing documents usually take up a small fraction of your computer’s processing power). The project, run out of Stanford University, distributes a program that you can download and install, and which will then detect when your computer is idle, fetch a protein folding problem from the folding@home site, solve it and send back the result. The central site then puts together millions of individual results to attack hard problems like the causes of Alzheimer’s Disease and cancer. Folding@home thereby leverages the connectedness of the internet to vastly increase the computational power available to crunch through these problems.

At the other end of the spectrum, a single company, Google, has put some of its vast computational resources to work to partner with startup Adimab in discovering new antibody drugs. Google has, perhaps more than any other company, mastered the art of building, maintaining and distributing a problem across a vast cluster of machines, in its efforts to index more and more of the web. Adimab, in its turn, is researching more efficient ways of developing precisely-targeted antibodies. This is a venture in its early stages, but the synergy is exciting, and both partners have an excellent track record.

And finally, Harvard is harnessing a global, distributed network of both computational- and brainpower, in a crowdsourced attack on type-1 diabetes. This is in some ways the most exciting development of the lot, a distributed model in which the hard problems of how to break a problem up, attack the individual subproblems, and combine the results meaningfully, have not been solved up front, but have themselves been handed over to the distributed network to solve. Along the way, social issues of credit for the research, publication priority, and the ability of individual researchers to get funding to solve a distributed problem, will have to be tackled and overcome. As the linked article notes, the expected benefits of the project are not just a better handle on type-1 diabetes, but “shared knowledge of how to share knowledge”:

One reason the NIH is funding this crowdsourcing experiment is the hope that its findings will stretch subsequent research dollars further. “Focusing across teams on solving very specific, well-defined problems could crack the code on collaboration and improve the yield of all of this medical research,” said Spradlin


One Response to “distributed medical research”

  1. vk Says:

    Hi Martin,

    This is a pretty fascinating blog that you have. I am less optimistic than you are regarding the utility of brute force computational methods of the kind you mention. For example, the folding@home project has been less successful than initially expected, as in it has provided a lot of data regarding computational studies of folding, but has not been terribly successful in understanding the fundamental mechanisms that drive folding, at least not more than other, less computational approaches.

    The reasons for this, and possibly these problems are the same for other such computational projects are essentially that these problems are hard (computationally) when regarded as targets of brute force, large scale simulations. This, of course, does not mean that these projects are not worth doing, but one should not be surprised if the results of these endeavors are less ambitious than claimed (e.g note that in the case of antibody design-most pharma companies use similar exhaustive search approaches). Maybe Adimab/Google have a better approach.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: