Advertisement

50,000-core megacluster powers cancer research

50,000-core megacluster powers cancer research

Minor detail: cost to run is close to $5,000 per hour


Cycle Computing can best be described as a utility supercomputing company. They use Amazon Web Service’s Elastic Compute Cloud (EC2), a cloud-computing service that offers tens of thousands of cores for remote use, add their own software on top of it, and present clients with a high-performance, cloud-based supercomputer that they can rent by the hour.

50,000-core megacluster powers cancer research

For those of you interested in the business model (and wondering how a client profits off another company’s technology), clients subscribe to Cycle’s service, and the per-hour charge goes directly to Amazon.

Brief history

Last September, Cycle made headlines when they put together a jaw-dropping 30,000-core cluster that they rented for $1,279 per hour to an unnamed pharmaceutical customer.

They amazed us again when they held a competition that gave away $10,000 worth of cloud computing to a worthy cause (that’s approximately eight hours on the 30,000-core cluster). They received research proposals in the fields of diabetes, Parkinson’s disease, and photovoltaic cells, but ultimately decided upon Morgridge Institute for Research’s proposal for stem cell research.

Shattering their own record

On the night of March 30, 2012, at the cost of a whopping $4,828.85 per hour (at its peak), Cycle ran up to 51,132 cores on a CentOS Linux system. The operation consisted of 6,742 computers from all over the globe, tapping data centers in four continents and every available Amazon region, including Tokyo, Singapore, and Sao Paolo, as well as Virginia, Oregon, and California.

50,000-core megacluster powers cancer research

By running across multiple continents, Cycle was able to ensure that they were getting all of the capacity that they needed.

What the company did is they layered their software (called “CycleServer”) on top of the EC2, and allowed a new job submission algorithm to dole out work to each region based on real-time measurements from that region. With this architecture in place, they essentially built a secured, automated 50,000-core supercomputer . . . all in less than two hours’ time.

The project used 58.78 terabytes of RAM and was secured with HTTPS, SSH, and 256-bit AES encryption. Also, it used a mix of 10 Gigabit Ethernet and 1 Gigabit Ethernet interconnects.

50,000-core megacluster powers cancer research

Cycle’s 50,000-core cluster scaled up steadily over three hours.

About the project

The client who requested this gargantuan project was Schrödinger, a New York pharmaceutical company and leader in computational chemistry research. They needed to test 21 million synthetic compounds for a potential cancer drug and so wanted to use Cycle’s megacluster as a method of virtual screening for proteins that were predicted to bind to a particular target implicated in multiple cancers.

To better understand the sheer magnitude of this project, consider that Amazon measures the workload of EC2 compute hours based on the CPU capacity of a 2007 Opteron or Xeon processor. With that in mind, the Schrödinger project workload equated to 109,927 hours, or a little under 13 years.

Cycle was able to complete the project in three hours.

As a result of the run, Schrödinger was able to identify compounds that they wouldn’t have been able to see if it weren’t for Cycle’s supercomputing service. The company is now acquiring the compounds, and will be testing them in the lab.

Outlook

Schrödinger, like most biotech companies, has its own computing cluster to perform virtual screening operations in-house, but the problem is that their system has only 1,500 cores; running the 21-million synthetic compound test on this system would have taken far too long to be practical.

This is one of the major hurdles facing pharmaceutical and biotechnology research companies today: to further their research, they require the massive capabilities of a supercomputer, but what they have in-house is simply not enough. What’s more, building the kind of data center that could house an actual 50,000-core supercomputer would cost tens of millions of dollars.

What Cycle achieved here is another example of how cloud computing is not the gimmick that everyone is portraying it to be. Rather, it’s a legitimate technology that will very soon be shaping the way in which all industries, from product design to manufacturing, energy development to aerospace research, continue to grow and develop. And it’s forward-thinking companies like Cycle Computing that will continue to break down barriers and help us better understand all that’s possible with this awe-inspiring technology.

Want to learn more about cloud computing? Check out our article that compares cloud computing and grid computing.

Via: CycleCompting.com

Advertisement



Learn more about Electronic Products Magazine

Leave a Reply