The Business Takeaways From HPC
Four IBM Champions share lessons for enterprises from research computing environments
By Rebecca Lubecki11/01/2019
Spanning across biomolecular simulation projects and computational accelerators for high-performance computing, to genome research, it’s evident Linux* distributions on POWER9* have a variety of capabilities outside the business world. As organizations examine these capabilities to stay innovative,
it’s worth taking a deeper dive into other organizations using the technology. What challenges are they facing? What are these workers doing to solve those issues? And most importantly, “What can our company learn from them?” To answer those questions, IBM Systems magazine, Power Systems, spoke to IBM Champions Volodymyr Kindratenko, John Stone, Christopher Sullivan and Simon Thompson about the challenges their respective labs face. While the issues research environments encounter on the daily might be lab-specific, you’ll find more often than not, their solutions are unlimited.
... We need to embrace different hardware to overcome limits and move into the future.
Christopher Sullivan: Assistant Director for Biocomputing, Center for Genome Research and Biocomputing, Oregon State University, Corvallis, Oregon. Photography by Evan Kaufman
Fueling Discovery With Data
Christopher Sullivan, who serves as the assistant director for Biocomputing in the Center for Genome Research and Biocomputing at Oregon State University has been working in the computational science field for 19 years. Today, he assists more than 25 departments, researching topics from identification of plankton in the oceans, to assembly of genomes. Working in biocomputing and life sciences, Sullivan has encountered his share of challenges that research environments face.
“The data becomes the main resource and creates all limits to the majority of our research work,” says Sullivan. Different groups in his lab collect video to identify plankton and can generate over 100 TB of data in just one week, while other groups are working with over 250 TB of data per season. With the vast amount of data, his lab often needs to create new pathways of analysis. As this data increases, they need machines that speed up the I/O when working with data and interact with different aspects of computing leveraging that speed like GPU. Sullivan says his team abandoned daily backups and realized it’s cheaper to re-sample the data and collect it again. It also costs the same as backup hardware.
However, Sullivan says this process hinders their rate of discovery. In order to ensure data collection remains without bias, more data is always needed, and storage limitations challenge those needs. Sullivan says groups working with his team often clear space by deleting data from past experiments and make room for new data. If they don’t process the data quickly enough, they fall behind in their work and risk increasing costs for hardware and labor.
Because the majority of large data sets they work with are related to artificial intelligence (AI) or work around genomes, they can use AC922 machines with four GPUs to help process data quickly. “We currently find about a 3x to 4x increase on CPU side using the POWER9 and about a 2x to 4x increase when processing data on the GPU,” he says, adding one group doing AI processing can segment on the CPU and classify on the GPU to a 4x to 8x speed for larger data sets. “For example, we have a group using sound data to identify owls in the forest and they use the AC922 for both segment and then classify, giving them a 6x speed over the x86 based machines they were using.”
What Businesses can Learn
Businesses can take note from this solution, according to Sullivan. What that does is create more transactions, which translates to more money in the business world. Sullivan feels that GPU technologies create the best pathways for new computing at his lab but understands that will change in time. “We feel a heterogeneous infrastructure is the only way to ensure our group can handle every challenge,” he says. “We’re not going to give up the hardware we’re all familiar with. However, we need to embrace different hardware to overcome limits and move into the future.”
Sullivan understands, however, that not every company can afford to upgrade in the future. In that case, he suggests the group break down their hardware’s role in their computing needs to find limits around CPU and then GPU. From there, he recommends testing the throughput for each on both x86 and PPC64LE. Then, evaluate the costs. That, he says, has become a business skill his lab applies in their everyday work. “We do a cost-benefit analysis on just about everything,” he says. “We many times have to create new more cost-effect methods of data collection or management to keep funds available for labor and other reoccurring costs.”