A Deeper Look at POWER8 CAPI and Data Engine for NoSQL
In all of the press surrounding POWER8* technology and the OpenPOWER Foundation, you’ve likely seen mentions of Coherent Accelerator Processor Interface (CAPI) and you might have questions. This article explains CAPI technology and describes how it’s being used in the Data Engine for NoSQL, the first exploiter of the CAPI platform.
IT organizations must provide increased system performance as their workloads grow with demands for big data analysis, social media applications, technical computing, continuous customer connectivity and business-specific applications. Increases in processor performance can no longer satisfy the workload demands, so solutions must also come from system-level advances such as hybrid computing, processing engine customization and open platform development that enables cross-company innovation.
NoSQL and CAPI create a new tier of memory by attaching up to 40 TB of auxiliary flash memory to the processor without the latency issues of traditional I/O storage
Many solutions will keep improving price-performance through specific function accelerators. These accelerators are delivered by GPU- and field-programmable gate array (FPGA)-based solutions. We already see these solutions today with I/O-attached acceleration engines, such as an NVIDIA GPU placed on a PCIe card. But the overhead of controlling and communicating with the accelerator device often offsets the raw algorithm speedup resulting in less than optimal gain in overall application performance. This overhead comes from copying data to pinned memory pages and communicating with the accelerator device through interrupts and memory-mapped I/O. The programming complexity of dealing with these issues has made hardware acceleration unattractive for many applications.
The CAPI Difference
CAPI on POWER8 systems provides a high-performance platform for the implementation of client-specific, computation-heavy algorithms on an FPGA. This innovation can replace either application programs running on a core or custom acceleration implementations attached via I/O. CAPI removes the overhead and complexity of the I/O subsystem, allowing an accelerator to operate as part of an application. IBM’s platform enables higher system performance with a much smaller programming investment, allowing hybrid computing to be successful across a broader range of applications.
In the CAPI paradigm, the specific algorithm for acceleration is contained in a unit on the FPGA called the accelerator function unit (AFU or accelerator). The AFU provides applications with a higher computational unit density for customized functions to improve the performance of the application and offload the host processor. At the same time, the FPGA draws less power than the core for a better overall IT solution. Using an AFU for application acceleration allows for cost-effective processing over a wide range of applications. A key innovation in CAPI is that the POWER8 system contains custom silicon that enables the infrastructure to treat the client’s AFU as a coherent peer to the POWER8 processors.
Because of CAPI’s peer-to-peer coherent relationship with the POWER8 processors, data-intensive programs are easily offloaded to the FPGA, freeing the POWER8 processor cores to run standard software programs. Any algorithm that you can code into an FPGA is now possible on a POWER8 system using this low overhead mechanism. CAPI’s overall value proposition is that it significantly reduces development time for new algorithm implementations and improves application performance by connecting the processor to hardware accelerators and allowing them to communicate in the same language (eliminating intermediaries such as I/O drivers).
comments powered by