close window

Print print

How Power Systems Helps Clients Manage Today’s Data Challenges

Data, the lifeblood of any business, used to be structured and organized neatly as columns and rows. Today, data categories are no longer tidy. The volume, variety, velocity and veracity of data leads to complex data management challenges. Companies are grappling with a growing tangle of unstructured data types including social media data feeds, streaming video, system logs and human notes such as those related to patient-client interactions or call center user interactions. As the Internet of Things becomes part of our everyday lives, sensor data from devices joins the mix.

This volume of data is potentially as useful as it is unwieldy. Companies are gaining business insights from this rich array of data, which lets them target customers more effectively, improve operational efficiencies and weed out security risks. However, without new approaches to handling the data, companies will flounder.

“With CAPI-attached flash capability, managing a large Redis Labs database doesn’t necessarily require your system to have enormous amounts of memory.”
—Keshav Ranganathan, Power Systems analytics offering manager

IBM is developing innovations to help clients derive insights from the data and handle it efficiently. “Our clients recognize that they won’t be competitive in their markets if they don’t efficiently tackle and learn from these data sources,” says Steve Roberts, big data offering manager for IBM Power Systems*.

When demand for big data and analytics took off a few years ago, IBM responded by specifically designing POWER8* processors to accommodate large amounts of data. The POWER8 balanced architecture uses processor multithreading, large intelligent caches, and memory and I/O bandwidth to handle today’s workloads easily.

The POWER8 processor has 64 registers compared to 16 registers in an x86 processor

“When you look at the overall system design, the capability to provide in the neighborhood of 2.5 times I/O and memory bandwidth means you can move data in and out of the processor and among memory and CPU better,” says Linton B. Ward, IBM Distinguished Engineer and chief engineer big data, Power Systems.

POWER8 technology has an edge from the start with its numerous threads and cores that accelerate its processing capabilities. Larger cache and memory as well as more registers provide enhanced performance. The POWER8 processor has 64 registers compared to 16 registers in an x86 processor. More registers mean faster data handling, says Keshav Ranganathan, Power Systems analytics offering manager. “We start with the high-performance POWER8 processor and take a holistic view of all surrounding elements, memory, network, accelerators, et cetera, to design systems that deliver exceptional performance for a range of big data and analytics workloads,” he says.

Adding Acceleration

To enable greater client value, IBM introduced a unique innovative technology called Coherent Accelerator Processor Interface (CAPI) to attach accelerators and external devices. CAPI enables external processing engines to act like extensions of the processor. CAPI provides a high-bandwidth, low-latency path between external devices, the POWER8 core and the system’s open-memory architecture. CAPI adapters reside in regular Peripheral Component Interconnect Express (PCIe) x16 slots, and use PCIe Gen 3 as an underlying transport mechanism.

Because of CAPI’s peer-to-peer coherent relationship with the POWER8 processor, data-intensive programs are easily offloaded to the field-programmable gate array (FPGA), freeing the POWER8 processor cores to run standard software programs. CAPI can also be used as a base for flash memory expansion and that’s how it’s employed for the IBM Data Engine for NoSQL—Power Systems Edition.

IBM partner Redis Labs is the developer of Redis Key-Value NoSQL, an in-memory NoSQL database that gives quick response, which is important for Web and mobile applications where the data must be in memory. By adding CAPI capability and attaching flash to the Power Systems server, flash can act as extended memory. CAPI reduces the time it would take for a flash system to respond to NoSQL queries.

“Design features for data efficiency and resiliency are built in to ensure that the system has no problems dealing with constant load at high utilization.”
—Anirban Chatterjee, IBM Power Systems portfolio marketing

“With CAPI-attached flash capability, managing a large Redis Labs database doesn’t necessarily require your system to have enormous amounts of memory,” Ranganathan says. “It’s not exactly the same as having all the data in memory, but for the majority of applications having a chunk of data in memory and a large chunk of data in flash still provides the required performance characteristic.” In working with Redis Labs, IBM delivers the memory capacity, data-access latency and performance of 24 x86 servers and does it with a single Power Systems server with flash attached via CAPI, he contends.

IBM offers other in-memory solutions such as DB2* BLU that give clients a magnitude difference in performance compared to competing platforms. “IBM is developing the technologies that have impact for our clients in real-world situations,” Ranganathan says.

Encouraging Collaboration

Advances like CAPI come with a licensing model so partners are able to innovate around the basic core technologies thanks to IBM’s adoption of the OpenPOWER Foundation licensing model, Ward explains. IBM’s embrace of the open-source community could one day lead to advances in programmable gate arrays, flash, remote direct memory access and other technologies, he notes.

IBM has committed 3,500 developers and researchers to work on Spark-related projects

IBM also is collaborating directly with partners to provide solutions such as NVLink, a result of IBM’s collaboration with NVIDIA. NVLink enables the memory in a graphics processing unit (GPU) to access the data in the server memory, allowing the GPU to work with the CPU using a simple programming model that has lower latency. NVLink creates an asymmetrical multiprocessor, something the computer science world has wanted to create for a while, Ward says. Both CAPI and NVLink reduce the latency and improve the bandwidth, thereby lowering the cost of moving data between the computer and the accelerator.

Utilizing Hadoop

Acceleration isn’t the only focus for innovation. Companies are finding value in consolidating data stores into an Apache Hadoop-based environment and allowing employees and partners to gain insight from unstructured data. For example, “clients who are on the leading edge of using predictive analytics can use Hadoop to detect security breaches that are occurring by mining system logs and call logs,” Roberts says.

Hadoop helps companies economically augment relational data, such as that found in a data warehouse, with unstructured data. An analytics system using Hadoop is pushing pricing for the entire package—storage, network, compute and software stack—below $1 per usable gigabyte, Roberts says. That’s significant when compared to traditional data warehousing’s cost.

Hadoop also gives clients access to a huge ecosystem of partners and ISVs that are investing in the open-source community and adding tools and capabilities on a regular basis.

IBM is adding value to Hadoop for Power Systems. Analytics uses high memory and I/O bandwidth and IBM has developed an IBM Spectrum Scale* software system as an appliance that acts as an Elastic Storage server. This provides consolidated high-efficiency data storage for different analytics workloads eliminating the need to copy the data as is required in a typical Hadoop system, Roberts says. This consolidated data storage footprint delivers resiliency with native software RAID support while providing superior performance to larger infrastructures based on local storage and data replication.

Introducing Spark

However, as Hadoop is a batch environment, it isn’t ideal for those who need real-time results. For these workloads, Apache has developed Spark, which provides even greater acceleration for certain types of workloads and lets customers economically tap into innovation, Ward says. Apache Spark enables more complex analytic tasks like machine learning, interactive analytics and operational analytics. That capability comes to the fore when the business needs analytics results in real time to manage the supply chain, IT infrastructure and security vulnerabilities. Further, Spark uses SQL interfaces to leverage traditional data warehouse analytics tools such as SPSS*, Cognos* or SAS.

IBM and Redis Labs deliver the memory capacity, data-access latency and performance of 24 x86 servers in 1 Power Systems server with CAPI-attached flash

IBM is looking at how technologies from the OpenPOWER Foundation can help users of Spark, too. In June, IBM announced a major commitment to Spark with plans to embed it into its platforms, offer Spark as a service in the cloud and put 3,500 IBM developers and researchers to work on Spark-related projects. “Spark leverages in-memory and high bandwidth with low-latency networking and IBM believes those are very significant areas to go after,” Ward says. In addition, IBM’s partnership with NVIDIA around machine learning will offer advantages for workloads in the Spark environment.

Satisfying Needs

IBM is helping companies of all sizes benefit from these innovations. The technology allows small, midsized and enterprise organizations to take advantage of analytics although the challenges may be different for each. For instance, a large enterprise may need to reconcile multiple data silos. A midsized company may start with a smaller Hadoop platform and then scale out as its needs grow. “Whether you are small or large, you need to make sense of your business including products, customers and potential customers,” Ranganathan says.

Every company today requires analytics to be competitive, and the difference in strategy boils down to the skills and maturity of the tools available, Ward says. Larger companies tend to invest in a broader diversity of skills so they’re going to be ahead of some of the midrange companies. But some small companies have very deep skills and may lead in certain areas based on their focus and insight.

An analytics system using Hadoop is pushing pricing below $1 per usable gigabyte

The availability of data and its importance means companies will be built out of data in the not-too-distant future. “Because data will be their core service, it will transform a number of industries and we’ll see new business models based on data and analytics,” Ranganathan contends.

A Unique Position

Power Systems technology has a unique position in the marketplace, offering various form factors, which help clients with any in-memory needs, whether it’s a single server with a few hundred gigs of RAM all the way up to a client that needs 16 TB of RAM, notes Anirban Chatterjee, IBM Power Systems portfolio marketing. “We serve the entire spectrum, which is unique in the business. The architecture really is designed to provide efficiency and performance for both extremes and anywhere in between,” he says. “That said, it’s an enterprise architecture, and design features for data efficiency and resiliency are built in to ensure that the system has no problems dealing with constant load at high utilization.”

Power Systems architecture will continue to evolve as businesses transform. The chaos of raw data is transformed into insight when the data is handled skillfully and swiftly. Companies know the benefits that insight provides. Any size business can use the solutions offered by IBM and Power Systems to make sense of the ever-growing data feeds and thrive.