Big Data Analytics Drives Efforts to Bring Hadoop to the Mainframe
Most mainframe shops consider Hadoop a distributed, open-source systems play. It conjures up visions of thousands or tens of thousands of Hadoop instances running on the cheapest, commodity-priced x86 servers. At least, that’s the lore around Hadoop with Google and Yahoo distributing queries across vast numbers of cheap servers to achieve fast response times.
In truth, this open-source technology, supported by the large Hadoop Apache community, underpins not only Google and Yahoo but also Facebook, Twitter, eBay and many other big-name Internet players that need to sift and sort the vast amounts of data generated by their operations. They use massive numbers of commodity servers running separate instances of Hadoop so they can replicate queries and process data among multiple machines knowing that some will fail but others will continue the workload and nobody will notice or care. Do you mind if your query runs a few tenths of a second slower?
IBM Joins the Hadoop Party
In April, IBM started making Hadoop more interesting to its mainframe base. Tucked into its Vivisimo acquisition announcement, IBM declared that its big data platform is based on open-source Apache Hadoop. The platform makes it easier for data-intensive applications to manage and analyze petabytes of big data by providing an integrated approach to analytics that helps them turn sheer data into actionable business insights.
IBM further explained that its evolving big data strategic platform promises the industry's broadest array of advanced, Hadoop-based business analytics, stream computing, data warehousing, integration, visualization, systems management, governance and consulting services. IBM, in effect, is taking a federated approach to the big data challenge by blending traditional data management technologies with what it sees as complementary new technologies, like Hadoop, that address speed and flexibility, and are ideal for data exploration, discovery and unstructured analysis.
In addition, the company is expanding this evolving analytics environment to run on various distributions of Hadoop, beginning with Cloudera, a top contributor to the Hadoop development community. The result: Cloudera Hadoop clients will be able to take advantage of IBM's big data platform to perform complex analytics.
comments powered by