Microsoft this week is focused on the launch of its converged Windows 8 operating system, which a number of pundits and industry watchers have declared a make-or-break release for the company, but in the meantime Microsoft is setting its sights on the nascent but much-hyped big data market by giving organizations the capability to deploy and manage Hadoop in a familiar Windows context.
Two days ahead of the Windows 8 launch, Microsoft used the platform provided by the O'Reilly Strata Conference + Hadoop World here in New York to announce an expanded partnership with Hortonworks-provider of a Hadoop distribution and one of the companies that has taken a leading role in the open source Apache Hadoop project- and to unveil new previews of a cloud-based solution and an on-premise solution for deploying and managing Hadoop. The previews also give customers the capability to use Excel, PowerPivot for Excel and Power View for business intelligence (BI) and data visualization on the data in Hadoop.
Microsoft has dubbed the cloud-based version Windows Azure HDInsight Service, while the on-premise offering is Microsoft HDInsight Server for Windows.
"Microsoft's entry expands the potential market dramatically and connects Hadoop directly to the largest population of business analysts: users of Microsoft's BI tools," says Merv Adrian, research vice president, Information Management, at Gartner. "If used effectively, Microsoft HDInsight will enable a significant expansion of the scope of data available to analysts without introducing substantial new complexity to them."
Microsoft Promises to Reduce Big Data Complexity
"This provides a unique set of offerings in the marketplace," says Doug Leland, general manager of SQL Server Marketing at Microsoft. "For the first time, customers will have the enterprise characteristics of a Windows offering-the simplicity and manageability of Hadoop on Windows-wrapped up with the security of the Windows infrastructure in an offering that is available both on-premise and in the cloud. This will ultimately take out some of the complexity that customers have experienced with some of their earlier investigations of big data technologies."
"Big data should provide answers for business, not complexity for IT," says David Campbell, technical fellow, Microsoft. "Providing Hadoop compatibility on Windows Server and Azure dramatically lowers the barriers to setup and deployment and enables customers to pull insights from any data, any size, on-premises or in the cloud."
One of the pain points experienced by just about any organization that seeks to deploy Hadoop is the shortage of Hadoop skills among the IT staff. Engineers and developers with Hadoop chops are difficult to come by. Gartner's Adrian is quick to note that HDInsight in either flavor won't eliminate that issue, but it will allow more people in the organization to benefit from big data faster.
"The shortage of skills continues to be a major impediment to adoption," Adrian says. "Microsoft's entry does not relieve the shortage of experienced Hadoop staff, but it does amplify their ability to deliver their solutions to a broad audience when their key foundation work has been done."
"It opens the world of data outside traditional data stores to commonly used business tools in a way that will accelerate adoption by today's business analysts-but not without the infrastructure creation that will still depend on the skills of technical professionals using Hadoop technologies," he adds.
Seamlessly Move Big Data Between Cloud and On-Premise
One immediate benefit, he notes, is the ability to rapidly prototype in the cloud and then seamlessly bring the solution on-premise using the common platform offered by Windows Azure HDInsight Service and Microsoft HDInsight Server for Windows.
"Early experimentation and testing in the cloud eliminates early capital expenditures that can be a barrier to adoption," Adrian says. "For example, Amazon's AWS has been used for millions of Elastic MapReduce jobs, but those experiments stayed in the cloud. The connection of cloud to on-premise systems offers the promise of more rapid time to value-and Microsoft has established a lead in connecting to its widely used stack."
Microsoft's Leland notes the capabilities provided by the two flavors of HDInsight also allow customers to run most of their big data analytics on-premise but cloudburst to Azure when they need the additional compute.
"Ultimately it's an ideal scenario," he says. "You can manage your infrastructure cost, but really have unlimited scale when you need it. You can spin up a cluster in less than 10 minutes [in Azure]. Having that kind of additional capacity, that quickly, on demand, that simply, is certainly going to change the game."
Microsoft Committed to Open Source
For its part, Hortonworks is working closely with Microsoft on its big data initiatives.
"This is a joint engineering effort," explains John Kreisa, vice president of marketing for Hortonworks. "This is an engineering relationship in which our engineers have been working with Microsoft engineers in porting Hadoop, which is traditional Linux infrastructure, over to Windows."
"Hortonworks and Microsoft share a very common and aligned view on the Hadoop technology, the strategy and the evolution of this market," Leland adds. "Hortonworks is the only provider of Hadoop that is committed to delivering a 100 percent open source distribution of Apache Hadoop-no proprietary code-and that, as we came out earlier in the process, was an absolute stated direction for Microsoft. We have a commitment on our part to contribute back all the code that we build to the open source community. Given that alignment and approach, Hortonworks was really the natural partner for us."
For now, both flavors of HDInsight are in preview and free to use. Leland says Microsoft is encouraging customers to use the previews and help it round out the offering before its commercial launch.
Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for CIO.com. Follow Thor on Twitter @ThorOlavsrud. Follow everything from CIO.com on Twitter @CIOonline and on Facebook. Email Thor at [email protected]
Read more about business intelligence (bi) in CIO's Business Intelligence (BI) Drilldown.