Hewlett-Packard's Autonomy subsidiary will release an add-on component to link the company's IDOL flagship search software to the Apache Hadoop data processing platform, it announced on 4 June as part of its HP Discover user conference this week in Las Vegas.
While Hadoop provides a good platform holding vast amounts of information, it offers little in the way of prebuilt analysis tools, said Matt Malden, Autonomy vice president. Organisations must write their own Java programs in the MapReduce framework to analyse their data.
With Autonomy's Hadoop package, users can instead embed an IDOL 10 engine in each node of their Hadoop cluster. They then can use IDOL's 500 functions to analyse and summarise data on the Hadoop implementation.
Autonomy's IDOL (Intelligent Data Operating Layer) provides enterprise users with the ability to conduct complex queries across large amounts of unstructured data, such as web pages, email and digitised office documents. Over 400 organisations use this software, according to the company.
All the functionality in IDOL itself can be applied to a Hadoop dataset, Malden said. The software offers such functionality as concept searching, where a search on one word will return results containing items with synonyms to that word. It can do sentiment analysis, offering a summary of how negative or positive the information in a set of documents may be. Such sentiment analysis can be used understand user satisfaction levels, perhaps over a select period of time. IDOL can also offer conceptual clustering, whereby it groups documents under broad themes, potentially simplifying a search process.
The pairing of Hadoop and IDOL was a natural fit, Malden said. "You don't need to move data into IDOL to use its functions. Whatever technology choice you make for data storage, we are able to process it," he said, adding that IDOL has 400 connectors to various other platforms, and can understand over 1,000 different data formats.
Autonomy is one of a number of data-centered software companies that have bridged their offerings to the increasingly popular Hadoop. Teradata partnered with Hadoop distributor Hortonworks to bridge Hadoop with its own data warehouse software. In a similar move, Oracle partnered with Cloudera, another Hadoop distributor.
HP Autonomy declined to reveal pricing or availability of the IDOL Hadoop plug-in.
Autonomy also announced on Monday that it has released a tool to analyse how users surf around a company's website, called Autonomy Optimost Clickstream Analytics. The software summarises user visits, as well as online purchases and other pertinent information.