New job opportunities are emerging for IT professionals in the field of "big data," the term used to describe how corporations gather vast amounts of real-time data about their customers and analyze that data to drive decision making and increase profitability.
A new job title -- data scientist -- is all the rage. A data scientist typically has a background in computer science or mathematics as well as the analytical skills necessary to find the proverbial needle in a haystack of data gathered by the corporation.
"A data scientist is somebody who is inquisitive, who can stare at data and spot trends," says Anjul Bhambhri, vice president of Big Data Products at IBM. "It's almost like a Renaissance individual who really wants to learn and bring change to an organization."
Unheard of 18 months ago, "data scientist" has exploded in popularity as a Google search term. The number of Google searches of "data scientist" hit peaks 20 times higher than normal in the last quarter of 2011 and the first quarter of 2012. It's a popular search term in high-tech hotspots such as San Francisco, Washington, D.C., and New York.
Among the U.S. companies looking to hire data scientists are PayPal, Amazon and HP. Indeed, the term "data scientist" is mentioned in 195 job listings on the Dice.com website for IT professionals.
IT departments also are adding data-centric developers and system administrators who specialize in tools such as the open source Apache Hadoop software. Hadoop is designed for data-intensive, distributed applications and used by such popular websites as Yahoo, Facebook, LinkedIn and eBay.
Hadoop is mentioned in 612 of 83,122 job listings on Dice.com. Among the companies looking to hire Hadoop software engineers and developers are AT&T Interactive, Sears, PayPal, AOL and Deloitte.
Hadoop "is an emerging skill," says Alice Hill, managing director of Dice.com. "Companies need to manage large-scale data operations, and the whole idea of Hadoop is that you can do it inexpensively. That works really well with what we're seeing in terms of the movement to the cloud."
Hill sees opportunities related to Hadoop for both entry-level and experienced IT staff, as well as for hardware and software specialists.
"Traditional hardware people need to figure out how to cluster in different environments. It's not just about buying a database and hooking it up to a hard drive. Now you have distributed databases that are hooked up to multiple servers and multiple hard drives," Hill says. Hadoop "is inexpensive, but it require somebody who really knows how to scale hardware."
Hill says Hadoop is also a good skill for IT professionals with relational database management experience to pursue. "If you really understand data structure and queries, there's going to be a lot of job opportunities," she adds.
Job opportunities for data scientists and Hadoop specialists are emerging across industries, from Web companies and e-retailers to financial services, healthcare, energy, utilities and media.
"There are so many directions you can go in with these [data management] skills," Hill says. "It's very fertile ground for experienced IT professionals, but also for people coming out with computer science degrees. It's a great area to specialize in."
Bhambhri says IT departments will be looking to hire new people in the area of big data as well as to retrain some of their existing staff to add data analysis and Hadoop-related programming and administrative skills. For example, IBM retrained 2,400 IT professionals at Big Data Bootcamps that it held for its customers and partners last year.
RELATED: Get Hadoop certified ... fast
"The IT departments really have to expand their data platforms and not be restricted to structured data repositories," Bhambhri says. "They have to bring in new sources of unstructured data in their platforms to answer the questions that C-level executives are asking for their decision-making processes. From an IT perspective, it's very important for the IT folks to not only identify these data sources but to work with their business counterparts to discover what other sources of data need to be seamlessly integrated into their platforms."
IBM has a new initiative called Big Data University aimed at training undergraduate and graduate students in the area of big data and exposing them to Hadoop. Launched last October, Big Data University has already attracted more than 14,000 students to register for its online courses. IBM offers six online courses related to Hadoop and big data.
"We are trying to get the students to really see the potential of big data and what the business outcomes can be from these new sources of data," Bhambhri says. "We're giving them use cases from retail, healthcare and telecommunications companies. We're showing them what was not possible before is possible now because of the work we have done with different customers in these different industries."
Bhambhri is optimistic about the career prospects for IT professionals with data management and Hadoop skills.
"In every industry, there is a lot of data that is getting captured, whether it is sensor data or log data or data coming from social media like Facebook and Twitter," Bhambhri says. "The volumes of the data are huge. So what has been happening is that a lot of our customers are capturing the data, but until now there was no technology that was available that they could use to analyze this data quickly in a cost-effective manner. That was a huge bottleneck. Now we read what Yahoo and Google are doing around Hadoop and MapReduce, and it certainly looks like these open source tools will solve that bottleneck."
Read more about infrastructure management in Network World's Infrastructure Management section.