NEWS ANALYSIS
Look Before You
Leap Into Hadoop
Analysts and early users warn that most data centers
lack the analytics expertise needed for the open-
source big data technology. By Todd R. Weiss
NOW THAT APACHE.ORG has listed more than 150 en- terprises as Hadoop users — including JPMorgan Chase, IBM, Google, Booz Allen Hamilton and the New York Times — it seems likely that the big data management system could soon become all the
rage among corporate IT executives.
But analysts and early users warn that
companies should move slowly to take
advantage of the open-source technology,
noting that Hadoop requires extensive
training along with analytics expertise not
seen in many IT shops today.
Some also noted that the swollen ranks of
suppliers of Hadoop technology could soon
thin out, leaving some users without vendor
support for the complex technology.
To be sure, Hadoop clearly has some technical
advantages over traditional database management
systems, especially its ability to simultaneously handle
both structured data and unstructured information
such as video, audio and email messages. Hadoop
systems can also scale with minimal fuss and bother.
Forrester Research analyst James Kobielus pointed
out that only about 1% of U.S. enterprises are current-
ly using Hadoop in production environments. That
figure should remain small for now, perhaps growing
to 2% or 3% over the course of the year, he projected.
Concurrent Computer and eBay may be more
typical of today’s early Hadoop adopters; they use
the big data technology for specific applications
while maintaining traditional relational database
technology for the bulk of their IT operations.
As such IT operations build up expertise, they
can figure out more things to do with Hadoop,
Kobielus said.
Online auction house eBay stores unstructured data
on Hadoop-based clusters running on “thousands” of
nodes, while using relational databases for key tasks
like transaction processing, said Hugh Williams, vice
president of experience, search and platforms.
“We see value in using multiple technologies to
work with our data,” Williams said. “Hadoop is a
terrific choice for certain uses, while other tech-
nologies work alongside it for other purposes.”
In the long term, he said, the idea is to remain
“flexible in what technologies we use; we don’t see a
world [with] one unifying technology.”
Concurrent, a maker of video-streaming systems,
uses Hadoop to “do the heavy lifting, such as large-
scale data processing,” said William Lazzaro, direc-
tor of engineering.
Concurrent continues to use multiple relational
databases, including MySQL, PostgreSQL and
Oracle for other tasks, Lazzaro added.
Kobielus also warned that today’s market for
Hadoop technology is “turbulent,” with a fast-
growing community of vendors that continues
to “rapidly evolve.”
Marcus Collins, an analyst at Gartner, suggested
that IT managers take the time needed to seek out hard-to-find
Hadoop experts before getting too immersed in the technology.
“You need to train your staff and invest in analytics,” he said.
“It’s not trivial,” agreed eBay’s Williams. “We’ve put a lot
of training in place, so our engineers know how to use
Analysts and users also stressed the
need to educate corporate executives
on the use of an open-source system for
mission-critical applications.
Using it for a few under-the-radar kinds
of projects is one thing, but using it to
develop a massive system for all the world
to see is another thing entirely. u
Weiss is a freelance technology writer.
It’s not trivial.
We’ve put a lot
of training in place so our
engineers know how to use
Hadoop and can write code.
HUGH WILLIAMS, VICE PRESIDENT
OF EXPERIENCE, SEARCH AND
PLATFORMS, eBAY