Friday, January 25, 2013

Nice Summary of How to Set Up Hive

Massoud Mazar  has a nice summary of how to set up Hive.

It is given for a particular version and operating system but it is easy to extend to other Unix or Linux flavors, and Hadoop versions from 0.20.x all the way through Hadoop 1.x.

Make sure you set conf/masters in all nodes that you set up according to the above instruction. The host(s) appearing in that configuration file need to be reachable by all (slave) nodes. (Slave nodes can each just carry "localhost" in conf/slaves. That's alright.) Even if ssh authorization has been sent for the password-less identity, I found that I had to ssh to the slave node, at least once from the master.

I was able to create clusters of 10 machines, grow and shrink these clusters, using Linux containers. Occasionally, it is required to clear all hdfs data and configuration files if proper start-up and shut-down sequence is not followed, and VERSION mismatches occur in the data layout.

Further, cluster set-up information for the current version can be found along with the Apache Hadoop documentation. See cluster set-up. (I should note that Mazar's brief and concise instructions were adequate and I didn't need to make references to this documentation to get things up and running.)


No comments: