background preloader

Hbase

Facebook Twitter

Hadoop-hbase-streaming/src at master · wanpark/hadoop-hbase-streaming. HBase InputFormat/OutputFormat for Hadoop Streaming - こども(てれび) ■HBase InputFormat/OutputFormat for Hadoop Streaming 14:50 What is this? InputFormat/OutputFormat to use HBase tables as input/output of MapReduce in Hadoop Streaming. Repository Usage Setting Set up Hadoop 1.7.2, HBase 0.2.1 Download hadoop-hbase-streaming.jar from repository Edit $HADOOP_HOME/conf/hadoop-env.sh and add downladed jar to HADOOP_CLASSPATH Run Hadoop Streaming with selected format and jobconf options. Supported Options -jobconf map.input.table=<value> Input table name for the Map step -jobconf map.input.columns=<value> Column name to scan. -jobconf map.input.binary=<true|false> Optional. -jobconf map.input.timestamp=<true|false> Optional. -jobconf reduce.output.table=<value> Output table name -jobconf reduce.output.binary=<true|false> Set true when column names and cell values are Base64 encoded InputFormats org.childtv.hadoop.hbase.mapred.JSONTableInputFormat -inputformat=json -jobconf map.input.timestamp=true Dan <?

Dan <? Chicago Data Summit: Apache HBase: An Introduction. Apache HBase Do’s and Don’ts. I recently gave a talk at the LA Hadoop User Group about Apache HBase Do’s and Don’ts. The audience was excellent and had very informed and well articulated questions. Jody from Shopzilla was an excellent host and I owe him a big thanks for giving the opportunity to speak with over 60 LA Hadoopers.

Since not everyone lives in LA or could make it to the meetup, I’ve summarized some of the salient points here. For those of you with a busy day, here’s the tl;dr: HBase is good, but not an RDBMS or HDFS replacementGood configuration means good operationMonitor monitor monitor monitor monitor We at Cloudera are big fans of HBase. When to use HBase The most important consideration when looking at HBase is that, while it is a great solution to many problems, it is not a silver bullet.

With that caveat out the way – why should you use HBase? Supporting Services Assuming you’re convinced that HBase is a good fit for your application, here are some tips you need to consider when deploying it. The Apache HBase™ Reference Guide. The documentation below is generated using the default hbase configuration file, hbase-default.xml, as source. hbase.tmp.dir Temporary directory on the local filesystem.

Change this setting to point to a location more permanent than '/tmp', the usual resolve for java.io.tmpdir, as the '/tmp' directory is cleared on machine restart. Default: ${java.io.tmpdir}/hbase-${user.name} hbase.rootdir The directory shared by region servers and into which HBase persists. Default: ${hbase.tmp.dir}/hbase hbase.cluster.distributed The mode the cluster will be in. Default: false hbase.zookeeper.quorum Comma separated list of servers in the ZooKeeper ensemble (This config. should have been named hbase.zookeeper.ensemble).

Default: localhost hbase.local.dir Directory on the local filesystem to be used as a local storage. Default: ${hbase.tmp.dir}/local/ hbase.master.port The port the HBase Master should bind to. Default: 16000 hbase.master.info.port The port for the HBase Master web UI. Default: 16010 Default: 0.0.0.0.

Hbase. Recommender System at Scale Using HBase and Hadoop. Accelerating Big Data with Hadoop (HDFS, MapReduce and HBase) and Memcached. Low Latency "OLAP" with Hadoop and HBase. Low Latency "OLAP" with Hadoop and HBase.