background preloader

HBase - Apache HBase™ Home

HBase - Apache HBase™ Home
Welcome to Apache HBase™ Apache HBase™ is the Hadoop database, a distributed, scalable, big data store. When Would I Use Apache HBase? Use Apache HBase when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al.

HBase Un article de Wikipédia, l'encyclopédie libre. HBase est un système de gestion de base de données non-relationnelles distribué, écrit en Java, disposant d'un stockage structuré pour les grandes tables. HBase est inspirée des publications de Google sur BigTable. Comme BigTable, elle est une base de données orientée colonnes. HBase est un sous-projet d'Hadoop, un framework d'architecture distribuée. La base de données HBase s'installe généralement sur le système de fichiers HDFS d'Hadoop pour faciliter la distribution, même si ce n'est pas obligatoire.

Understanding HBase and BigTable - From The hardest part about learning HBase (the open source implementation of Google's BigTable), is just wrapping your mind around the concept of what it actually is. I find it rather unfortunate that these two great systems contain the words table and base in their names, which tend to cause confusion among RDBMS indoctrinated individuals (like myself). This article aims to describe these distributed data storage systems from a conceptual standpoint. After reading it, you should be better able to make an educated decision regarding when you might want to use HBase vs when you'd be better off with a "traditional" database.

The Hadoop ecosystem: the (welcome) elephant in the room (infographic) To say Hadoop has become really big business would be to understate the case. At a broad level, it’s the focal point of a immense big data movement, but Hadoop itself is now a software and services market of its very own. In this graphic, we aim to map out the current ecosystem of Hadoop software and services — application and infrastructure software, as well as open source projects — and where those products fall in terms of use cases and delivery model.

Jetspeed 2 - Jetspeed 2 Home Page Jetspeed is an Open Portal Platform and Enterprise Information Portal, written entirely in open source under the Apache license in Java and XML and based on open standards. All access to the portal is managed through a robust portal security policy. Within a Jetspeed portal, individual portlets can be aggregated to create a page. Each portlet is an independent application with Jetspeed acting as the central hub making information from multiple sources available in an easy to use manner. Jetspeed has been fully conformant to the Java Portlet 2.0 Standard since release 2.2.0 in May 2009. All releases prior, such as the 2.1.x releases, are conformant to the first Java Portlet Specification, the Java Portlet 1.0 Standard.

Hive! Bigtable: système de bases de données distribué version Google Aspirer l’intégralité du Net comme le fait Google et l’indexer – afin de satisfaire plus d’un milliard de requêtes par jours – nécessite un système d’accès aux données capable de trouver une information rapidement dans une volumétrie considérable. Contrainte supplémentaire, les temps de réponse doivent être très rapides pour ne pas éveiller l’impatience des internautes. Les SGBDR traditionnelles ne suffisent plus pour satisfaire de tels besoins: les temps de réponses deviennent trop important sur de telles volumétries (on parle ici de Petabytes ).

Apache Hadoop! How Hadoop Works? HDFS case study The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. The Hadoop library contains two major components HDFS and MapReduce, in this post we will go inside each HDFS part and discover how it works internally.

MySQL Engines: InnoDB vs. MyISAM – A Comparison of Pros and Cons The 2 major types of table storage engines for MySQL databases are InnoDB and MyISAM. To summarize the differences of features and performance, InnoDB is newer while MyISAM is older.InnoDB is more complex while MyISAM is simpler.InnoDB is more strict in data integrity while MyISAM is loose.InnoDB implements row-level lock for inserting and updating while MyISAM implements table-level lock.InnoDB has transactions while MyISAM does not.InnoDB has foreign keys and relationship contraints while MyISAM does not.InnoDB has better crash recovery while MyISAM is poor at recovering data integrity at system crashes.MyISAM has full-text search index while InnoDB has not. In light of these differences, InnoDB and MyISAM have their unique advantages and disadvantages against each other. They each are more suitable in some scenarios than the other.

Sqoop BigTable Un article de Wikipédia, l'encyclopédie libre. BigTable est un système de gestion de base de données compressées, haute performance, propriétaire, développé et exploité par Google[1]. Chez Google, BigTable est stockée sur le système de fichiers distribué GoogleFS. Google ne distribue pas sa base de données mais propose une utilisation publique de BigTable via sa plateforme d'application Google App Engine.

Transcript of HBase for Architects Presentation - Nick Dimiduk I was invited to speak at the Seattle Technical Forum’s first ”Big Data Deep Dive”. The event was very well organized and all three presentations dove-tailed into each other quite well. No recording was made of the event, so this is a transcription of my talk based on notes and memory. The deck is available on slideshare, and embedded at the bottom of the post. Hi everyone, thanks for having me. My name is Nick Dimiduk, I’m an engineer on the HBase team at Hortonworks, contributing code and advising our customers on their HBase deployments.

Cordova About Apache Cordova™ Apache Cordova is a set of device APIs that allow a mobile app developer to access native device function such as the camera or accelerometer from JavaScript. Combined with a UI framework such as jQuery Mobile or Dojo Mobile or Sencha Touch, this allows a smartphone app to be developed with just HTML, CSS, and JavaScript. When using the Cordova APIs, an app can be built without any native code (Java, Objective-C, etc) from the app developer. Instead, web technologies are used, and they are hosted in the app itself locally (generally not on a remote http server).

A distributed, scalable, big data store with random, real time read/write access. by sergeykucherov Jul 15

Related:  ColumnolegwatersColumn OrientedHadoop Tools