background preloader

HBase - Apache HBase™ Home

HBase - Apache HBase™ Home
Welcome to Apache HBase™ Apache HBase™ is the Hadoop database, a distributed, scalable, big data store. When Would I Use Apache HBase? Use Apache HBase when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al.

http://hbase.apache.org/

HBase Un article de Wikipédia, l'encyclopédie libre. HBase est un système de gestion de base de données non-relationnelles distribué, écrit en Java, disposant d'un stockage structuré pour les grandes tables. HBase est inspirée des publications de Google sur BigTable. Comme BigTable, elle est une base de données orientée colonnes. HBase est un sous-projet d'Hadoop, un framework d'architecture distribuée. La base de données HBase s'installe généralement sur le système de fichiers HDFS d'Hadoop pour faciliter la distribution, même si ce n'est pas obligatoire.

Understanding HBase and BigTable - Jimbojw.com From Jimbojw.com The hardest part about learning HBase (the open source implementation of Google's BigTable), is just wrapping your mind around the concept of what it actually is. I find it rather unfortunate that these two great systems contain the words table and base in their names, which tend to cause confusion among RDBMS indoctrinated individuals (like myself). This article aims to describe these distributed data storage systems from a conceptual standpoint. After reading it, you should be better able to make an educated decision regarding when you might want to use HBase vs when you'd be better off with a "traditional" database.

Jetspeed 2 - Jetspeed 2 Home Page Jetspeed is an Open Portal Platform and Enterprise Information Portal, written entirely in open source under the Apache license in Java and XML and based on open standards. All access to the portal is managed through a robust portal security policy. Within a Jetspeed portal, individual portlets can be aggregated to create a page. Each portlet is an independent application with Jetspeed acting as the central hub making information from multiple sources available in an easy to use manner. Jetspeed has been fully conformant to the Java Portlet 2.0 Standard since release 2.2.0 in May 2009. All releases prior, such as the 2.1.x releases, are conformant to the first Java Portlet Specification, the Java Portlet 1.0 Standard.

22 free tools for data visualization and analysis You may not think you've got much in common with an investigative journalist or an academic medical researcher. But if you're trying to extract useful information from an ever-increasing inflow of data, you'll likely find visualization useful -- whether it's to show patterns or trends with graphics instead of mountains of text, or to try to explain complex issues to a nontechnical audience. There are many tools around to help turn data into graphics, but they can carry hefty price tags. The cost can make sense for professionals whose primary job is to find meaning in mountains of information, but you might not be able to justify such an expense if you or your users only need a graphics application from time to time, or if your budget for new tools is somewhat limited.

Hive! Bigtable: système de bases de données distribué version Google Aspirer l’intégralité du Net comme le fait Google et l’indexer – afin de satisfaire plus d’un milliard de requêtes par jours – nécessite un système d’accès aux données capable de trouver une information rapidement dans une volumétrie considérable. Contrainte supplémentaire, les temps de réponse doivent être très rapides pour ne pas éveiller l’impatience des internautes. Les SGBDR traditionnelles ne suffisent plus pour satisfaire de tels besoins: les temps de réponses deviennent trop important sur de telles volumétries (on parle ici de Petabytes ).

Apache Hadoop! MySQL Engines: InnoDB vs. MyISAM – A Comparison of Pros and Cons The 2 major types of table storage engines for MySQL databases are InnoDB and MyISAM. To summarize the differences of features and performance, InnoDB is newer while MyISAM is older.InnoDB is more complex while MyISAM is simpler.InnoDB is more strict in data integrity while MyISAM is loose.InnoDB implements row-level lock for inserting and updating while MyISAM implements table-level lock.InnoDB has transactions while MyISAM does not.InnoDB has foreign keys and relationship contraints while MyISAM does not.InnoDB has better crash recovery while MyISAM is poor at recovering data integrity at system crashes.MyISAM has full-text search index while InnoDB has not. In light of these differences, InnoDB and MyISAM have their unique advantages and disadvantages against each other. They each are more suitable in some scenarios than the other.

Big Data Is As Misunderstood As Twitter Was Back In 2008 Boonsri Dickinson, Business Insider In 2008, when Howard Lindzon started StockTwits, no one knew what Twitter was. Obviously, that has changed. Now that Twitter is more of a mainstream communication channel, Lindzon has figured out the secret to getting past all the noise on Twitter. By using human curation, StockTwits can serve up relevant social media content to major players like MSN Money. Sqoop Cordova About Apache Cordova™ Apache Cordova is a set of device APIs that allow a mobile app developer to access native device function such as the camera or accelerometer from JavaScript. Combined with a UI framework such as jQuery Mobile or Dojo Mobile or Sencha Touch, this allows a smartphone app to be developed with just HTML, CSS, and JavaScript. When using the Cordova APIs, an app can be built without any native code (Java, Objective-C, etc) from the app developer. Instead, web technologies are used, and they are hosted in the app itself locally (generally not on a remote http server).

Semantic Bits What Machine Learning Can Do And What Machine Learning Cannot Do Posted by Stacy on 24-08-2018 How Do Machines Learn An Introduction to Machine Learning Posted by Stacy on 14-08-2018 Deep learning based Object Detection and Instance Segmentation using Mask R-CNN in OpenCV (Python / C++) A few weeks back we wrote a post on Object detection using YOLOv3. The output of an object detector is an array of bounding boxes around objects detected in the image or video frame, but we do not get any clue about the shape of the object inside the bounding box. Wouldn’t it be cool if we could find a binary mask containing the object instead of just the bounding box? In this post, we will learn how to do just that. We will show how to use a Convolutional Neural Network (CNN) model called Mask-RCNN (Region based Convolutional Neural Network) for object detection and segmentation.

A distributed, scalable, big data store with random, real time read/write access. by sergeykucherov Jul 15

Related:  Column OrientedHadoop Tools