background preloader


Titan is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. Titan is a transactional database that can support thousands of concurrent users executing complex graph traversals in real time. In addition, Titan provides the following features: Download Titan or clone from GitHub. <dependency><groupId>com.thinkaurelius.titan</groupId><artifactId>titan-core</artifactId><version>1.0.0</version></dependency><! // who is hercules' grandfather? Continue with the Getting Started with Titan guide for a step-by-step introduction.

The Benefits of Titan · thinkaurelius/titan Wiki Titan is designed to support the processing of graphs so large that they require storage and computational capacities beyond what a single machine can provide. This is Titan’s foundational benefit. This section will discuss the various specific benefits of Titan and its underlying, supported persistence solutions. General Titan Benefits Support for very large graphs. Titan graphs scale with the number of machines in the cluster. Benefits of Titan with Cassandra Continuously available with no single point of failure. Benefits of Titan with HBase Tight integration with the Hadoop ecosystem. Titan and the CAP Theorem When using a database, the CAP theorem should be thoroughly considered (C=Consistency, A=Availability, P=Partitionability). “Despite your best efforts, your system will experience enough faults that it will have to make a choice between reducing yield (i.e., stop answering requests) and reducing harvest (i.e., giving answers based on incomplete data). Titan In-Memory

DEX high-performance graph database TinkerPop Graph database Graph databases are part of the NoSQL databases created to address the limitations of the existing relational databases. While the graph model explicitly lays out the dependencies between nodes of data, the relational model and other NoSQL database models link the data by implicit connections. Graph databases, by design, allow simple and fast retrieval[citation needed] of complex hierarchical structures that are difficult to model[according to whom?] in relational systems. Graph databases differ from graph compute engines. Background Graph databases, on the other hand, portrays the data as it is viewed conceptually. Graph Graph databases employ nodes, properties, and edges. A graph within graph databases is based on graph theory. Nodes represent entities or instances such as people, businesses, accounts, or any other item to be tracked. Graph models Labeled-property graph A labeled-property graph model is represented by a set of nodes, relationships, properties, and labels. Graph types

AllegroGraph RDFStore Web 3.0's Database Geospatial and Temporal Reasoning AllegroGraph stores geospatial and temporal data types as native data structures. Combined with its indexing and range query mechanisms, AllegroGraph lets you perform geospatial and temporal reasoning efficiently. Social Networking Analysis AllegroGraph includes an SNA library that treats a triple-store as a graph of relations, with functions for measuring importance and centrality as well as several families of search functions.

New Apache project will Drill big data in near real time August 16, 2012, 3:02 PM — Working with big data is a lot like dealing with the Heisenberg Uncertainty Principle: either you're going to have a massive amount of data on hand or you're going to be able to query that data in real time--never both. But now a new open source project has just been accepted as an Apache Software Foundation Incubation project that will let you do both: have your data and search it fast, too. Apache Drill is an ad-hoc query system based on Dremel, another big data system that, like Hadoop, was invented by Google engineers to not only manage large datasets but also perform interactive analysis in near real-time. To explain Drill, you can first examine the architecture of Hadoop, which uses the Hadoop distributed file system (HDFS) for storage and the MapReduce framework to perform batch analysis on whatever data is stored within Hadoop. Because Hadoop uses MapReduce to perform data queries, searches have to be done in batches.

COMBINATORIAL_BLAS: Combinatorial BLAS Library (MPI reference implementation) Authors: Aydın Buluç , John R. Gilbert , Adam Lugowski This material is based upon work supported by the National Science Foundation under Grant No. 0709385. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF) The Combinatorial BLAS is an extensible distributed-memory parallel graph library offering a small but powerful set of linear algebra primitives specifically targeting graph analytics. The Combinatorial BLAS is also the backend of the Python Knowledge Discovery Toolbox (KDT) . Download Read release notes . Requirements : You need a recent C++ compiler (gcc version 4.4+, Intel version 11.0+ and compatible), a compliant MPI implementation, and C++11 Standard library (libstdc++ that comes with g++ has them). Documentation : This is a reference implementation of the Combinatorial BLAS Library in C++/MPI. New in version 1.3 : Some features it uses:

For fast, interactive Hadoop queries, Drill may be the answer — Cloud Computing News twitter/cassovary Getting Started · thinkaurelius/titan Wiki The Graph of the Gods The examples in this section make extensive use of a toy graph distributed with Titan called The Graph of the Gods. This graph is diagrammed below. key: a graph indexed key. key *: a graph indexed key that must have a unique value. key: a vertex-centric indexed key. hollow head edge: a functional/unique edge (no duplicates). tail crossed edge: a unidirectional edge (can only traverse in one direction). Downloading Titan and Running the Gremlin Shell Unbeknownst to the Gods, there still lived one Titan. Titan can be downloaded from the Downloads section of the project repository (get the titan-all version). $ unzip Archive: creating: titan/ ... $ cd titan $ bin/ \,,,/ (o o) -----oOOo-(_)-oOOo----- gremlin> The Gremlin terminal is a Groovy shell. NOTE: Please refer to GremlinDocs for a easy to use Gremlin reference. Loading Data Into Titan The example below will load The Graph of the Gods dataset diagrammed above into Titan. Pluto’s Brothers

Welcome To Apache Giraph Titan: Big Graph Data with Cassandra pregel_paper Marko A. Rodriguez