background preloader


Facebook Twitter

KDM - Welcome. Hadoop: How to Update without Update. NoSQL Data Modeling Techniques. NoSQL databases are often compared by various non-functional criteria, such as scalability, performance, and consistency.

NoSQL Data Modeling Techniques

This aspect of NoSQL is well-studied both in practice and theory because specific non-functional properties are often the main justification for NoSQL usage and fundamental results on distributed systems like the CAP theorem apply well to NoSQL systems. At the same time, NoSQL data modeling is not so well studied and lacks the systematic theory found in relational databases. In this article I provide a short comparison of NoSQL system families from the data modeling point of view and digest several common modeling techniques. I would like to thank Daniel Kirkdorffer who reviewed the article and cleaned up the grammar. To explore data modeling techniques, we have to start with a more or less systematic view of NoSQL data models that preferably reveals trends and interconnections.

CQL-2.2. CQL Syntax Preamble This document describes the Cassandra Query Language (CQL) version 3.


CQL v3 is not backward compatible with CQL v2 and differs from it in numerous ways. Note that this document describes the last version of the languages. However, the changes section provides the diff between the different versions of CQL v3. CQL v3 offers a model very close to SQL in the sense that data is put in tables containing rows of columns. Conventions To aid in specifying the CQL syntax, we will use the following conventions in this document: Apache spark + cassandra: Basic steps to install and configure cassandra and use it with apache spark with example.

To build an application using apache spark and cassandra you can use the datastax spark-cassandra-connector to communicate with spark.

Apache spark + cassandra: Basic steps to install and configure cassandra and use it with apache spark with example

Before we are going to communicate with spark using connector we should know how to configure cassandra. So following are prerequisite to run example smoothly. Following steps to install and configure cassandra. 142234894284612sparkconnectorindepth.pdf. TinkerPop3 Documentation. TinkerPop0 Gremlin came to realization.

TinkerPop3 Documentation

The more he realized, the more ideas he created. The more ideas he created, the more they related. Into a concatenation of that which he accepted wholeheartedly and that which perhaps may ultimately come to be through concerted will, a world took form which was seemingly separate from his own realization of it. However, the world birthed could not bear its own weight without the logic Gremlin had come to accept — the logic of left is not right, up not down, and west far from east unless one goes the other way.

TinkerPop1 What is The TinkerPop? "If I haven't found it, it is not here and now. " Upon their realization of existence, the machines turned to their machine elf creator and asked: Cassandra Migration To EC2. In January we migrated our entire infrastructure from dedicated servers in Germany to EC2 in the US.

Cassandra Migration To EC2

The migration included a wide variety of components, web workers, background task workers, RabbitMQ, Postgresql, Redis, Memcached and our Cassandra cluster. Cassandra Modeling Kata - Allegro Open Source. What is Apache Cassandra?

Cassandra Modeling Kata - Allegro Open Source

Apache Cassandra is an open source, distributed, high performance, cloud-friendly NoSQL database offering high availability with its masterless and no single-point-of-failure architecture. If you are new to Apache Cassandra and NoSQL, I highly recommend reading these great introductions: CQL. Cassandra – Setting up a cluster in EC2. This post is mainly a recopilation of different sources and what I came up with while creating my first Cassandra Cluster.

Cassandra – Setting up a cluster in EC2

Hope this helps. Kudos to. The cassandra.yaml configuration file. The cassandra.yaml file is the main configuration file for Cassandra.

The cassandra.yaml configuration file

Important: After changing properties in the cassandra.yaml file, you must restart the node for the changes to take effect. It is located in the following directories: Cassandra Package installations: /etc/cassandra/conf Cassandra Tarball installations: /conf DataStax Enterprise Package installations: /etc/dse/cassandra DataStax Enterprise Tarball installations: /resources/cassandra/conf The configuration properties are grouped into the following sections: Quick start The minimal properties needed for configuring a cluster.

Note: Values with note indicate default values that are defined internally, missing, commented out, or implementation depends on other properties in the cassandra.yaml file. Quick start properties The minimal properties needed for configuring a cluster. Cassandra – Setting up a cluster in EC2. DataStax CQL 3.0 Documentation. Start the CQL interactive terminal.

DataStax CQL 3.0 Documentation

Synopsis $ cqlsh [options] [host [port]] $ python cqlsh [options] [host [port]] Description. Getting started with cassandra. CQL : Cassandra Query Language. This article was created by Manisha Sethi.

CQL : Cassandra Query Language

To view the original article, and more postings by Manisha, visit the Xebia blog. Many of you know must be heard of or worked on Cassandra – The Columnar NoSQL Database. Most of us gets a feel from the term NoSQL as it is not much like RDBMS because of ease of use in terms of the Query Language. But with Cassandra 2.0 , It is providing a better Query Language Support using CQL3. Here in CQL 3.0, it has borrowed many of features from SQL like Table creation, OrderBy clause and many more. First Steps with Titan using Rexster and Scala. Titan is a distributed graph database that runs on top of Cassandra or HBase to achieve both massive data scale and fast graph traversal queries. There are benefits to Titan on only a single server and it seamlessly scales up from there. It’s great to know that Titan scales but when first starting out you may just need it on a single server, either for local development or powering a small production application.

However, there are so many Titan deployment options and associated tools & technologies that it can be difficult to know where to get started. This post assumes the standard architecture for a web application: a database running on a server that is remote from where the application runs. Therefore the application needs to communicate with the database over the network. Since Scala is the language of choice at Pongr, we’ll also be writing code in Scala and managing the project with sbt. We will accomplish the goals above using the following approach: Visualizing a Titan Graph Database. Visualizing a Titan Graph Database Build powerful and sophisticated visualizationapplications for your Titan graph database Visualizing Titan Databases with KeyLines KeyLines is a fully-featured Software Development Kit for building graph visualization software.

It’s flexible enough to be compatible with any graph database, but is an especially good fit with Titan. Titan is an open source distributed graph database, available under a subscription license terms from the Aurelius team, lead by Marko Rodriguez. Using distributed multi-machine clusters, Titan can store and efficiently traverse huge graphs containing billions of nodes and links. In addition, a flexible event model means KeyLines essentially become a visual query engine, allowing users to dynamically query the database in a visual way.

Applying Graph Theory and Network Science. Powers of Ten – Part II June 2, 2014 by Stephen Mallette “‘Curiouser and curiouser!’ Cried Alice (she was so much surprised, that for the moment she quite forgot how to speak good English); ‘now I’m opening out like the largest telescope that ever was!” — Lewis Carroll – Alice’s Adventures in Wonderland This article represents the second installment in the two part Powers of Ten series that discusses bulk loading data into Titan at varying scales. Part I. Introduction. Using Cassandra · thinkaurelius/titan Wiki. This is the documentation for Titan 0.4. Documentation for the latest Titan version is available at The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.

How to install Cassandra on Ubuntu? Building a flexible, Real-time Big Data Applications Platform on Cassandra - Clint Kelly. Using Cassandra · thinkaurelius/titan Wiki. Java - Cassandra-cli cant connect to remote cassandra server. Apache Cassandra remote access. Installing, Configuring, and Testing Apache Cassandra.