background preloader

The Apache Cassandra Project

The Apache Cassandra Project
Cassandra Welcome to Apache Cassandra The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages. Cassandra's data model offers the convenience of column indexes with the performance of log-structured updates, strong support for denormalization and materialized views, and powerful built-in caching.

Related:  Graph DBsBases de datos gratuitasBig Data - Gestion données de masseCassandra[._.]

Berkeley DB Java Edition Oracle Berkeley DB Java Edition is an open source, embeddable, transactional storage engine written entirely in Java. It takes full advantage of the Java environment to simplify development and deployment. The architecture of Oracle Berkeley DB Java Edition supports very high performance and concurrency for both read-intensive and write-intensive workloads.

Community Edition MySQL Community Edition is the freely downloadable version of the world's most popular open source database. It is available under the GPL license and is supported by a huge and active community of open source developers. The MySQL Community Edition includes: Available on over 20 platforms and operating systems including Linux, Unix, Mac and Windows. Download Now » BulkLoad To the Cassandra with the Hadoop To bulkload the data to the Cassandra using Hadoop Cassandra introduces new OutputFormat that is BulkOutputFormat. Cassandra has implemented it in such a way that each map or reduce(depends on implementation) will generate sstables with data provided and then stream them to Cassandra with sstableloader. Don’t worry you need not know about all this implementation details to use BulkOutputFormat, all you need to know is some job configuration and basic thrift call to create columns and mutations. Initial Setup to write Hadoop job with BulkOutputFormatCassandra Related configurationUsing BulkOutputFormatTo start with the development you need to have all the jars from Cassandra-1.1.x into classpath.All the Hadoop related jars should be in classpath.For execution of this job you should have all the Cassandra jars in classpath of Hadoop.

Titan: Distributed Graph Database Titan is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. Titan is a transactional database that can support thousands of concurrent users executing complex graph traversals in real time. In addition, Titan provides the following features: Download Titan or clone from GitHub. Read the Titan documentation and join the mailing list. <dependency><groupId>com.thinkaurelius.titan</groupId><artifactId>titan-core</artifactId><version>1.0.0</version></dependency><! PostgreSQL: About PostgreSQL is a powerful, open source object-relational database system. It has more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness. It runs on all major operating systems, including Linux, UNIX (AIX, BSD, HP-UX, SGI IRIX, Mac OS X, Solaris, Tru64), and Windows.

Representing time dependent graphs in Neo4j · SocioPatterns/neo4j-dynagraph Wiki Background Large-scale data collection efforts using wearable sensors to mine for proximity of individuals (for example, the SocioPatterns project) produce time-varying social graphs, where nodes are individuals, edges represent proximity/contact relations of individuals, and the proximity graph changes over time. Both nodes and edges can have rich attributes. Data formats for exchanging the time-dependent graphs are available, see for instance the GEXF format. Get to know Firebird in two minutes Introduction If you are reading this paper, this is probably your first encounter with the Firebird RDBMS. This paper will present to you the main features of the Firebird database. At the end, I am sure you will be anxious to download its lightweight installer and try it out yourself. History

Advanced Time Series with Cassandra Cassandra is an excellent fit for time series data, and it’s widely used for storing many types of data that follow the time series pattern: performance metrics, fleet tracking, sensor data, logs, financial data (pricing and ratings histories), user activity, and so on. A great introduction to this topic is Kelley Reynolds’ Basic Time Series with Cassandra. If you haven’t read that yet, I highly recommend starting with it. This post builds on that material, covering a few more details, corner cases, and advanced techniques. Indexes vs Materialized Views

Related:  Web ProgrammingCassandraDatabasegraphswide column storeNoSQLDatabasesVeille TechnoNoSQLmaikokojimaBig DataColumn NoSQL DBOSS NoSQL DBNoSQLdb in cloud, db clustering, nosqlBDDApache-ProjectsResources - LinksNoSQL and Graph Data Baseskey value stores