Home · tinkerpop/frames Wiki Neo4j -- or why graph dbs kick ass JDBC access with distributed ruby : Ewout blogt. JDBC, java database connectivity, is the standard database driver interface on the java platform. Since java is ubiquitous, most database vendors provide JDBC drivers to access their data. When a ruby application requires using a legacy data source, sometimes the only option is going through JDBC. The database toolkit Sequel can use JDBC data sources, but only when running on JRuby. Although JRuby is compatible with ruby 1.8.7, not every application can be run on it, especially when it depends on gems that define C extensions. Fortunately, distributed ruby exists. Both server-side and client-side code is pretty straightforward. # server side require 'drb' require 'java' require 'rubygems' require 'sequel' DRb.start_service ' Sequel # It might be needed to instantiate the driver here, # so it is available when a connection string is given. NativeException Inside the JDBC driver, native java exceptions can be raised. DRb::DRbUnknownError: NativeException
Home · tinkerpop/pipes Wiki Pipes is a dataflow framework using process graphs. A process graph is composed of Pipe vertices connected by communication edges. A Pipe implements a simple computational step that can be composed with other Pipe objects to create a larger computation. Such data flow graphs allow for the splitting, merging, looping, and in general, the transformation of data from input to output. There are numerous Pipe classes that come with the main Pipes distribution. Once a good understanding of each Pipe accomplished, then using the framework is straightforward. Please join the Gremlin users group at for all TinkerPop related discussions. Pipes JavaDoc: 2.5.0 – 2.4.0 – 2.3.0 – 2.2.0 – 2.1.0 – 2.0.0 – 1.0 – 0.9 – 0.8 – 0.7 – 0.6 – 0.5 – 0.4 – 0.3 – 0.2 – 0.1 Pipes WikiDocs: 2.5.0 – 2.4.0 – 2.3.0 – 2.2.0 – 2.1.0 – 2.0.0 <dependency><groupId>com.tinkerpop</groupId><artifactId>pipes</artifactId><version>2.5.0</version></dependency>
Neo4j - a Graph Database that Kicks Buttox Update: Social networks in the database: using a graph database. A nice post on representing, traversing, and performing other common social network operations using a graph database. If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. For those of more modest means Neo4j, a graph database, is a good alternative. A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. A graph looks something like: For more lovely examples take a look at the Graph Image Gallery. Here's a good summary by Emil Eifrem, founder of the Neo4j, making the case for why graph databases rule: Most applications today handle data that is deeply associative, i.e. structured as graphs (networks). So relational database can't handle complex relationships. Neo4j's Key Characteristics
rubyrep: Home Home · tinkerpop/furnace Wiki How Twitter Uses NoSQL InfoQ has released a video of Twitter's Kevin Weil speaking at Strange Loop earlier this year on how the company uses NoSQL. Weil is quick to point out that Twitter is heavily dependent on MySQL. However, Twitter does employ NoSQL solutions for many purposes for which MySQL isn't ideal. Scribe Syslog stopped scaling for Twitter after a while, so instead it uses Scribe, a log collection framework created and open-sourced by Facebook. Twitter uses Scribe to write logs to Hadoop. Hadoop Twitter needs to store more data per day than it can reliably write to a single hard drive, so it needs to store data on clusters. Weil says MySQL isn't efficient at doing analytics at the scale Twitter needs. Pig This Pig script finds the top five pages of your site visited by people aged 18 to 25. Weil says the most natural way to "talk to" Hadoop is through Java. Hbase
Building a Twitter Filter With CherryPy, Redis, and tweetstream Background all the code is available at Since reading this post by Simon Willison I've been interested in Redis and have been following its development. Tools tweetstream - provides the interface to the Twitter Streaming APICherryPy - used for handling the web app side, no need for an ORMJinja2 - HTML templatingjQuery - for doing the AJAXy stuff and visual effectsredis-py - Python client for RedisRedis - the "database", look here for the documenation on how to install it Retrieving tweets The first thing we need to is retrieve tweets from the Twitter Streaming API. Here is the code for the filter_daemon.py, which when executed as a script from the command-line will start streaming tweets from Twitter that contain the words "why", "how", "when", "lol", "feeling" and the tweet must end in a question mark. The important part of this class is the push method, which will push data onto the tail of a Redis list. Web App Thats it.
Home · tinkerpop/rexster Wiki Rexster is a graph server that exposes any Blueprints graph through REST and a binary protocol called RexPro. The HTTP web service provides standard low-level GET, POST, PUT, and DELETE methods, a flexible extensions model which allows plug-in like development for external services (such as adhoc graph queries through Gremlin), server-side “stored procedures” written in Gremlin, and a browser-based interface called The Dog House. Rexster Console makes it possible to do remote script evaluation against configured graphs inside of a Rexster Server.1 Rexster Kibbles is a collection of various Rexster server extensions provided by TinkerPop. Please join the Gremlin users group at for all TinkerPop related discussions. Access graphs via the Basic REST API: or through RexPro: Rexster JavaDoc: 2.5.0 – 2.4.0 – 2.3.0 – 2.2.0 – 2.1.0 – 2.0.0 Rexster WikiDoc: 2.5.0 – 2.4.0 – 2.3.0 – 2.2.0 – 2.1.0 – 2.0.0
Why Use a Graph-Oriented Database? | YarcData Suppose you worked for a business analysis software company, and your CEO wanted you to look into the possibility of developing a product that would help investment banks detect insider trading. Further suppose that the CEO wanted you to brief her on your proposed technical approach to insider trading detection, and you’re standing in front of a whiteboard with a marker in your hand (you know that she likes hand-sketched diagrams), and you’ve decided to use a fictionalized version of this story you read in Bloomberg as an example. What would you draw on the whiteboard? I’m thinking it might look something like this: Let’s look at the last three arrows in the diagram. In this story, these represent a chain of causality – this thing happened, which caused this other thing to happen, and so on. There are other situations when it would be natural to draw a sort of diagram like this. Let’s consider a different example. Is there anything inherently graph-oriented about this information?
Finding your soulmate: autocomplete with Redis in Rails 3.1 Back in February the SeatGeek team open sourced a gem they call Soulmate that implements autocomplete using a Redis back end. “Soulmate finishes your sentences” as they say on their Github readme page. You can see it in action on SeatGeek.com. Soulmate is very useful: many of us need type ahead search behavior and using Redis is a great way to make it fast and snappy. But more importantly, Soulmate is a great example of how to create an index in Redis ahead of time, allowing for very fast lookups later. Take a few minutes to learn how Soulmate works; chances are you’ll be able to use the same approach in your own app with a completely different data set. I’ll get started today by showing you step by step how to setup a new Rails 3.1 app with Soulmate and Redis, using the jQuery UI autocomplete widget. How Soulmate was intended to work Here’s a conceptual diagram showing how you would normally use Soulmate: The Soulmate gem contains two components that you interact with directly:
Home · tinkerpop/blueprints Wiki Blueprints is a collection of interfaces, implementations, ouplementations, and test suites for the property graph data model. Blueprints is analogous to the JDBC, but for graph databases. As such, it provides a common set of interfaces to allow developers to plug-and-play their graph database backend. Moreover, software written atop Blueprints works over all Blueprints-enabled graph databases. Pipes: A lazy, data flow framework Gremlin: A graph traversal language Frames: An object-to-graph mapper Furnace: A graph algorithms package Rexster: A graph server The documentation herein will provide information regarding the use of Blueprints.1 Please join the Gremlin users group at for all TinkerPop related discussions. Blueprints JavaDoc: 2.4.0 – 2.3.0 – 2.2.0 – 2.1.0 – 2.0.0 – 1.2 – 1.1 – 1.0 – 0.9 – 0.8 – 0.7 – 0.6 – 0.5 – 0.4 – 0.3 – 0.2 – 0.1 Blueprints WikiDoc: 2.4.0 – 2.3.0 – 2.2.0 – 2.1.0 – 2.0.0