background preloader


Facebook Twitter

Matches are the New Hotness. How do you help a person without a job find one online?

Matches are the New Hotness

A search screen. A walk in graph databases v1.0.


Meronymy Corporation. SaaS In The Enterprise - Lenny Liebmann - For Unstructured Data, SaaS Matters. Mobile Home Bloggers Messages Webinars Resources.

SaaS In The Enterprise - Lenny Liebmann - For Unstructured Data, SaaS Matters

Graph Databases, NOSQL and Neo4j. Introduction Of the many different datamodels, the relational model has been dominating since the 80s, with implementations like Oracle, MySQL and MSSQL - also known as Relational Database Management System (RDBMS).

Graph Databases, NOSQL and Neo4j

Lately, however, in an increasing number of cases the use of relational databases leads to problems both because of Deficits and problems in the modeling of data and constraints of horizontal scalability over several servers and big amounts of data. There are two trends that bringing these problems to the attention of the international software community: Patentula. In the early 2000s BountyQuest attempted to crowdsource prior art discovery.


It failed. We believe it was ahead of its time. Today, the landscape is different. We are overcoming previous obstacles by building non-keyword based tools to implement crowdsourcing in patent analysis effectively. One part of our approach revolves around visualization; the others make up our patent pending secret sauce. The USTPO’s public dataset on US/International patents applications and grants has been approached in similar ways, and has yet to be fully utilized. Hunting Trolls with Neo4j! Allison Sparrow shared a link to Patentula, a company interested in finding better ways to explore patent data and hunt patent trolls.

Hunting Trolls with Neo4j!

What caught my attention is this quote from the video below: What we tried to do with it, is bypass any sort of keyword processing in order to find similar patents. The reason we’ve done this is to avoid the problems encountered by other systems that rely on natural language processing or semantic analysis simply because patents are built to avoid detection by similar keywords…we use network topology (specifically citation network topology) to mine the US patent database in order to predict similar documents. When dealing with a large text dataset, most folks jump right into NLP and semantic analysis, it’s interesting to learn when that’s not such a good idea. Check out the full video:

Manage Your Data: Data Management: Subject Guides. The MIT Libraries supports the MIT community in the management and curation of research data by providing the following services: Data Management Guide This Data Management and Publishing Guide is a practical self-help guide to the management and curation of research data throughout its life cycle.

Manage Your Data: Data Management: Subject Guides

It provides guidance on a range of topics, including: planning for data management, documentation/metadata, file formats, data organization, data security and backup, citing data, data integration, funder requirements, ethical and legal issues, and sharing and archiving data. Assistance with Creating Data Management Plans. Runaway complexity in Big Data... and a plan to stop it. Getting a Big Neo4j Test Box for Cheap! When embarking on a new Neo4j project, one of the things you have to figure out is where to run it.

Getting a Big Neo4j Test Box for Cheap!

Most of the time the answer is just your laptop. Other times, using Heroku works great. However, if you are at the stage of your testing where you have billions of nodes and relationships, you need something a little bigger. If you are not ready to commit to purchasing a 100k server for testing, then I suggest you borrow one for a short time. You can try to spin up an Amazon EC2 instance, the high memory large ones go up to 60 gigs of RAM.

The best deals however, are not found on their site. If you search the dedicated hosting offers forum for webnx, you’ll find threads like this: Yeah… that’s 128GB of RAM for $339 a month. Dedicated Hosting Offers. Data Analytics Supercomputer - LexisNexis. LexisNexis® Data Analytics Supercomputer is a powerful processing technology that provides unprecedented capabilities for managing hundreds of terabytes of data—up to a year’s worth of network traffic.

Data Analytics Supercomputer - LexisNexis

Its fast, scalable engines and parallel processing architecture integrate large volumes of disparate data for effective and reliable analysis. Quickly transform data into decisions Data Analytics Supercomputer offers the speed and scalability that conventional processing technologies cannot support. Lawrence Livermore National Laboratory: Contact Us. Address and Numbers Lawrence Livermore National Laboratory 7000 East Ave., Livermore, CA 94550-9234 (Deliveries) P.O.

Lawrence Livermore National Laboratory: Contact Us

Box 808, Livermore, CA 94551-0808 (Mail) Main Operator (925) 422-1100 Fax (925) 422-1370, Fax verification (925) 422-1100 Employment Verification Hot Line (925) 422-9348 Information Line The Public Affairs information line is available to answer questions about the Laboratory's mission, programs, and activities. To reach this service, or for information about news, community relations, or employee communications, call (925) 422-4599. You can also visit the Public Affairs Contacts page for information on contacting a Public Affairs subject matter expert. Running Large Graph Algorithms - Evaluation of Current State-of-the-Art and Lessons Learned. On the surface nothing appears more different than soft data and hard raw materials like iron.

Running Large Graph Algorithms - Evaluation of Current State-of-the-Art and Lessons Learned

Then isn’t it ironic, in the Alanis Morissette sense, that in this Age of Information, great wealth still lies hidden deep beneath piles of stuff? It's so strange how directly digging for dollars in data parallels the great wealth producing models of the Industrial Revolution. The piles of stuff is the Internet. It takes lots of prospecting to find the right stuff. Mighty web crawling machines tirelessly collect stuff, bringing it into their huge maws, then depositing load after load into rack after rack of distributed file system machines.

Structural Abstractions in Brains and Graphs. A graph database is a software system that persists and represents data as a collection of vertices (i.e. nodes, dots) connected to one another by a collection of edges (i.e. links, lines). These databases are optimized for executing a type of process known as a graph traversal. At various levels of abstraction, both the structure and function of a graph yield a striking similarity to neural systems such as the human brain. It is posited that as graph systems scale to encompass more heterogenous data, a multi-level structural understanding can help facilitate the study of graphs and the engineering of graph systems. Finally, neuroscience may foster a realization and appreciation of the various structural abstractions that exist within the graph. The Network: A Data Structure that Links Domains. What is HyperGraphDB? Recently we’ve seen a lot of activity in the graph database world.

Better understanding the space will help us make smarter decisions, so I’ve decided to reach out to the main players in the market and run a series of interviews about their projects and goals. The first in this series is about HyperGraphDB and Borislav Iordanov, his creator, has been kind enough to answer my questions. myNoSQL: What is HyperGraphDB? Graph Databases in Document Management. Graph database uses graph structures with nodes, edges, and properties to represent and store data.Compared with relational databases, graph databases are often faster for associative data sets, and map more directly to the structure of object-oriented applications.

They can scale more naturally to large data sets as they do not typically require expensive join operations. As they depend less on a rigid schema, they are more suitable to manage ad-hoc and changing data with evolving schema. Conversely, relational databases are typically faster at performing the same operation on large numbers of data elements.

Graph databases are a powerful tool for graph-like queries, for example computing the shortest path between two nodes in the graph. Hypergraph. Giant Global Graph. Well, it has been a long time since my last post here. So many topics, so little time. Some talks, a couple of Design Issues articles, but no blog posts. To dissipate the worry of expectation of quality, I resolve to lower the bar. More about what I had for breakfast. So The Graph word has been creeping in. Maybe it is because Net and Web have been used. The Net we normally use as short for Internet, which is the International Information Infrastructure. Using Neo4J to load and query OWL ontologies. Directed Edge - Blog - On Building a Stupidly Fast Graph Database.

It’s pretty clear to computer science geeks that Directed Edge is supposed to be doing groovy things with graphs. Multi-Relational Graph Structures: From Algebra to Application. Graph Databases and the Future of Large-Scale Knowledge Management. Breaking into the NoSQL Conversation. Semantic Web Community: I’m disappointed in us! Or at least in our group marketing prowess.

0830 - Cypher and Neo4j.