Databases

TwitterFacebook
Get flash to fully experience Pearltrees

Introduction to Databases - Stanford University

About The Course A bold experiment in distributed education, "Introduction to Databases" is being offered free and online to students worldwide, October 10 - December 12, 2011. Students have access to lecture videos, are given assignments and exams, receive regular feedback on progress, and participate in a discussion forum. Those who successfully complete the course will receive a statement of accomplishment. Taught by Professor Jennifer Widom, the curriculum draws from Stanford's popular Introduction to Databases course. A high speed internet connection is recommended as the course content is based on videos and online exercises. http://www.db-class.com/
http://www.mpi-inf.mpg.de/~neumann/rdf3x/ RDF-3X is the experimental RDF storage and retrieval system described in Thomas Neumann, Gerhard Weikum. RDF-3X: a RISC-style Engine for RDF . JDMR (formely Proc.

Thomas Neumann: D5: Databases and Information Systems (Max-Planck-Institut für Informatik)

What is Orient? OrientDB is an Open Source NoSQL DBMS with both the features of Document and Graph DBMSs. It's written in Java and it's amazing fast: can store up to 150,000 records per second on common hardware. Even if it's Document based database the relationships are managed as in Graph Databases with direct connections among records. You can traverse entire or part of trees and graphs of records in few milliseconds. Supports schema-less, schema-full and schema-mixed modes.

orient - NoSQL document database light, portable and fast. Supports ACID Tx, Indexes, asynch queries, SQL layer, clustering, etc - Google Project Hosting

http://code.google.com/p/orient/

neo4j open source nosql graph database

http://neo4j.org/ Spring Data Neo4j E-book Available Now!
A sample Entity-relationship diagram using Chen's notation In software engineering , an entity-relationship model ( ER model for short) is an abstract and conceptual representation of data . Entity-relationship modeling is a database modeling method, used to produce a type of conceptual schema or semantic data model of a system, often a relational database , and its requirements in a top-down fashion. http://en.wikipedia.org/wiki/Entity%E2%80%93relationship_model

Entity-relationship model - Wikipedia, the free encyclopedia

http://cassandra.apache.org/

The Apache Cassandra Project

The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages. Cassandra's ColumnFamily data model offers the convenience of column indexes with the performance of log-structured updates, strong support for materialized views , and powerful built-in caching. Cassandra is in use at Netflix , Twitter , Urban Airship , Constant Contact , Reddit , Cisco, OpenX, Digg, CloudKick, Ooyala, and more companies that have large, active data sets.
Overview YAGO2 is a huge semantic knowledge base, derived from Wikipedia , WordNet and GeoNames . Currently, YAGO2 has knowledge of more than 10 million entities (like persons, organizations, cities, etc.) and contains more than 120 million facts about these entities. http://www.mpi-inf.mpg.de/yago-naga/yago/

YAGO-NAGA - D5: Databases and Information Systems (Max-Planck-Institut für Informatik)

A collage of UML diagrams. Unified Modeling Language ( UML ) is a standardized general-purpose modeling language in the field of object-oriented software engineering . The standard is managed, and was created, by the Object Management Group . http://en.wikipedia.org/wiki/Unified_Modeling_Language

Unified Modeling Language - Wikipedia, the free encyclopedia

NoSQL is What? | Jeremy Zawodny's blog

http://blog.zawodny.com/2011/07/23/nosql-is-what/ I found myself reading NoSQL is a Premature Optimization a few minutes ago and threw up in my mouth a little. That article is so far off base that I’m not even sure where to start, so I guess I’ll go in order. In fact, I would argue that starting with NoSQL because you think you might someday have enough traffic and scale to warrant it is a premature optimization, and as such, should be avoided by smaller and even medium sized organizations. You will have plenty of time to switch to NoSQL as and if it becomes helpful. Until that time, NoSQL is an expensive distraction you don’t need.

Database Models: Hierarcical, Network, Relational, Object-Oriented, Semistructured, Associative and Context.

Each Column Has a Unique Name Certain fields may be designated as keys, which means that searches for specific values of that field will use indexing to speed them up. Where fields in two different tables take values from the same set, a join operation can be performed to select related records in the two tables by matching values in those fields. Often, but not always, the fields will have the same name in both tables. For example, an "orders" table might contain (customer-ID, product-code) pairs and a "products" table might contain (product-code, price) pairs so to calculate a given customer's bill you would sum the prices of all products ordered by that customer by joining on the product-code fields of the two tables. This can be extended to joining multiple tables on multiple fields. http://unixspace.com/context/databases.html
Gremlin is a graph traversal language. The documentation herein will provide all the information necessary to understand how to use Gremlin for graph query, analysis, and manipulation. Gremlin works over those graph databases/frameworks that implement the Blueprints property graph data model. Examples include TinkerGraph , Neo4j , OrientDB , DEX , InfiniteGraph , Rexster , and Sail RDF Stores . Gremlin is a style of graph traversal that can be natively used in various JVM languages . Currently, Gremlin provides native support for Java, Groovy , and Scala .

Home - GitHub

Associative model of data - Wikipedia, the free encyclopedia

The associative model of data is an alternative data model for database systems. Other data models, such as the relational model and the object data model, are record-based. These models involve encompassing attributes about a thing, such as a car, in a record structure. Such attributes might be registration, colour, make, model, etc.

Getting the most *out* of your data

PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data. You can download PyTables and use it for free . You can access documentation , some examples of use and presentations in the HowToUse section. PyTables is built on top of the HDF5 library , using the Python language and the NumPy package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code (generated using Cython ), makes it a fast, yet extremely easy to use tool for interactively browse, process and search very large amounts of data.