distributed systems

TwitterFacebook
Get flash to fully experience Pearltrees
http://incubator.apache.org/mesos/ Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks .

Mesos: Dynamic Resource Sharing for Clusters

Process Perfection

Well over a year ago, in a conversation with Alexis Richardson , I came up with a catchy acronym to articulate an idea that I had been kicking around as a simple way to respond to all of the Sturm und Drang in the press and the blogosphere about "lock-in", "data portability" and reliability of cloud computing providers. I said -- "You know what, mate, done properly, it would be like a RAID setup -- it would be an array of cloud providers. http://jroller.com/MasterMark/entry/raic_what_s_that
Posted by Eric Brewer on May 30, 2012 Sections Enterprise Architecture , Operations & Infrastructure , Architecture & Design ,

CAP Twelve Years Later: How the "Rules" Have Changed

http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed

Design and Implementation of a Real-Time Cloud Analytics Platform

http://www.oscon.com/oscon2011/public/schedule/detail/19836 Brendan Gregg is the lead performance engineer at Joyent, where he analyzes performance and scalability at any level of the software stack. He is the author of the upcoming book “Systems Performance” (Prentice Hall, 2013), and co-author of “DTrace” (Prentice Hall, 2011) and “Solaris Performance and Tools” (Prentice Hall, 2006). He was previously a performance lead and kernel engineer at Sun Microsystems where he developed the ZFS L2ARC , and later Oracle. He has also invented and developed numerous performance analysis tools, including some that are shipped by default in Mac OS X and Oracle Solaris 11. His recent work has included performance visualizations for illumos and Linux kernel analysis.

In Memory Data Grid Technologies

After winning a CSC Leading Edge Forum (LEF) research grant, I (Paul Colmer) wanted to publish some of the highlights of my research to share with the wider technology community. What is an In Memory Data Grid? It is not an in-memory relational database, a NOSQL database or a relational database. It is a different breed of software datastore. In summary an IMDG is an ‘off the shelf’ software product that exhibits the following characteristics: http://highscalability.com/blog/2011/12/21/in-memory-data-grid-technologies.html

IndexTank is now open source!

We are proud to announce that the technology behind IndexTank has just been released as open-source software under the Apache 2.0 License! We promised to do this when LinkedIn acquired IndexTank , so here we go: indextank-engine : Indexing engine indextank-service : API, BackOffice, Storefront, and Nebulizer We know that many of our users and other interested parties have been patiently waiting for this release. We want to thank you for your patience, for your kind emails, and for your continued support. http://engineering.linkedin.com/open-source/indextank-now-open-source
Teams from Princeton and CMU are working together to solve one of the most difficult problems in the repertoire: scalable geo-distributed data stores. Major companies like Google and Facebook have been working on multiple datacenter database functionality for some time, but there's still a general lack of available systems that work for complex data scenarios. The ideas in this paper-- Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS --are different. It's not another eventually consistent system, or a traditional transaction oriented system, or a replication based system, or a system that punts on the issue. It's something new, a causally consistent system that achieves ALPS system properties. http://highscalability.com/blog/2011/11/23/paper-dont-settle-for-eventual-scalable-causal-consistency-f.html

Paper: Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS

Graphity: An efficient Graph Model for Retrieving the Top-k News Feeds for users in social networks - (Current Session: Current)

http://www.rene-pickhardt.de/graphity-an-efficient-graph-model-for-retrieving-the-top-k-news-feeds-for-users-in-social-networks/ UPDATE: the paper got accepted at SOCIALCOM2012 and the source code and data sets are online UPDATE II: Download the paper (11 Pages from Social Com 2012 with Co Authors: Thomas Gottron, Jonas Kunze, Ansgar Scherp and Steffen Staab) and the slides I already said that my first research results have been submitted to SIGMOD conference to the social networks and graph databases track. Time to sum up the results and blog about them. you can find a demo of the system here
We've previously written about the importance of internal tooling for creating a culture of empowering engineers and building a leveraged business. Our first example was adding bash completion to a curl wrapper script. Today I'd like to describe some of the internal tooling we use to make ourselves more productive in the distributed service oriented architecture that we maintain in our production environment. The three things I'll be talking about are distributed tracing, profiling across a large group of machines and building a REPL environment for working with your code on an ad hoc basis.

Engineering: Tools for Debugging Distributed Systems

Below I’ve collected some links to advanced computer science courses on-line. I’m concentrating on courses with good lecture notes, rather than video lectures, and I’m applying a rather arbitrary filter for quality (otherwise this becomes a directory with less semantic utility). This is the good stuff! But only a subset of it – any recommendations for good courses are gratefully received. I’m mainly interested in systems, data-structures and mathematics, so reserve the right to choose topics at will. Courses are organised by broad topic.

Advanced Computer Science Courses : Paper Trail

35+ Use Cases for Choosing Your Next NoSQL Database

We've asked What The Heck Are You Actually Using NoSQL For? . We've asked 101 Questions To Ask When Considering A NoSQL Database .