distributed-computation

TwitterFacebook
Get flash to fully experience Pearltrees
amazon

UCBerkeley

http://tech.backtype.com/using-gearman-for-distributed-alerts At BackType we manage over 30 virtual machines (EC2). We've leveraged the latest technology in cloud computing, storage, and data processing to index over one billion online reactions (comment-like data) and organize those conversations to help users find the latest news and opinions. When you run dozens of machines, you're inevitably going to want some kind of monitoring in place. There are plenty of existing tools available such as monit , god , daemontools , etc for lower level systems management. Cloudkick provides free tools for managing virtual machines on EC2 and Slicehost with some basic monitoring.

Using Gearman For Distributed Alerts - BackType Technology

HBase Leads Discuss Hadoop, BigTable and Distributed Databases

Google's recent introduction of their Google Application Engine and its inclusion of access to BigTable has created renewed interest in alternative database technologies. A few weeks back InfoQ interviewed Doug Judd a founder of the Hypertable project which is inspired by Google's BigTable database. This week InfoQ has the pleasure of presenting an interview with HBase leads im Kellerman, Michael Stack, and Bryan Duxbury. HBase is is an open-source, distributed, column-oriented store also modeled after BigTable. 1. How would you describe HBase to someone first hearing about it? http://www.infoq.com/news/2008/04/hbase-interview
http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey

Persistent Data Structures and Managed References

The authors of this book share their experience and lessons learned while building an enterprise-wide Identity and Access Management system using an architectural approach called LIMA. Bill Wagner and Jon Skeet explain the basics of asynchronous operations in C# using the Async keyword. The session is spiced with live demos.
This article walks you through building an application with Ember.js, showing the MVC system, data binding, as well as how to build GUI and Touch support. Trotter Cashion introduces and demoes Chloe, a web server that handles real time data streaming between browsers and web applications written in any language and using any framework. Lúcio Ferrão talks about making the software appealing to the business by using a visual language and an integrated environment supporting the entire life cycle of application development. John Musser discusses the state of open web APIs, remarking its growth over time, the current technological trends, the market leaders, and other API-related aspects. http://www.infoq.com/presentations/Systems-that-Never-Stop-Joe-Armstrong

Systems that Never Stop (and Erlang)

Jason’s .plan

http://blogs.digitar.com/jjww/ If you like Rabbit and Warrens checkout RabbitMQ in Action in the sidebar. The goal was simple enough: decouple a particular type of analysis out-of-band from mainstream e-mail processing. We started down the MySQL road…put the things to be digested into a table…consume them in another daemon…bada bing bada boom. But pretty soon, complex ugliness crept into the design phase… You want to have multiple daemons servicing the queue?…no problem we’ll just hard code node numbers…what?
Started with few blog posts and with the help of many contributes, this project is now benchmarking much more then just protobuf and thrift. Thanks to all who looked at the code, contributed, suggested and pointed bugs. Three major contributions are from cowtowncoder who fixed the stax code, Chris Pettitt who added the json code and David Bernard for the xstream and java externalizable . The charts below are displaying the latest results. Note that the charts are scaled to best fit the results and they might be misleading in come cases. If you wish to see the numbers scroll down to the chart at the end of the page.

Benchmarking - thrift-protobuf-compare - Project Hosting on Go

http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking
http://avro.apache.org/ Developers interested in getting more involved with Avro may join the mailing lists , report bugs , retrieve code from the version control system, and make contributions .

Welcome to Apache Avro!

http://www.cloudera.com/blog/2009/10/introducing-cloudera-desktop/

Introducing Cloudera Desktop » Cloudera Hadoop & Big Data Bl

Today at Hadoop World NYC , we’re announcing the availability of Cloudera Desktop , a unified and extensible graphical user interface for Hadoop. The product is free to download and can be used with either internal clusters or clusters running on public clouds. At Cloudera, we’re focused on making Hadoop easy to install, configure, manage, and use for all organizations. While there exist many utilities for developers who work with Hadoop, Cloudera Desktop is targeting beginning developers and non-developers in an organization who’d like to get value from the data stored in their Hadoop cluster.
[2011-06-29] java-gearman-service 0.4 released. You can find it at Google Code . This version addresses the issues with text commands not working, the standalone server dieing from inactivity, and other miscellaneous bugs. I realize that it may not be clear on how to install Gearman. The following list is meant to be a quick reference to those who just want it installed. Gearman provides a generic application framework to farm out work to other machines or processes that are better suited to do the work. http://gearman.org/index.php

Gearman

mixi Engineers’ Blog

http://alpha.mixi.co.jp/blog/?s=mapreduce LSH は大量なデータから類似度が高い インスタンス のペアを高速に抽出してくれるアルゴリズムです. ここでインスタンスはデータ集合の一つの要素を表します. たとえば扱うデータが E-コマースサイトの購買ログであれば, インスタンスは各ユーザですし, 画像データ集合であれば, インスタンスは個々の画像データです. Microsoft Zune , うまい棒 , 任天堂DS
Built on Tokyo Tyrant . One of the fastest key-value databases [ benchmark ]. Tokyo Tyrant has been in development for many years and is used in production by Plurk.com , mixi.jp and scribd.com (to name a few)... It's production ready and Plurk.com is using it to store millions of keys on only two servers that run 3 lookup nodes and 6 storage nodes (these servers also run MySQL). How LightCloud differs from memcached and MySQL?

Open Source - LightCloud - Distributed and persistent key value

This section provides a reasonable amount of detail on every user-facing aspect of the MapReduce framework. This should help users implement, configure and tune their jobs in a fine-grained manner. However, please note that the javadoc for each class/interface remains the most comprehensive documentation available; this is only meant to be a tutorial. Let us first take the Mapper and Reducer interfaces.

Hadoop Map/Reduce Tutorial

Setting up Disco — Disco v0.1.2 documentation

This document helps you to install Disco from source, either on a single server or a cluster of servers. This requires installation of some Prerequisites . Background You should have a quick look at Technical Overview before setting up the system, to get an idea what should go where and why.
Message Passing Interface Forum Version 1.1: June, 1995. Beginning in March, 1995, the Message Passing Interface Forum reconvened to correct errors and make clarifications in the MPI document of May 5, 1994, referred to below as Version 1.0.

MPI: A Message-Passing Interface Standard

Developing High Performance Asynchronous IO Applications

Creating Financial Friction for Spammers Why do spammers send billions of email messages advertising ridiculous products that most of us would never in our lives consider buying? How can someone possibly make money from this endeavor when the vast majority of spam either gets filtered out or at the very best read and discarded by a disgruntled end user?