background preloader

Distributed-computation

Facebook Twitter

Amazon

UCBerkeley. Using Gearman For Distributed Alerts - BackType Technology. At BackType we manage over 30 virtual machines (EC2). We've leveraged the latest technology in cloud computing, storage, and data processing to index over one billion online reactions (comment-like data) and organize those conversations to help users find the latest news and opinions. When you run dozens of machines, you're inevitably going to want some kind of monitoring in place. There are plenty of existing tools available such as monit , god , daemontools , etc for lower level systems management. Cloudkick provides free tools for managing virtual machines on EC2 and Slicehost with some basic monitoring. At BackType, we use a number of these. However, as we rapidly deploy new technology and features, we've required more customizable monitoring.

Gearman Gearman is a system to farm out work to other machines, dispatching function calls to machines that are better suited to do work, to do work in parallel, to load balance lots of function calls, or to call functions between languages. pass. HBase Leads Discuss Hadoop, BigTable and Distributed Databases. Google's recent introduction of their Google Application Engine and its inclusion of access to BigTable has created renewed interest in alternative database technologies.

A few weeks back InfoQ interviewed Doug Judd a founder of the Hypertable project which is inspired by Google's BigTable database. This week InfoQ has the pleasure of presenting an interview with HBase leads im Kellerman, Michael Stack, and Bryan Duxbury. HBase is is an open-source, distributed, column-oriented store also modeled after BigTable. 1. How would you describe HBase to someone first hearing about it? HBase is an open-source, distributed, column-oriented store modeled after the Google paper, "Bigtable: A Distributed Storage System for Structured Data" by Chang et al.

Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop. 2. 3. Clearly both projects set out to solve generally the same problem - open-source Bigtable. Persistent Data Structures and Managed References. Systems that Never Stop (and Erlang) Jason’s .plan. If you like Rabbit and Warrens checkout RabbitMQ in Action in the sidebar. The goal was simple enough: decouple a particular type of analysis out-of-band from mainstream e-mail processing.

We started down the MySQL road…put the things to be digested into a table…consume them in another daemon…bada bing bada boom. But pretty soon, complex ugliness crept into the design phase… You want to have multiple daemons servicing the queue? …no problem we’ll just hard code node numbers…what? You want dynamic load re-assignment when daemons join and die? You get the idea…what was supposed to be simple (decouple something) was spinning its own Gordian knot. A short search later, and we entered the world of message queueing. Open up your queue… Cutting to the chase, over the last 4 years there have been no shortage of open-source message queueing servers written. Apache ActiveMQ gets the most press, but it appears to have some issues not losing messages. That leaves us with the carrot muncher… That’s it.

Benchmarking - thrift-protobuf-compare - Project Hosting on Go. The wiki moved to For discussions please use Started with few blog posts and with the help of many contributes, this project is now benchmarking much more then just protobuf and thrift. Thanks to all who looked at the code, contributed, suggested and pointed bugs.

Three major contributions are from cowtowncoder who fixed the stax code, Chris Pettitt who added the json code and David Bernard for the xstream and java externalizable. The charts below are displaying the latest results. Note that the charts are scaled to best fit the results and they might be misleading in come cases. If you wish to see the numbers scroll down to the chart at the end of the page. Benchmarks can be very misleading. Setup The following measurements were performed with revision r128 on Windows 7 64-bit using Sun's JVM 1.6.0_15 JRE 32-bit, with an Intel Core i7 920 CPU.

Total Time Serialization Time. Welcome to Apache Avro! Karmasphere. Introducing Cloudera Desktop » Cloudera Hadoop & Big Data Bl. Today at Hadoop World NYC, we’re announcing the availability of Cloudera Desktop, a unified and extensible graphical user interface for Hadoop. The product is free to download and can be used with either internal clusters or clusters running on public clouds. At Cloudera, we’re focused on making Hadoop easy to install, configure, manage, and use for all organizations. While there exist many utilities for developers who work with Hadoop, Cloudera Desktop is targeting beginning developers and non-developers in an organization who’d like to get value from the data stored in their Hadoop cluster.

By working within a web browser, users avoid the tedious client installation and upgrade cycle, and system administrators avoid custom firewall configurations. We’ve worked closely with the MooTools community to create a desktop environment inside of a web browser that should be familiar to navigate for most users. Initial applications for Cloudera Desktop include: Gearman. Mixi Engineers’ Blog. ミクシィの七尾です。 すでに1週間ほど経ってしまいましたが、去る2/22-2/23に米国のAviary("エイヴィアリー"と読みます)と共同でPhoto Hack Day Japanというハッカソンを行いました。 改めて参加者のみなさまと以下のスポンサー様に感謝させて頂きます。 当日は全部で23組の作品発表があり、審査では総合1位から3位までと特別賞が2組、各スポンサー様からAPI賞が選出されました。 ちゃっかり審査員としても参加させて頂きましたので、早速受賞作品の紹介をしたいと思います。 1位 Back to the Future (賞金30万円) メンバー: Theeraphol Wattanavekin, Rapee Suveeranont, Yoonjo Shin, Thiti Luang 利用API: Amazon / gettyimages / Leap Motion URL: 説明: Back to the Futureは、時間を超えて旅することのできるcoolなwebアプリです。 感想: 4分間という限られた時間でしたが、デモも素晴らしく、独自のコンセプトが光っていたいと思います。 2位 Before The Filter(賞金20万円) メンバー:Benjamin Watanabe, Antony Tran 利用API: Aviary URL: 説明: 沢山の素晴らしい画像編集ツールを利用することができますが、多くのユーザーが優れた写真に何が必要なのかをわかっていません。

感想: 写真に関するハックと言えば、素敵なフィルターとか、顔検出とか、画像合成のようなテクニカルなものだろうなと勝手に思い込んでいましたが、このチームは写真の撮影テクニックに焦点を当てていた点がユニークでした。 3位 VOCA Getty (賞金10万円) メンバー: Atsushi Onoda, Hiroshi Kanamura, Shinichi Segawa, Yasushi Takemoto 利用API: gettyimages / imagga 説明: VOCA gettyは写真をベースにした単語帳アプリです。 操作の流れ 特別賞 Na・Gu・Ri・A・I 概要. Open Source - LightCloud - Distributed and persistent key value. Distributed and persistent key-value database Features Built on Tokyo Tyrant. One of the fastest key-value databases [benchmark]. Tokyo Tyrant has been in development for many years and is used in production by Plurk.com, mixi.jp and scribd.com (to name a few)... Great performance (comparable to memcached!) Can store millions of keys on very few servers - tested in production Scale out by just adding nodes Nodes are replicated via master-master replication.

But that's not all, we also support Redis (as an alternative to Tokyo Tyrant)! Check benchmarks and more details about Redis in LightCloud adds support for Redis. Stability It's production ready and Plurk.com is using it to store millions of keys on only two servers that run 3 lookup nodes and 6 storage nodes (these servers also run MySQL).

How LightCloud differs from memcached and MySQL? Memcached is used for caching, meaning that after some time items saved to memcached are deleted. How LightCloud differs from redis and memcachedb? Setting up Disco — Disco v0.1.2 documentation. Developing High Performance Asynchronous IO Applications. Published on ONLamp.com ( See this if you're having trouble printing code examples by Stas Bekman 10/12/2006 Creating Financial Friction for Spammers Why do spammers send billions of email messages advertising ridiculous products that most of us would never in our lives consider buying? How can someone possibly make money from this endeavor when the vast majority of spam either gets filtered out or at the very best read and discarded by a disgruntled end user?

What makes spamming profitable is huge volume. Spamming is profitable when a bait message, be it a commercial spam or a phishing email, reaches a substantial number (usually millions) of recipients. Ken Simpson and Will Whittaker, formerly developers at ActiveState, founded MailChannels to solve the spam problem. By observing spammer behavior, the MailChannels team realized that spammers are impatient. Nowadays, the majority of spam is sent from botnets--vast, distributed networks of compromised Windows PCs. Superorganism - Wikipedia, the free encycl. A termite mound made by the cathedral termite A coral colony A superorganism is an organism consisting of many organisms.

The term was originally coined James Hutton (1726-1797), the "Father of Geology" in 1789. See the discussion of Geophysiology for more on the use of this term in geological and ecological contexts. The term is now usually meant to be a social unit of eusocial animals, where division of labour is highly specialised and where individuals are not able to survive by themselves for extended periods of time. Ants are the best-known example of such a superorganism, while the naked mole rat is a famous example of the eusocial mammal.

The Gaia hypothesis of James Lovelock[2] and the work of James Hutton, Vladimir Vernadsky and Guy Murchie, have suggested that the biosphere can be considered a superorganism. Superorganisms are important in cybernetics, particularly biocybernetics. Superorganic in social theory[edit] The term superorganic was adopted by anthropologist Alfred L. Grid Computing Planet. The Free Lunch Is Over: A Fundamental Turn. By Herb Sutter The biggest sea change in software development since the OO revolution is knocking at the door, and its name is Concurrency. This article appeared in Dr. Dobb's Journal, 30(3), March 2005. A much briefer version under the title "The Concurrency Revolution" appeared in C/C++ Users Journal, 23(2), February 2005. Update note: The CPU trends graph last updated August 2009 to include current data and show the trend continues as predicted.

Your free lunch will soon be over. The major processor manufacturers and architectures, from Intel and AMD to Sparc and PowerPC, have run out of room with most of their traditional approaches to boosting CPU performance. And that puts us at a fundamental turning point in software development, at least for the next few years and for applications targeting general-purpose desktop computers and low-end servers (which happens to account for the vast bulk of the dollar value of software sold today). The Free Performance Lunch Okay. But cache is it. GPGPU.