background preloader

Riak Overview

Riak Overview

Faster than C ― Andreas Zwinkau Judging the performance of programming languages, usually C is called the leader, though Fortran is often faster. New programming languages commonly use C as their reference and they are really proud to be only so much slower than C. Few language designer try to beat C. What does it take for a language to be faster than C? Better Aliasing Information Aliasing describes the fact that two references might point to the same memory location. void* memcopy(void* dst, const void* src, size_t count) { while (count--) *dst++ = *src++; return dst; } Depending on the target architecture, a compiler might perform a lot of optimizations with this code. In C99 the restrict keyword was added, which we could use here to encode that src and dst are different from all other references. Fortran semantics say that function arguments never alias and there is an array type, where in C arrays are pointers. Push Computation to Compile-Time Doing things at compile time reduces the run time. Runtime Optimization

Apache Pig Programmer’s Toolbox Part 3: Consistent Hashing « tomkleinpeter.com Next up in the toolbox series is an idea so good it deserves an entire article all to itself: consistent hashing. Let’s say you’re a hot startup and your database is starting to slow down. You decide to cache some results so that you can render web pages more quickly. If you want your cache to use multiple servers (scale horizontally, in the biz), you’ll need some way of picking the right server for a particular key. If you only have 5 to 10 minutes allocated for this problem on your development schedule, you’ll end up using what is known as the naïve solution: put your N server IPs in an array and pick one using key % N. I kid, I kid — I know you don’t have a development schedule. Anyway, this ultra simple solution has some nice characteristics and may be the right thing to do. You’ll have a second problem if your cache is read-through or you have some sort of processing occurring alongside your cached data. As I said, though, that might be OK. In a nutshell, here is how it works.

Gizzard (Scala framework) Project Website Apache Accumulo Bit Twiddling Hacks By Sean Eron Anderson seander@cs.stanford.edu Individually, the code snippets here are in the public domain (unless otherwise noted) — feel free to use them however you please. The aggregate collection and descriptions are © 1997-2005 Sean Eron Anderson. The code and descriptions are distributed in the hope that they will be useful, but WITHOUT ANY WARRANTY and without even the implied warranty of merchantability or fitness for a particular purpose. As of May 5, 2005, all the code has been tested thoroughly. Thousands of people have read it. Contents About the operation counting methodology When totaling the number of operations for algorithms here, any C operator is counted as one operation. Compute the sign of an integer The last expression above evaluates to sign = v >> 31 for 32-bit integers. Alternatively, if you prefer the result be either -1 or +1, then use: sign = +1 | (v >> (sizeof(int) * CHAR_BIT - 1)); // if v < 0 then -1, else +1 sign = (v ! Patented variation: f = v && ! Sean A.

Hive! Google’s Bigtable Distributed Storage System, Pt. I Google rolls out new applications to millions of users with surprising frequency, which is pretty amazing all by itself. Yet when you look at the variety of the applications, ranging from data-sucking behemoths like webcrawling to intimate apps like Personalized Search and Writely it is even more startling. How does the Google architecture manage the conflicting requirements of such a wide range of workloads? Bigtable, a Google-developed distributed storage system for structured data, is a big piece of the answer. Isn’t The Google File System The Answer? If It’s a Storage System, Where Are The Disks? This article is adapted from a paper entitled Bigtable: A Distributed Storage System for Structured Data(PDF) that was just released. Scale To Thousands Of Terabytes And Servers Every good product solves a problem. So, It’s A Database, Right? So It’s The Mother Of All Spreadsheets? So Far, I Get It. So Bigtable Stores Everything Forever . . . GFS. Comments, as always, welcome.

vitess - Scaling MySQL databases for the web The project is now hosted on Please use the new location to watch our progress. The main goal of the vitess project is to provide servers and tools to facilitate scaling of MySQL databases for the web. The Project Goals page has more details on this. You can also review the slides from our presentation at the MySQL/Percona conference. Vtocc is the first usable product of vitess. Vtocc is already being used in a large scale production environment. A Python DBAPI 2.0 compliant client interface (vt_occ2.py). Vtocc is still under active development. A consistent row cache and the ability to rewrite queries to maximize utilization of the row cache.

Home » OpenStack Open Source Cloud Computing Software s Nginx Tutorials (version 2012.03.23) I've been doing a lot of work in the Nginx world over the last few years and I've also been thinking about writing series of tutorial-like articles to explain to more people what I've done and what I've learned in this area. Now I have finally decided to post serial tutorials to the Sina Blog in Chinese. Every article will have one rough topic and will be in a rather casual style. They're not parts of a book after all. But I do have plans to re-orginaize these stuffs to form a real book. Now the tutorials being written is devided into "series". The samples in my tutorials are at least compatible with Nginx 0.8.54; do not try the samples with older versions of Nginx. All of the Nginx modules mentioned in these tutorials are production-ready. I'm going to make extensive use of Nginx 3rd-party modules here. ? Do not reproduce these articles without explicit permissions from us. October 30, 2011

HBase

Related: