Get flash to fully experience Pearltrees
In the spring of 2010, the search team at Twitter started to rewrite our search engine in order to serve our ever-growing traffic, improve the end-user latency and availability of our service, and enable rapid development of new search features. As part of the effort, we launched a new real-time search engine , changing our back-end from MySQL to a real-time version of Lucene . Last week, we launched a replacement for our Ruby-on-Rails front-end: a Java server we call Blender.
“Big data” is data that becomes large enough that it cannot be processed using conventional methods.
A few of our friends have been asking us what are some of the best practices we learnt over the last two years designing and implementing RESTful Web Services as the back-end of the feedly service. Here is a quick/high level brain dump: Phase 1 – Defining a simple resource/service | Take a sample resource such as Customer Information, model it as JSON. Build a simple servlet where PUT creates a new customer, GET returns the customer information based on the customer key, DELETE deletes the customer and POST updates the customer information.
Update 6: Some interesting changes from Twitter's Evan Weaver : everything in RAM now, database is a backup; peaks at 300 tweets/second; every tweet followed by average 126 people; vector cache of tweet IDs; row cache; fragment cache; page cache; keep separate caches; GC makes Ruby optimization resistant so went with Scala; Thrift and HTTP are used internally; 100s internal requests for every external request; rewrote MQ but kept interface the same; 3 queues are used to load balance requests; extensive A/B testing for backwards capability; switched to C memcached client for speed; optimize critical path; faster to get the cached results from the network memory than recompute them locally. Update 5: Twitter on Scala . A Conversation with Steve Jenson, Alex Payne, and Robey Pointer by Bill Venners.
ParaVM or full-VM or Cloud ? Xen