background preloader

The C10K problem

[Help save the best Linux news source on the web -- subscribe to Linux Weekly News!] It's time for web servers to handle ten thousand clients simultaneously, don't you think? After all, the web is a big place now. And computers are big, too. In 1999 one of the busiest ftp sites, cdrom.com, actually handled 10000 clients simultaneously through a Gigabit Ethernet pipe. And the thin client model of computing appears to be coming back in style -- this time with the server out on the Internet, serving thousands of clients. With that in mind, here are a few notes on how to configure operating systems and write code to support thousands of clients. Contents Related Sites See Nick Black's execellent Fast UNIX Servers page for a circa-2009 look at the situation. In October 2003, Felix von Leitner put together an excellent web page and presentation about network scalability, complete with benchmarks comparing various networking system calls and operating systems. Book to Read First I/O frameworks 1. 2. Related:  swabler

Design and Implementation of a High-performance TCP/IP Communications Library Download source code - 129 KB Introduction This article is the second of a multi-part series that will cover the architecture and implementation of components needed to create and maintain a robust, scalable, high performance, massive multiplayer online game server and game engine. The first article of the series focused on building a Scheduling Engine to drive organized, real-time change in a virtual world. This article focuses on the design and implementation of a TCP/IP communication component, designed to efficiently handle communications between the game server and the remote game clients (players). Background: BBS Games to MUDs Back in the 80's, I ran a modem-based bulletin board system (BBS) that let users dial in and leave messages for other users, share files, and play simple multi-player games. When I headed off for college in the early 90's, I had to shut down the BBS. Terminology - Sockets, and Outgoing / Incoming Connections Peer-to-Peer vs. Okay, enough confusion. Example Use

High-Performance Analytics - L'architecture Big Data de SAS Hadoop, MapReduce, NoSQL, Appliances… tous ces termes techniques fleurissent pour décrire le phénomène Big Data, à l'origine du Big Analytics chez SAS. Si le Peta-octet n'est pas encore l'unité de base des applications décisionnelles, on peut estimer que les données disponibles pour le monde analytique vont augmenter et se diversifier. La capacité à valoriser et utiliser ces informations dans un laps de temps réduit est l'enjeu majeur des trois prochaines années. Le Big Data et l'Analytique : la réponse à des enjeux métiers Nous vous proposons en téléchargement gratuit un livre blanc qui énonce les utilisations et les bénéfices métiers dans différents secteurs d'activité. Ce sont autant d’exemples apportant un éclairage pertinent sur la gestion, le stockage, l'analyse et l'exploitation d'importants volumes de données réalisés avec SAS dans le contexte du Big Data. SAS® High-Performance Analytics Server : une offre dédiée Exploration visuelle des données avec SAS® Visual Analytics

TCP_CORK: More than you ever wanted to know | 2005-04-06 | christopher baus.net April 6, 2005 I previously mentioned the leakiness of Unix's file metaphor. The leak often becomes a gushing torrent when trying to bump up performance. TCP_CORK is yet another example. Before I get into the details of TCP_CORK and the problem it addresses, I want to point out that this is a Linux only option, although variants exist on other *nix flavors -- for instance TCP_NOPUSH on FreeBSD and Mac OS X (although from what I read the OS X implementation is buggy). This is one of the unfortunate aspects of modern Unix programming. What are "physical" socket writes? The root of the abstraction leak derives from the semantics of the write() function when applied to TCP/IP. Any data that has been sent to the kernel with write() is placed into one or more packets and immediately sent onto the wire. The resulting behavior is what application programmers expected. Nagle's algorithm Nagle not optimal for streams It also requires the peer to process more packets when network latency is low.

node.js In the "hello world" web server example below, many client connections can be handled concurrently. Node tells the operating system (through epoll, kqueue, /dev/poll, or select) that it should be notified when a new connection is made, and then it goes to sleep. If someone new connects, then it executes the callback. Each connection is only a small heap allocation. This is in contrast to today's more common concurrency model where OS threads are employed. Node is similar in design to and influenced by systems like Ruby's Event Machine or Python's Twisted. HTTP is a first class protocol in Node. But what about multiple-processor concurrency? See also:

A reusable, high performance, socket server class - Part 1 The following source was built using Visual Studio 6.0 SP5 and Visual Studio .NET. You need to have a version of the Microsoft Platform SDK installed Note that the debug builds of the code waste a lot of CPU cycles due to the the debug trace output. Overview Writing a high performance server that runs on Windows NT and uses sockets to communicate with the outside world isn't that hard once you dig through the API references. What does a socket server need to do? A socket server needs to be able to listen on a specific port, accept connections and read and write data from the socket. Before we can start accepting connections we need to have a socket to listen on. Hide Copy Code virtual SOCKET CreateListeningSocket( unsigned long address, unsigned short port); The server class provides a default implementation that's adequate in most circumstances. Note that we use a helper class, CSocket, to handle setting up our listening socket. Asynchronous IO Graceful shutdown Socket closure A simple server

Box plot In descriptive statistics, box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending vertically from the boxes whiskers indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-and-whisker diagram. Outliers may be plotted as individual points. This is also called a "box and whisker plot". Box plots are non-parametric: they display variation in samples of a statistical population without making any assumptions of the underlying probability distribution statistical distribution. Types of boxplots[edit] Figure 2. Figure 3. Box and whisker plots are uniform in their use of the box: the bottom and top of the box are always the first and third quartiles, and the band inside the box is always the second quartile (the median). Some box plots include an additional character to represent the mean of the data.[2] Variations[edit] Figure 4. if and John W.

The ultimate SO_LINGER page, or: why is my tcp not reliable edit The ultimate SO_LINGER page, or: why is my tcp not reliable Posted by bert hubert on 01/18/2009 This post is about an obscure corner of TCP network programming, a corner where almost everybody doesn’t quite get what is going on. So I decided to trawl the web and consult the experts, promising them to write up their wisdom once and for all, in hopes that this subject can be put to rest. The experts (H. Even though I refer a lot to the Linux TCP implementation, the issue described is not Linux-specific, and can occur on any operating system. What is the issue? Sometimes, we have to send an unknown amount of data from one location to another. “TCP provides a reliable, stream-oriented, full-duplex connection between two sockets on top of ip(7), for both v4 and v6 versions. However, when we naively use TCP to just send the data we need to transmit, it often fails to do what we want - with the final kilobytes or sometimes megabytes of data transmitted never arriving. What is going on How come?

Node.js для начинающих » Подробный учебник по Node.js How does a relational database work - Coding Geek When it comes to relational databases, I can’t help thinking that something is missing. They’re used everywhere. There are many different databases: from the small and useful SQLite to the powerful Teradata. But, there are only a few articles that explain how a database works. Are relational databases too old and too boring to be explained outside of university courses, research papers and books? As a developer, I HATE using something I don’t understand. Though the title of this article is explicit, the aim of this article is NOT to understand how to use a database. I’ll start with some computer science stuff like time complexity. Since it’s a long and technical article that involves many algorithms and data structures, take your time to read it. For the more knowledgeable of you, this article is more or less divided into 3 parts: A long time ago (in a galaxy far, far away….), developers had to know exactly the number of operations they were coding. O(1) vs O(n2) The concept Examples Merge

Using OpenSSH through a SOCKS compatible firewall on your LAN Using OpenSSH through a SOCKS compatible PROXY on your LAN This guide has been written by J. Grant. 2002-04-15 Version 0.9 Copyleft J. Grant. New versions can be found on the page hosted by Goto-san: This guide has been featured on the Mandrake website: Tested on Linux Mandrake 8.1, this will not affect you providing you have RPM support. Introduction The SOCKS firewall protocol was fostered by NEC, they currently DO NOT support a free version of their tools for UNIX (free as in freedom, not beer). There are 5 solutions covered in this SSH through a SOCKS PROXY guide. Currently I use Goto-san's connect.c and the wrapper "runsocks" for other applications, read this whole FAQ before making your decision! 1) Using the old NEC software I installed runsocks-1.0r11-3.i386.rpm successfully. rpm -ivh I have also rpm --rebuild the src.rpm to make the whole packages and installed those as well.

CouchApp: JavaScript приложения в CouchDB Когда-то давно, когда я практиковался в написании хранимых процедур, триггеров, курсоров под MSSQL, мне не давала покоя мысль о приложении, где вся бизнес-логика крутится на уровне БД, а presentation tier просто дергает базу и отвечает за отрисовку полученных результатов. С тех пор прошло много моих девелоперских лет, но возможности для реализации данной идеи так и не встретилось… до тех пор, пока я не наткнулся на CouchDB. Я думаю, что многие уже слышали о NoSQL базах данных и в том числе о Couch DB. Здесь я хочу рассказать о замечательной возможности встраивать JavaScript-приложения в CouchDB, название которым CouchApp. CouchApp описывается на сайте CouchDB книги как «Javascript и HTML5 приложение, которое отдается напрямую в браузер из CouchDB». Я думаю, что такое определение не совсем точно, так как в браузер в этом случае отдается HTML, а уже какой HTML отдавать решает JavaScript, работающий на сервере. Приложение в CouchDB начинается с «design document». Начало Ничего особенного.

Google Architecture Update 2: Sorting 1 PB with MapReduce. PB is not peanut-butter-and-jelly misspelled. It's 1 petabyte or 1000 terabytes or 1,000,000 gigabytes. Google is the King of scalability. Information Sources Platform Linux A large diversity of languages: Python, Java, C++ What's Inside? The Stats Estimated 450,000 low-cost commodity servers in 2006 In 2005 Google indexed 8 billion web pages. The Stack Google visualizes their infrastructure as a three layer stack: Products: search, advertising, email, maps, video, chat, blogger Distributed Systems Infrastructure: GFS, MapReduce, and BigTable. Reliable Storage Mechanism with GFS (Google File System) Reliable scalable storage is a core need of any application. Do Something With the Data Using MapReduce Now that you have a good storage system, how do you do anything with so much data? Storing Structured Data in BigTable BigTable is a large scale, fault tolerant, self managing system that includes terabytes of memory and petabytes of storage. Hardware Misc

Related: