Zooniverse (citizen science project) Zooniverse is a citizen science web portal owned and operated by the Citizen Science Alliance. The organization grew from the original Galaxy Zoo project and now hosts dozens of projects which allow volunteers to participate in scientific research. Unlike many early internet-based citizen science projects (such as SETI@home) which used spare computer processing power to analyse data, known as volunteer computing, Zooniverse projects require the active participation of human volunteers to complete research tasks. Projects have been drawn from disciplines including astronomy, ecology, cell biology, humanities, and climate science. Active projects currently include: According to the Zooniverse site, these projects are now retired:
MapReduce Overview MapReduce is a framework for processing parallelizable problems across huge datasets using a large number of computers (nodes), collectively referred to as a cluster (if all nodes are on the same local network and use similar hardware) or a grid (if the nodes are shared across geographically and administratively distributed systems, and use more heterogenous hardware). Processing can occur on data stored either in a filesystem (unstructured) or in a database (structured). MapReduce can take advantage of locality of data, processing it on or near the storage assets in order to reduce the distance over which it must be transmitted. "Map" step: Each worker node applies the "map()" function to the local data, and writes the output to a temporary storage. A master node orchestrates that for redundant copies of input data, only one is processed." MapReduce allows for distributed processing of the map and reduction operations. Logical view Map(k1,v1) → list(k2,v2) Uses
YaCy : un moteur de recherche peer to peer sous licence libre pour remplacer Google Cet article a été publié il y a 3 ans 11 mois 4 jours, il est donc possible qu’il ne soit plus à jour. Les informations proposées sont donc peut-être expirées. C’est ma découverte du jour que je dois à Twitter et plus particulièrement à @glenux. En effet de YaCy, je n’avais encore jamais entendu parler bien qu’il existe depuis 2006. A la lecture de la présentation de YaCy, il y a de quoi être emballé. Ensuite, ce sont les caractéristiques techniques qui m’emballent : Une instance de YaCy peut stocker plus de 20 millions de documents.Partage d’index en peer to peer : YaCy implémente un système de partage d’index s’apparentant à un mécanisme de peer to peer (P2P). YaCy se décompose en quatre modules : un web crawler (le processus qui parcourt les pages web à indexer), un moteur d’indexation, une base de données et une interface utilisateur. Concernant la base de données embarquée, elle est spécifique à YaCy et utilise une structure de type AVL-Trees.
Google File System Google File System (GFS or GoogleFS) is a proprietary distributed file system developed by Google for its own use. It is designed to provide efficient, reliable access to data using large clusters of commodity hardware. A new version of the Google File System is codenamed Colossus. Design Google File System. Designed for system-to-system interaction, and not for user-to-system interaction. A GFS cluster consists of multiple nodes. The Master server doesn't usually store the actual chunks, but rather all the metadata associated with the chunks, such as the tables mapping the 64-bit labels to chunk locations and the files they make up, the locations of the copies of the chunks, what processes are reading or writing to a particular chunk, or taking a "snapshot" of the chunk pursuant to replicate it (usually at the instigation of the Master server, when, due to node failures, the number of copies of a chunk has fallen beneath the set number). Performance See also
Bandwidth throttling Bandwidth throttling is the intentional slowing of Internet service by an Internet service provider. It is a reactive measure employed in communication networks in an apparent attempt to regulate network traffic and minimize bandwidth congestion. Bandwidth throttling can occur at different locations on the network. On a local area network (LAN), a sysadmin may employ bandwidth throttling to help limit network congestion and server crashes. Operation Clients will make requests to servers, which will respond by sending the required data. In order to prevent such occurrences, a server administrator may implement bandwidth throttling to control the number of requests a server responds to within a specified period of time. Application Those that could have their bandwidth throttled are typically someone who is constantly downloading and uploading torrents, or someone who just watches a lot of online videos. Network neutrality Throttling vs. capping Court cases
Le logiciel libre moteur de recherche Welcome to Apache Pig! Virtual private server A virtual private server within a host A virtual private server (VPS) is a virtual machine sold as a service by an Internet hosting service. A VPS runs its own copy of an operating system, and customers have superuser-level access to that operating system instance, so they can install almost any software that runs on that OS. For many purposes they are functionally equivalent to a dedicated physical server, and being software defined are able to be much more easily created and configured. They are priced much lower than an equivalent physical server, but as they share the underlying physical hardware with other VPSs, performance may be lower, and may depend on the workload of other instances on the same hardware node. Virtualization The force driving server virtualization is similar to that which led to the development of time-sharing and multiprogramming in the past. Hosting With unmanaged or self managed hosting, the customer is left to administer his own server instance.
Folding@home The project has pioneered the use of graphics processing units (GPUs), PlayStation 3s, Message Passing Interface (used for computing on multi-core processors), and some Sony Xperia smartphones for distributed computing and scientific research. The project uses statistical simulation methodology that is a paradigm shift from traditional computing methods. As part of the client–server model network architecture, the volunteered machines each receive pieces of a simulation (work units), complete them, and return them to the project's database servers, where the units are compiled into an overall simulation. Volunteers can track their contributions on the Folding@home website, which makes volunteers' participation competitive and encourages long-term involvement. Folding@home is one of the world's fastest computing systems, with a speed of approximately 100 petaFLOPS. Project significance A protein before and after folding. Biomedical research Alzheimer's disease V7
GPUGRID.net GPUGRID is a distributed computing project hosted by Pompeu Fabra University and running on the Berkeley Open Infrastructure for Network Computing (BOINC) software platform. It performs full-atom molecular biology simulations that are designed to run on Nvidia's CUDA-compatible graphics processing units. Former support for PS3s See also List of distributed computing projects References Further reading External links
Welcome to Hive!