background preloader

Welcome To Apache Incubator Giraph

Welcome To Apache Incubator Giraph

Data Recipes pregel_paper Sqrrl Enterprise - Linked Data Analysis for Hadoop Our flagship product is Sqrrl Enterprise, a unified solution for integrating data to enable secure, real-time search, discovery, and analytics, powered by Apache Accumulo. Sqrrl Enterprise enables organizations to ingest, secure, connect, and analyze massive amounts of structured, semi-structured, and unstructured data: Ingest: Streaming or bulk data ingest from any source.Secure: Encryption and labeling of data with fine-grained access controls.Connect: Automatically organize data and extract information about the entities and relationships you care about.Analyze: Web-based dashboarding and visual, contextual navigation of the data and relationships in the system. Clients use Sqrrl Enterprise for a variety of real-time Big Data applications, including cybersecurity analytics, healthcare analytics, and intelligence analysis. Sqrrl licenses Sqrrl Enterprise via annual subscriptions models. - Home Pregel Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs - in some cases billions of vertices, trillions of edges - poses challenges to their efficient processing. In this paper we present a computational model suitable for this task. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its own state and that of its outgoing edges or mutate graph topology. This vertex-centric approach is flexible enough to express a broad set of algorithms.

twitter/cassovary AWS Lambda The code you run on AWS Lambda is called a “Lambda function.” After you create your Lambda function it is always ready to run as soon as it is triggered, similar to a formula in a spreadsheet. Each function includes your code as well as some associated configuration information, including the function name and resource requirements. After you upload your code to AWS Lambda, you can associate your function with specific AWS resources (e.g. a particular Amazon S3 bucket, Amazon DynamoDB table, or Amazon Kinesis stream).

Quick 'n' Comfortable Web Development in PHP | Nette Framework SNAP: Stanford Network Analysis Project COMBINATORIAL_BLAS: Combinatorial BLAS Library (MPI reference implementation) Authors: Aydın Buluç , John R. Gilbert , Adam Lugowski This material is based upon work supported by the National Science Foundation under Grant No. 0709385. The Combinatorial BLAS is an extensible distributed-memory parallel graph library offering a small but powerful set of linear algebra primitives specifically targeting graph analytics. The Combinatorial BLAS is also the backend of the Python Knowledge Discovery Toolbox (KDT) . Download Read release notes . Requirements : You need a recent C++ compiler (gcc version 4.4+, Intel version 11.0+ and compatible), a compliant MPI implementation, and C++11 Standard library (libstdc++ that comes with g++ has them). Documentation : This is a reference implementation of the Combinatorial BLAS Library in C++/MPI. The implementation supports both formatted and binary I/O. SpParMat <int, float, SpDCCols<int,float> > A; Sparse and dense vectors can be distributed either along the diagonal processor or to all processor. New in version 1.3 :

Platform as a Service | Pivotal Cloud Foundry | Pivotal What is the Buildpack Architecture in Pivotal Cloud Foundry? Pivotal CF uses a flexible approach called buildpacks to dynamically assemble and configure a complete runtime environment for executing a particular type of applications. Since buildpacks are extensible to most modern runtimes and frameworks, applications written in nearly any language can be deployed to Pivotal Cloud Foundry. Developers benefit from an “it just works” experience as the platform applies the appropriate buildpack to detect, download and configure the language, framework, container and libraries for the application. Pivotal Cloud Foundry provided buildpacks for Java, Ruby, Node, PHP, Python and golang are part of a broad buildpack provider ecosystem that ensures constant updates and maintenance for virtually any language. Containerization Combining the power of virtualization with efficient container scheduling, Pivotal Cloud Foundry delivers a higher server density than traditional environments. Monitoring Logging

Spring Hadoop's tremendous inefficiency on graph data management (and how to avoid it) Hadoop is great. It seems clear that it will serve as the basis of the vast majority of analytical data management within five years. Already today it is extremely popular for unstructured and polystructured data analysis and processing, since it is hard to find other options that are superior from a price/performance perspective. The problem with Hadoop is that its strength is also its weakness. Although not the subject of this post, an example of this inefficiency can be found in a SIGMOD paper that a bunch of us from Yale and the University of Wisconsin published 5 weeks ago. Before we get into how to improve Hadoop's efficiency on graph data by a factor of 1000, let's pause for a second to comprehend how dangerous it is to let inefficiencies in Hadoop become widespread. Let's delve into the subject of graph data in more detail. Hadoop, by default, hash partitions data across nodes.

GraphLab - Large-Scale Machine Learning on Graphs