background preloader

Big Data Analytics - Platfora

Big Data Analytics - Platfora
Related:  High Performance Big Data Analytics Infrastructure

What is BigQuery? - Google BigQuery Querying massive datasets can be time consuming and expensive without the right hardware and infrastructure. Google BigQuery solves this problem by enabling super-fast SQL queries against append-only tables using the processing power of Google's infrastructure. Simply move your data into BigQuery and let us handle the hard work. You can control access to both the project and your data based on your business needs, such as giving others the ability to view or query your data. You can access BigQuery by using a web UI or a command-line tool, or by making calls to the BigQuery REST API using a variety of client libraries such as Java, .NET or Python. Get started now with creating an app, running a web query or using the command-line tool, or read on for more information about BigQuery fundamentals and how you can work with the product. BigQuery fundamentals There are four main concepts you should understand when using BigQuery. Projects Tables Tables contain your data in BigQuery. Datasets Jobs

Platform as a Service | Pivotal Cloud Foundry | Pivotal What is the Buildpack Architecture in Pivotal Cloud Foundry? Pivotal CF uses a flexible approach called buildpacks to dynamically assemble and configure a complete runtime environment for executing a particular type of applications. Since buildpacks are extensible to most modern runtimes and frameworks, applications written in nearly any language can be deployed to Pivotal Cloud Foundry. Developers benefit from an “it just works” experience as the platform applies the appropriate buildpack to detect, download and configure the language, framework, container and libraries for the application. Pivotal Cloud Foundry provided buildpacks for Java, Ruby, Node, PHP, Python and golang are part of a broad buildpack provider ecosystem that ensures constant updates and maintenance for virtually any language. Containerization Combining the power of virtualization with efficient container scheduling, Pivotal Cloud Foundry delivers a higher server density than traditional environments. Monitoring Logging

AWS Lambda The code you run on AWS Lambda is called a “Lambda function.” After you create your Lambda function it is always ready to run as soon as it is triggered, similar to a formula in a spreadsheet. Each function includes your code as well as some associated configuration information, including the function name and resource requirements. Lambda functions are “stateless,” with no affinity to the underlying infrastructure, so that Lambda can rapidly launch as many copies of the function as needed to scale to the rate of incoming events. After you upload your code to AWS Lambda, you can associate your function with specific AWS resources (e.g. a particular Amazon S3 bucket, Amazon DynamoDB table, or Amazon Kinesis stream). Then, when the resource changes, Lambda will execute your function and manage the compute resources as needed in order to keep up with incoming requests.

Cloud Services - HDInsight (Hadoop) Scale elastically on demand HDInsight is a Hadoop distribution powered by the cloud. This means HDInsight was architected to handle any amount of data, scaling from terabytes to petabytes on demand. You can spin up any number of nodes at anytime. We charge only for the compute and storage you actually use. It's part of our audit requirements that we keep data for seven years, and some information has to be retained for as long as 30 years. –Don Wood, Beth Israel Deaconess Medical Center Crunch all data – structured, semi-structured, unstructured Since it's 100% Apache Hadoop, HDInsight can process unstructured or semi-structured data from web clickstreams, social media, server logs, devices and sensors, and more. With a solution based on SQL Server and Azure HDInsight Service, we can capture data written in plain English and use it to improve services…This will reinvent the way we work with medical records in the future. –Paul Henderson, Ascribe Develop in your favorite language

Data-Driven Business | Big Data Business Insights | Teradata Aster Products Looking for data-driven business insights? With the integrated Teradata Aster Discovery Platform, organizations attain unmatched competitive advantage by making it faster and easier for a wider group of users to generate powerful, high impact business insights from big data. Products and Solutions Teradata Workload Specific Platforms Teradata Hardware Overview - The Teradata Platform Family, all running the Teradata Database, includes the Active Enterprise Data Warehouse, the Data Warehouse Appliance, the Data Mart Appliance, and the Integrated Big Data Platform. How to Get Started Assess Your Organization’s Big Data Analytics Competency Evaluate your organization's data discovery and big data analytics competency with the IDC Discovery Platform Assessment Tool. Receive a customized report, based on your answers to a short questionnaire, with recommendations for moving up the maturity scale and becoming a data-driven business.

Sqrrl Enterprise - Linked Data Analysis for Hadoop Our flagship product is Sqrrl Enterprise, a unified solution for integrating data to enable secure, real-time search, discovery, and analytics, powered by Apache Accumulo. Sqrrl Enterprise enables organizations to ingest, secure, connect, and analyze massive amounts of structured, semi-structured, and unstructured data: Ingest: Streaming or bulk data ingest from any source.Secure: Encryption and labeling of data with fine-grained access controls.Connect: Automatically organize data and extract information about the entities and relationships you care about.Analyze: Web-based dashboarding and visual, contextual navigation of the data and relationships in the system. Clients use Sqrrl Enterprise for a variety of real-time Big Data applications, including cybersecurity analytics, healthcare analytics, and intelligence analysis. Sqrrl licenses Sqrrl Enterprise via annual subscriptions models.

Giraph - Welcome To Apache Giraph! Handling five billion sessions a day – in real time Since we first released Answers seven months ago, we’ve been thrilled by tremendous adoption from the mobile community. We now see about five billion sessions per day, and growing. Hundreds of millions of devices send millions of events every second to the Answers endpoint. During the time that it took you to read to here, the Answers back-end will have received and processed about 10,000,000 analytics events. The challenge for us is to use this information to provide app developers with reliable, real-time and actionable insights into their mobile apps. At a high level, we guide our architectural decisions on the principles of decoupled components, asynchronous communication and graceful service degradation in response to catastrophic failures. In practice, we need to design a system that receives events, archives them, performs offline and real-time computations, and merges the results of those computations into coherent information.

Databricks - The next generation of Big Data