background preloader

Tutorials

Facebook Twitter

Pragmatic AWS: 3 Tips to enhance the AWS SDK with Scala. At Sumo Logic, most backend code is written in Scala.

Pragmatic AWS: 3 Tips to enhance the AWS SDK with Scala

Scala is a newer JVM (Java Virtual Machine) language created in 2001 by Martin Odersky, who also co-founded our Greylock sister company, TypeSafe. Over the past two years at Sumo Logic, we’ve found Scala to be a great way to use the AWS SDK for Java. In this post, I’ll explain some use cases. 1. Tags as fields on AWS model objects Accessing AWS resource tags can be tedious in Java.

String deployment = null; for (Tag tag : instance.getTags()) { if (tag.getKey().equals(“Cluster”)) { deployment = tag.getValue(); } } While this isn’t horrible, it certainly doesn’t make code easy to read. Val deployment = instance.cluster Here is what it takes to make this magic work: Whenever this functionality is desired, one just has to import RichAmazonEC2 2. Scala 2.8.0 included a very powerful new set of collections libraries, which are very useful when manipulating lists of AWS resources. 3. Elastic MapReduce Quickstart — mrjob 0.4 documentation. Analyze Log Data with Apache Hive, Windows PowerShell, and Amazon EMR : Articles & Tutorials. The example used in this tutorial is known as contextual advertising, and it is one example of what you can do with Amazon Elastic MapReduce (Amazon EMR).

Analyze Log Data with Apache Hive, Windows PowerShell, and Amazon EMR : Articles & Tutorials

It is an adaptation of an earlier article at that used the Amazon EMR Command Line Interface (CLI) and the AWS Management Console instead of Windows PowerShell. Storing Logs on Amazon S3 An ad server produces two types of log files: impression logs and click logs. Try a Query Using the AWS SDK for Java. The following Java code example uses the AWS SDK for Java to perform the following tasks: Get an item from the ProductCatalog table.

Try a Query Using the AWS SDK for Java

Query the Reply table to find all replies posted in the last 15 days for a forum thread. In the code, you first describe your request by creating a QueryRequest object. Newsapps/beeswithmachineguns. Unleash an Army of Bees With Machine Guns on Your Website. Want to know if your website can stand up to a sudden, massive deluge of traffic?

Unleash an Army of Bees With Machine Guns on Your Website

Sure, you could use some of the available tools like Flood , JMeter or The Ginder . But none of those options have bees with machine guns . The news applications team at the Chicago Tribune, has released a new tool it calls Bees with Machine Guns that uses Amazon EC2 servers to launch what amounts to a distributed DoS attack against your site. Genie is out of the bottle! By Sriram Krishnan In a prior tech blog, we had discussed the architecture of our petabyte-scale data warehouse in the cloud.

Genie is out of the bottle!

Salient features of our architecture include the use of Amazon’s Simple Storage Service (S3) as our "source of truth", leveraging the elasticity of the cloud to run multiple dynamically resizable Hadoop clusters to support various workloads, and our horizontally scalable Hadoop Platform as a Service called Genie. Today, we are pleased to announce that Genie is now open source, and available to the public from the Netflix OSS GitHub site.

What is Genie? Genie provides job and resource management for the Hadoop ecosystem in the cloud. Why did we build Genie? There are two main reasons why we built Genie. Secondly, end-users simply want to run their Hadoop, Hive or Pig jobs - very few of them are actually interested in launching their own clusters, or even installing all the client-side software and downloading all the configurations needed to run such jobs. Netflix open sources its Hadoop manager for AWS. Netflix runs a lot of Hadoop jobs on the Amazon Web Services cloud computing platform, and on Friday the video-streaming leader open sourced its software to make running those jobs as easy as possible.

Netflix open sources its Hadoop manager for AWS

Called Genie, it’s a RESTful API that makes it easy for developers to launch new MapReduce, Hive and Pig jobs and to monitor longer-running jobs on transient cloud resources. In the blog post detailing Genie, Netflix’s Sriram Krishnan makes clear a lot more about what Genie is and is not. Essentially, Genie is a platform as a service running on top of Amazon’s Elastic MapReduce Hadoop service. It’s part of a larger suite of tools that handles everything from diagnostics to service registration. How to process a million songs in 20 minutes. The recently released Million Song Dataset (MSD), a collaborative project between The Echo Nest and Columbia’s LabROSA is a fantastic resource for music researchers.

How to process a million songs in 20 minutes

It contains detailed acoustic and contextual data for a million songs. However, getting started with the dataset can be a bit daunting. Echonest/msd-examples. Credential Management for Mobile Applications : Sample Code & Libraries. This article is a supplement to Authenticating Users of AWS Mobile Applications with a Token Vending Machine.

Credential Management for Mobile Applications : Sample Code & Libraries

It provides additional details on how to secure your Amazon Web Services (AWS) resources when using the token vending machine (TVM) with mobile applications. First, it is important to understand why mobile security is hard. For native mobile applications, the application code exists and executes on the device. This makes it possible for malicious users to extract AWS credentials embedded within the application. Parsing Logs with Apache Pig and Elastic MapReduce : Articles & Tutorials. This article is outdated.

Parsing Logs with Apache Pig and Elastic MapReduce : Articles & Tutorials

Apache LogAnalysis using Pig : Articles & Tutorials. Total bytes transferred per hour A list of the top 50 IP addresses by traffic per hour A list of the top 50 external referrers The top 50 search terms in referrals from Bing and Google You can modify the Pig script to generate additional information.

Apache LogAnalysis using Pig : Articles & Tutorials

Running the Pig Sample Using AWS Management Console To run the application using the AWS Management Console please see our documentation: Running the Pig Sample Using EMR's Ruby Command Line Client If you have the Amazon Elastic MapReduce Command Line Client installed, you can generate the reports using the following commands. . $ INPUT_PATH= $ OUTPUT_PATH= $ PIG_SCRIPT= $ .